167 45 28MB
English Pages 853 [826] Year 2022
Lecture Notes in Networks and Systems 458
Jennifer S. Raj Yong Shi Danilo Pelusi Valentina Emilia Balas Editors
Intelligent Sustainable Systems Proceedings of ICISS 2022
Lecture Notes in Networks and Systems Volume 458
Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Fernando Gomide, Department of Computer Engineering and Automation—DCA, School of Electrical and Computer Engineering—FEEC, University of Campinas— UNICAMP, São Paulo, Brazil Okyay Kaynak, Department of Electrical and Electronic Engineering, Bogazici University, Istanbul, Turkey Derong Liu, Department of Electrical and Computer Engineering, University of Illinois at Chicago, Chicago, USA Institute of Automation, Chinese Academy of Sciences, Beijing, China Witold Pedrycz, Department of Electrical and Computer Engineering, University of Alberta, Alberta, Canada Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Marios M. Polycarpou, Department of Electrical and Computer Engineering, KIOS Research Center for Intelligent Systems and Networks, University of Cyprus, Nicosia, Cyprus Imre J. Rudas, Óbuda University, Budapest, Hungary Jun Wang, Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
The series “Lecture Notes in Networks and Systems” publishes the latest developments in Networks and Systems—quickly, informally and with high quality. Original research reported in proceedings and post-proceedings represents the core of LNNS. Volumes published in LNNS embrace all aspects and subfields of, as well as new challenges in, Networks and Systems. The series contains proceedings and edited volumes in systems and networks, spanning the areas of Cyber-Physical Systems, Autonomous Systems, Sensor Networks, Control Systems, Energy Systems, Automotive Systems, Biological Systems, Vehicular Networking and Connected Vehicles, Aerospace Systems, Automation, Manufacturing, Smart Grids, Nonlinear Systems, Power Systems, Robotics, Social Systems, Economic Systems and other. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution and exposure which enable both a wide and rapid dissemination of research output. The series covers the theory, applications, and perspectives on the state of the art and future developments relevant to systems and networks, decision making, control, complex processes and related areas, as embedded in the fields of interdisciplinary and applied sciences, engineering, computer science, physics, economics, social, and life sciences, as well as the paradigms and methodologies behind them. Indexed by SCOPUS, INSPEC, WTI Frankfurt eG, zbMATH, SCImago. All books published in the series are submitted for consideration in Web of Science. For proposals from Asia please contact Aninda Bose (aninda.bose@springer. com).
Jennifer S. Raj · Yong Shi · Danilo Pelusi · Valentina Emilia Balas Editors
Intelligent Sustainable Systems Proceedings of ICISS 2022
Editors Jennifer S. Raj Gnanmani College of Engineering and Technology Namakkal, India
Yong Shi Department of Computer Science Kennesaw State University Kennesaw, GA, USA
Danilo Pelusi Faculty of Communication Sciences University of Teramo Teramo, Italy
Valentina Emilia Balas Automatics and Applied Software Aurel Vlaicu University of Arad Arad, Romania
ISSN 2367-3370 ISSN 2367-3389 (electronic) Lecture Notes in Networks and Systems ISBN 978-981-19-2893-2 ISBN 978-981-19-2894-9 (eBook) https://doi.org/10.1007/978-981-19-2894-9 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022, corrected publication 2024 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
We are honored to dedicate the proceedings of 5th ICISS 2022 to all the participants, organizers and editors of 5th ICISS 2022.
Preface
With a deep gratification, we are delighted to welcome you to the proceedings of the 5th International Conference on Intelligent Sustainable Systems (ICISS 2022) organized at SCAD College of Engineering and Technology, Tirunelveli, India, on February 17–18, 2022. The major goal of this international conference is to gather the academicians, industrialists, researchers and scholars together in a common platform to share their innovative research ideas and practical solutions toward the development of intelligent sustainable systems for a more sustainable future. The conference delegates had a wide range of technical sessions based on different technical domains involved in the theme of conference. The conference program has included invited keynote sessions on developing a sustainable future, stateof-the-art research work presentations and informative discussion with the distinguished keynote speakers by covering a wide range of topics in information systems and sustainability research. This year, ICISS has received 312 papers in different conference tracks, and based on the 3–4 expert reviews from the technical program committee, internal and external reviewers, 62 papers were finally selected for the conference. The entire conference proceedings include papers from different tracks like intelligent systems, sustainable systems and applications. Each paper, regardless of track, has received at least three reviews, who have professional expertise in the particular research domain of the paper. We are pleased to thank the conference organization committee, conference program committee and technical reviewers for working generously toward the success of the conference event. A special mention to the internal and external reviewers for working very hard in reviewing the each and every paper received to the conference and for giving valuable suggestions to the authors for maintaining the quality of the conference. We are truly obliged to the authors, who have contributed their innovative research results to the conference. Special thanks go to Springer Publications from their impeccable support and guidance throughout the publication process.
vii
viii
Preface
We wish the proceedings of ICISS 2022 will give an enjoyable and technical– rewarding experience for both attendees and readers. Namakkal, India Kennesaw, USA Teramo, Italy Arad, Romania
Dr. Jennifer S. Raj Dr. Yong Shi Dr. Danilo Pelusi Dr. Valentina Emilia Balas
Contents
Lung Ultrasound COVID-19 Detection Using Deep Feature Recursive Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E. Naveenkumar, B. Dhiyanesh, D. Magesh, G. Muthuram, N. Selvanathan, and R. Radha Predicting New York Taxi Trip Duration Based on Regression Analysis Using ML and Time Series Forecasting Using DL . . . . . . . . . . . . S. Ramani, Anish Ghiya, Pusuluri Sidhartha Aravind, Marimuthu Karuppiah, and Danilo Pelusi Implementation of Classical Error Control Codes for Memory Storage Systems Using VERILOG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sreevatsan Radhakrishnan, Syed Ishtiyaq Ahmed, and S. R. Ramesh Parkinson’s Disease Detection Using Machine Learning . . . . . . . . . . . . . . . Shivani Desai, Darshee Mehta, Vijay Dulera, and Hitesh Chhikaniwala Sustainable Consumption: An Approach to Achieve the Sustainable Environment in India . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sunny Dawar, Pallavi Kudal, Prince Dawar, Mamta Soni, Payal Mahipal, and Ashish Choudhary
1
15
29 43
59
The Concept of a Digital Marketing Communication Model for Higher Education Institutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Artur Kisiołek, Oleh Karyy, and Ihor Kulyniak
75
A Lightweight Image Colorization Model Based on U-Net Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pham Van Thanh and Phan Duy Hung
91
Comparative Analysis of Obesity Level Estimation Based on Lifestyle Using Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 R. Archana and B. Rajathilagam
ix
x
Contents
An Empirical Study on Millennials’ Adoption of Mobile Wallets . . . . . . . 115 M. Krithika and Jainab Zareena An IoT-Based Smart Mirror . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 K. N. Pallavi, Jagadevi N. Kalshetty, Maithri Suresh, Megha B. Kunder, and Kavya Shetty AI-Assisted College Recommendation System . . . . . . . . . . . . . . . . . . . . . . . . 141 Keshav Kumar, Vatsal Sinha, Aman Sharma, M. Monicashree, M. L. Vandana, and B. S. Vijay Krishna An Agent-Based Model to Predict Student Protest in Public Higher Education Institution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 T. S. Raphiri, M. Lall, and T. B. Chiyangwa RETRACTED CHAPTER: High Accuracy for Hyperspectral Image Classification Using Hybrid Spectral 3D-2D CNN . . . . . . . . . . . . . . 167 Mohini Shivhare and Sweta Tripathi Design Smart Curtain Using Light-Dependent Resistor . . . . . . . . . . . . . . . 179 Feras N. Hasoon, Mustafa Khalaf Aal Thani, Hilal A. Fadhil, Geetha Achuthan, and Suresh Manic Kesavan Machine Learning Assisted Binary and Multiclass Parkinson’s Disease Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 Satyankar Bhardwaj, Dhruv Arora, Bali Devi, Venkatesh Gauri Shankar, and Sumit Srivastava Category Based Location Aware Tourist Place Popularity Prediction and Recommendation System Using Machine Learning Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 Apeksha Arun Wadhe and Shraddha Suratkar Maximization of Disjoint K-cover Using Computation Intelligence to Improve WSN Lifetime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 D. L. Shanthi Car-Like Robot Tracking Using Particle Filter . . . . . . . . . . . . . . . . . . . . . . . 239 Cheedella Akhil, Sayam Rahul, Kottam Akshay Reddy, and P. Sudheesh Secured E-voting System Through Blockchain Technology . . . . . . . . . . . . 247 Nisarg Dave, Neev Shah, Paritosh Joshi, and Kaushal Shah A Novel Framework for Malpractice Detection in Online Proctoring Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 Korrapati Pravallika, M. Kameswara Rao, and Syamala Tejaswini Frequency Reconfigurable of Quad-Band MIMO Slot Antenna for Wireless Communication Applications in LTE, GSM, WLAN, and WiMAX Frequency Bands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 B. Suresh, Satyanarayana Murthy, and B. Alekya
Contents
xi
Intelligent Control Strategies Implemented in Trajectory Tracking of Underwater Vehicles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289 Mage Reena Varghese and X. Anitha Mary Fused Feature-Driven ANN Model for Estimating Code-Mixing Level in Audio Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301 K. Priya, S. Mohamed Mansoor Roomi, R. A. Alaguraja, and P. Vasuki Pre-emptive Caching of Video Content Using Predictive Analysis . . . . . . 317 Rohit Kumar Gupta, Atharva Naik, Saurabh Suthar, Ashish Kumar, and Ankit Mundra Information Dissemination Strategies for Safety Applications in VANET: A Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327 Mehul Vala and Vishal Vora Tech Stack Prediction Using Hybrid ARIMA and LSTM Model . . . . . . . 343 Radha SenthilKumar, V. Naveen, M. Sri Hari Balaji, and P. Aravinth Deceptive News Prediction in Social Media Using Machine Learning Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355 Anshita Malviya and Rajendra Kumar Dwivedi Several Categories of the Classification and Recommendation Models for Dengue Disease: A Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369 Salim G. Shaikh, B. Suresh Kumar, and Geetika Narang Performance Analysis of Supervised Machine Learning Algorithms for Detection of Cyberbullying in Twitter . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381 Nida Shakeel and Rajendra Kumar Dwivedi Text Summarization of Legal Documents Using Reinforcement Learning: A Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403 Bharti Shukla, Sonam Gupta, Arun Kumar Yadav, and Divakar Yadav Use of Near-field Communication (NFC) and Fingerprint Technology for Authentication of ATM Transactions . . . . . . . . . . . . . . . . . . 415 K. Renuka, R. P. Janani, K. Lakshmi Narayanan, P. Kannan, R. Santhana Krishnan, and Y. Harold Robinson Light Gradient Boosting Machine in Software Defect Prediction: Concurrent Feature Selection and Hyper Parameter Tuning . . . . . . . . . . . 427 Suresh Kumar Pemmada, Janmenjoy Nayak, H. S. Behera, and Danilo Pelusi An Three-Level Active NPC Inverter Open-Circuit Fault Diagnosis Using SVM and ANN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443 P. Selvakumar and G. Muthukumaran
xii
Contents
Hybrid Control Design Techniques for Aircraft Yaw and Roll Control System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457 A. C. Pavithra and N. V. Archana A Review of the Techniques and Evaluation Parameters for Recommendation Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473 S. Vijaya Shetty, Khush Dassani, G. P. Harish Gowda, H. Sarojadevi, P. Hariprasad Reddy, and Sehaj Jot Singh Fostering Smart Cities and Smart Governance Using Cloud Computing Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491 Lubna Ansari, M. Afshar Alam, Mohd Abdul Ahad, and Md. Tabrez Nafis IoT-Enabled Smart Helmet for Site Workers . . . . . . . . . . . . . . . . . . . . . . . . . 505 D. Mohanapriya, S. K. Kabilesh, J. Nandhini, A. Stephen Sagayaraj, G. Kalaiarasi, and B. Saritha Efficient Direct and Immediate User Revocable Attribute-Based Encryption Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517 Tabassum N. Mujawar and Lokesh B. Bhajantri Comparative Analysis of Deep Learning-Based Abstractive Text Summarization Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 531 Dakshata Argade and Vaishali Khairnar Crop Disease Prediction Using Computational Machine Learning Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541 Rupali A. Meshram and A. S. Alvi A Survey on Design Issues, Challenges, and Applications of Terahertz based 6G Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551 Selvakumar George, Nandalal Vijayakumar, Asirvatham Masilamani, Ezhil E. Nithila, Nirmal Jothi, and J. Relin Francis Raj A Study of Image Characteristics and Classifiers Utilized for Identify Leaves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559 Dipak Pralhad Mahurkar and Hemant Patidar COVID-19 Detection Using X-Ray Images by Using Convolutional Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 569 S. L. Jany Shabu, S. Bharath Vinay Reddy, R. Satya Ranga Vara Prasad, J. Refonaa, and S. Dhamodaran Polarimetric Technique for Forest Target Detection Using Scattering-Based Vector Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 577 Plasin Francis Dias and R. M. Banakar Multilingual Identification Using Deep Learning . . . . . . . . . . . . . . . . . . . . . 589 C. Rahul and R. Gopikakumari
Contents
xiii
AI-Based Career Counselling with Chatbots . . . . . . . . . . . . . . . . . . . . . . . . . 599 Ajitesh Nair, Ishan Padhy, J. K. Nikhil, S. Sindhura, M. L. Vandana, and B. S. Vijay Krishna A High-Gain Improved Linearity Folded Cascode LNA for Wireless Applicatıons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613 S. Bhuvaneshwari and S. Kanthamani Design and Analysis of Low Power FinFET-Based Hybrid Full Adders at 16 nm Technology Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 631 Shikha Singh and Yagnesh B. Shukla A Review on Fish Species Classification and Determination Using Machine Learning Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643 Sowmya Natarajan and Vijayakumar Ponnusamy Malicious URL Detection Using Machine Learning Techniques . . . . . . . . 657 Shridevi Angadi and Samiksha Shukla Comparative Study of Blockchain-Based Voting Solutions . . . . . . . . . . . . . 671 Khushi Patel, Dipak Ramoliya, Kashish Sorathia, and Foram Bhut Electrical Simulation of Typical Organic Solar Cell by GPVDM Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 687 Rohma Usmani, Malik Nasibullah, and Mohammed Asim Statistical Analysis of Blockchain Models from a Cloud Deployment Standpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 695 Himanshu V. Taiwade and Premchand B. Ambhore Deferred Transmission Control Communication Protocol for Mobile Object-Based Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . 713 Anand Vaidya and Shrihari M. Joshi Crıme Data Analysıs Usıng Machıne Learnıng Technıques . . . . . . . . . . . . 727 Ankit Yadav, Bhavna Saini, and Kavita Subsampling in Graph Signal Processing Based on Dominating Set . . . . 737 E. Dhanya, Gigi Thomas, and Jill K. Mathew Optimal Sizing and Cost Analysis of Hybrid Electric Renewable Energy Systems Using HOMER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 745 Basanagouda F. Ronad Different Nature-Inspired Optimization Models Using Heavy Rainfall Prediction: A Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 761 Nishant N. Pachpor, B. Suresh Kumar, Prakash S. Parsad, and Salim G. Shaikh
xiv
Contents
Hyper Chaos Random Bit-Flipping Diffusion-Based Colour Image Cryptosystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 777 Sujarani Rajendran, Manivannan Doraipandian, Kannan Krithivasan, Ramya Sabapathi, and Palanivel Srinivasan Implementation of Fuzzy Logic-Based Predictive Load Scheduling in Home Energy Management System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 791 Nirmala Jegadeesan and G. Balasubramanian Firmware Attack Detection on Gadgets Using Least Angle Regression (LAR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 801 E. Arul and A. Punidha STEMS—Smart Traffic and Emergency Management System . . . . . . . . . 811 A. Rajagopal, Chirag C. Choradia, S. Druva Kumar, Anagha Dasa, and Shweta Yadav Retraction Note to: High Accuracy for Hyperspectral Image Classification Using Hybrid Spectral 3D-2D CNN . . . . . . . . . . . . . . . . . . . . Mohini Shivhare and Sweta Tripathi
C1
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 825
Editors and Contributors
About the Editors Dr. Jennifer S. Raj received the Ph.D. degree from Anna University and Master’s Degree in communication System from SRM University, India. Currently, she is working in the Department of ECE, Gnanamani College of Technology, Namakkal, India. She is Life Member of ISTE, India. She has been serving as Organizing Chair and Program Chair of several international conferences, and in the Program Committees of several international conferences. She is book reviewer for Tata McGraw-Hill Publication and publishes more than 50 research articles in the journals and IEEE conferences. Her interests are in wireless healthcare informatics and body area sensor networks. Dr. Yong Shi is currently working as Associate/Tenured Professor of Computer Science, Kennesaw State University and Director/Coordinator of the Master of Computer Science. He is responsible for directing the Master of Computer Science Program, reviewing applications for the Master of Computer Science. He has published more than 50 articles in national and international journals. He acted as an editor, reviewer, editorial board member, and program committee member in many reputed journals and conferences. His research interest includes Cloud Computing, Big Data, and Cybersecurity. Danilo Pelusi received the Ph.D. degree in Computational Astrophysics from the University of Teramo, Italy. He is Associate Professor at the Department of Communication Sciences, University of Teramo. Co-editor of books in Springer and Elsevier, he is/was Associate Editor of IEEE Transactions on Emerging Topics in Computational Intelligence, IEEE Access and International Journal of Machine Learning and Cybernetics. Guest editor for Elsevier, Springer, and Inderscience journals and keynote speaker in several conference, he belongs to the editorial board member of many journals. Reviewer of reputed journals such as IEEE Transactions on Fuzzy Systems and IEEE Transactions on Neural Networks and Machine Leaning,
xv
xvi
Editors and Contributors
his research interests include Fuzzy Logic, Neural Networks, Information Theory, Machine Learning, and Evolutionary Algorithms. Dr. Valentina Emilia Balas is currently Full Professor at “Aurel Vlaicu” University of Arad, Romania. She is Author of more than 300 research papers. Her research interests are in Intelligent Systems, Fuzzy Control, Soft Computing. She is Editor-in Chief to International Journal of Advanced Intelligence Paradigms (IJAIP) and to IJCSE. She is Member of EUSFLAT, ACM, and a SM IEEE, Member in TC—EC and TC-FS (IEEE CIS), TC—SC (IEEE SMCS), Joint Secretary FIM.
Contributors Mohd Abdul Ahad Department of Computer Science and Engineering, Jamia Hamdard, Delhi, India Geetha Achuthan Electrical and Computer Engineering Department, National University of Science and Technology, Muscat, Sultanate of Oman M. Afshar Alam Department of Computer Science and Engineering, Jamia Hamdard, Delhi, India Cheedella Akhil Department of Electronics and Communication Engineering Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India R. A. Alaguraja ECE, Thiagarajar College of Engineering, Madurai, Tamil Nadu, India B. Alekya Department of ECE, V R Siddhartha Engineering College, Vijayawada, India A. S. Alvi PRMIT & R, Badnera, Amravati, India Premchand B. Ambhore Department of Information Technology, Government College of Engineering, Amravati, India Shridevi Angadi Christ University, Bangalore, India Lubna Ansari Department of Computer Science and Engineering, Jamia Hamdard, Delhi, India Pusuluri Sidhartha Aravind School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, Tamil Nadu, India P. Aravinth Anna University, Madras Institute of Technology, Chennai, Tamil Nadu, India N. V. Archana Electrical and Electronics Department, NIEIT, Mysore, Karnataka, India
Editors and Contributors
xvii
R. Archana Department of Computer Science and Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India Dakshata Argade Terna Engineering College, Navi-Mumbai, India Dhruv Arora Computer Science and Engineering, Manipal University Jaipur, Jaipur, Rajasthan, India E. Arul Department of Information Technology, Coimbatore Institute of Technology, Coimbatore, Tamilnadu, India Apeksha Arun Wadhe Department of Computer Engineering and Information Technology, VJTI, Mumbai, India Mohammed Asim Integral University, Lucknow, India Author Department of ECE, V R Siddhartha Engineering College, Vijayawada, India G. Balasubramanian School of Electrical and Electronics Engineering, SASTRA Deemed University, Thanjavur, India R. M. Banakar Department of Electronics and Communication Engineering, BVBCET, Hubli, India H. S. Behera Department of Information Technology, Veer Surendra Sai University of Technology, Burla, India Lokesh B. Bhajantri Department of ISE, Basaveshwar Engineering College, Bagalkot, Karnataka, India S. Bharath Vinay Reddy Sathyabama Institute of Science and Technology, Chennai, India Satyankar Bhardwaj Computer Science and Engineering, Manipal University Jaipur, Jaipur, Rajasthan, India Foram Bhut Department of Information Technology, Devang Patel Institute of Advance Technology and Research (DEPSTAR), Faculty of Technology and Engineering (FTE), Charotar University of Science and Technology (CHARUSAT), CHARUSAT Campus, Changa, Gujarat, India S. Bhuvaneshwari Department of ECE, Thiagarajar College of Engineering, Madurai, India Hitesh Chhikaniwala Info Comm Technology, Adani Institute of Infrastructure Engineering, Ahmedabad, Gujarat, India T. B. Chiyangwa Computer Science Department, University of South Africa, Gauteng, South Africa Chirag C. Choradia Dept. of Electronics and Communication Engineering, Dayananda Sagar College of Engineering, Bangalore, Karnataka, India
xviii
Editors and Contributors
Ashish Choudhary Manipal University Jaipur, Jaipur, Rajasthan, India Anagha Dasa Dept. of Electronics and Communication Engineering, Dayananda Sagar College of Engineering, Bangalore, Karnataka, India Khush Dassani Nitte Meenakshi Institute of Technology, Bengaluru, India Nisarg Dave Pandit Deendyal Energy University, Gandhinagar, Gujrat, India Prince Dawar Poornima Group of Colleges, Jaipur, Rajasthan, India Sunny Dawar Manipal University Jaipur, Jaipur, Rajasthan, India Shivani Desai Computer Science and Engineering Department, Nirma University, Ahmedabad, Gujarat, India Bali Devi Computer Science and Engineering, Manipal University Jaipur, Jaipur, Rajasthan, India S. Dhamodaran Sathyabama Institute of Science and Technology, Chennai, India E. Dhanya PG and Research Department of Mathematics, Mar Ivanios College (Autonomous), Thiruvananthapuram, Kerala, India B. Dhiyanesh Hindusthan College of Engineering and Technology, Coimbatore, India Manivannan Doraipandian School of Computing, SASTRA Deemed University, Thanjavur, India S. Druva Kumar Dept. of Electronics and Communication Engineering, Dayananda Sagar College of Engineering, Bangalore, Karnataka, India Vijay Dulera Computer Science and Engineering Department, Nirma University, Ahmedabad, Gujarat, India Phan Duy Hung FPT University, Hanoi, Vietnam Rajendra Kumar Dwivedi Department of Information Technology and Computer Application, MMMUT Gorakhpur, Gorakhpur, India Hilal A. Fadhil Department of Electrical and Computer Engineering, Sohar University, Sohar, Sultanate of Oman Plasin Francis Dias Department of Electronics and Communication Engineering, KLS VDIT, Haliyal, India Selvakumar George Department of ECE, SCAD College of Engineering and Technology, Tirunelveli, Tamilnadu, India Anish Ghiya School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, Tamil Nadu, India R. Gopikakumari Division of Electronics Engineering, School of Engineering, CUSAT, Cochi, India
Editors and Contributors
xix
Rohit Kumar Gupta Manipal University Jaipur, Jaipur, India Sonam Gupta Ajay Kumar Garg Engineering College, Ghaziabad, India P. Hariprasad Reddy Nitte Meenakshi Institute of Technology, Bengaluru, India G. P. Harish Gowda Nitte Meenakshi Institute of Technology, Bengaluru, India Y. Harold Robinson School of Information Technology and Engineering, Vellore Institute of Technology, Vellore, India Feras N. Hasoon Electrical and Computer Engineering Department, National University of Science and Technology, Muscat, Sultanate of Oman Syed Ishtiyaq Ahmed Department of Electronics and Communication Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India R. P. Janani Francis Xavier Engineering College, Tirunelveli, Tamil Nadu, India S. L. Jany Shabu Sathyabama Institute of Science and Technology, Chennai, India Nirmala Jegadeesan School of Electrical and Electronics Engineering, SASTRA Deemed University, Thanjavur, India Paritosh Joshi Pandit Deendyal Energy University, Gandhinagar, Gujrat, India Shrihari M. Joshi SDM College of Engineering and Technology, Dharwad, India Nirmal Jothi Department of ECE, SCAD College of Engineering and Technology, Tirunelveli, Tamilnadu, India S. K. Kabilesh Department of Electronics and Communication Engineering, Jai Shriram Engineering College, Avinashipalayam, Tiruppur, India G. Kalaiarasi Department of Electronics and Communication Engineering, Jai Shriram Engineering College, Avinashipalayam, Tiruppur, India Jagadevi N. Kalshetty Nitte Meenakshi Institute of Technology, Bangalore, India P. Kannan ECE Department, Francis Xavier Engineering College, Tirunelveli, Tamil Nadu, India S. Kanthamani Department of ECE, Thiagarajar College of Engineering, Madurai, India Marimuthu Karuppiah Department of Computer Science and Engineering, SRM Institute of Science and Technology, Ghaziabad, Uttar Pradesh, India Oleh Karyy Lviv Polytechnic National University, Lviv, Ukraine Kavita Department of Information Technology, Manipal University, Jaipur, India Suresh Manic Kesavan Electrical and Computer Engineering Department, National University of Science and Technology, Muscat, Sultanate of Oman
xx
Editors and Contributors
Vaishali Khairnar Terna Engineering College, Navi-Mumbai, India Mustafa Khalaf Aal Thani Electrical and Computer Engineering Department, National University of Science and Technology, Muscat, Sultanate of Oman ´ Artur Kisiołek Greate Poland University of Social Studies and Economics in Sroda ´ Wlkp., Sroda Wielkopolska, Poland M. Krithika Department of Management Studies, Saveetha School of Engineering, SIMATS, Chennai, India Kannan Krithivasan School of Computing, SASTRA Deemed University, Thanjavur, India Pallavi Kudal Dr DY Patil Institute of Management Studies, Pune, Maharashtra, India Ihor Kulyniak Lviv Polytechnic National University, Lviv, Ukraine Ashish Kumar Manipal University Jaipur, Jaipur, India Keshav Kumar Computer Science, PES University, Bengaluru, India Megha B. Kunder Atos Syntel, Bangalore, India K. Lakshmi Narayanan ECE Department, Francis Xavier Engineering College, Tirunelveli, Tamil Nadu, India M. Lall Department of Computer Science, Tshwane University of Technology, Gauteng, South Africa D. Magesh Hindusthan College of Engineering and Technology, Coimbatore, India Payal Mahipal Manipal University Jaipur, Jaipur, Rajasthan, India Anshita Malviya Department of Information Technology and Computer Application, MMMUT Gorakhpur, Gorakhpur, India X. Anitha Mary Karunya Institute of Technology and Sciences, Coimbatore, India Asirvatham Masilamani Department of ECE, SCAD College of Engineering and Technology, Tirunelveli, Tamilnadu, India Jill K. Mathew PG and Research Department of Mathematics, Mar Ivanios College (Autonomous), Thiruvananthapuram, Kerala, India Darshee Mehta Computer Science and Engineering Department, Nirma University, Ahmedabad, Gujarat, India Rupali A. Meshram PRMIT & R, Badnera, Amravati, India S. Mohamed Mansoor Roomi ECE, Madurai, Tamil Nadu, India
Thiagarajar
College
of
Engineering,
Editors and Contributors
xxi
D. Mohanapriya Department of Electronics and Communication Engineering, Jai Shriram Engineering College, Avinashipalayam, Tiruppur, India M. Monicashree Computer Science, PES University, Bengaluru, India Tabassum N. Mujawar Research Scholar, Department of CSE, Basaveshwar Engineering College, Bagalkot, Karnataka, India; Department of CE, Ramrao Adik Institute of Technology, D Y Patil Deemed to be University, Navi Mumbai, Maharashtra, India Ankit Mundra Manipal University Jaipur, Jaipur, India Satyanarayana Murthy Department of ECE, V R Siddhartha Engineering College, Vijayawada, India G. Muthukumaran Department of Electrical and Electronics Engineering, School of Electrical Sciences, Hindustan Institute of Technology and Science, Chennai, India G. Muthuram Hindusthan College of Engineering and Technology, Coimbatore, India Atharva Naik Manipal University Jaipur, Jaipur, India Ajitesh Nair Computer Science, PES University, Bengaluru, India J. Nandhini Department of Electronics and Communication Engineering, Jai Shriram Engineering College, Avinashipalayam, Tiruppur, India Geetika Narang Department of CSE, TCOER, Pune, India Malik Nasibullah Integral University, Lucknow, India Sowmya Natarajan Department of ECE, SRM IST, Chennai, India V. Naveen Anna University, Madras Institute of Technology, Chennai, Tamil Nadu, India E. Naveenkumar Hindusthan College of Engineering and Technology, Coimbatore, India Janmenjoy Nayak Department of Computer Science, Maharaja Sriram Chandra Bhanja Deo University, Baripada, Odisha, India J. K. Nikhil Computer Science, PES University, Bengaluru, India Ezhil E. Nithila Department of ECE, SCAD College of Engineering and Technology, Tirunelveli, Tamilnadu, India Nishant N. Pachpor Amity University, Jaipur, India Ishan Padhy Computer Science, PES University, Bengaluru, India K. N. Pallavi NMAM Institute of Technology, Nitte, India Prakash S. Parsad Priyadarshini College of Engineering, Nagpur, India
xxii
Editors and Contributors
Khushi Patel Department of Computer Engineering, Devang Patel Institute of Advance Technology and Research (DEPSTAR), Faculty of Technology and Engineering (FTE), Charotar University of Science and Technology (CHARUSAT), CHARUSAT Campus, Changa, Gujarat, India Hemant Patidar Electronics and Communication Engineering, Oriental University, Indore, India A. C. Pavithra Electronics and Communication Department, ATMECE, Mysore, Karnataka, India Danilo Pelusi Faculty of Communications Sciences, University of Teramo, Teramo, Italy Suresh Kumar Pemmada Department of Computer Science and Engineering, Aditya Institute of Technology and Management (AITAM), Tekkali, India; Department of Information Technology, Veer Surendra Sai University of Technology, Burla, India Vijayakumar Ponnusamy Department of ECE, SRM IST, Chennai, India Dipak Pralhad Mahurkar Electronics and Communication Engineering, Oriental University, Indore, India Korrapati Pravallika Department of Electronics and Computer Engineering, KLEF, Vaddeswaram, India K. Priya ECE, Thiagarajar College of Engineering, Madurai, Tamil Nadu, India A. Punidha Department of Computer Science and Engineering, Coimbatore Institute of Technology, Coimbatore, Tamilnadu, India R. Radha Karpagam Institute of Technology, Coimbatore, India Sreevatsan Radhakrishnan Department of Electronics and Communication Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India C. Rahul Division of Electronics Engineering, School of Engineering, CUSAT, Cochi, India Sayam Rahul Department of Electronics and Communication Engineering Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India A. Rajagopal Dept. of Electronics and Communication Engineering, Dayananda Sagar College of Engineering, Bangalore, Karnataka, India B. Rajathilagam Department of Computer Science and Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India Sujarani Rajendran Department of Computer Science and Engineering, Srinivasa Ramanujan Centre, SASTRA Deemed University, Kumbakonam, India
Editors and Contributors
xxiii
S. Ramani School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, Tamil Nadu, India S. R. Ramesh Department of Electronics and Communication Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India Dipak Ramoliya Department of Computer Science and Engineering, Devang Patel Institute of Advance Technology and Research (DEPSTAR), Faculty of Technology and Engineering (FTE), Charotar University of Science and Technology (CHARUSAT), CHARUSAT Campus, Changa, Gujarat, India M. Kameswara Rao Department of Electronics and Computer Engineering, KLEF, Vaddeswaram, India T. S. Raphiri Department of Computer Science, Tshwane University of Technology, Gauteng, South Africa Kottam Akshay Reddy Department of Electronics and Communication Engineering Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India J. Refonaa Sathyabama Institute of Science and Technology, Chennai, India J. Relin Francis Raj Department of ECE, SCAD College of Engineering and Technology, Tirunelveli, Tamilnadu, India K. Renuka Francis Xavier Engineering College, Tirunelveli, Tamil Nadu, India Basanagouda F. Ronad Department of Electrical and Electronics Engineering, Basaveshwar Engineering College (A), Bagalkot, India Ramya Sabapathi School of Computing, SASTRA Deemed University, Thanjavur, India Bhavna Saini Department of Information Technology, Manipal University, Jaipur, India R. Santhana Krishnan ECE Department, SCAD College of Engineering and Technology, Tirunelveli, Tamil Nadu, India B. Saritha Department of Electronics and Communication Engineering, Jai Shriram Engineering College, Avinashipalayam, Tiruppur, India H. Sarojadevi Nitte Meenakshi Institute of Technology, Bengaluru, India R. Satya Ranga Vara Prasad Sathyabama Institute of Science and Technology, Chennai, India P. Selvakumar Department of Electrical and Electronics Engineering, School of Electrical Sciences, Hindustan Institute of Technology and Science, Chennai, India N. Selvanathan Sona College of Technology, Salem, India
xxiv
Editors and Contributors
Radha SenthilKumar Anna University, Madras Institute of Technology, Chennai, Tamil Nadu, India Kaushal Shah Pandit Deendyal Energy University, Gandhinagar, Gujrat, India Neev Shah Pandit Deendyal Energy University, Gandhinagar, Gujrat, India Salim G. Shaikh Department of CE, SIT, Lonavala, India; Department of CSE, Amity University, Jaipur, Jaipur, India Nida Shakeel Department of Information Technology and Computer Application, MMMUT Gorakhpur, Gorakhpur, India Venkatesh Gauri Shankar Information Technology, Manipal University Jaipur, Jaipur, Rajasthan, India D. L. Shanthi BMS Institute of Technology and Management, Bengaluru, India Aman Sharma Computer Science, PES University, Bengaluru, India Kavya Shetty Netanalytiks Technologies Pvt Ltd, Bangalore, India Mohini Shivhare Department of Elecronics and Communication, Kanpur Institute of Technology, Kanpur, India Bharti Shukla Ajay Kumar Garg Engineering College, Ghaziabad, India Samiksha Shukla Christ University, Bangalore, India Yagnesh B. Shukla Gujarat Technological University, Ahmedabad, India S. Sindhura Computer Science, PES University, Bengaluru, India Sehaj Jot Singh Nitte Meenakshi Institute of Technology, Bengaluru, India Shikha Singh Gujarat Technological University, Ahmedabad, India Vatsal Sinha Computer Science, PES University, Bengaluru, India Mamta Soni Manipal University Jaipur, Jaipur, Rajasthan, India Kashish Sorathia Department of Information Technology, Devang Patel Institute of Advance Technology and Research (DEPSTAR), Faculty of Technology and Engineering (FTE), Charotar University of Science and Technology (CHARUSAT), CHARUSAT Campus, Changa, Gujarat, India M. Sri Hari Balaji Anna University, Madras Institute of Technology, Chennai, Tamil Nadu, India Palanivel Srinivasan School of Computing, SASTRA Deemed University, Thanjavur, India Sumit Srivastava Information Technology, Manipal University Jaipur, Jaipur, Rajasthan, India
Editors and Contributors
xxv
A. Stephen Sagayaraj Bannari Amman Institute of Technology, Sathyamangalam, India P. Sudheesh Department of Electronics and Communication Engineering Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India Shraddha Suratkar Department of Computer Engineering and Information Technology, VJTI, Mumbai, India B. Suresh Kumar Department of CSE, Sanjay Ghodawat University, Kolhapur, India B. Suresh Department of ECE, V R Siddhartha Engineering College, Vijayawada, India Maithri Suresh Oracle, Bangalore, India Saurabh Suthar Manipal University Jaipur, Jaipur, India Md. Tabrez Nafis Department of Computer Science and Engineering, Jamia Hamdard, Delhi, India Himanshu V. Taiwade Department of Computer Science & Engineering, Priyadarshini College of Engineering, Nagpur, India Syamala Tejaswini Department of Electronics and Computer Engineering, KLEF, Vaddeswaram, India Gigi Thomas PG and Research Department of Mathematics, Mar Ivanios College (Autonomous), Thiruvananthapuram, Kerala, India Sweta Tripathi Department of Elecronics and Communication, Kanpur Institute of Technology, Kanpur, India Rohma Usmani Integral University, Lucknow, India Anand Vaidya SDM College of Engineering and Technology, Dharwad, India Mehul Vala Atmiya University, Rajkot, Gujarat, India Pham Van Thanh FPT University, Hanoi, Vietnam M. L. Vandana Computer Science, PES University, Bengaluru, India Mage Reena Varghese Department of Robotics Engineering, Karunya Institute of Technology and Sciences, Coimbatore, India P. Vasuki ECE, Sethu Institute of Technology, Madurai, Tamil Nadu, India B. S. Vijay Krishna CTO, nSmiles, Bengaluru, India S. Vijaya Shetty Nitte Meenakshi Institute of Technology, Bengaluru, India Nandalal Vijayakumar Department of ECE, Sri Krishna College of Engineering and Technology, Coimbatore, Tamilnadu, India
xxvi
Editors and Contributors
Vishal Vora Atmiya University, Rajkot, Gujarat, India Ankit Yadav Department of Information Technology, Manipal University, Jaipur, India Arun Kumar Yadav National Institute of Technology, Hamirpur, H.P., India Divakar Yadav National Institute of Technology, Hamirpur, H.P., India Shweta Yadav Dept. of Electronics and Communication Engineering, Dayananda Sagar College of Engineering, Bangalore, Karnataka, India Jainab Zareena Department of Management Studies, SCAD College of Engineering and Technology, Tirunelveli, India
Lung Ultrasound COVID-19 Detection Using Deep Feature Recursive Neural Network E. Naveenkumar, B. Dhiyanesh, D. Magesh, G. Muthuram, N. Selvanathan, and R. Radha
Abstract Coronavirus disease (COVID-19) is a universal illness that has been prevalent since December 2019. COVID-19 causes a disease that extends to more serious illnesses than the flu and is formulated from a large group of viruses. COVID-19 has been announced as a global epidemic that has greatly affected the global economy and society. Recent studies have great promise for lung ultrasound (LU) imaging, subjects infected by COVID-19. Extensively, the growth of an impartial, fast, and accurate automated method for evaluating LU images is still in its infancy. The present algorithms provide results of LU detecting COVID-19, are very time consuming, and provide high false rate for early detection and treatment of affected patients. Today, accurate detection of COVID-19 usually takes a long time and is prone to human error. To resolve this problem, Information Gain Feature Selection (IGFS) based on Deep Feature Recursive Neural Network (DFRNN) algorithm is proposed to detect the COVID-19 automatically at an early stage. The LU images are preprocessed using Gaussian filter approach, then quality enhanced by Watershed Segmentation (WS) algorithm, and later trained into IGFS algorithm to detect the finest features of COVID-19 to improve classification performance. Thus, the proposed algorithm detects whether the person is COVID-19 affected or not, from his LU image, in an efficient manner. The proposed experimental results show improved precision, recall, F-measure, and classification performance with low time complexity and less false rate performance, compared to the previous algorithms.
E. Naveenkumar (B) · B. Dhiyanesh · D. Magesh · G. Muthuram Hindusthan College of Engineering and Technology, Coimbatore, India e-mail: [email protected] G. Muthuram e-mail: [email protected] N. Selvanathan Sona College of Technology, Salem, India e-mail: [email protected] R. Radha Karpagam Institute of Technology, Coimbatore, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_1
1
2
E. Naveenkumar et al.
Keywords COVID-19 · Lung ultrasound (LU) · Information gain feature selection (IGFS) · Deep feature recursive neural network (DFRNN) · Preprocessing · Segmentation · Classification
1 Introduction In December 2019, the world has faced a universal healthiness disaster, a new coronavirus disease pandemic typically known as COVID-19. The outburst of COVID-19 and its associated complications of suppression cause global health hazards and affect all aspects of personal life. Spreading through human-to-human communication by direct contact or droplets is a known quality of the virus, with an intricate imitation amount of 2.25–3.59 and a development period of 2–14 days. According to a study, 99% of COVID-19 patients experience runny nose, dry cough, body pain, dyspnea, confusion, headache, sore throat, high fever, and vomiting. There are various diagnostic methods for COVID-19 detection, among which Computed Tomography (CT) and MRI (Magnetic Resonance Imaging) methods have wide applicability, but these tests have less sensitivity and accuracy. Quick detection of highly contagious COVID-19 infection is a requisite to isolate patients to control the epidemic and save many lives. Lung ultrasound (LU) imaging focuses on lesion characteristics such as shape, number, distribution, density, and associated symptoms. Deep learning (DL) has played a key role in medical imaging tasks, including CT imaging. Deep learning-based approaches are widely used to guarantee the highest level of accuracy in disease detection and prediction. This paper focuses on detecting the COVID-19 infection using a LU image dataset based on the proposed algorithm to identify the presence of lung infection in patients, in a classification mode which is most helpful for the diagnosis. Initially, normal tissue is compared with the infected tissue using image preprocessing steps for segmentation and finest feature selection. To evaluate the performance of the proposed model, Deep Feature Recursive Neural Network (DFRNN) algorithm is used, which automatically detects new coronavirus disease from lung ultrasound images. The proposed algorithm is categorized into four parts; they are preprocessing, image segmentation, feature selection, and classification.
2 Related Work COVID-19 is currently spreading worldwide which can be detected by RT-PCR test and CT scan. The paper [1] proposed an algorithmic Indefiniteness Elimination Network (IE-Net) to obtain accurate results about COVID-19 and remove the impact of different dimensions. However, the algorithm did not provide an accurate result about the presence or absence of COVID-19.
Lung Ultrasound COVID-19 Detection Using Deep Feature …
3
The authors of [2] investigated lung ultrasound imagery for COVID-19 prediction using DL techniques that consists of VGG19, InceptionV3, and ResNet50. Infection Segmentation Deep Network (ISD-Net) algorithm was used to find the lung infected area from Computed Tomography (CT) images. Article [3, 4] explored the convolutional neural network (CNN) algorithm and transfer learning (TL), to predict COVID-19 by determining different abnormalities from X-ray images. Likewise, the authors of [5] used the Generative Adversarial Networks (GANs) to improve the CNN performance to predict COVID-19. Paper [6] suggested the use of the CNN and Gravitational Search Optimization (GSO) algorithms to detect COVID-19. In paper [7], the GSO was used to identify the finest values and CNN was used to predict the COVID-19. DL methods efficiently identified COVID-19 from CT scan and X-ray images [8]. Researchers of [9] employed DL-based TL algorithm to identify the coronavirus using CT images. However, the method produced low accuracy classification performance. The authors of [10] introduced the Saliency-Based Region Detection and Image Segmentation (SBRDIS) approach to minimize noise and identify the infected class. The pre-trained CNN method for COVID-19 identification using the RYDLS-20 image dataset was done by [11]. The authors of [12] evaluated the Mini-COVIDNet-based Deep DNN algorithm to efficiently identify COVID-19 using the lung ultrasound image dataset. Similarly, paper [13] analyzed the Deep CNN algorithm for automatically classifying positive and negative for COVID-19 using X-ray images [13]. Quick and trustworthy recognition of patients diseased with COVID-19 is crucial to inhibit and boundary of its occurrence [14]. Researchers of [15] utilized the Deep features with BAT optimization and the fuzzy K-nearest neighbor (FKNN) algorithm to automatic diagnosis COVID-19 [15, 16]. The diagnosis of lung disease by analyzing the chest CT image has developed a significant tool for the prediction of COVID-19 patients [17, 18]. Markov Model (MM) and Viterbi Algorithm (VA) was used to find the affected regions and Support Vector Machine (SVM) algorithm to classify COVID-19 or nonCovid in [19]. A powerful Deep CNN model was proposed to identify COVID-19 using online available datasets [20]. The purpose of [21] was to use various medical imaging methods such as CT and X-ray, based on the deep learning technology, and provide an overview of the recently developed systems. An Adaptive Thresholding Technique (ATT) and the Semantic Segmentation (SS) algorithm was designed to find the infected lungs using the LIDCIDRI image dataset [22, 23]. In that paper, the author proposes a framework of transfer learning to detect deep uncertainty perception and COVID-19 [24]. Extracted features to identify the status of COVID-19 it is handled by various machine learning and statistical modelling techniques. MM and VA algorithm is used to find the affected regions and SVM algorithm is used to classify the COVID-19 or non-Covid [25]. The model can be a given chest x-ray of the patient COVID-19, which must determine if it is accurate [26].
4
E. Naveenkumar et al.
3 Proposed Methodology This paper presents the Deep Feature Recursive Neural Network (DFRNN) algorithm for accurately detecting the presence or absence of COVID-19 with low time complexity and less false rate. The method has been introduced to detect COVID-19 infections more quickly by analyzing lung ultrasound (LU) images in the early stage. Figure 1 describes the proposed overall architecture diagram for COVID-19 identification using lung ultrasound (LU) image dataset. The first process is that the collected LU image is fed into Gaussian filter algorithm to remove noise and resize the image. Then, the preprocessed image is trained into Watershed segmentation algorithm to enhance the image quality for finest features using Information Gain Feature Selection (IGFS). Finally, the proposed algorithm classifies and detects the existence of COVID-19.
3.1 Gaussian Filter Since the features of lung ultrasound (LU) images have different intensities and gray levels, image preprocessing must be applied before using such images as classifier inputs. Image normalization in the preprocessing step is aimed at reducing noise and resizing the images. A Gaussian filter is a linear smoothing filter that chooses weights according to the shape of the Gaussian function. Gaussian smoothing filtration is completed to eliminate noise and adjust the size of the LU images. The Gaussian filter performs smoothing by replacing each image pixel with the weighted average of adjacent pixels. As a result, the weight given to adjacent pixels diminishes monotonously with distance from the center pixel. The below equation is used
Fig. 1 Proposed framework for COVID-19 detection using lung ultrasound image dataset
Lung Ultrasound COVID-19 Detection Using Deep Feature …
5
to calculate the image smoothing filter G s (x, y). G s (x, y) = e−
(x 2 +y2 ) 2σ 2
(1)
where x and y are the image pixels, and σ is a standard deviation value. Nr (x, y) =
1 aw (x, y)o(i ) n(x, y) i∈ϕ(x,y)
(2)
In the Equation (2) can be expressed as noise removing process Nr (x, y) where aw (x, y) refers to the average weight calculation assign to remove noise from Nr (x, y), n(x, y) is the normalizing constant, O refers to the original noise image and ϕ(x, y) is the, and the normalized noise removing image is Nr .
3.2 Watershed Segmentation (WS) This phase proposes a Watershed Segmentation (WS) approach to overcome the problems of uneven intensity in the image. The proposed SW algorithm is a new energy function that can effectively extract objects with complex backgrounds considering severe non-uniformity in the preprocessed LU images. The proposed WS stems from the fact that during the process of detecting a lung infection, at first the area of infection is roughly identified and then its contours are accurately extracted based on the local appearance. Therefore, WS first predicts the coarse area and then implicitly models the boundary through reverse attention and edge constraints, thus clearly enhancing the boundary recognition. Equation (3) is expressed as LU image edge detection (ed ) which is, ed =
1 1 + |∀gσ ∗ Nr (x, y)|
(3)
where ∀ and gσ are gradient processes, Nr (x, y) refers to the preprocessed image, σ refers to standard deviation, and ∗ refers to reduce intense background from the preprocessed image. Ra = →ed −ed (x, y)
(4)
The above equation is to identify covid-affected region and coarse area Ra , and →ed refers to the image mean value of ed (x, y). WS = Ret (∅) + Rit
(5)
6
E. Naveenkumar et al.
The above equation is to identify the segment energy image information ‘W s ’ to predict the covid-affected region. Gradient color external information is Ret (∅), and Rit refers to internal evolution of level set.
3.3 Information Gain Feature Selection In this phase, segmented image is fed into Information Gain Feature Selection (IGFS) to gain the finest information from segmented LU image of COVID-19. The IGFS algorithm is no longer used as a black box. On the contrary, in binary particle aggregate optimization, the interrelationship coefficient is added as the feature coefficient that determines the image position, and hence, it is more likely to select features with additional information. Evaluating the performance of the feature subgroup, i.e., evaluating the feature combined with the highest aspect ratio is approved by the classifier. Algorithm steps Input: Segmented LU images W s Output: Optimal original features (f ). Begin Step 1: Initialize the Each feature images Step 2: Calculate the weights of features Set all feature weights W(R) = 0.0; For n = 1 to m do Randomly select the features weights (R); Find Information of the features (If ). For Each R = Class (Ws) do Find Coefficient Feature weights (m f ) End for For R = 1 to feature weights do W (R) = W (R) −
n
diff(R, W s, D)/(m X I )
a=1
+
R=class(F)
m(R) diff(R, W s, D)/(m X I ) 1 − m(class(F)) a=1 n
[
End For Step 3: Calculate the Each image Feature max. weights (m f ) Step 4: Update the best feature set Step 5: Update each combined feature values
Lung Ultrasound COVID-19 Detection Using Deep Feature …
7
Step 6: Obtain finest (f) result of COVID-19 End where W (R)—Weight Random features that represents Class feature set, n—number of images, in the feature selection that is based on the IGFS to analyze the maximum number of features and update the feature set of the images. Size of the image is N, and Dimension of Features is represented as D.
3.4 Deep Feature Recursive Neural Network In this phase, feature selected image is trained into the proposed DFRNN algorithm to detect COVID-19. The DFRNN algorithm using LU imaging can identify COVID19 attack by detecting the lung consolidation and tissue. The DFRNN algorithm is gaining popularity due to its improved prediction accuracy. In addition, it is used to eliminate unwanted information that negatively impacts accuracy. Algorithm steps Input: Finest features (f) LU image Output: Return optimized result Begin Import LU Finest features image (f) Set the initial layers based on the feature weights Set the DFRNN limitations ε, μ, β Set convert features Image in training Train the features weights of the images For ε = 1 to ∈ do Randomly select the image features from T1 Compute the loss ranges Update the classification weights rates ω∗ Return optimized result End The above algorithm steps are performed to provide covid-positive or covidnegative result in an efficient manner. The infection seems to be often bilateral and predominant in the lower area, and the size of the infected area depends on the patient’s condition. For example, in mild cases, lesions may appear small, whereas in severe cases, lesions may be widespread. Therefore, DFRNN algorithm deals with changes in lesion size and location. ε⇓0(1 + a)n = 1 +
na 1!
+
n(n − 1)a 2 , 1!
8
E. Naveenkumar et al.
where μ refers to performance rate, ε is the Iteration stage, ∈ is the maximum number of iterations, and β is the number of images covered iterations.
4 Simulation and Results The proposed DFRNN algorithm is implemented in python language at an anaconda environment using LU image dataset and is compared with other algorithms such as convolutional neural network (CNN) and Generative Adversarial Networks (GANs). Table 1 describes the details of simulation parameters for the proposed implementation compared with previous algorithms. Table 2 defines the classification of accuracy performance for coronavirus detection. It is evident that the proposed algorithm achieves high performance of results compared to the existing algorithms. Figure 2 portrays the exploration of classification accuracy performance for coronavirus detection. Table 3 depicts the analysis of precision performance for accurate detection of coronavirus. Precision performance accurately predicts how many positives this class has, in the test data set. Figure 3 represents the precision performance for coronavirus detection. Table 4 shows the analysis of recall performance for coronavirus detection. Recall performance refers to how many times the actual true-positive classification predicts the class. Figure 4 shows the exploration of recall performance for coronavirus detection. Table 5 shows the false rate performance for coronavirus detection. The proposed algorithm’s false rate performance is low compared to the other existing methods. Table 1 Details of Simulation parameters
Table 2 Classification of accuracy performance
Values
Parameters Name of the dataset
Lung ultrasound image dataset
Language
Python
Tool used
Anaconda
Number dataset
800
Training dataset
600
Testing dataset
200
No. of images
CNN (%)
GANs (%)
DFRNN (%)
1
63
67
71
2
69
73
77
3
74
80
83
4
77
83
92
Lung Ultrasound COVID-19 Detection Using Deep Feature …
9
Classification of Accuracy Performance Accuracy in %
100 80 60 40 20 0 1
2
3
4
No. of images CNN
GANs
DFRNN
Fig. 2 Exploration of classification accuracy performance Table 3 Analysis of precision performance
No of image
CNN (%)
GANs (%)
DFRNN (%)
1
61
67
71
2
68
71
78
3
74
73
81
4
77
85
90
Precision Performance Precision in %
100
50
0
1
2
3
4
No. of images CNN
GANs
DFRNN
Fig. 3 Exploration of precision performance Table 4 Analysis of Recall performance
No of image
CNN (%)
GANs (%)
DFRNN (%)
1
64
68
72
2
67
71
79
3
73
78
81
4
79
84
91
10
E. Naveenkumar et al.
Recall Performance
100 Recall in %
80 60 40 20 0 1
2 No. of images GANs
CNN
3
4 DFRNN
Fig. 4 Exploration of Recall performance
Table 5 Analysis of false rate performance
No of image
CNN (%)
GANs (%)
DFRNN (%)
1
37
33
29
2
31
27
23
3
26
20
17
4
23
17
8
Figure 5 depicts the analysis of false rate performance for coronavirus detection. Analysis of time complexity performance result graph is shown in Fig. 6. False Rate Performance
False rate in %
40 30 20 10 0 1
2
3
4
No of image CNN
Fig. 5 Analysis of false rate performance
GANs
DFRNN
Lung Ultrasound COVID-19 Detection Using Deep Feature …
11
Time Complexity Performance
Time In Sec
40 30 20 10 0 1
2
3
4
No. of images CNN
GANs
DFRNN
Fig. 6 Analysis of time complexity performance
5 Conclusion In the current COVID-19 pandemic, medical services are often saturated, and therefore, automatic diagnosis imaging tools can significantly decrease the burden on the medical system by a limited number of specialists required. In this paper, the proposed Deep Feature Recursive Neural Network (DFRNN) algorithm is performed to classify the result as COVID-19 affected or Non-Covid case, using lung ultrasound image dataset. The first stage of this algorithm is the Gaussian filter algorithm to reduce the noise and resize the image, then the preprocessed image is supplied to the Watershed Segmentation (WS) algorithm to enhance the image quality, and later, the quality image is fed into Information Gain Feature Selection (IGFS) approach to increase the classification accuracy performance. Finally, the proposed DFRNN algorithm classifies effectively if the individual is COVID-19 affected or not affected from the lung ultrasound image at an early stage. Another advantage of the DFRNN algorithm is that it is a very versatile deployment and has a high COVID-19 detection accuracy of 92%, precision performance as 90%, recall performance as 91%, false rate performance as 8%, and classification time complexity performance as 22 s. Hence, the DFRNN algorithm proves to be superior to CNN and GANs in detecting lung infection.
12
E. Naveenkumar et al.
References 1. G. Guo, Z. Liu, S. Zhao, L. Guo, T. Liu, Eliminating indefiniteness of clinical spectrum for better screening COVID-19. IEEE J. Biomed. Health Inform. 25(5), 1347–1357 (2021). https:// doi.org/10.1109/JBHI.2021.3060035 2. J. Diaz-Escobar, N.E. Ordóñez-Guillén, S. Villarreal-Reyes, A. Galaviz-Mosqueda, V. Kober, R. Rivera-Rodriguez, J.E. Lozano Rizk, Deep-learning based detection of COVID-19 using lung ultrasound imagery. PLoS One. 2021 16(8), e0255886. doi: https://doi.org/10.1371/jou rnal.pone.0255886 3. B. Dhiyanesh, S. Sakthivel, UBP-Trust: user behavioral pattern based secure trust model for mitigating denial of service attacks in software as a service (SaaS) cloud environment. J. Comput. Theor. Nanosci. 13(10) (2016) 4. A. Narin, C. Kaya, Z. Pamuk, Automatic detection of coronavirus disease (COVID-19) using X-ray images and deep convolutional neural networks (2020) 5. D.P. Fan, T. Zhou, G.P. Ji et al., Inf-Net: automatic COVID-19 lung infection segmentation from CT images. IEEE Trans. Med. Imaging 39, 2626–2637 (2020). https://doi.org/10.1109/ TMI.2020.2996645 6. I.D. Apostolopoulos, T.A. Mpesiana, COVID-19: automatic detection from X-ray images utilizing transfer learning with convolutional neural networks. Phys. Eng. Sci. Med. 43, 635–640 (2020). https://doi.org/10.1007/s13246-020-00865-4 7. S. Kasthuripriya, B. Dhiyanesh, S. Sakthivel, LFTSM-local flow trust based service monitoring approach for preventing the packet during data transfer in cloud. Asian J. Inform. Technol. 15(20) (2016) 8. B. Dhiyanesh, S. Sakthivel, F2C: an novel distributed denial of service attack mitigation model for SAAS cloud environment. Asian J. Res. Soc. Sci. Hum. 6(6) (2016) 9. A. Waheed, M. Goyal, D. Gupta, A. Khanna, F. Al-Turjman and P.R. Pinheiro, CovidGAN: data augmentation using auxiliary classifier GAN for improved COVID-19 detection. IEEE Access. 8, 91916–91923 (2020). doi: https://doi.org/10.1109/ACCESS.2020.2994762 10. D. Ezzat, A.E. Hassanien, H.A. Ella, An optimized deep learning architecture for the diagnosis of COVID-19 disease based on gravitational search optimization. Appl. Soft. Comput. (2020). https://doi.org/10.1016/j.asoc.2020.106742 11. B. Dhiyanesh, S. Sakthivel, Secure data storage auditing service using third party auditor in cloud computing. Int. J. Appl. Eng. Res. 10(37) (2015) 12. P. Karthikeyan, B. Dhiyanesh, Location based scheduler for independent jobs in computational grids. CIIT Int. J. Netw. Commun. Eng. 3(4) (2011) 13. Y. Jiang, H. Chen, M. Loew, H. Ko, COVID-19 CT image synthesis with a conditional generative adversarial network. IEEE J. Biomed. Health Inform. 25(2), 441–452 (2021). https://doi.org/ 10.1109/JBHI.2020.3042523 14. J. Kaur, P. Kaur, Outbreak COVID-19 in medical image processing using deep learning: a state-of-the-art review. Arch Comput. Methods Eng. (2021). https://doi.org/10.1007/s11831021-09667-7 15. B. Dhiyanesh, K.S. Sathiyapriya, Image inpainting and image denoising in wavelet domain using fast curve evolution algorithm, in 2012 IEEE International Conference on Advanced Communication Control and Computing Technologies 2012. https://doi.org/10.1109/ICA CCCT.2012.6320763 16. T. Kaur, T.K. Gandhi, B.K. Panigrahi, Automated diagnosis of COVID-19 using deep features and parameter free BAT optimization. IEEE J. Trans. Eng. Health Med. 9, 1–9 (2021). Art no. 1800209. https://doi.org/10.1109/JTEHM.2021.3077142 17. B. Dhiyanesh, Dynamic resource allocation for machine to cloud communications robotics cloud, in 2012 International Conference on Emerging Trends in Electrical Engineering and Energy Management (ICETEEEM), 2012. doi:https://doi.org/10.1109/ICETEEEM.2012.649 4498
Lung Ultrasound COVID-19 Detection Using Deep Feature …
13
18. P. Dutta, T. Roy, N. Anjum, COVID-19 detection using transfer learning with convolutional neural network, in 2021 2nd International Conference on Robotics, Electrical and Signal Processing Techniques (ICREST), 2021, pp. 429–432. https://doi.org/10.1109/ICREST51555. 2021.9331029 19. S. Wang, B. Kang, J. Ma, et al., A deep learning algorithm using CT images to screen for corona virus disease (COVID-19) (2020). medRxiv 2020.02.14.20023028. https://doi.org/10. 1101/2020.02.14.20023028 20. A. Joshi, M.S. Khan, S. Soomro, A. Niaz, B.S. Han, K.N. Choi, SRIS: saliency-based region detection and image segmentation of COVID-19 infected cases. IEEE Access 8, 190487– 190503 (2020). https://doi.org/10.1109/ACCESS.2020.3032288 21. R.M. Pereira, D. Bertolini, L.O. Teixeira et al., COVID-19 identification in chest X-ray images on flat and hierarchical classification scenarios. Comput. Methods. Programs Biomed. (2020). https://doi.org/10.1016/j.cmpb.2020.105532 22. N. Awasthi, A. Dayal, L.R. Cenkeramaddi, P.K. Yalavarthy, Mini-COVIDNet: efficient lightweight deep neural network for ultrasound based point-of-care detection of COVID-19. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 68(6), 2023–2037 (2021). https://doi.org/10. 1109/TUFFC.2021.3068190 23. M.M. Islam, F. Karray, R. Alhajj, J. Zeng, A review on deep learning techniques for the diagnosis of novel coronavirus (COVID-19). IEEE Access 9, 30551–30572 (2021). https://doi. org/10.1109/ACCESS.2021.3058537 24. A. Shamsi et al., An uncertainty-aware transfer learning-based framework for COVID-19 diagnosis. IEEE Trans. Neural Netw. Learn. Syst. 32(4), 1408–1417 (2021). https://doi.org/10.1109/ TNNLS.2021.3054306 25. E. Irmak, A novel deep convolutional neural network model for COVID-19 disease detection. Med. Technol. Congress (TIPTEKNO) 2020, 1–4 (2020). https://doi.org/10.1109/TIPTEKNO5 0054.2020.9299286 26. L. Carrer et al., Automatic pleural line extraction and COVID-19 scoring from lung ultrasound data. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 67(11), 2207–2217 (2020). https://doi. org/10.1109/TUFFC.2020.3005512
Predicting New York Taxi Trip Duration Based on Regression Analysis Using ML and Time Series Forecasting Using DL S. Ramani, Anish Ghiya, Pusuluri Sidhartha Aravind, Marimuthu Karuppiah, and Danilo Pelusi
Abstract The taxi fare and the duration of a trip are highly dependent on many factors such as traffic along route or late-night drives, which might be a little slower due to restricted night vision and many more. In this research work, it is attempted to visualize the various factors that might affect the trip durations such as day of the week, pickup location, drop-off location and time of pickup. The research work mainly analyses the dataset obtained from the NYC Taxi and Limousine Commission (TLC) which contains the data of taxi trips from January 2016 to June 2016 with GPS coordinates. The analysis of the data is performed, and the prediction of the taxi trip duration is done using multiple machine learning and deep learning models. The analysis is done for these models based on the mean squared error and the R2 score that is found without scaling and performing scaling on the data. The maximum R 2 score was attained with the recurrent neural network (RNN) using time series analysis with a score of 0.99 and 0.97 with XGBRegressor, and an increment of 0.6% was observed with normalizing value using log transform while analysing it as a regression perspective. Keywords New York City Taxi and Limousine Commission · Regression · Scaling · Logarithmic transformation · Machine learning · Deep learning · Mean squared error (MSE) · R 2 values
S. Ramani · A. Ghiya · P. S. Aravind School of Computer Science and Engineering, Vellore Institute of Technology, Vellore , Tamil Nadu 632014, India e-mail: [email protected] P. S. Aravind e-mail: [email protected] M. Karuppiah Department of Computer Science and Engineering, SRM Institute of Science and Technology, Delhi-NCR Campus, Ghaziabad, Uttar Pradesh 201204, India D. Pelusi (B) Faculty of Communications Sciences, University of Teramo, Teramo, Italy e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_2
15
16
S. Ramani et al.
1 Introduction New York, also known as the ‘concrete jungle’ is riddled with a large number of one ways, small side streets and an almost incalculable number of pedestrians at any given point in time, not to mention the number of cars/motorcycles/bicycles clogging up the roads. This when combined with an urgency to get from point A to point B will make oneself late for whatever one needs to be on time for. An average taxi organization faces a typical issue of proficiently assigning the taxis to travellers so the administration is smooth and bother free. One of the fundamental issues is deciding the duration of the current trip so it can anticipate when the taxi will be free for the following trip. The solution to getting from A to B when living in a city like New York (without losing your mind) is to either take a taxi/Uber/Lyft/, etc. One does not need to stress about the traffic or pedestrians and can have a moment to do something else, like catch up on emails. Although this sounds simple enough, it does not mean you will get to your destination in time. The driver needs to take the shortest trip possible for one to make it on time to their destination. If route A is X kilometres longer, but gets you there, Y minutes faster than route B would, one would take route B over A. New York City Taxi and Limousine Commission (TLC) deals with the licencing of taxicabs operated by the private companies in New York along with overseeing about 40,000 other for-hire vehicles. Taxicab vehicles must each have a medallion to operate and are driven an average of 180 miles per shift. As of March 14, 2014, there were 51,398 individuals licenced to drive medallion taxicabs. Taxis collectively make between 300,000 and 400,000 trips a day. The approach presented in this paper uses the dataset which has a considerable number of rows (i.e. 1,458,644 trip records). This paper presents predictive analysis on the novel approach. This paper employs ML models for predictive analysis on the datasets mentioned in Sect. 3.1, and we have used the time series data from the target variable (trip_duration) as input for LSTMs and RNNs, and we are presenting the comparison between these two modes of analysis. Taxicab vehicles, every one of which must have a medallion to work, are driven a normal of 180 miles for each move. As of March 14, 2014, there were 51,398 people authorized to drive medallion taxicabs. There were 13,605 taxicab medallion licences in presence. By July 2016, the number had dropped somewhat to 13,587 medallions or 18 lower than the 2014 aggregate. With a taxicab network, this big along with the huge New York population of nearly 8.4 million people creates a huge network of traffic routes, creating a huge amount of traffic in different places. Thus, it is necessary to know how the traffic is at different times, days, for the organization to efficiently distribute their taxi cabs so as to obtain maximum profit, which would require for the cabs to be efficient in dropping off the passengers. To know which route is the best one to take, we need to be able to predict how long the trip will last when taking a specific route. Knowing the duration that the trip would take is also essential for us to take into consideration all the factors like traffic, weather and many more. Performing data analysis can help to get a clear idea
Predicting New York Taxi Trip Duration Based on Regression . . .
17
to get bits of knowledge about the data and decide how various factors are subject to the objective variable trip duration. This study can help us get a good visualization of how the taxi traffic moves and how people move within the city as well, helping us get a better understanding of the popular places for increasing the business of the taxi companies that are targeted. The study is significant in a manner as it will educate organizations about the circumstance present. It will likewise permit individuals to watch the current state of the spot they are heading out to. Alongside this, the expectation tab will permit individuals to set themselves up with predictive cases for what’s to come. With the help of the data obtained from the NY Taxi and Limousine Company (TLC), it becomes easier as we have access to huge amount of data which has been collected over a period of 7 21 years, which will help bring about a much more accurate prediction with the help of machine learning (ML) and artificial intelligence (AI). Since this paper focuses on the predictive analysis aspect of the NYC taxi trip durations based on the various input features like weather, location, etc., it differs from the travelling salesman problem (TSP) which emphasizes on the optimum path to be followed in order to get to the location in the best possible manner with the least use of resources.
2 Literature Review This particular paper highlights the prevailing focus on the dataset of NYC taxi trips and fare. Big data was under the limelight to analyses such a massive dataset from the early 2000s. There were around 180 million taxi rides in the city of New York in the year of 2014 itself. The intent of analysing this data was for several purposes like avoiding traffic, lower rate where services are not functioning, more frequency than a cab on prime location and many more. The paper focuses on four main areas— analysis on individual taking fare, distance, time and efficiency; analysis on region taking pickup location and drop-off location; analysis based on fare and; analysis based on fare [1]. The vital section of the paper involves a visual query model that allows the users to quickly select data slices and use them. The mentioned analysis is performed to help the vendor provide more taxis where necessary as per region and use the other analysis as well to make the system work more efficiently. This paper deals with the analysis of sale of iced products affected by the variation of temperature, and the paper utilizes the collection of data from previous years. The paper utilizes a regression analysis model based on the data that has been cleansed. Python3.6 being a completely object-oriented language design programme with a pure script allows for the combination of the essence and designing rules of various design languages making it easier for the use in making large-scale software [2]. The paper uses Python3.6 to set up a linear regression analysis model targeting the effect of temperature variation on the sale of companies with the help of Pandas analysis package. The paper mainly analyses a simple linear regression model, which performs the analysis of studying relations between independent variable and dependent variable. While the paper presents an essential use of python as a programming lan-
18
S. Ramani et al.
guage, we can also see from this that the model linear regression that was used could not be of greater significance in the case of big data analysis. The paper primarily deals to improve the ceramic sector with the performance indicators being analysed with multiple regression analysis; this analysis belongs to the multivariate methods. This type of analysis is used for modelling and analysing several variables; it extends regression analysis by describing the relationship between dependent and independent variables which can be used for predicting and forecasting [3]. This model can be much more realistic than the uni-factorial regression model as not always only a single variable affects the data. The results showed that three of the variables analysed are very significant predictors for the magnitude of profit. We could also find significant correlations between the analysed indicators. While the paper performs a multivariate correlation analysis limited to only 3 variables, we expand this to deal with nearly 30 variables bringing the best possible result for the analysis performed and the further model created from this analysis. Throughout past years, the forecasting domain has been impacted by the way that specialists have excused neural networks (NNs) as being non-serious, whereas NN enthusiasts have presented few new and improved NN models, generally without solid empirical assessments when contrasted with streamlined univariate measurable techniques [4]. For feature-based forecasting, this paper used XGBoost as a meta model to assign weights to forecasting variables. In the paper [5], their model outperforms all other individual models in the ensemble. The models used in this paper is the model we have used as the main model for training and also included other boosting models. The idea of using ML models for forecasting was obtained from [6] wherein the authors use multiple basic ML models in order to make predictions on the M3 dataset and evaluate relative performance across multiple horizons. While the paper provides insight into the advantages that the ML models provide, further enhancing them with the help of the Booster model furthers the cause of working on larger datasets with ease. The deep learning approach has taken a more positive turn in the forecast and analysis of network traffic as in [7–9] where the authors analyse as well as predict the network traffic in the future using LSTM and RNN. And, the major comparison is between these models are general, and hence for this paper, we have also used the same two network architectures, i.e. LSTM and RNN, for analysing the taxi trip duration of future trips. Traffic analysis has been a hot topic of research, especially in the network domain which can easily be incorporated into the real-life scenario of taxi trips since this can be visualized as a network. For industrial datasets from [10], it was observed that normalizing plays a vital role while using ML models for predictions; it is termed crucial for a pre-processing pipeline, especially for larger datasets as it can bring down computations drastically while improve the performance of the models to a great extent. For skewed data, log transforms prove to be beneficial, especially for right-skewed data. Although as put by the authors of [11], it might not always be the case; analysis needs to be done on the data after log transform to confer the outputs if they are needed in the way they are given. If the data output from the log transform is not normalized and it presents another skew, then it would be better to not use log transform, but if it does provide the
Predicting New York Taxi Trip Duration Based on Regression . . .
19
Fig. 1 Hidden layer of LSTM blocks
right output, then it makes complete sense to use log transform for skewed dataset as it might improve performance and also give normalized outputs while removing skew. As put forward by the authors of [12], accurate time series forecasting plays a critical role, taking business operations into consideration, as it covers predicting customer growth, understanding trends and anomaly detection. LSTM was considered in being used to capture the nonlinear traffic dynamics which demonstrates the capability of time series prediction with long temporal dependency. The authors in the paper [13] utilize local information to train an LSTM neural network to predict the traffic status as shown in Fig. 1. The real-world traffic network being highly dynamic, a deep stacked bidirectional and unidirectional LSTM is used to predict the traffic state. The model takes spatial time series data and outputs the predicted values. The LSTM cell includes the input layer xt and the output layer ht, with each cell providing an output state. While the paper provides an avenue for the expansion of the concept to a large network with the help of a hierarchical method, the use of CatBoostRegressor or an XGBRegressor could have dealt with the sophisticated network being generated in a much more efficient way, as it would be able to deal with the increased complexity of the data along with the size of the data. This gated structure enables long-term dependencies to be learned by the LSTM allowing the useful data to pass alongside the network. (1) h t = t × tanh(Ct ) t Ct = f t × Ct−1 + it × C
(2)
t is the This is used to calculate the cell output state and the layer output. Here, C output generated from performing the calculation for each iteration with the help of the sequential data. In Fig. 1, the LSTM block is depicted where each timestamp indicates the input whose base design is similar to that mentioned in [13].
20
S. Ramani et al.
3 Proposed Methodology 3.1 Data Collection The datasets were collected from Kaggle. For our research work, three datasets were obtained namely: 1. taxi trip duration [14], 2. distance dataset [15], 3. weather dataset [16].
3.2 Data Pre-processing The data obtained from these sources were not processed (i.e. the datatypes of some attributes were mismatched, example: date was object format whereas it should have been date-time format). All the datasets are then merged into one to get a complete data frame that can be used to perform all further analysis. The datasets 1, 2 from the above section are merged based on the column ID, whereas the weather dataset is merged on the basis of date. After analysing the data, it was observed that the target variable had a left skew in it, and so to remove the effects of the skew, logarithmic transformations are applied. Also, the attributes with NA values were detected and filled with 0.
3.3 Feature Engineering Date–time attributes can be extracted from this like day of the week, day of the month, month, etc. K -means clustering is performed on the dataset to get a cluster analysis of the pickup and drop-off latitude and longitude. This approach is adopted so as to apprehend the concept of cluster then predict in the models to attain higher accuracies. Figure 2 represents the first 50,000 data points present in the dataset and how they are clustered into 50 clusters. The objective function of the K -means algorithm is as follows: J=
50 50000
weightik ||x i − u k ||2
(3)
i=0 k=1
where x i data point belongs to the kth cluster and u k is the centroid of x i cluster and weightik shows the weight that is trained for the kth cluster in the ith training example. From Fig. 3a, it is clear that the data are right-skewed (positively skewed distribution). Using a log transform on the target variable, the skew is normalized and is visualized in Fig. 3b, c for log and log(1 + x) transforms, respectively.
Predicting New York Taxi Trip Duration Based on Regression . . .
21
Fig. 2 Representation of clusters using K -means
x i = log(1 + x i )
(4)
x i = log(x i )
(5)
Here, xi is the data point and the log transforms are applied, respectively, using Eqs. 4 and 5.
3.4 Machine Learning To achieve a clean dataset, the data is processed using K -means clustering and then normalised using log transformations. Machine learning models, namely XGBRegressor, LGBRegressor, CatBoostRegressor, are used to predict the taxi trip durations as a regression problem. To check neural network performance on this, the problem is analysed through a time series problem perspective. For the deep learning models in the paper, MinMaxScaler was used to get all the attributes into the range of 0–1. LSTM Network: The LSTM network used for the predictions in this paper follows a three layered block system where each block consists of 2 LSTM layers, followed by 1 dropout layer which is used for regularization. There are three blocks that are placed one after the other, and in each layer, the number of neurons is reduced starting
22
S. Ramani et al.
Fig. 3 Distribution of trip durations, a without log transform, b with log(1 + x) and c with log(x) transforms
with 64 neurons and the final set containing 16 neurons each. A final dense layer is then used for the predictions. The input to this model is of the shape (1, 42) which is 1 column with 42 days of previous records data. RNN Network: A simple RNN network with three block architecture is used in this paper. Each block consists of two simple RNN modules and one dropout module which is used for regularization wherein 20% of the neurons are effectively dropped from the network at random to avoid overfitting. The input to this remains the same as the LSTM network.
3.5 Feature Analysis To analyse if the engineered features are correlated to each other which might cause issues in modelling the machine learning models. A value of 1 depicts a perfect linear relationship, and 0 shows a nonlinear relationship. Figure 4a shows a heatmap of correlations between all features from the original dataset and also after feature engineering; Fig. 4b shows correlation with the target variable.
Predicting New York Taxi Trip Duration Based on Regression . . .
Fig. 4 Pearson coefficient
23
24
S. Ramani et al.
As expected, it is observed that total_distance has the maximum positive correlation with the trip_duration. Also, pickup latitude and longitude have different correlations to the target variable, and pickup and drop-off longitude have different correlations to the target variable. Speed and cluster also are significantly correlated to the target variable.
3.6 Evaluation To evaluate the models, two metrics were chosen with respect to the two metrics R 2 score and mean squared error (MSE). Each model is trained on 90% of the dataset and evaluated on the remaining 10% of the dataset. R score = 1 − 2
[ytest[i] − pred[i]]2 ytest[i] − µ
(6)
This score can be explained as the variance of the model based on the predictions that are made by the model versus the total variance. A low value will depict that the model’s predictions are lowly correlated, and hence, the models used in this paper aim to attain high R 2 score. MSE =
n 1 (predi − vali )2 n i=1
(7)
MSE score is used to evaluate the sum of the variance of the predictor variables and the squared bias of the variables at play, where predi is the prediction made by the model and vali is test set value of the target variable.
4 Results From Table 1, it is clearly noticeable that the XGBRegressor was the best performing model with an R 2 score of 0.97 which is 12% more than the LGBRegressor and 5% more than the CatBoostRegressor (as shown in Fig. 5). From Fig. 5 we can clearly see that the model generates a non-noisy, nonheteroscedasticity along with a range from the lowest and highest values of the y_predicted. The prediction error plots the test set targets from the dataset against the predictions made by the model. The plot depicts the difference between the residuals on the vertical axis and the dependant variable on the horizontal axis, allowing to detect the regions susceptible to the error. The error plot in Fig. 5 shows the variance of the error of the regressor model, and from the figure, we can see that a fairly random distribution of the data targets in the two dimensions.
Predicting New York Taxi Trip Duration Based on Regression . . .
25
Fig. 5 Prediction errors presented from XGBoost
Fig. 6 Comparison of R 2 score
From the implementation, it was observed that the log transforms perform the best and the results are shown in Table 2. In general, it was observed that with and without scaling the R 2 score was the same for the three models that were used for the regression task. This is possibly because of the use of boosting models that are used for predictions (as shown in Fig. 6). MSE values were the least in the case of the XGBoost model with only a value of 396, while both the other models are in the vicinity of 1100. From Fig. 6, it is clearly seen that the XGBoost model scores better than all models and also a major point to note is that with the log transform (i.e. the normalized variables) results have been better both for the R 2 and for the MSE score, except for the CatBoost model which performs below expectations.
4.1 Deep Learning Model Performance From Fig. 7, it is clearly notable that the RNN model has performed better, especially because the MSE value is far lesser than that of the LSTM model, and both these
26
S. Ramani et al.
Fig. 7 R 2 score and MSE values for the deep learning models Table 1 Results for R 2 score and MSE without log transform MSE Model LGBRegressor XGBRegressor CatBoostRegressor
1158.154 459.85 836.87
Table 2 Results for R 2 score and MSE with log transform MSE Model LGBRegressor XGBRegressor CatBoostRegressor
1105.9 396.58 1089.6
Table 3 Results for R 2 score and MSE Model MSE LSTM RNN
0.97986 0.991586
R 2 score 0.855 0.9787 0.9295
R 2 score 0.877 0.984 0.881
R 2 score 0.74508 0.48158
models performed better than that of the regression models with the LSTM model also performing better than ML models and the RNN outperforming the LSTM (as values shown in Table 3).
Predicting New York Taxi Trip Duration Based on Regression . . .
27
XGBoost is the best of the ML models that were trained, and RNN is the best of the deep learning models. When comparing the two models indicated above, we find that the RNN performs somewhat better, i.e. by 2% (as shown in Tables 1 and 2).
5 Conclusions An approach was proposed to predict the taxi trip durations using regression analysis with machine learning and time series forecasting using deep learning models. We first adopted the idea of using logarithmic transform to normalize the target value. Along with that new features are engineered which include date–time features like day, time, etc., and also cluster using K -means. Predictive analysis is done using two methods, the first being the regression analysis using ML models, and the second being time series analysis using deep learning models. The deep learning models outperform the ML models by 12% in terms of R 2 score and the RNN outperforms the LSTM. Normalizing the target variable using log transform increases the overall R 2 score by 0.6% for the XGBoost model. The output from the ML models is decent given the nature of the dataset and that of the deep learning models is better when compared to that of the ML models.
References 1. U. Patel, A. Chandan, NYC taxi trip and fare data analytics using BigData, in Analyzing Taxi Data Using Bigdata (2015) 2. S. Rong, Z. Bao-wen, The research of regression model in machine learning field, in MATEC Web of Conferences, vol. 176, pp. 01033. EDP Sciences (2018) 3. Z. Turóczy, L. Marian, Multiple regression analysis of performance indicators in the ceramic industry. Procedia Econ. Finan. 3, 509–514 (2012) 4. J.G. De Gooijer, R.J. Hyndman, 25 years of time series forecasting. Int. J. Forecast. 22(3), 443–473 (2006) 5. P. Montero-Manso, G. Athanasopoulos, R.J. Hyndman, T.S. Talagala, FFORMA: feature-based forecast model averaging. Int. J. Forecast. 36(1), 86–92 (2020) 6. S. Makridakis, E. Spiliotis, V. Assimakopoulos, Statistical and machine learning forecasting methods: concerns and ways forward. PloS One 13(3), e0194889 (2018) 7. R. Madan, P.S. Mangipudi, Predicting computer network traffic: a time series forecasting approach using DWT, ARIMA and RNN. in 2018 Eleventh International Conference on Contemporary Computing (IC3), pp. 1–5. IEEE (2018) 8. S. Nihale, S. Sharma, L. Parashar, U. Singh, Network traffic prediction using long short-term memory, in 2020 International Conference on Electronics and Sustainable Communication Systems (ICESC), pp. 338–343. IEEE (2020) 9. T. Shelatkar, S. Tondale, S. Yadav, S. Ahir, Web traffic time series forecasting using ARIMA and LSTM RNN, in ITM Web of Conferences, vol. 32, pp. 03017. EDP Sciences (2020) 10. J. Sola, J. Sevilla, Importance of input data normalization for the application of neural networks to complex industrial problems. IEEE Trans. Nucl. Sci. 44(3), 1464–1468 (1997) 11. F.E.N.G. Changyong, W.A.N.G. Hongyue, L.U. Naiji, C.H.E.N. Tian, H.E. Hua, L.U. Ying, Log-transformation and its implications for data analysis. Shanghai Archiv. Psychiat. 26(2), 105 (2014)
28
S. Ramani et al.
12. S. Du, M. Pandey, C. Xing, Modeling Approaches for Time Series Forecasting and Anomaly Detection (ArXiv, Stanford, 2017) 13. M. Abdoos, A.L. Bazzan, Hierarchical traffic signal optimization using reinforcement learning and traffic prediction with long-short term memory. Expert Syst. Appl. 171, 114580 (2021) 14. https://www.kaggle.com/c/nyc-taxi-trip-duration/data. Last Accessed 4 Oct 2021 15. https://www.kaggle.com/oscarleo/new-york-city-taxi-with-osrm. Last Accessed 4 Oct 2021 16. https://www.kaggle.com/mathijs/weather-data-in-new-york-city-2016. Last Accessed 4 Oct 2021
Implementation of Classical Error Control Codes for Memory Storage Systems Using VERILOG Sreevatsan Radhakrishnan, Syed Ishtiyaq Ahmed, and S. R. Ramesh
Abstract Error coding is a method of detecting and correcting errors that ensures the detection of information bits and error recovery in case of damage. Encoding is done using mathematical techniques that pad extra bits to data which aid the recovery of the original message. Several error coding techniques offering different error rates and recovery capabilities are employed in modern-day communication systems facilitating error-free transmission of information bits. Hardware-based implementations of these error coding techniques for robust memory systems and processors has become imperative due to error resistance compared to their software counterparts. In this work, the authors demonstrate the VERILOG implementation targeted for Artix-7 board, various error coding, and correction methodologies in the view of hardware storage using field programmable gate array (FPGA), thereby providing the readers an insight into the performance and advantages offered by these techniques. Their performance in terms of power consumption and utilization is evaluated and analyzed. Keywords Error control codes · Hamming encoding · Cyclic redundancy check · Field programmable gate array · Verilog
1 Introduction Error coding is a technique for improved and reliable data storage when the medium has a high bit error rate (BER) due to physical damages and soft errors. Usually, these are termed as single event upset (SEU) and is of paramount importance in memory systems, as with aggressive scaling and higher packaging of devices, the probability of erroneous flip has increased. Therefore, it has become imperative to include error control modules as a part of hardware architecture. Instead of storing message bits as a bit form, this message is encoded with additional bits before sent to storage. This longer “code word” is then stored, and the decoder can retrieve the desired data S. Radhakrishnan · S. Ishtiyaq Ahmed · S. R. Ramesh (B) Department of Electronics and Communication Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_3
29
30
S. Radhakrishnan et al.
by using maximum likelihood decoding rule. The additional bits transform the data into a valid code word for the given coding scheme. The space of valid code words is a subset of all space of possible bit words of that length; so, the destination can recognize invalid code words. In case of errors introduced while storing, they are detected in the decoding process at the destination as the word would be the result from an invalid word. The code rate is the ratio of data bits to total bits in the code words. A high code rate results in information content for given length and lesser overhead bits. However, fewer bits appended as redundancy, the higher error-prone is the scheme. The error control capabilities of a coding scheme are correlated with its complexity and code rate. A trade-off is made between available bandwidth and the error protection offered for the storage. Consider a processor that computes the implemented logic and has a data path to store the logic in the memory. But once sent via data bus, processor has no way to ensure the correctness of data store in that memory location and in case of error, after retrieval processor must recompute. Further such mismatch allows for accidental bitflips or induced malicious attack to introduce error pattern. So herein, the authors propose various models of error control codes that would prevent such flips. This is illustrated in Fig. 1 where this error control block containing encoder and decoder is present in between storage elements and processor units connected via data bus. These encoder block add redundant bits to actual message and store in reserved memory location, while decoder removes the additional bits appended and corrects errors, if any. Also, this module rises an internal flag suggesting the correctness and recovery from the introduced error. Authors also show that these modules are
Fig. 1 Block diagram depicting Error Control Codes for memory storage system
Implementation of Classical Error Control Codes …
31
utility efficient and consume very less amount of hardware to provide the additional protection. The authors implement various error detection and error correction schemes for such memory storage system, developing a VERILOG module, schematic, synthesis the schemes targeted for FPGA implementation and have mapped it out for Artix-7 board to arrive at various reports. The choice of VERILOG as hardware language is due to the more flexible construct and C-like syntax. The following are designed in this work, checksum module, CRC encoder, hamming encoder, and decoder. In this work, authors opt for a hybrid of behavioral and structural modeling style as structural model allows to model bigger system into of simpler sub-modules while behavioral model allows for top-level abstraction of the system. This modeling style allows for independent processing of the sub-modules before integrating them into a complex system. This work is organized as follows: Next section highlights major works done in the field. This followed by methodology section explains the key concepts used in this implementation error control codes and shows the steps taken to arrive at the results. Results and discussion section highlight the waveform, report on power utilized, and show waveform for selected inputs. This section is followed by conclusion highlighting further scope of this work.
2 Literature Review Shannon’s noiseless coding theorem gives statement that there exists an efficient lossless method, at a rate approaching the channel capacity [1]. The development of modern error control techniques is credited to the works of R. Hamming [2]. The works of Gilbert [3] and Varshamov et al. [4] formalize a bound and introduction of error correction efficiency. The series of work by Wozencraft [5] highlights the computational aspects of such error correction schemes. (15,5,3), (15,7,2), and (15,11,1) Bose–Chaudhuri–Hocquenghem (BCH) codes on FPGA are designed and implemented in [6]. The work in [7] proposes a design of propagation decoder using FPGAs for polar codes [8] which shows higher throughput and improved complexity of the architecture, in comparison with convolutional turbo code decoder. Advances in serial-in serial-out (SISO) decoder algorithm have enhanced the turbo product codes and thus prominent in practice [9, 10]. Convolutional encoder and adaptive Viterbi decoder are implemented in FPGA platform using VHDL as in [11]. In [12], the work presents a bit rate and power consumption comparison of various ECC on different hardware. Data handling capability with low-power design for the system has been addressed [13, 14]. A multi-data LDPC decoder architecture is implemented on a Xilinx FPGA device [15]. From the literature survey, authors observe that no significant work is done on integrating the error control codes with FPGA memory systems, even though lot of error control systems are published. Authors in this work address the error control system in view of FPGA-based memory system design.
32
S. Radhakrishnan et al.
3 Methodology 3.1 Error Detection Schemes Checksum. The technique of checksum for memory systems is to ensure speed up in direct memory access (DMA) applications for accelerated hardware (as in C2H compilers) without trade-off on reliability. The DMA engine using checksum incurs very little overhead when switching between buffer locations, as this requires no software interruption. The computation of checksum is 8-bit long and is obtained by performing modulo255 addition over message bits in random-access memory (RAM) contents for all the retrieved 64-bytes buffer space. The obtained checksum is then flipped and appended along with the message to the processor. At the processor side, the re-computation of checksum is performed by including the checksum bits with the message. If the recomputed checksum with original checksum and message is computed to be 0, implies correct transmission while incorrect transmission requires retransmission from buffers. CRC encoder. There is possibility of soft error propagation in configuration randomaccess memory (CRAM) cells which are due to external radiations resulting in SEUs. To protect such information, these bits are protected by encoding using CRCs. The cyclic redundancy check (CRC) is formulated by making the encoded binary number as coefficients of a finite polynomial ring devised with special operations governed by finite field evaluation. The data bits at memory side are appended with the remainder bits obtained by the finite polynomial division. The retrieved bit string is divided by the same generator polynomial, and obtained remainder is compared with 0. The generator polynomial is latched up with pull-up data lines as it will utilized in decoding process. CRC can be implemented as a linear feedback shift register (LFSR) with for serial data input. But such design is suboptimal as this implementation has an unusually high clock speed required for operation and only one data bit every clock cycle. To achieve higher throughput, using linearity property of CRC, the serial LFSR designed as parallel N-bit-wide circuit, so that N bits are processed in every clock. A CRC can detect burst errors of up to r bits of errors, where r is the order of generator polynomial [16].
3.2 Error Correction Schemes Hamming module: Hamming code encodes 4 bits of message into 7-bit code words by adding additional parity bits to each message string and is thus called (7,4) Hamming code. The hamming bits are inserted in integral powers of 2 (i.e., bit positions 1,2,4). The parity bits computed herein are non-systematic, and using cyclic
Implementation of Classical Error Control Codes …
33
property, these are valid code words. This scheme is suitable for burst, errorless, low-noise communication channels. In FPGAs, the DDR memory controllers use matrix-based hamming encoders. These encoders are implemented using LUTs even though faster and are suboptimal for memory utilization, but matrix multiplication is costly. This [17] work suggests implementing them using cyclic codes and polynomials as serial LFSR, with a tradeoff on the processing cycle. Authors herein have implemented the encoder simply as series of XOR function of corresponding bit positions, thus saving memory and holding the clock cycle. Hamming decoder is constructed by computing the syndrome bits “s” using the input word from retrieved data. If “s” is computed to be 0, this implies that no error is present. For all the non-zero values of “s,” the interleaving allows to read out the single-error bit. This corresponding bit is flipped to arrive at the mapped code word. From this code word, the data bits are recovered by omitting the parity bits that are interleaved at bit positions that are integral powers of 2.
4 Results and Discussion Note that the high values of static power in all the schemes implemented are due to power consumed by I/O pins, while the actual design consumes a lot less power as this is sub-module present within the memory system and would involve no I/O pins as this act on the computed data bits. So, for in all practical applications, only the switching power is considered as dynamic power.
4.1 Error Detection Schemes Checksum. Figure 2 corresponds to the waveform obtained where “data” of 64 bits (processed message from processor) is the input for the checksum module while “out” (72 bits long) is the checksum (8 bits at the start—marked in yellow) appended message that ensures the correctness of the written signal during retrieval from memory. Figure 3 shows the circuit schematic. Figure 4 shows the RTL top block is presented where “sender” module is the processor module that performs write operation to the memory and checksum block is an internal sub-module that computes the redundant “checksum” bits. Figure 5
Fig. 2 Waveform for 8-bit checksum on 64-bit data (yellow box depicts the checksum bits)
34
S. Radhakrishnan et al.
Fig. 3 Checksum circuit schematic (65 cells and 168 nets)
Fig. 4 Top-level module sender depicting the functional module and corresponding I/O pins
Fig. 5 Power report of checksum
represents the power value, and Fig. 6 correspond to the utilization report. From utilization table, we note that the LUT utilization for the module is less (66/63400 LUT slices) and allows room for further actual logic to be implemented.
Implementation of Classical Error Control Codes …
35
Fig. 6 Utilization report of checksum
Fig. 7 Waveform of CRC encoding 1 s compliment design
Fig. 8 Circuit layout for CRC encoder depicting gates, LUTs, and buffers
CRC encoder: Fig. 7 shows waveform obtained where “data_in” is the input message and “crc_out” (16 bits long highlighted in yellow) is the CRC signal that is stored in the memory. Waveform also shows the “crc_en” to be active high and “rst” to be active low signals. The “crc_out” is computed by appending the negated message with negated remainder obtained by field division. Herein, the mathematical division is retrieved to be made in XOR counting and this is further simplified to be made parallel, resulting in lesser footprint as in circuit shown. Figure 8 shows the circuit schematic showing the simplicity of parallel LFSR implementation as this saves on clock cycle by making it linearly independent computations. The schematic uses 75 cells and 84 nets for the circuit implementation. Figure 9 is the result of RTL block view that maps all the input signal from the data bus and output signal to memory. Figures 10 and 11 show the utilization and power report of the implementation. Unlike checksum, we require to store the computed results of initial bits as they are reused in serial LFSR implementation and thus uses slice registers. But in comparison with serial counterparts, this method allows for design flexibility and division polynomial can be modified on the fly as well only one cycle, thus a better throughput.
36
S. Radhakrishnan et al.
Fig. 9 Top-level module of CRC encoder
Fig. 10 Power report for CRC encoder
Fig. 11 Utilization report of CRC encoder
Fig. 12 Hamming encoder waveform: Non-systematic “code word” generation
Implementation of Classical Error Control Codes …
37
4.2 Error Correction Schemes Hamming module: Waveform presented in Fig. 12 shows “data” is the input signal to the module and “code word” is the encoded signal that is sent to memory. This signal is arrived by doing a non-systematic encoding of message signal appending with parity bits computed by XOR operations. Figure 12 shows the circuit schematic. This module utilizes active high clock enable, and it involves only three additional gates represented by LUTs in the circuit above. The circuit footprint is represented in Fig. 13. The top-level footprint is shown in Fig. 14. One of the major advantages of this scheme is that it is highly power efficient as well as leaves only very less footprint. This is evident from utilization as in Figs. 15 and 16. Utilization table clearly shows very little consumption to add redundancy. Similarly, decoder system is designed with “code word” as the input for this module, while this module sends the actual “data” back to the processor. Figure 17 shows the circuit schematic. This schematic shows the buffers and MUX used in decoding by computing syndrome bits and mapping them to retrieve to actual corrected data bits. The top-level module depicting the RTL layout with input and output lines is presented in Figs. 18, 19, and 20. Again, the major highlight of this hamming decoder design is the scalability and power efficiency. The power report suggests active power in decoding logic
Fig. 13 Circuit schematic of hamming encoder
38
S. Radhakrishnan et al.
Fig. 14 Top-level module of hamming encoder
Fig. 15 Power report of hamming encoder
Fig. 16 Utilization report-hamming encoder
Fig. 17 Waveform for hamming decoder
is 0.131 W which in comparison with others is optimal for single bit correction. The scalability to higher hamming codes depending on the errors can be done as utilization report in Fig. 21 suggests that less 1% of total LUTs available is used in such design (Table 1).
Implementation of Classical Error Control Codes …
Fig. 18 Schematic for hamming decoder Fig. 19 Top-level module of hamming decoder
Fig. 20 Power report for hamming decoder
39
40
S. Radhakrishnan et al.
Fig. 21 Utilization report for hamming decoder
Table 1 Power comparison table for various schemes in this work
Error control scheme
Dynamic power Static power (in W) (in W) Signal
Logic
Checksum
1.567
0.551
0.223
CRC encoder
0.689
0.334
0.141
Hamming encoder
0.060
0.020
0.087
Hamming decoder
0.075
0.056
0.085
5 Conclusion With rapid growth and development of big data and Internet of Things, as well with increasing emphasis in hardware security in compact miniaturized system, the development of efficient error control systems that aids error-free storage and retrieval of data has become imperative. The design integration of such error control codes using FPGA in various application is flexible as the utilization and power dependence are specific to nature of error, available bandwidth, and thus customizable. In this work, the Verilog-based implementation of a plethora of error control codes is depicted, specifically targeted for the Artix-7 board technology, and their performance is analyzed. The scalability of system is independent of targeted board although in such case power and utilization may differ. The systems demonstrate improvement in reliability of data storage in hardware-based memory systems and the simulations presented in this work validate the use of FPGA-based implementation of error control codes for building systems with improved pipelining efficiency and low-power systems.
References 1. C.E. Shannon, A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948) 2. R.W. Hamming, Error detecting and error correcting codes. Bell Syst. Tech. J. 29(2), 147–160 (1950) 3. E.N. Gilbert, A comparison of signalling alphabets. Bell Syst. Tech. J. 31(3), 504–522 (1952) 4. R.R. Varshamov, Estimate of the number of signals in error correcting codes. DockladyAkad. Nauk, SSSR 117(1957), 739–741 5. J.M. Wozencraft, List decoding. Q. Progress Rep. 48, 90–95 (1958)
Implementation of Classical Error Control Codes …
41
6. A.K. Panda, S. Sarik, A. Awasthi, FPGA implementation of encoder for (15, k) binary BCH code using VHDL and performance comparison for multiple error correction control, in 2012 International Conference on Communication Systems and Network Technologies (CSNT) (IEEE, 2012), pp. 780–784 7. A. Pamuk, An FPGA implementation architecture for decoding of polar codes, in 2011 8th International Symposium on Wireless Communication Systems (ISWCS) (IEEE, 2011), pp. 437441 8. E. Arikan, A performance comparison of polar codes and Reed-Muller codes. IEEE Commun. Lett. 12(6), 447–449 (2008) 9. S. Khavya, B. Karthi, B. Yamuna, D. Mishra, Design and analysis of a secure coded communication system using chaotic encryption and turbo product code decoder, in Advances in Computing and Network Communications (Springer, Singapore, 2021), pp. 657–666 10. G. Shivanna, B. Yamuna, K. Balasubramanian, D. Mishra, Design of high-speed turbo product code decoder, in Advances in Computing and Network Communications (Springer, Singapore, 2021), pp. 175–186 11. Y.S. Wong et al., Implementation of convolutional encoder and viterbi decoder using VHDL, in Proceedings of 2009 IEEE Student Conference on Research and Development (IEEE, Serdang, Malaysia, 2009), pp. 22–25 12. G. Balakrishnan et al., Performance analysis of error control codes for wireless sensor networks, in 4th International Conference on Information Technology, 2007 (ITNG’07, IEEE, 2007), pp. 876–879 13. A.S.K. Vamsi, S.R. Ramesh, An efficient design of 16 bit mac unit using vedic mathematics, ın 2019 International Conference on Communication and Signal Processing (ICCSP) (IEEE, 2019), pp. 319–322 14. C. Mahitha, S.C.S. Ayyar, S. Dutta, A. Othayoth, S.R. Ramesh, A low power signed redundant binary vedic multiplier, in 2021 5th International Conference on Trends in Electronics and Informatics (ICOEI) (IEEE, 2021), pp. 76–81 15. L. Yang, H. Liu, C.-J.R. Shi, Code construction and FPGA implementation of a low-error-floor multi-rate low-density parity-check code decoder. IEEE Trans. Circ. Syst. I Regul. Pap. 53(4), 892–904 (2006) 16. E. Stavinov, A practical parallel CRC generation method. Circ. Cellar-Mag. Comput. Appl. 31(234), 38 (2010) 17. J.M. Gilbert, C. Robbins, W. Sheikh, FPGA implementation of error control codes in VHDL: an undergraduate research project. Comput. Appl. Eng. Educ. 27(5), 1073–1086 (2019)
Parkinson’s Disease Detection Using Machine Learning Shivani Desai , Darshee Mehta, Vijay Dulera, and Hitesh Chhikaniwala
Abstract Parkinson’s disease (PD) is one of the illnesses which influences the development and its moderate and non-treatable sensory system problem. Side effects of Parkinson’s infection might incorporate quakes, inflexible muscles, stance and equilibrium weakness, discourse changes, composing changes, decline squinting, grinning, arms development, and so on. The manifestations of Parkinson’s illness deteriorate as time elapses by. The early location of Parkinson’s sickness is one of the critical applications in the present time because of this explanation. According to the execution, part is concerned it is partitioned into two unique parts. The first incorporates pre-handling of the MRI picture dataset utilizing different methods like resizing, standardization, histogram coordinating, thresholding, separating, eliminating predisposition and so forth to zero in on the part which is significant and gets more precise outcomes. In the subsequent section, a dataset with different various elements of human discourse which helps in identifying Parkinson’s illness has been utilized. Here additionally, the dataset will be handled first, imagined, adjusted, and afterward, at last, be split into preparing and testing. Utilizing machine learning calculations, we will prepare the model like decision tree classifier, logistic regression, support vector machine, XGBoost and K neighbors classification, and after testing, we will get results utilizing execution boundaries like exactness score, accuracy review, disarray grid and so on. Keywords Parkinson’s disease · Machine learning · Magnetic resonance imaging · Parkinson’s progression markers initiative S. Desai (B) · D. Mehta · V. Dulera Computer Science and Engineering Department, Nirma University, Ahmedabad, Gujarat, India e-mail: [email protected] D. Mehta e-mail: [email protected] V. Dulera e-mail: [email protected] H. Chhikaniwala Info Comm Technology, Adani Institute of Infrastructure Engineering, Ahmedabad, Gujarat, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_4
43
44
S. Desai et al.
1 Introduction Parkinson’s disease (PD) is a non-reparable moderate tangible framework issue that effects working of the mind and various turns of events. This disease is achieved by the lack of dopamine-conveying neurons and is significantly found in people of 50–70 years of age and comes later Alzheimer’s. The symptoms of PD fall apart as your condition progresses as time goes on. Notwithstanding the way that there is no indisputable therapy for Parkinson’s sickness (PD), early area and appropriate organization may reduce the signs and especially work on the condition of patients’ lives. Thusly, Parkinson’s affliction revelation is one of the fundamental consistently. Side effects of PD might incorporate quakes, unbending muscles, stance and equilibrium weakness, discourse changes, composing changes, decline flickering, grinning, arm development and so forth. Determination of PD at a beginning phase is a significant testing task as early non-engine manifestations of PD might be gentle and can be brought about by numerous different conditions, and subsequently, these indications are regularly disregarded [1, 2]. As a result, to overcome these difficulties, various AI techniques for the arrangement of PD and sound controls or patients with comparative clinical introductions have been developed (e.g., development issues or other Parkinsonian disorders). Using AI approaches, we can then identify relevant elements that are not commonly used in the clinical analysis of Parkinson’s disease and rely on these alternative measures to identify PD in the early stages [3]. For analysis of PD, procedures like magnetic resonance imaging (MRI), single photon emission computed tomography (SPECT), positron emission tomography (PET), functional magnetic resonance imaging (fMRI) are utilized [4, 5]. The explanation we have utilized MRI examines here is on the grounds that they are considered to give results better compared to other utilizing procedures of ML and DL-SVM, ANN, naïve Bayes, 2D-CNN, 3D-CNN and so on For building the model, the means followed will be in the request-information assortment (dataset), information pre-handling utilizing different pre-handling strategies, improvement of the model, preparing the model and testing something similar and assessing the outcomes utilizing different execution boundaries like accuracy review, confusion matrix, loss score, etc. The center would be that of picture prehandling and in this manner include determination and extraction. In the subsequent section, a dataset with different various highlights of human discourse which helps in identifying Parkinson’s sickness has been utilized and thought about. Here likewise, the dataset will be handled first and in this manner managing expulsion of repetitive information, insignificant information, checking for missing information and so on. Utilizing different plots information can be imagined. Likewise, the dataset is checked for unevenness, and utilizing SMOTE dataset has been adjusted alongside setting up the train test highlights from the dataset for model preparation and testing. We have taken a stab at assessing the presentation by different ML methods like XGBoost classifier, support vector machine, naïve Bayes, K neighbors classification and so on. The model will initially be prepared and afterward tried, and we will get the outcomes utilizing execution boundaries like exactness score, accuracy review, disarray grid and so on.
Parkinson’s Disease Detection Using Machine Learning
45
1.1 Motivation Early non-engine indications of PD can be humble and may be brought about by an assortment of different sicknesses, making early analysis of PD troublesome. Therefore, these side effects are habitually missed. Therefore, a few AI, profound learning approaches for the order of PD and sound controls or patients with similar clinical introductions have been created to settle these difficulties (e.g., development issues or other Parkinsonian disorders). We can identify useful features which cannot be notices otherwise by using ML and DL techniques. Also, with machine learning, there are some models created but very few work on MRI images. They mostly work on tabular data directly available. We will try to develop a model which will work on MRI images and from that images we will obtain data and do the further processing.
1.2 Research Contribution The research contribution for this paper includes following things: For the MRI scans, pre-processing is done using various techniques available as per the need. This will help in focusing and working with only the part which will be useful for the detection of PD, selecting and extracting the features and enhancing the same. Results of various ML and DL have also been specified.
1.3 Organization For the remaining portion, section-wise description is as follows: Sect. 2 consists of date in table format of related work with details of papers, their title, technique, dataset, pros and cons, etc. Section 3 shows the workflow steps, design of model of the system as proposed. Section 4 shows comparative study of various techniques of machine learning as well as deep learning based on their accuracy and other such parameters. Following that, in Sect. 5, specifics about the dataset, such as the source, type of scans, quantity of data, and so on, are provided. Section 6 details about model, and results obtained are discussed. Toward the end, conclusion as well as future work is mentioned.
2 Related Work In this part, a study of some current ML and DL strategies is introduced. For instance, in cliteb7, the creator has proposed a counterfeit neural organizations (ANN) model.
46
S. Desai et al.
Right around, 200 SPECT pictures were utilized and pre-handling; division on this information of pictures was done to step up the efficiency. ANN was utilized with sigmoid as actuation work for the model structure. Another approach by SolanaLavalle et al. [6] has proposed a model using machine learning using techniques like (k-closest neighbors) KNN and support vector machine (SVM). Additionally, for zeroing in just one significant area/part voxel-based morphometry is utilized. Conversely [7], we fostered a model for early discovery of Parkinson’s illness utilizing the AlexNet model created by preparing DICOM images. Images are pre-handled using techniques such as image rotating and mirror image handling. Pictures are pre-handled using methods like image rotating and mirror image handling. In paper [8, 9], creator has proposed DL foresee Parkinson’s and characterized into a phase. They used a CNN model with five convolutional layers and used MRI pictures as an information base. Standardization is utilized for pre-handling. For model K-field cross-approval has likewise been done. Table 1 shows beneath the correlation of different investigations identified with it.
3 Proposed Work 3.1 Proposed System Model The framework model proposed in this article has a few stages related to it before the model was prepared, which incorporates investigation and pre-handling of the pictures. The whole stream should be visible in Fig. 1. All the MRI checks utilized were gathered from the PPMI data set and had been separated through certain boundaries, for example, considering just the hub view and the sort of MRI grouping. From that point onward, the pictures were separated into two primary envelopes according to their group PD or HC. Each picture is advanced pre-handled before being given as a contribution to the model, beginning with predisposition field amendment, histogram coordinating, standardization of the pixel esteems, thresholding, sifting and resizing every one of the pictures to a comparable size. The resultant pictures will be given as contributions to the model on which they will be prepared and tried, and utilizing different assessment boundaries execution of the model was estimated [10]. Next, in Fig. 2, the workflow for the text-based speech dataset is proposed. The dataset was collected from UCI ML repository which had data of HC and PD patients and their speech data characteristics. Then comes the part of processing the data like checking for null values, duplicate values, unnecessary values, etc. Next, we will visualize the processed data using various plots like count plot, pair plot, etc., and make observations. Before splitting the data for training and testing, one important part of balancing the data remains. This is very useful when the classes are not near to equal. We can do this using SMOTE analysis. Next, the training data will first be trained and then tested using various ML techniques like KNN, SVM, naïve Bayes, XGBoost, etc. Once that is done, we can measure the performance of the models
End-to-end Parkinson disease diagnosis using Brain MR-Images An explainable machine learning model for early detection of Parkinson’s disease using LIME on DaTscan Imagery Deep learning based diagnosis of Parkinson’s disease using convolutional neural network MRI scans classification using machine learning and voxel-based Model using deep Image alignment and learning for prediction of segmentation, extraction stages of ROI, PCA
2018
Mahesh, Pavam Richard 2020 DelwinMyloth, Rij Jason Tom
2019
2021
2021
2018
S. Sivaranjini, C. M. Sujatha
Solana-Lavalle, Rosas-Romero
Mozhdehfarahbakhsh
Bhalchndra
Data pre-processing
Early detection of Parkinson’s disease through shape-based features
Segmentation, Feature extraction
Normalization
Normalization, Gaussian Filter
Normalization, resizing
Skull stripping, data augmentation
Hough transformation, sequential grass fire algorithm, moments of image, XGBoost
Soheil Esmaeilzadeh, Yao Yang,
Title Early detection of Parkinson’s disease using image processing and artificial neural network
Year
2018
Author
MosarratRumman, Abu Nayeem Tasneem, Sadia Farzana,
Table 1 Comparison of related work (literature survey) Technique
LDA, SVM classifier
KNN, SVM, multilayer perceptron, Random forest, Bayesian networks
CNN
CNN
CNN
CNN
Segmentation and ANN
Pros
Cons
Inefficient image processing
Use of only one slice of image
Prior to training model authenticity
Insufficient image processing
Use only one angle of image
Used K-fold cross-validation
Feature extraction done manually
Results is accurate one of Feature extraction done the reasons being speech manually based
K-fold cross-validation
Good image quality
Transfer learning
High parse data in MRI
Good enough image processing, adaptive learning
Parkinson’s Disease Detection Using Machine Learning 47
48
S. Desai et al.
Fig. 1 Proposed system model for MRI scan database
Fig. 2 Proposed system model for text-based dataset Table 2 Comparative study of ML/DL algorithms Type of dataset Technique used SVM Naïve Bayes classifier Decision tree classifier SVM LDA and SVM Classifier ANN CNN Transfer learning, Alexnet pre-trained model Various ML techniques
Accuracy score
MRI Scans MRI Scans 3-T MR imaging MRI Scans SPECT Scans SPECT Scans MRI Scans MRI Scans
92.35 93 92 86 98 94 96 88.9
SPECT Scans
97.27
using various evaluation parameters like accuracy score, confusion matrix, precision, recall, etc. [11].
4 Comparative Study of ML/DL Algorithms [1] By taking techniques used, data set type and accuracy score as parameters comparative study of various ML/DL algorithms has been done as shown in Table 2 [12].
5 Dataset The dataset utilized here comprising of MRI examine pictures to recognize Parkinson’s illness is taken from Parkinson’s progression markers initiative (PPMI). As told the dataset comprises of MRI filter pictures in .nii (NifTI—neuroimaging informatics technology initiative) record design and for the most part two kinds of pictures are
Parkinson’s Disease Detection Using Machine Learning
49
available pizazz (fluid attenuated inversion recovery) and non-energy (T2 weighted) pictures. Alongside that we likewise have various kinds of picture design records for pizazz and non-style which are accessible as changed over, standardized and expanded and so on Notwithstanding, we have chosen to utilize and work basically with energy and T2-weighted pictures for a similar which is .nii documents. Also, for the other part, we have made use of speech dataset of normal healthy people and ones suffering from PD. The dataset for the same is taken from machine learning repository of UCI. The dataset has total of 240 records, and 46 instances of which 120 records are of healthy patients and other 120 are of people with Parkinson’s disease. The attributes/features included here are jitters (pitch local perturbation measures), shimmers (amplitude local perturbation measures), status (HC/PD), gender, harmonic to noise ratio (HNR) measures, recurrence period density entropy (RPDE), pitch period entropy (PPE), glottal to noise excitation ratio (GNE), detrended fluctuational analysis (DFA) [13, 14].
5.1 Data Pre-processing for MRI Scans Data pre-processing is one of the important steps before using images from dataset for our machine learning/deep learning techniques as it helps in better performance and better diagnosis of disease prediction and thereby improving accuracy of the model by taking into account only the parts/features which are important and enhancing the same. Pre-processing of the dataset may include resizing, normalization, segmentation, cropping of the images, data augmentation, histogram matching, removing bias, etc. Some specific image processing methods are used in the case of neuroimages such as – – – – – – – –
Normalization Histogram matching Image resizing Smoothening of images CLAHE Otsu thresholding Skull stripping Binary mask, unsharp masking, etc.
Bias and Removal Field of Bias: Space of the top of a patient in the scanner, shortcoming in the scanner, head twists, temperature, and perhaps a couple of issues can add to the heterogeneity of differing power all through the MRI check. At the end of the day, the force worth might differentiate such that it is not tentatively huge inside a comparative tissue [15]. This is known as the field of predisposition. This can create issues like division, and we probably would not get exact outcomes. Accordingly, a specific kind of pre-processing is expected to wipe out or right the
50
S. Desai et al.
inclination field. Thus, through this, low recurrence and non-consistency found in MRI pictures can be revised. Histogram Matching: Here, there is a source picture which will be changed according to the reference picture given [15]. In fundamental words, histogram matching is the point at which we roll out specific improvements or adjust the first information, which is the MR picture. Likewise, we will get histogram of the source coordinating with that taken as reference subsequently. Histogram matching redesigns the differentiation of the picture and can save the misfortunes that happen as a result of difference gains or clippings on account of shadows. Figure 3 shows the matching impact, as we can see the subsequent picture endeavors to copy the pixel forces of the reference picture while keeping up with the semantic and spatial highlights (Figs. 4, 5, 6, 7 and 8). Normalization: The standardization of the picture is a change from the info picture power highlights to deliver an alternate differentiation between the result picture tissue. The forces of the info information are changed in standardization to do the picture standardization, to comprehend the changes and the advancement, individually. Two diverse difference pictures are the T2-weighted and FLAIR pictures. It is finished by deducting the normal pixel power esteem from the unique picture and
Fig. 3 Flair axial image
Fig. 4 Speech dataset
Parkinson’s Disease Detection Using Machine Learning
51
Fig. 5 Before bias removal
Fig. 6 After bias removal
Fig. 7 Before histogram matching
separating it by the qualities for getting a proper reach somewhere in the range of 0 and 1 (Figs. 9, 10, and 11). Z = x − μ/σ CLAHE: Contrast limited adaptive histogram equalization (CLAHE) is one of the parts of adaptive histogram equalization. Here, the work is not done on entire image, but it is done on very small regions called tiles. It is used for removal of contrast at times as it handles over amplification of contrast.
52
S. Desai et al.
Fig. 8 After histogram matching
Fig. 9 Before normalization
Fig. 10 After normalization
Otsu Threshold: This technique is widely used for the purpose of image thresholding, wherein after processing, the input image, and its histogram, the threshold value of it is calculated [5]. And then, the pixels of image will be replaced half into one region and other half into another region, i.e., replace by white if amount of saturation more than that of threshold and replace with black otherwise.
Parkinson’s Disease Detection Using Machine Learning
53
Fig. 11 Output after applying CLAHE, Otsu threshold and resizing
Fig. 12 Healthy control (HC) count and Parkinson disease (PD) patients count (0 denotes HC, 1 denotes PD)
6 Performance Evaluation and Results In the below image, a count plot of the data in our dataset is shown. Here, we can see that in our dataset there are 120 records of healthy people and 120 records of people suffering from Parkinson’s disease which is shown correctly as below (Figs. 12 and 13). The below image shows pair plot for different features like Shimloc, ShimdB, ShimAPQ3, ShimAPQ5, ShiAPQ11. These features are correlated to a level which is one thing we can observe from this. Next, SMOTE is used for balancing of the dataset. The number of instances for both the classes is equal in our case. This is not always the case. So, we need to do balancing of the dataset. Once done, we divide our dataset into training and validation data. And then, we have applied ML algorithms (Figs. 14, 15, 16, 17, 18, 19 and 20).
54
S. Desai et al.
Fig. 13 Pair plot where we can see that all these fundamental frequencies, variation in amplitude are highly correlated with each other Fig. 14 Confusion matrix for KNN. Accuracy achieved—85.41
Parkinson’s Disease Detection Using Machine Learning
Fig. 15 Confusion matrix forXGBoost. Accuracy achieved—83.33
Fig. 16 Confusion matrix for naïve Bayes. Accuracy achieved—83.33
Fig. 17 ROC AUC curve for SVM
55
56
S. Desai et al.
Fig. 18 Confusion matrix for SVM. Accuracy achieved—86.45
Fig. 19 Performance measures for various different ML-based techniques (results of our model implementation) Fig. 20 Performance measures for various different ML-based techniques as comparison [9] (results of already implemented models)
Parkinson’s Disease Detection Using Machine Learning
57
We have implemented ML techniques like decision tree classification, support vector machine, naïve bayes classification, KNN classification and XGBoost classifier. We can conclude the following from tables: For the naïve Bayes and KNN, we have been able to achieve almost the same accuracy. And we got the highest accuracy. Also, the accuracy achieved by us is higher as compared to the one we are comparing with. Thus, in this way we have implemented and performed Parkinson’s disease classification using various machine learning techniques.
7 Conclusion and Future Work For the detection of PD, dataset used was obtained from PPMI Web site consisting of MRI scans and flair and non-flair images with .nii file format. After that, pre-processing techniques like histogram matching, z-score normalization, image resizing, etc., have been applied for selection, extraction and enhancement of features of the images. From various machine learning techniques, we will then study and implement the technique which gives us the best results. The results and output have been visualized using accuracy, precision, recall, confusion matrix, etc., which are parameters for evaluation of performance. Similarly, for the text-based speech dataset, after processing and visualizing the data, we split the data into train and test data and provide the train data to the various ML techniques like naïve Bayes, SVM, XGBoost, Naïve Bayes, etc. The results and output have been visualized using accuracy, precision, recall, confusion matrix, etc., which are parameters for evaluation of performance. Also, we were able to increase the performance efficiency. The entire implementation work currently has been done on Jupyter Notebook and Google Colab. In the future, implementation of deep learning techniques like CNN can also be done which can give us more accurate results for detection of Parkinson’s disease.
References 1. P.M. Shah et al., Detection of Parkinson disease in brain MRI using convolutional neural network, in 2018 24th International Conference on Automation and Computing (ICAC). IEEE (2018) 2. S.R. Nair et al., A decision tree for differentiating multiple system atrophy from Parkinson’s disease using 3-T MR imaging. Eur. Radiol. 23(6), 1459–1466 (2013) 3. P.R. Magesh, R. DelwinMyloth, R. Jackson Tom, An explainable machine learning model for early detection of Parkinson’s disease using LIME on DaTSCAN imagery. Comput. Biol. Med. 126(2020), 104041 4. S. Sivaranjini, C.M. Sujatha, Deep learning based diagnosis of Parkinson’s disease using convolutional neural network. Multimed. Tools Appl. 79(21), 15467–15479 (2020) 5. M. Rumman et al., Early detection of Parkinson’s disease using image processing and artificial neural network, in 2018 Joint 7th International Conference on Informatics, Electronics & Vision
58
6.
7. 8. 9.
10.
11.
12.
13. 14. 15.
S. Desai et al. (ICIEV) and 2018 2nd International Conference on Imaging, Vision & Pattern Recognition (icIVPR). IEEE (2018) G. Solana-Lavalle, R. Rosas-Romero, Classification of PPMI MRI scans with voxel-based morphometry and machine learning to assist in the diagnosis of Parkinson’s disease. Comput. Methods Programs Biomed. 198, 105793 (2021) E. Huseyn, Deep Learning Based Early Diagnostics of Parkinsons Disease (2020). arXiv preprint arXiv:2008.01792 A. Mozhdehfarahbakhsh et al., An MRI-Based Deep Learning Model to Predict Parkinson Disease Stages. medRxiv (2021) C.O. Sakar et al., A comparative analysis of speech signal processing algorithms for Parkinson’s disease classification and the use of the tunable Q-factor wavelet transform. Appl. Soft Comput. 74, 255–263 (2019) M.B.T. Noor et al., Application of deep learning in detecting neurological disorders from magnetic resonance images: a survey on the detection of Alzheimer’s disease. Parkinson’s disease and schizophrenia. Brain Inform. 7(1), 1–21 (2020) S. Haller et al., Differentiation between Parkinson disease and other forms of Parkinsonism using support vector machine analysis of susceptibility-weighted imaging (SWI): initial results. Eur. Radiol. 23(1), 12–19 (2013) N.A. Bhalchandra et al., Early detection of Parkinson’s disease through shape based features from 123 I-Ioflupane SPECT imaging, in 2015 IEEE 12th International Symposium on Biomedical Imaging (ISBI). IEEE (2015) L. Naranjo et al., Addressing voice recording replications for Parkinson’s disease detection. Exp. Syst. Appl. 46, 286–292 (2016) B.E. Sakar et al., Collection and analysis of a Parkinson speech dataset with multiple types of sound recordings. IEEE J. Biomed. Health Inform. 17(4), 828–834 (2013) V. Tarjni et al., Deep learning-based scheme to diagnose Parkinson’s disease. Exp. Syst. e12739 (2021)
Sustainable Consumption: An Approach to Achieve the Sustainable Environment in India Sunny Dawar, Pallavi Kudal, Prince Dawar, Mamta Soni, Payal Mahipal, and Ashish Choudhary
Abstract In last few years, research on sustainable environment has motivated to unfold the problems through different marketing and consumption patterns. This claims to provide an alternative path to conceptualize the dynamic nature of society to speak about the sustainability. Most of the conceptual–practical research focus on routine problems of people neglecting the need of protection of environment for future generation. The core issues had been unaddressed by behavioural researchers like role of consumers in sustainable development. This research article aims to examine the determinants of consumer behaviour linked with sustainable consumption. The focus would remain on sustainable consumption and how dream of protection of sustainable environment can be achieved through sustainable consumption. The research makes an attempt to find out the determinants and effects of demographic variables on sustainable consumption. Keywords Sustainable environment · Sustainable consumption · Human behaviour · Sustainable development
1 Introduction Sustainability has been defined by several authors keeping in notice various variables which directly impact the sustainable living, sustainability inhibitors, and sustainability strategies. These all things have direct influence on perception and attitudes of S. Dawar · M. Soni · P. Mahipal (B) · A. Choudhary Manipal University Jaipur, Jaipur, Rajasthan, India S. Dawar e-mail: [email protected] P. Kudal Dr DY Patil Institute of Management Studies, Pune, Maharashtra, India e-mail: [email protected] P. Dawar Poornima Group of Colleges, Jaipur, Rajasthan, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_5
59
60
S. Dawar et al.
people to pursue sustainable development policies and practices. Sustainable development refers to social, economic, and environmental foundation which must be considered, harmonized, and communicated to sustain long-term feasibility of the society and the planet earth. It deals with the fulfilment of present generation needs without compromising needs of future generation. From last decade, a debate is going on, how to conserve the environment by sustainable consumption. Sustainable consumption has got vast importance at different levels of consumption. It has different suggestions for welfare of economy. It directly effects the consumption behaviour of people to encourage the sustainable consistency in an economy. The phenomenon of sustainable consumption is influenced by the allocation and utilization of different resources so that it has an ability to affect the long-term growth of an economy. Sustainable consumption is a consumption of goods which have very low impact on environment considering the social and economic parameters viable for meeting the common needs of human beings. It targets every individual and countries from an individual to every government and multinational corporations. The idea behind adoption of practice of sustainable consumption is to provide the consumers various products by reflecting their new environmental values. There are different methods by which sustainable consumption can be promoted like providing tax rebates to eco-friendly products, increasing taxes on more water and energy consumption, promoting 3R’s framework (Reduce, Recycle, and Reuse) through communication and educational campaigns. In current scenario, people show a lot of concern to environmental protection practices and want to purchase and use ecofriendly products. They make their purchase decisions concerning the eco-friendly practices. If we focus on demand side, it is only sustainable consumption which fosters the sustainable production practices. There is a requirement of multinational and multidisciplinary approaches to achieve the goal of sustainable environment. Less exploitation of resources would help the people to think more about conservation of environment and reduction of personal consumption. There are different issues which must be addressed and supported by people like providing training to consumers, increasing awareness, and bringing government and industry closer to work upon the sustainable issues. There is an emergent need to find out why and how consumers behave for sustainability issues. Various researchers have identified different contributing factors including demographic and psychographic that influence consumers’ behaviour towards sustainability. Consumption patterns in the world needs to be administrated in such a way which can increase the opportunities for the sustainable development in the world economy. There are many products which are having environmental advantages that are produced by many companies to improve the value of production. More utilization of these eco-friendly products would help to minimize the stress on environment and increase the customer satisfaction and would also promote the sustainable consumption. Sustainable consumption, in return, will help the human being to ensure their long-term survival on earth.
Sustainable Consumption: An Approach …
61
2 Literature Review Consumers have gained the sustainability knowledge as per the research policies developed by various researchers, and they are involved in mass consumption on the price of damaging consequences of environment [1]. The fundamental basics of sustainability are dependent on conscious efforts of consumers to gain the goals of sustainable consumption through deciding the future gain of society, environment, and economic systems [2]. Sustainable consumption is dependent on utilization of environmentally labelled products. The organizations are focussing to produce the eco-friendly products and organic food items to lead the protection of sustainable environment [3]. Existing research literature has an individualistic approach about sustainable consumption and environmental labels. Consumer choices are the reflection of not only of quality and prices but also are related with values and social phenomenon of consumers which have seen enormous growth in the global market [4]. The major goal of sustainable development remains to recycle and reuse of water through modern sources [5]. Conscious acceptance of prevented design can be the starting step for long-term changes in the ways of changing consumer behaviour. When consumers would become more and more aware, then demand for sustainable consumption would increase [6]. There is significant impact of different variables like social demographics and explicit and implicit consumer attitudes on sustainable consumption. Majority of the studies are concerned with production of new market and have a very less concern with consumer exploration end. Motivation and values support the actual sustainable behaviour for consumption of fashion products [7]. Consumers are responsible for unsustainable consumption behaviour [8]. The choices they make have substantial environmental effect which is separated from those facets which are influenced by consumers. Consumer choices are influenced by sustainable consumption. Problems related to sustainability can be solved by sustainable consumption as it empowers people for active lifestyle and can change the method of sustainable orientation. It is not good to use those products which disturb ecological balance, while not giving notice to those products related to positive influence on environment [9]. Ethical and sustainable consumption can be encouraged through perceived consumer effectiveness, social norms, values, and positive consumer attitudes [10]. Social media like Twitter can also be used for increasing the awareness about sustainable consumption. To extract crucial features, the tweets would go through six stages of pre-processing. After that, the tweets are categorized as good, neutral, or negative [11]. The multi-criteria decision problem [12] and sustainable consumption are gaining a lot of attention of international communities. Various effective programmes have acknowledged the financial and social dimensions of consumer decision making and have drawn more attention for the role of households as stimulating factor for production [13]. There are some limitations related to cultural, social, and historical context of sustainable consumption [14]. For more safety and protection against any glitches which have been discovered in residences, corporate structures, and various
62
S. Dawar et al.
manufacture sites, a comprehensive addition of the sanitizer machine with the door lock method is required [15].
3 Research Methodology The current research was conducted utilizing an empirical research design. The information was gathered using a standardized questionnaire. A pilot study with 75 respondents was done first, and several wordings of the final questionnaire were revised as a result of the pilot study. The final questionnaire was divided into two portions, each with structured questions. The first component included demographic information, while the second included questions about the structure of the proposed model. The data was collected using snowball sampling, which is handy when a population list is not accessible. The questionnaire was sent on Facebook, one of the most prominent social media networks [16]. The sample was taken using convenience and judgemental sampling techniques, and the frame of questions was developed using previous studies. The study utilized the five-point rating Likert scale. The study initially adopted the exploratory research to get new ideas and understandings. Thereafter, descriptive research was done. Survey method was utilized for collecting the research data using questionnaire. The respondents were also interviewed personally to gain more insights. Respondents were also given questionnaire to fill. A sample of 350 respondents was taken. Respondents were based on different demographic background. The data was analysed by SPSS 21.0 on the responses given by different respondents. Correlation technique has been used, and structural equation modelling (SEM) was used for data analysis. The Cronbach’s alpha was calculated to investigate the internal consistency of the items used in structured questionnaire. The content validity of the questionnaire was tested by the discussion of experts coming from industry and academics.
4 Data Analysis and Interpretation Data analysis and interpretation is dependent on the descriptive and inferential analysis.
4.1 Analysis of Reliability For testing the reliability, the structured questionnaire was given to 75 respondents and Cronbach’s alpha was calculated, and it must be exceeded from 0.70 and then it is said that questionnaire shows good reliability [17]. The Cronbach’s alpha value
Sustainable Consumption: An Approach … Table 1 Analysis of reliability
63
Reliability statistics Cronbach’s Alpha
No. of items
0.830
20
was 0.83 which is quite good and shows that the statements made to measure the constructs are quite reliable and not confusing. We moved ahead with some minute changes with this value and sent the questionnaire to all respondents using snowball sampling (Table 1).
4.2 Respondent Demographic Profile Table 2 reveals the demographic report of the respondents. This data has been collected using the questionnaire. Questionnaire was sent to 850 customers, 475 responses were received, out of those only 350 responses filled appropriately were taken for the final research study. Table 2 shows that the 62% respondents were males and only 37% were females. Maximum respondents were age-group of 18–30 years (youngsters). 54% respondents had graduation and higher degrees. And most of the respondents were from service class. The demographics is majorly normally distributed. Following constructs were identified after detailed review of literature. The constructs are borrowed from published works of few researchers. The details are given in next section (Table 3). Data collection was performed through structured questionnaires which were got filled though personal interaction and self-administrated electronic mediums. The Table 2 Demographic report of respondents
Variables Gender Age
Education
Occupation
Frequency
Percentage (%)
Male
220
62.86
Female
130
37.14
18–30 years
150
42.86
31–40 years
90
25.71
41–50 years
45
12.86
Above 50 years
65
18.57
High school
45
12.86
Graduate
190
54.29
Postgraduate
115
32.85
Student
140
40.00
Business
55
15.71
Service
155
44.29
64
S. Dawar et al.
Table 3 Constructs of structural equation modelling (SEM) Demographic variables DEM_1
Gender
DEM_2
Age
DEM_3
Education level
Sustainable consumption behaviour SCB_1
Try to preserve the environment through my daily activities
SCB_2
Promote social justice and human rights through my concrete activities
SCB_3
Consume local products for supporting economy
SCB_4
Encouraged for making changes in lifestyle for sustainable consumption
Influence of social environment SEN_1
My family members or friends motivate me to follow them for protecting environment
SEN_2
Participate in environmental protection and social work
SEN_3
Buy organic and ecological friendly products from the supermarket
SEN_4
Follow tradition of family to protect environment
SEN_5
In our society separation of wastes for recycling is normal phenomenon
Awareness and information AI_1
Have participated in workshop related with environmental issue
AI_2
From my peer group I have been taught to be responsible for various resources like electricity, energy fuel, and water, etc.
AI_3
Have been informed for sustainability issues
AI_4
Know about the negative effects of the products which are harmful for environment
Market determinants MD_1
Organic products help to protect the environment
MD_2
Know about the advertising tools promoting organic products
MD_3
Know about available distribution channels to buy environment saving products
MD_4
Even organic products are expensive, I still buy them
data set had taken 20 items. These items were taken to judge the external variables like demographic factors, awareness and information, influence of social environment, and market determinants.
4.3 Hypothesized Proposed Model This model was developed using available literature of different variables related to external variables which are considered as possible determinants. The addition of constructs and the relationship among different items in the model is based on earlier
Sustainable Consumption: An Approach …
65
knowledge and researches. The studies performed on different encouraging factors of sustainable consumption behaviour. Demographic variable: As adopted from [18] who identified impact of demographics like gender age and education level of green purchase and consumption behaviour. More demographics than in earlier studies were shown to be linked to a number of specific environmentally friendly activities. The authors came to the conclusion that individual behaviours, rather than broad statements or attitudes, may be more responsive to demographic influences. H1 : There is a positive and significant relationship between demographic variables and sustainable consumption behaviour. Awareness and information: As per study [19] found that the understanding of that term is a prerequisite for changes in consumer behaviour and consumption models. We adopted the construct from their findings in research paper Consumers’ Awareness of the Term Sustainable Consumption published in 2018. They had found that consumers are unfamiliar with the concept of sustainable consumption. The majority of respondents came across concepts that were connected to the concept of sustainable consumption, but they were unable to make links between them. Only around half of those polled were confident in their ability to interpret sustainable consumption on their own. It is worth mentioning, though that the responders’ wide range of behaviours suggests that they understand the notion. H2 : There is a positive and significant relationship between awareness and information and sustainable consumption behaviour. Influence of Social Environment: The study [20] contributed to a better understanding of what quality of life means from the standpoint of sustainable consumption. Different consumer motivations are discussed, as well as the contributions of the rich and poor to unsustainable consumption patterns. They began a discussion about the complicated relationship among consumption, beliefs, uniqueness, and mechanisms for making purchase decisions in a globalized perspective, using relevant literature as a starting point. We adopted the constructs developed in study [20] to understand the impact influence of social environment on sustainable consumption behaviour. Social environmental influence can be applied by family, friends, and other peer group members. They influence the attitudes towards the environment. Those groups which operate on the power mode can encourage the behaviour when people reach to them. Search of identity in a group and expectation of support also increase the influence of social environmental forces. H3 : There is a positive and significant relationship between influence of social environmental forces and sustainable consumption behaviour. Market determinants: Several authors identified factors like price, advertising and distribution channels impact the sustainable consumption behaviour [21]. The construct is based on this work, and few of the relevant items developed have been borrowed for this research paper. Market determinants of products and service of
66
S. Dawar et al.
sustainable behaviour also modify the consumption behaviour. The behaviour would be distorted when price of sustainable product increase. It is very important to know how consumers perceive efficiency of sustainable products in market. H4 : There is a positive and significant relationship between market determinants and sustainable consumption behaviour.
4.4 Correlation Analysis The correlation among all the variables was found which were used in the hypotheses. Table 4 shows positive and high correlation values between demographic variables and sustainable consumption behaviour. Table 5 shows positive and high correlation values between awareness and information related with consumers and their related sustainable consumption behaviour. Table 6 shows positive and high correlation values influence of social environment and sustainable consumption behaviour. Table 4 Correlation between demographic variables and sustainable consumption behaviour Demographic variable Sustainable consumption behaviour Demographic variable
Pearson correlation 1
0.725*
Sig. (2-tailed)
0.002
N
350
350
Pearson correlation 0.725* Sustainable consumption behaviour Sig. (2-tailed) 0.002 N
1
350
350
(*) represents that the significance of correlation value at 1% level of significance
Table 5 Correlation between awareness and information and sustainable consumption behaviour
Awareness and information
Sustainable consumption behaviour
Pearson correlation
Awareness and information
Sustainable consumption behaviour
1
0.745*
Sig. (2-tailed)
0.003
N
350
350
Pearson correlation
0.745*
1
Sig. (2-tailed)
0.003
N
350
350
(*) represents that the significance of correlation value at 1% level of significance
Sustainable Consumption: An Approach …
67
Table 6 Correlation between influence of social environment and sustainable consumption behaviour Influence of social environment
Sustainable consumption behaviour
1
0.785*
Influence of Social Environment
Pearson correlation N
350
350
Sustainable consumption behaviour
Pearson correlation
0.785*
1
Sig. (2-tailed)
0.000
N
350
Sig. (2-tailed)
0.000
350
(*) represents that the significance of correlation value at 1% level of significance
Table 7 Correlation between market determinants and sustainable consumption behaviour Market determinants Sustainable consumption behaviour Market determinants
Pearson correlation 1
0.765*
Sig. (2-tailed)
0.002
N
350
Sustainable consumption Pearson correlation 0.765* behaviour Sig. (2-tailed) 0.002 N
350
350 1 350
(*) represents that the significance of correlation value at 1% level of significance
Table 7 shows positive and high correlation values between market determinants and sustainable consumption behaviour. All the constructs showed relatively high positive correlation with the sustainable consumption behaviour. Hence, it is a good premise to move ahead and check for structural equation modelling results.
4.5 Measurement Model Estimation To evaluate the measurement model by applying statistical instrument, structural equation modelling (SEM) AMOS 21.0 was operated. It is dependent on a process of three steps:
4.5.1
Assessment of Convergent Validity
This assessment is based on the examination of average variance explained (AVEs) of every construct. AVEs values should be at least 0.50 to demonstrate that most
68
S. Dawar et al.
Table 8 AVEs and CRs of construct
Construct
No. of items
AVEs
CR
Demographic variables
3
0.71
0.82
Awareness and information
4
0.77
0.86
Influence of social environment
5
0.73
0.84
Market determinants
4
0.78
0.90
of the variance is responsible for the construct. Results of analysis can be seen in Table 8. It is quite visible that AVEs exceed the threshold value 0.50 showing that convergent validity had been satisfactory.
4.5.2
Examination of Internal Consistency
This is second step in which reliability of each construct is determined and composite reliability statistics shows internal consistency of each construct. It is identical to Cronbach’s alpha and uses 0.70 as threshold to show acceptable reliability of construct. Table 8 shows that CRs of all theoretical construct exceed 0.70 so there is internal consistency in items used for research.
4.5.3
Evaluation of Discriminant Validity
The establishment of suitable discriminant validity is the final step in evaluating a measurement model. This guarantees that each construct is distinct from the others and that the survey instrument variables are loaded on the appropriate construct. This can be investigated by comparing the square root of each construct’s AVEs to the correlations of further constructs. Discriminant validity is considered satisfied with AVEs are higher than the correlations values. Table 9 shows the discriminant validity of the instruments and this is examined by following Fornell and Larcker (1981). The square root of the AVE as showed in bold values on the diagonals was greater than the corresponding row and column values that indicate discriminant validity of the constructs. The bold values show discriminant validity. Table 9 Correlations of construct and AVEs 1 Demographic variables
0.81
Information and education
0.18
2
3
4
0.75
Influence of social environment
0.15
0.15
0.85
Market determinants
0.20
0.22
0.20
0.87
Sustainable Consumption: An Approach …
69
4.6 Overall Measurement Model Fit and Testing of Hypotheses To access this, SPSS AMOS software version 21.0 was applied as it can judge the complex path model, examine the overall model fit, and investigate the confirmatory factor analysis. It has two steps. In first step, assessment of overall fit of model is done, and in second step, level of significance is done to find out the hypothesis relationships. Bivariate analysis was performed, i.e. examine correlative p-value two tailed 0.05. To determine the fit of the model, chi-square, CMIN, GFI, CFI, and RMSA were determined as shown in Table 10. Overall fit indices used for structured modelling show that it fits with acceptable values to support the hypothesized model as shown in Fig. 1. GFI, CFI, IFI, and TLI, the model fit indices, were measured at 0.918, 0.950, 0.950, and 0.937, respectively. The RMSEA was 0.040, that is less than 0.05 and indicates a good model fit [22]. All fit indices needed to be better than 0.9 to indicate a good or acceptable model fit [23]. The final measurement model met all fit criteria. As a result, the final measurement model may be assessed to fit the data well and to have an acceptable level of reliability and validity (Table 10).
Fig. 1 Hypothesized proposed model
Table 10 Summary of overall measurement model
Fit indices
Overall measurement model Initial
Final
CMIN/DF
3.816
1.540
GFI
0.665
0.918
TLI
0.443
0.937
CFI
0.515
0.950
RMSEA
0.091
0.040
IFI
0.521
0.950
70
S. Dawar et al.
Table 11 shows that all null hypotheses are not accepted. The first hypothesis concluded that there is a positive and significant relationship between demographic variables and sustainable consumption behaviour which means that demographic variables related to consumer play an important role to impart the sustainable consumption behaviour. The acceptance of alternate option of second hypothesis emphasized that awareness of consumers plays an important in regulating the consumers’ sustainable consumption behaviour. So updated information has a significant role in controlling the consumption behaviour towards sustainability. The nonacceptance of third hypothesis highlighted that there is a positive and significant relationship between influence of social environmental forces and sustainable consumption behaviour. The social environmental factors considerably affect sustainable consumption behaviour for achieving the sustainable environment. The acceptance of alternate option of last and fourth hypothesis emphasized that market determinants have a significant role in controlling the sustainable consumption behaviour. So, companies need to use better marketing factors and practices for influencing sustainable consumption behaviour.
5 Conclusion The study was done keeping in concern the conceptual framework developed on previous researchers’ studies. The study aim was to examine the effect of different demographic variable and other determinants on sustainable consumption behaviour. The results of the research have shown that demographic variables and other determinants like social environmental forces, awareness and information, and market determinants all have positive and significant impact on sustainable consumption behaviour. It is very much necessary that sustainable environment can only be achieved when people will get more and more information of positive effects of sustainability. Social environmental forces also play a significant role to change the consumption behaviour and moulding them towards sustainable consumption. It is not only the task of consumers who must see the factors responsible for sustainable consumption but is also the duty of marketers to provide the sustainable products and adopt the practices of sustainability. Consumers require assistance in making sustainable choices, and as their awareness grows, they will begin to comprehend the consequences of their purchasing decisions, leading to long-term sustainable practices [24]. Consumer awareness has shifted as a result of globalization and technology improvements, and customers are now more aware of many social and environmental issues. Strategic problem framing by itself is unlikely to be helpful in changing entrenched attitudes and behaviours in the absence of a larger behavioural change programme [25]. Different methods of raising awareness, such as social media networking, correct advertising, product labelling, educational programmes, and so on, can be used to achieve the goal of sustainable development. Furthermore, consumers’ real purchasing decisions are influenced by the price and quality of
0.174
0.168
0.68
0.072
DEM → SCB
SEN → SCB
IE → SCB
MD → SCB
H1
H2
H3
H4
P
0.064
0.61
0.035
0.038
1.127
1.137
4.653
4.730
0.003
0.002
0.080
0.086
0.144
0.156
0.095
0.102
0.039
0.45
0.028
0.030
S.E
2.133
3.562
1.146
1.152
C.R. (t)
Standardized β
C.R (t)
SEM output: Modified proposed model
S.E
SEM output: proposed model
Standardized β
Paths
Hypotheses
Table 11 Significance test of individual hypothesis
0.001
0.002
0.002
0.003
P
Supported
Supported
Supported
Supported
Results
Sustainable Consumption: An Approach … 71
72
S. Dawar et al.
items and services. To help people cope with the situation, infotainment activities and programmes might be conducted. On the contrary, a legislative framework can be devised to exert control over corporations’ unsustainable behaviours, allowing for the timely correction of flaws. The current work has produced a fresh perspective that will be effective in raising consumer knowledge about sustainable consumption in order to attain long-term goals. Emotional intelligence increases the impact of involvement on pro-environmental and pro-social consumption behaviour, as well as having a direct impact on pro-environmental conduct [26]. This research has important managerial ramifications. It educates policymakers and marketing executives on the key predictors of consumers’ sustainable purchasing behaviour. Marketers would benefit from understanding the drivers to sustainable purchasing behaviour, as this knowledge will allow them to tailor their product offerings and develop marketing strategies to encourage sustainable purchasing behaviour. The current study has important implications for public policy. According to the findings, environmental awareness, demographics, market factors, and social environment are all important factors for consumers to investigate green products. Environmental education should be used by policymakers to further nurture and develop this tendency. Consumers are generally sceptical of manufacturers’ environmental claims and find it difficult to identify green products. As a result, environmental education should provide information on how a consumer can help the environment. Acknowledgements The authors would thank sincerely the management and administration of Manipal University Jaipur, Dr DY Patil Institute of Management Studies and Poornima Group of Colleges for providing necessary support for research.
References 1. J. Shadymanova, S. Wahlen, H. van der Horst, Nobody cares about the environment’: K yrgyz’perspectives on enhancing environmental sustainable consumption practices when facing limited sustainability awareness. Int. J. Consum. Stud. 38(6), 678–683 (2014) 2. H.M. Farley, Interpreting sustainability: an analysis of sustainable development narratives among developed nations. Ph.D. diss., Northern Arizona University, 2013 3. S. Koos, Varieties of environmental labelling, market structures, and sustainable consumption across Europe: a comparative analysis of organizational and market supply determinants of environmental-labelled goods. J. Consum. Policy 34(1), 127–151 (2011) 4. N. Mazar, C.B. Zhong, Do green products make us better people? Psychol. Sci. 21(4), 494–498 (2010) 5. S. Zanni, S.S. Cipolla, E. di Fusco, A. Lenci, M. Altobelli, A. Currado, M. Maglionico, A. Bonoli, Modeling for sustainability: life cycle assessment application to evaluate environmental performance of water recycling solutions at the dwelling level. Sustain. Prod. Consump. 17, 47–61 (2019) 6. L.A. Hale, At home with sustainability: from green default rules to sustainable consumption. Sustain. Sustain. 10(1), 249 (2018) 7. L. Lundblad, I.A. Davies, The values and motivations behind sustainable fashion consumption. J. Consum. Behav. 15(2), 149–162 (2016)
Sustainable Consumption: An Approach …
73
8. D.B. Holt, Constructing sustainable consumption: from ethical values to the cultural transformation of unsustainable markets. Ann. Am. Acad. Pol. Soc. Sci. 644(1), 236–255 (2012) 9. M. Bilharz, K. Schmitt, Going big with big matters. The key points approach to sustainable consumption. GAIA-Ecol. Perspect. Sci. Soc. 20(4), 232–235 (2011) 10. I. Vermeir, W. Verbeke, Sustainable food consumption: exploring the consumer attitude–behavioral intention gap. J. Agric. Environ. Ethics 19(2), 169–194 (2006) 11. V. Sharma, S. Srivastava, B. Valarmathi, N.S. Gupta, A comparative study on the performance of deep learning algorithms for detecting the sentiments expressed in modern slangs, in International Conference on Communication, Computing and Electronics Systems: Proceedings of ICCCES 2020, vol. 733 (Springer, 2021), p. 437 12. M.S. Ishi, J.B. Patil, A study on machine learning methods used for team formation and winner prediction in cricket, in Inventive Computation and Information Technologies (Springer, Singapore, 2021), pp. 143–156 13. M.J. Cohen, Consumer credit, household financial management, and sustainable consumption. Int. J. Consum. Stud. 31(1), 57–65 (2007) 14. P. Dolan, The sustainability of “sustainable consumption. J. Macromark. 22(2), 170–181 (2002) 15. M. Shanthini, G. Vidya, IoT-based smart door lock with sanitizing system, in Inventive Computation and Information Technologies (Springer, Singapore, 2021), 63–79 16. A.C. Nielsen, Global faces and networked places. Retrieved January, 29, 2010 (2009) 17. P.J. Lavrakas, Encyclopedia of Survey Research Methods (Sage Publications, 2008) 18. C. Fisher, S. Bashyal, B. Bachman, Demographic impacts on environmentally friendly purchase behaviors. J. Target. Meas. Anal. Mark. 20(3), 172–184 (2012) 19. E. Gory´nska-Goldmann, M. Gazdecki, Consumers’ awareness of the term sustainable consumption, in Conference Proceedings International scientific days 2018, Towards Productive, Sustainable and Resilient Global Agriculture and Food Systems (Volter Kulwer, Nitra, 2018), pp. 316–329 20. N.M. Ayala, Sustainable consumption, the social dimension. Revista Ecuatoriana de Medicina y Ciencias Biológicas. 39(1) (2018) 21. Y. Joshi, Z. Rahman, Factors affecting green purchase behaviour and future research directions. Int. Strateg. Manage. Rev. 3(1–2), 128–143 (2015) 22. J.H. Steiger, Point estimation, hypothesis testing, and interval estimation using the RMSEA: some comments and a reply to Hayduk and Glaser. Struct. Equ. Model. 7(2), 149–162 (2000) 23. J.F. Hair, R.E. Anderson, B.J. Babin, W.C. Black, Multivariate data analysis: a global perspective: Pearson Upper Saddle River 2010 24. M. Soni, S. Dawar, A. Soni, Probing consumer awareness and barriers towards consumer social responsibility: a novel sustainable development approach. Int. J. Sustain. Dev. Plan. 16(1), 89–96 (2021) 25. L.P. Fesenfeld, Y. Sun, M. Wicki, T. Bernauer, The role and limits of strategic framing for promoting sustainable consumption and policy. Glob. Environ. Chang. 68, 102266 (2021) 26. S. Kadic-Maglajlic, M. Arslanagic-Kalajdzic, M. Micevski, J. Dlacic, V. Zabkar, Being engaged is a good thing: understanding sustainable consumption behavior among young adults. J. Bus. Res. 104, 644–654 (2019)
The Concept of a Digital Marketing Communication Model for Higher Education Institutions Artur Kisiołek, Oleh Karyy, and Ihor Kulyniak
Abstract Digital marketing has become an essential element of higher education institution activities. Accordingly, higher education institutions need to adapt their marketing communications to modern realities. The authors’ intention was to highlight strategic and tactical aspects relevant to digital marketing communication processes based on literature review, the research of marketing activity of higher education institutions from Poland and Ukraine on the Internet, and own experience. The authors suggested the conceptual model for digital marketing communication of higher education institution that is universal, applicable to any type of higher education institution regardless of its profile, form of ownership and country. Keywords Digital marketing · Customer data platform · Higher education institutions · Marketing communication model · Omnichannel marketing · Web 2.0
1 Introduction The use of Internet technologies in the communication process is actively involved in the marketing activities of universities. These modern changes in marketing activities require scientific reflection. Internet marketing can also be described as a series of activities, procedures and practices that aim to achieve set goals, attract new visitors, retain them and turn visits into a specific customer response. To understand the different goals of digital marketing for each entity, the first thing you need to do is understand the core tools that are widely used today. Higher education institutions are A. Kisiołek ´ ´ Wlkp., Sroda Wielkopolska, Greate Poland University of Social Studies and Economics in Sroda Poland e-mail: [email protected] O. Karyy · I. Kulyniak (B) Lviv Polytechnic National University, Lviv, Ukraine e-mail: [email protected] O. Karyy e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_6
75
76
A. Kisiołek et al.
no exception in this matter. The multiplicity of online marketing instruments, tools and channels indicates both the remarkable potential of digital communication and the need to integrate various forms in many possible channels. The aim of the article is to present a conceptual model of digital marketing communication of a higher education institution, based on literature studies, own research on digital marketing of higher education institutions in Poland and Ukraine, and the authors’ professional experience in this matter. The model is an attempt to show the directions of integration in a multichannel and then omnichannel approach (multichannel, omnichannel) with the use of a Customer Data Platform (CDP) database and the use of cloud computing, artificial intelligence (AI) and machine learning.
2 Theoretical Outline Modern researchers distinguish three main directions of technological change affecting the operation of organisations, and these include transformational transitions: • from e-communication to m-communication; • from simple algorithmisation to artificial intelligence; • from simple analytics, via the Internet of Things (IoT) and Big Data, to predictive analytics [1–3]. The Internet, in the first, stationary phase of its existence (Web 1.0), became a database which was, for its current size, quite limited, and its further development in the interactive phase Web 2.0 brought an exponential growth of data. In such an overcrowded Web, in the chaos of publishing, finding valuable information is difficult, time-consuming and complex. Therefore, the next phase in the development of the Internet is the Web 3.0 concept, based on the semantic Web and artificial intelligence, the aim of which is to enable intelligent information search in a selective, targeted and user-driven manner. According to Rudman and Bruwer [4], in Web 3.0, a machine will be able to understand and catalogue data in a human-like way, making the Internet a base where any data format can be shared and understood by any device in any network. This evolution will enable the autonomous integration of data and services, as well as the creation of new functionalities. It also brings risks related to the protection of personal data, including unauthorised access or manipulation of data on an unprecedented scale. According to Baker [5, p. 14]: “The exploding world of data, as we will see, is a giant laboratory of human behaviour. It is a testing ground for social science, for economic behaviour and psychology”. A further development of the Web in the direction indicated above is already a fact, and its consequence will be, as Porter and Heppelmann [6] specified, the exit of data “trapped on two-dimensional document pages and computer screens” into three-dimensional Augmented Reality (AR) understood as “a set of technologies that put digital data and images on the physical world”, which gives hope for bridging
The Concept of a Digital Marketing Communication …
77
the gap between digital data collected on the Web and the possibility of using them in the physical world. The Internet, according to Kiełtyka and Zygo´n [7, p. 24], from a simple communication tool has become “the basis for the functioning of social life in many areas of the economy”, and technologies and adapted IT tools, both real and virtual, are still developing. The evolution of the Internet does not lose its momentum and the changes generated are reflected, among others, in the marketing activities of organisations worldwide. Digitalization of marketing is a process that has been progressing very dynamically for several years, determined by the speed of technological change, which is directly proportional to the changes that occur in the model of customer behaviour. It all started with the World Wide Web, which has now transcended computer screens and followed the user to include the mobile phone, the tablet, as well as wearable devices or the space around us, through objects hitherto offline, as exemplified by home appliances (Internet of Things). Artificial intelligence (AI) techniques are applied for customer data and that can be analysed to anticipate customer behaviour. The AI, the big data and advanced analytics techniques can handle both structured and unstructured data efficiently with great speed and precision than regular computer technology which elicits Digital Marketing (DM) [8]. According to Zeeshan and Saxena [9]: “The function of artificial intelligence and digital marketing have been started to be used in various fields, especially in marketing because it is the trendiest work in the market creating major changes”. The research conducted by the authors showed a certain picture of the use of the Internet in marketing activities of higher education institutions in Poland and Ukraine. This state cannot be described as “current” even at the time of survey interviews with respondents or the literature search. The findings point to the significant and growing role of the Internet in the marketing activities of higher education institutions in the analysed countries and areas that are gaining strategic importance for the future, such as social media and mobile marketing. The research results indicated the great potential of these areas, as well as opportunities for improvement, use of additional tools or integration towards multichannel and omnichannel communication. The results obtained during the research, as well as the speed and nature of changes in the area of digital marketing, made the authors develop a model of digital marketing communication of a higher education institution, which in its intention should be an aid in the process of digital transformation of a higher education institution’s marketing, as well as material for discussion on the future of marketing communication in an omnichannel environment, using the marketing automation technology in the cloud. Selected aspects of the marketing activity of a higher education institution are the subject of many research works by scientists representing different educational systems and nationalities [10, 11]. In the following, the selected models will be presented which provide the theoretical background to the concept prepared by the author. According to Williams [12], the last 20 years have seen profound changes as a result of the development of information and communication technologies, which are unique in their comprehensive reach and the stakeholders they affect. Higher education institutions are under increasing pressure to respond to national
78
A. Kisiołek et al.
and international economic, social and political changes. According to Chaurasia et al. [13], the answer to this evolving complex image of contemporary relationships between a higher education institution and its external stakeholders is “data”. Structured data streams from, for example transaction systems, web sites, social networks, radio frequency identification reader (RFID) scanners, as well as “physiological” data, “user footprint” data and other data from various types of sensors (such as popular beacons) are used to build collections of Big Data about users, which, after processing, become important for an organisation that has them at its disposal. The data-stimulated market for higher education services is referred to in the literature as “academic and educational analytics” [14]. According to Chaurasia et al. [13, p. 1100], Big Data can help an academic community better understand its students’ needs. Based on the concept of capability maturity models (CMM) developed at the Carnegie Mellon Software Engineering Institute, Chaffey and Ellis-Chadwick [15] proposed a CMM model for digital marketing consisting of processes considered in relation to five maturity levels, id est initial, managed, defined, quantitatively managed and optimised. In the earlier model, the researchers included six processes: digital channel strategy development; online customer acquisition; online customer conversion and customer experience; customer development and growth; crosschannel integration and brand development; and overall digital channel management, including change management. In their subsequent work, Chaffey and EllisChadwick [15] revised the model, detailing the following seven processes: strategic, performance improvement, management commitment; resources and structure; data and infrastructure; integrated customer communication and integrated customer experience. A summary of selected CMM models in relation to Web 2.0 technologies in digital marketing is included in Table 1. These models take a variety of perspectives, the broader the problem is framed, the more general the concept is and consequently it requires significant changes to adapt it to specific requirements of each organisation. Consequently, as reported by Al-Thagafi et al. [16, p. 1149], many sectors and organisations are developing their own, narrower and more detailed CMM models. The CMM model proposed by the aforesaid researchers relates to the use of social media (Web 2.0 technologies) in the recruitment of foreign students at higher education institutions in Saudi Arabia. To develop it, the authors used four business processes of the AIDA (Attention, Interest, Desire, Action) marketing communication model. The aim of the research was to assess the extent to which Web 2.0 technologies were implemented to support these processes. The analysis covered the period when a prospective student first learned about an educational service until he/she decided to apply to a particular higher education institution.
The Concept of a Digital Marketing Communication …
79
Table 1 Overview of capability maturity models (CMM) in relation to social media marketing activities Model
Description
Advantages
Restrictions
Dynamic capabilities model of organisations (Bolat et al. [17])
Four abilities: 1. Sensing the market 2. Managing relationships 3. Branding 4. Developing content
It distinguishes skills and knowledge for implementing mobile social media capabilities at every level
It focuses on the advertising industry in business-to-business (B2B) environments. It is based on a single operational level rather than maturity levels
Excellence in social media (Chaffey [18])
Five abilities: 1. Initial (unintentional use of social media) 2. Managed (specific objectives) 3. Defined (SMART objectives) 4. Quantitatively managed (statistical measurement of social media activity) 5. Optimised (return on investment after verification)
The model is constantly updated by the Smart Insights research group to provide information on the latest and best practices in social media
It focuses on plans and activities without paying attention to the capacity of staff to implement them
Strategic capabilities of the organisation (Nguyen et al. [19])
Three abilities: 1. Gaining knowledge from social media 2. Integrating knowledge 3. Applying knowledge in line with the strategic directions and choices of the organisation
The focus is on how organisational social behaviour can lead to effective integration of social media when implementing an organisation’s strategy
Switching between organisational change and marketing strategy; embedded in the cultural and economic context of China and its widely used Web 2.0 technologies, e.g. WeChat, Weibo
Social media opportunities in B2B communication [20]
Five abilities: 1. Broadcast speed—the speed at which a social media message reaches its target audience 2. Parallelism—the extent of mutual understanding between a sender and a receiver 3. Symbol sets—flexibility factor in message encoding 4. Possibility to practice—whether the message can be edited before sending 5. Re-processability—whether the message can be edited/deleted after sharing
A model was developed to improve marketing communication skills limited to B2B practices of small- and medium-sized enterprises (SMEs)
The focus is on B2B practices in SMEs (limited to Chinese SMEs and Web 2.0)
(continued)
80
A. Kisiołek et al.
Table 1 (continued) Model
Description
Advantages
Restrictions
Social media opportunities in B2B marketing [21]
Four abilities: 1. Technological—understanding and categorising different social networks according to the organisation’s strategic objective(s) 2. Operational—building online communities to increase user benefits 3. Managed—mechanisms to assess, control and measure social media performance and results 4. Strategic—ensuring that the organisation has the necessary cultural and individual capabilities to use social media in its business over the long term
A model has been developed that suggests having the capability of a dynamic environment that uses a social-cognitive approach to the differences in individuals’ skills, which is essential for the successful implementation of Web 2.0 in B2B marketing
A descriptive study based on literature analysis of 112 articles; no primary data were collected
Source Based on [16, pp. 1150–1151]
3 Omnichannel Versus Multichannel Approach According to Chaffey and Ellis-Chadwick [15], a strategic approach to digital marketing requires defining capabilities and initiatives to support the marketing and business objectives that an organisation should implement to leverage digital media, data and marketing technologies to increase multichannel audience engagement using digital devices and platforms. This definition will serve as a basis for further discussion, as the number of tools and channels and the new technological possibilities mean that modern marketing and, more specifically, marketing communications are moving towards a multichannel and omnichannel environment. In the multichannel approach, multiple tools are used independently of each other, and their interaction is non-existent or limited. The omnichannel model is based on the multichannel template, but the essence of its functioning is to integrate all tools and channels into one coherent message for the recipient. According to Nowak [22], omnichannel marketing is “a complex ecosystem based on a central data system with many interrelated components”. It is not just about giving the customer different ways to interact, but it is about “providing them with a consistent experience and maintaining a complete sequence of actions at every level of the customer’s relationship with the brand”. Omnichannel marketing, according to Berman and Thelen [23], is generally opposed to multichannel marketing. The researchers outline the main differences between each of these approaches based on two dimensions: organisational strategy
The Concept of a Digital Marketing Communication …
81
and consumer behaviour. In terms of strategy, the differences are based on the objectives of the organisation, the uniformity of messages across devices and channels, the distinction between offline and online services, the use of simple or multiple touchpoints, the format of the organisation and the extent to which the customer and databases are unified across all channels. Differences based on consumer behaviour include the following: the design of the consumer’s purchase path (uniform or different and linear or non-linear), the place of purchase versus the place of collection and return, and the degree of effort the consumer has to make when moving across channels and devices. Ultimately, the differences between multichannel and omnichannel marketing will result from the strategy, customer profile and process maturity of the organisation. In the case of higher education institutions, it is a matter of evolution: first the multichannel approach and then more personalization thanks to the implementation of the omnichannel approach. Gotwald-Feja [24, p. 263] believes that the changing marketing environment as well as the tools and scope of marketing communication force researchers to modify the known and previously applied models of marketing communication, which can be exemplified by a critical analysis of classical and contemporary models of communication by Szymoniuk [25, 26] and the concept of a spherical model of marketing communication proposed by this researcher. The models of marketing communication on the Internet, from the point of view of the considerations carried out, should be extended to the reality of the omnichannel environment. A proposal for an omnichannel marketing communication (OMC) model was presented by Gotwald (Fig. 1). The author introduced a mix of tools in the process of encoding and decoding a message, as well as turned the singularity of the process into its multiplicity and parallelism. Understood in this way, the sender’s (company’s) message is the sum of “messages delivered across multiple channels, using a variety of marketing communication tools” [27, p. 46]. The researcher points out the need to encode messages that complement each other synergistically, as they are decoded by the recipient
COMMUNICATION OMNICHANNEL
ENHANCEMENTS
SENDER (COMPANY)
AND EXPERIENCE CODING
FILTERS AND ENHANCEMENTS
COMMUNICATION (TOTAL MESSAGES)
TOOL MIX
DECODING TOOL MIX
DECODING
CODING COMMUNICATION
RECIPIENT (COMPANY)
SENDER (CUSTOMER)
(MESSAGE)
FILTERS AND ENHANCEMENTS
RECIPIENT (CUSTOMER)
CULTURE, EXPERIENCE AND INDIVIDUAL FEATURES
Fig. 1 Omnichannel marketing communication model. Source Gotwald, p. 46
82
A. Kisiołek et al.
(customer) applying different tools at different times. Synergy may occur at nodal points, where the recipient applies numerous tools at the same time, then, according to the author: “the message conveyed has a chance to resound in an exceptionally clear and unambiguous manner” [24, p. 268]. The discussed part of the model is shown in the upper part of Fig. 1, while its lower part illustrates the feedback process, in which the sender (customer) selects one communication channel and within it transmits a message which is then decoded by the receiver (company). The effective communication in an omnichannel environment requires, according to the author of the OMC model, planning communication in such a way as to “achieve synergy between messages, tools and objectives” [27, p. 47]. The aforesaid model fragmentarily illustrates the complexity of the digital marketing communications environment; and modern marketing is undergoing a profound transformation, part of which is the integration of multiple tools, channels, technologies and modes of operation with the requirements and needs of the customer operating in an off and online reality. It is therefore appropriate to conceptualise new models that take into account the impact of digitalisation on marketing activity, thus setting the scene for further research and practical applications.
4 Conceptual Model of Digital Marketing Communication of Higher Education Institutions Development The digital technology has made the so-called customer shopping path increasingly complex. Pasqua and Elkin [28, pp. 382–388] distinguish its four components: portability, preference, proximity and presence. Marketing has evolved from a monologue to a dialogue conversation with a single customer (with specific and unique needs) in real time, in which market players participate thanks to the development of social platforms and mobile technologies. This type of dependencies will also exist in the market for higher education services. With a smartphone in hand, consumers everywhere are online 24/7, which Kall [29, p. 30] calls “the most important consequence of the massive spread of mobile phones”, moreover, thanks to the development of location-based services (LBS), companies can tailor their communication strictly to the specific needs at a given moment. Consequently, there are changes in customers’ behaviour, their habits and expectations. This raises new challenges for today’s managers to keep up with adapting their marketing communications strategy to the new multichannel and then omnichannel reality. Based on the research results obtained from higher education institutions from Poland and Ukraine, the literature research and the author’s experience in online marketing, a conceptual model of digital marketing communication of higher education institutions has been constructed. The concept is universal, applicable to any type of higher education institution regardless of its profile, form of ownership and country. The digital marketing communication of a higher education institution, according to the model shown in Fig. 2, can be multichannel in the first stage or omnichannel in the
The Concept of a Digital Marketing Communication …
83
Stage I
Multichannel communication
Stage II
Omnichannel communication STUDENT DATA PLATFORM (SDP) DATA:
SENDER
HIGHER EDUCATION INSTITUTION
MESSAGE
CODING
CHANNEL
DATA BASE (CDP type)
RECIPIENT
DATA PROCESSING
CRM Digital advertising Social media E-mail Mobile Website Video
Digital advertising Social media E-mail Mobile Website Video Offline
SEGMENTATION
PERSONALISATION
INVOLVEMENT
COLLECTION
PROSPECTIVE STUDENT PRESENT STUDENT DECODING
GRADUATE
OPTIMISATION
SENDER
RECIPIENT
EXTERNAL PROVIDER OF IT SOLUTIONS
AUTHORITIES OF HEI
MARKETING DEPARTMENT OF HEI
IT DEPARTMENT OF HEI
INTERACTION (or INTERACTION CHANNELS) COLLECTION
DECODING
MESSAGE Digital advertising
Social media
E-mail
Mobile
Website
Video
CODING
Fig. 2 A conceptual model for digital marketing communication of higher education institution. Source Own work
second stage. Changes in consumer behaviour, as well as new technologies, according to Berman and Thelen, foster a shift from multichannel marketing to omnichannel marketing. Changes related to consumer behaviour include the increasing use of mobile devices, the widespread use of social media and the popularity of related software (e.g. applications). According to the cited authors, the large number of differences between multichannel marketing and omnichannel marketing indicate the complexity and multidimensional aspect of omnichannel marketing. It also suggests that an organisation may be in the initial, intermediate or final stages of adopting an omnichannel marketing strategy [23, p. 598]. The presented model does not assume the division of the multichannel and omnichannel stage into sub-periods related to the implementation phase, as this may be the subject of separate research. The sender transmits the communication (as total messages) via a cloud-based Student Data Platform (SDP). The SDP system is based on the Customer Data Platform (CDP), which, according to Kihn and O’Hara [30, p. 41], provides a place to store customer data and perform analysis, as well as a layer that takes abstract customer data and connects it to real-time systems to perform tasks such as managing interactions (in real time), making decisions and connecting content. The communication from the Student Data Platform is directed multichannel to the
84
A. Kisiołek et al.
recipient, who is a potential and current student and a graduate of a higher education institution (for simplicity, the author will use one term in the following description—recipient or student). The messages in each channel is the same for the same recipient—omnichannel approach, or other, different from each other—multichannel approach. The reception of a communication is multichannelled, as a student may receive emails, text messages, social media notifications, etc., from a higher education institution, as such opportunities are created by a multi- or omnichannel environment. The mix of tools used allows for additional synergies and streamlining of the communication process, thanks to the data management processes offered by the CDP and the technical ability to coordinate them (linking of channels). Once the communication (total messages) has been received and decoded, interaction from the receiver follows. This process, as Gotwald [27, p. 47], inter alia, points out, “has a linear character and depends on the effectiveness of the first tool (and channel) that the consumer chooses to convey the message” (in this case a student is treated as a customer). A student chooses a communication channel convenient for him or her at a given moment and sends a message to the university. His or her activity may include a response to a digital ad, social media post, email, web site or mobile app post. In the next step, the feedback message is decoded by the higher education institution. In a multi- or omnichannel environment is simplistic as it “primarily involves filters and amplifications, applied not at the level of need, but at the level of the perception of the relevance of the problem and the total value of the consumer” the student to the higher education institution. At this point, one cycle is closed and another university–student interaction is possible. It should be emphasised that all the communication described takes place in the cloud, and the platform itself can also be described as software for managing and sharing data for other systems. The components of the Student Data Platform are the data, the Customer Data Platform database and the key communication channels. The data feeding the CDP comes from the CRM (Customer Data Management) system or other databases which are at a higher education institution’s disposal and digital communication channels. The model includes the basic ones such as digital advertising, social media, email, mobile tools, web site or video. The data to be acquired can be categorised as follows: • 1st data party—data derived from user interaction with content that the higher education institution publishes online (e.g. through forms, subscriptions, likes and activity on its own web sites); • 2nd data party—data obtained through the cooperation of the higher education institution with its partners in the network (e.g. publishers of digital content aimed at the same target group); • 3rd data party—external data collected by third parties and made commercially available. The core element of the SDP is the Customer Data Platform. According to Kihn and O’Hara [30, p. 41], cited above, the customer data platform is not only a customer
The Concept of a Digital Marketing Communication …
85
information database, but it is also a real-time engagement system that makes splitsecond decisions with the support of five main processes. 1.
2. 3.
4.
5.
Data processing—the aim of which is to build a unified user profile with data from all the systems collected over the years, integrated and tailored to further applications. Segmentation—creating intelligent segments based on artificial intelligence and machine learning. Personalisation—a process in which the planning and response phases are intertwined, enabling campaigns to be managed and decisions to be taken in real time. As part of this process, the marketing specialist can create student experience as part of the academic customer journey, and develop rules (including machine learning-based decisions) to handle related events, such as an unfamiliar user appearing on a web site or mobile application. Engagement—the ability to contact the student directly (e.g. by sending an email or a text message) or interact with systems that reach the recipient through the channel of their choice. Optimisation—a process in which user conversion data and any other signals about user activity are captured, then actions are taken based on the reported results to improve them. The main goal of optimisation is Real-time Interaction Management (RTIM). According to M. Kihn and Ch. O’Hara “in a world where consumers move from channel to channel in near real time, RTIM is the way to connect and relate fast-moving experiences, driving relevance and delivering lift” [30, p. 113].
Customer data platforms are offered in the SaaS (software as a service) model, in the cloud, with the full possibility of integration with the IT environment functioning in a given organisation. These platforms mark the further path of the digital transformation of marketing, where what is offline intersects with what is online [31]. They are an evolution of CRM and Data Management Platforms (DMP), using Big Data resources and the latest AI and machine learning solutions to automate marketing activities and integrate them at an omnichannel level. Marketing automation based on AI and machine learning means a new approach to the problem of information overload and the need to constantly analyse it in order to improve communication and streamline decision-making processes. The main property of machine learning technology is the ability to continuously learn and improve results based on experience, and to continuously collect and analyse data [32]. CDP solutions offered by IT companies [33–36] of different sizes and scales, as well as in the banking sector [37], can be successfully implemented in higher education institutions [38] (e.g. in marketing communication, recruitment and enrolment processes, day-to-day student services, etc.).
86
A. Kisiołek et al.
5 Implementation of the Concept Model The implementation of the Student Data Platform as proposed requires the full support of the authorities of the higher education institution and strategic decisions at this level. The relationships between the different key entities in the implementation and management of the SDP are shown in the lower part of Fig. 2. The Marketing Department (or the higher education institution’s organisational unit responsible for marketing activities) is the initiator and main manager of the Student Data Platform. Modern marketing is real-time, and instant feedback means that higher education marketing professionals should regularly use technology platforms to measure, analyse and tailor their activities to the needs and experiences of students operating across multiple digital media channels [39]. User expectations move towards personalisation and highly individualised communication, and therefore, increasing demands will be placed on the technology applied and the skills of marketers. The challenge for a higher education institution using a digital marketing automation platform will be to define new roles for Marketing Department managers and specialists. The marketing manager will be responsible for leading the team and supervising processes and the implemented strategy (including ongoing projects and campaigns); coordinating cross-channel communication; defining strategic direction; approving content and analysing campaign metrics. The role of the specialist marketer will be to develop, plan and send marketing messages; analyse data for decision-making; project audit [40]; and monitor and optimise ongoing campaigns. At this point, it is important to emphasise the importance of a clear legal framework related to guaranteeing privacy rights for all users. Marketing departments of higher education institutions are responsible for operational contacts with the IT provider, while at the strategic level, including the scope and conditions of implementation of the CDP, the management role is performed by the authorities of the higher education institution in cooperation with the IT provider. The authorities of the higher education institution will be, among others, responsible for approving the content of major marketing communication campaigns in line with the adopted marketing strategy. Success in digital marketing depends on the ability of the marketing division to cooperate with the IT Department, which is responsible for assisting in the implementation of the platform, managing the infrastructure and carrying out day-to-day supervision of the functioning of the higher education institution’s IT resources. In a formal way, IT Department employees report to the authorities of the higher education institution.
6 Conclusions The discussed outline of the dependencies and roles of various organisational units of a higher education institution is of a general and postulative nature. The authors’ intention was to highlight strategic and tactical aspects relevant to digital marketing
The Concept of a Digital Marketing Communication …
87
communication processes. The intertwining of marketing and IT competences, the role of the authorities of the higher education institution and the extent of outsourcing in the digital transformation of marketing indicate areas for further research. The ability to coordinate different channels in a marketing message and the ability to personalise interactions and deliver the right content at the right time requires organisations to increase their technological competence almost exponentially. The model of digital marketing communication of the higher education institution outlined above is a proposal for higher education marketing professionals and managers, as well as those who supervise these areas. The impact of modern technologies on contemporary marketing broadens the scope of discussion on the marketisation of higher education, as well as the essence and role of marketing activities of the higher education institution. In this context, it may be highly probable to argue that today’s students expect to be active across all channels in which they are present, at the level of communication with their favourite brands.
References 1. K. Kania, Gamifikacja w procesie wprowadzania nowych technologii informatycznych do organizacji jako zadanie specjalistów HR. Zarz˛adzanie Zasobami Ludzkimi. 5, 27–44 (2018) 2. B. Filipczyk, Zarz˛adzanie wiedz˛a i komunikacj˛a cyfrow˛a w procesie onboardingu studentów, in Cyfrowa komunikacja organizacji, ed. by B. Filipczyk, J. Gołuchowski (Wydawnictwo Uniwersytetu Ekonomicznego w Katowicach, Katowice, 2020) 3. P. Karthigaikumar, Industrial quality prediction system through data mining algorithm. J. Electr. Infor. 3(2), 126–137 (2021) 4. R. Rudman, R. Bruwer, Defining Web 3.0: opportunities and challenges. Electron. Libr. 34(1), 132–154 (2016) 5. S. Baker, The Numerati. Mariner Books Houghton Mifflin Harcourt, Boston-New York (2009) 6. M.E. Porter, J. Heppelmann, Strategiczne podej´scie do rzeczywisto´sci rozszerzonej. HBR Polska 4, 42–56 (2018) 7. L. Kiełtyka, O. Zygo´n, Współczesne formy komunikacji – jak zarz˛adza´c z wykorzystaniem Internetu Rzeczy i Wszechrzeczy. Przegl˛ad Organizacji. 2, 24–33 (2018) 8. B.R. Arun Kumar, AI-based digital marketing strategies—a review, in Inventive Computation and Information Technologies, ed. by S. Smys, V.E. Balas, K.A. Kamel, P. Lafata. Lecture Notes in Networks and Systems, vol. 173 (Springer, Singapore, 2021), pp. 957–969 9. M. Zeeshan, K. Saxena, Explorative study of artificial intelligence in digital marketing, in Proceeding of the International Conference on Computer Networks, Big Data and IoT (ICCBI– 2019), ed. by A. Pandian, R. Palanisamy, K. Ntalianis. Lecture Notes on Data Engineering and Communications Technologies, vol. 49 (Springer, Cham, 2019), pp. 968–978 10. Z. Malara, Y. Ziaeian, Marketing model in global companies: designing and management. Organ. Rev. 6(953), 23–30 (2019) 11. I. Oplatka, J. Hemsley-Brown, A systematic and updated review of literature on higher education marketing: 2005–2019, in Handbook of Operations Research and Management Science in Higher Education. International Series in Operations Research and Management Science, ed. by Z. Sinuany-Stern. (Springer International, 2021), pp. 1–71. Preprint: https://www.researchgate.net/publication/351845192_A_systematic_and_updated_r eview_of_the_literature_on_higher_education_marketing_2005--2019 12. P. Williams, Assessing collaborative learning: big data, analytics and university futures. Assess. Eval. High. Educ. 42(6), 978–989 (2016)
88
A. Kisiołek et al.
13. S.S. Chaurasia, D. Kodwani, H. Lachhwani, M. Avadhut Ketkar, Big data academic and learning analytics. Connecting the dots for academic excellence in higher education. Int. J. Educ. Manage. 32(6), 1099–1117 (2018) 14. P.D. Long, G. Siemens, Penetrating the fog: analytics in learning and education. Ital. J. Educ. Technol. 22(3), 132–137 (2014) 15. D. Chaffey, F. Ellis-Chadwick, Digital Marketing (Pearson Education Limited, Harlow, 2019) 16. A. Al-Thagafi, M. Mannion, N. Siddiqui, Digital marketing for Saudi Arabian University student recruitment. J. Appl. Res. High. Educ. 12(5), 1147–1159 (2020) 17. E. Bolat, K. Kooli, L.T. Wright, Businesses and mobile social media capability. J. Bus. Indus. Mark. 31(8), 971–981 (2016) 18. D. Chaffey, Digital marketing benchmarking templates. https://www.smartinsights.com/gui des/digital-marketing-benchmarking-templates 19. B. Nguyen, X. Yu, T.C. Melewar, J. Chen, Brand innovation and social media: knowledge acquisition from social media, market orientation, and the moderating role of social media strategic capability. Ind. Mark. Manage. 51, 11–25 (2015) 20. W.Y.C. Wang, D.J. Pauleen, T. Zhang, How social media applications affect B2B communication and improve business performance in SMEs. Ind. Mark. Manage. 54, 4–14 (2016) 21. W.Y.C. Wang, M. Rod, S. Ji, Q. Deng, Social media capability in B2B marketing: toward a definition and a research model. J. Bus. Indus. Mark. 32(8), 1125–1135 (2017) 22. G. Nowak, Przyszło´sc´ marketingu to wszechkanałowo´sc´ . Cz˛es´c´ 1. Geneza, https://marketerp lus.pl/przyszlosc-marketingu-to-wszechkanalowosc-czesc-1-geneza 23. B. Berman, S. Thelen, Planning and implementing an effective omnichannel marketing program. Int. J. Retail Distrib. Manage. 46(7), 598–614 (2018) 24. B. Gotwald-Feja, Komunikacja marketingowa w realiach omnichannel – uj˛ecie modelowe. Marketing i Zarz˛adzanie. 1(47), 261–271 (2017) 25. B. Szymoniuk, Komunikacja marketingowa w klastrach i uwarunkowania jej skuteczno´sci. Wydawnictwo Politechniki Lubelskiej, Lublin (2019) 26. B. Szymoniuk, Sferyczny model zintegrowanej komunikacji marketingowej. Marketing i Zarz˛adzanie. 3(49), 193–208 (2017) 27. B. Gotwald, Komunikacja marketingowa w s´rodowisku omnikanałowym. Potrzeby i zachowania konsumentów na rynku centrów nauki. Wydawnictwo Uniwersytetu Łódzkiego, Łód´z (2020) 28. R. Pasqua, N. Elkin, Godzina dziennie z mobile marketingiem. Helion, Gliwice (2014) 29. J. Kall, Branding na smartfonie. Komunikacja mobilna marki. (Wolters Kluwer Business, Warszawa, 2015) 30. M. Kihn, C.B. O’Hara, Customer Data Platforms: Use People Data to Transform the Future of Marketing Engagement (Wiley, 2020) 31. O. Prokopenko, O. Kudrina, V. Omelyanenko, ICT support of higher education institutions participation in innovation networks, in 15th International Conference on ICT in Education, Research and Industrial Applications, vol. 2387. (Kherson, 2019), pp. 466–471 32. I.P. Rutkowski, Inteligentne technologie w marketingu i sprzeda˙zy – zastosowania, obszary i kierunki bada´n (Intelligent technologies in marketing and sales – applications, research areas and directions). Marketing i Rynek/Journal of Marketing and Market Studies. 6, 3–12 (2020) 33. Microsoft. https://dynamics.microsoft.com/pl-pl/ai/customer-insights/what-is-a-customerdata-platform-cdp 34. IBM. https://www.ibm.com/pl-pl/analytics/data-ai-platform 35. Salesforce. https://www.salesforce.com/products/marketing-cloud/overview 36. SAP. https://apollogic.com/pl/2020/12/platforma-danych-klientow 37. S. Subarna, S. Smys, Big data analytics for improved risk management and customer segregation in banking applications. J. ISMAC. 3(03), 235–249 (2021) 38. Higher Education Marketing Platform. https://www.salesforce.org/highered/marketing
The Concept of a Digital Marketing Communication …
89
39. O. Prokopenko, V. Omelyanenko, Marketing aspect of the innovation communications development. Innov. Mark. 14(2), 41–49 (2018) 40. A.K. Yadav, The substance of auditing in project system. J. Inf. Technol. Digital World 3(1), 1–11 (2021)
A Lightweight Image Colorization Model Based on U-Net Architecture Pham Van Thanh and Phan Duy Hung
Abstract For the problem of grayscale image colorization, many authors propose their methods to produce the most plausible, vivid images from a gray input. Almost all of them introduce a quite large neural network model with hundreds of megabytes of parameters. This paper proposes a relatively lightweight model to solve the problem which has equivalent performance to recent methods. Our model is based on famous U-net architecture which is frequently used for semantic segmentation problems. The model is trained to predict the chromatic ab channels given the lightness L channel in Lab color space to finally produce a colorful image. Our method applies self-supervised representation learning where input and labeled output are different channels of the same image. Experiments on commonly used PASCAL VOC 2012 and Places205 datasets show that our method has equivalent performance compared to other state-of-the-art algorithms while the model size is relatively smaller and consumes less computing resources. Keywords Image colorization · U-net architecture · Self-supervised learning
1 Introduction Image colorization is to assign a color to each pixel of a grayscale image. This is a difficult problem because two of the three channels of the ground truth image are lost. However, colorization is an interesting area with many applications. One of the most popular applications of image colorization is to colorize legacy images or videos. This helps people to gain more knowledge about the colors of old images. A historical movie is more vivid when it is colorized and gives watchers a better imagination from what they see.
P. Van Thanh · P. Duy Hung (B) FPT University, Hanoi, Vietnam e-mail: [email protected] P. Van Thanh e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_7
91
92
P. Van Thanh and P. Duy Hung
Before the era of deep learning, most of the colorization methods require human intervention. Previous works of [1] require color scribbles while [2–4] require reference images. During the emergence of deep learning, many authors like [5–11] introduce their uses of deep convolutional neural networks (CNN) to solve this problem in a fully automatic approach, and among them [9] proposes a deep learning method for user-guided image colorization. Their objective is to colorize the images with the most plausible colors instead of finding the lost original colors. In deep learning approaches, authors proposed deep CNNs to predict the chromatic ab channels in Lab color space given the lightness L channel. The CNNs usually have a contracting path that contains downsampling steps and an expanding path containing upsampling steps to grow the image back to its original resolutions. Their models perform well and generate images that even can fool humans in a “colorization Turing test” where people are asked to choose which is a generated image or the ground truth image [5]. Most of the authors highlight the importance of semantic colorization. This is the key concept to colorize images plausibly. Furthermore, some authors like [8, 10, 12–14] focus on the diversity of the generated colors. A car can be red, yellow, or white, and a bird can be green, blue, or black. Their methods are supposed to predict more vivid images with diverse colors instead of assigning a single color for a class of objects. To improve performance, other authors like [15] present a memory-augmented model that can produce high-quality coloration with limited data, or [16] use a pre-trained Inception-ResNetv2 to extract features of images in parallel with a normal encoder. Su et al. [17] propose a method of instance-aware colorization. They leverage an off-the-shelf object detector to obtain object images and use an instance colorization network to extract object-level features. The above studies show that the problem of image colorization can be solved by different approaches. Many state-of-the-art algorithms are proposed to colorize images with the care of semantics and diversity. However, the uses of deep CNN models prevent people from deploying a colorization application on limited computational devices. A common property of the CNNs proposed by the mentioned authors is that they are quite large. The model of [5] has 129 MB of parameters, while [7] introduces a model which consists of 588 MB of parameters. Heavy models with too many parameters take a lot of time to colorize an image and much more time for a video or long movies. This motivates us to find a relatively lightweight model which has good performance as most recent models and is light enough to be deployed and run smoothly on regular devices. The main contribution of this study is to propose a lightweight model for semantic colorization which has equivalent performance to state-of-the-art colorization methods. It is supposed to predict the ab channel given the lightness L channel which can colorize images of any resolution in a fully automatic manner. The model is inspired by a successful semantic segmentation architecture called U-net proposed by Ronneberger et al. [18]. This work applies self-supervised representation learning where raw images are used as the source of supervision.
A Lightweight Image Colorization Model …
93
2 Methodology This section of the paper presents the proposed architecture and the loss function. The model is supposed to predict the ab channels from a given L channel in the Lab color space. There is a pre-processing step to convert each RGB image in the training set to a pair of input (L channel) and labeled output (ab channels). After prediction, the calculated ab channels are combined with the original lightness channel to produce the predicted Lab image. Images of any resolution are resized to 256 × 256 before processing and restored to the original resolutions after prediction.
2.1 The Architecture The U-net architecture is proposed by Ronneberger et al. [18] which is based on the work of [19]. It is originally used for biomedical image segmentation. The CNN proposed by this study is a simplified version of the U-net (Fig. 1). Like some other colorization methods, this network has a contracting path and an expanding path. However, skip connections are added to retain information that might be lost during the downsampling. In Fig. 1, the number of channels is denoted on the top of the box while the resolution is on the left side. Inputs are 256 × 256 × 1 tensors where 1 represents the lightness channel L. Outputs are 256 × 256 × 2 tensors representing 2 chromatic channels a and b. In the contracting path, 3 × 3 kernels and ReLU activations are
Fig. 1 The simplified U-net architecture
94
P. Van Thanh and P. Duy Hung
Fig. 2 The proposed model
Table 1 Model size comparison of the proposed method and the others
Method
Parameters (MB)
Zhang et al. [5]
129
Larsson et al. [7]
588
Proposed model
55
used for all convolutional layers and 2 × 2 max pooling for downsampling. Unlike [18], in this method, “same” paddings are applied in all convolutional layers. In the expanding path, each upsampling step is done by an up-convolution that halves the number of channels and concatenation with a corresponding cropped feature map from the contracting path, followed by a 3 × 3 convolution with ReLU activation and “same” padding. Finally, a 1 × 1 convolutional layer with 2 filters is used to predict the ab channels before they are combined with the original lightness L channel to produce a colorful Lab image. Figure 2 shows the proposed model where each rectangular box represents a convolutional block. This model consists of 9 convolutional blocks and a 1 × 1 convolutional layer. It has 4.7 million parameters which are equivalent to 55 MB of memory storage. Compared to other models, this is relatively smaller and can be called “lightweight.” Table 1 shows the comparison where the model size of [5] is 2 times and [7] is 10 times larger than the proposed model approximately.
2.2 Loss Function Unlike [5] and some other authors who use custom loss functions, in this work, mean-squared-error (MSE) loss is used to evaluate the predicted mask. The loss is calculated over the ab channels as follow: MSE =
N 2 1 yi − yˆi N i=1
A Lightweight Image Colorization Model …
95
where N represents the number of entries in the predicted mask which is equal to 256 × 256 × 2.
3 Implementation and Evaluation 3.1 Experiment Preparation The proposed model is trained and evaluated on two different datasets. The first is a common semantic segmentation dataset named Pascal VOC 2012. This work merges the train/validation set and the test set into a bigger one and then split into 2 new subsets for training (31,792 images) and validation (1440 images). Evaluation on PASCAL VOC 2012 is used to compare the performance of the model to other methods where metrics are also reported on this commonly used dataset. The second dataset is Places205 where 227,880 images are for training and 12,101 images for evaluation. Training the model on Places205 is to report how well the model works on large datasets. The model is built using the TensorFlow framework and trained with the Nvidia Geforce GTX 1650 GPU. Adam optimization [20] is adopted.
3.2 Training This work uses mini-batch training where the mini-batch size is 16. When the model is trained on Places205, a single epoch contains 14,243 iterations and it averagely takes 7500 s to complete. The validation loss reaches the lower limit after 140,000 iterations. Approximately, the training process takes 20 h to get the best performance. This fast learning enables researchers to make various changes to the model and quickly see the effectiveness during the tuning process. On PASCAL VOC 2012, the model even learns faster. It reaches the lowest validation loss after only 18,000 iterations. However, after that point, the validation loss increases while the model still fits the training data better. This indicates model overfitting on such a small dataset as PASCAL VOC 2012. Validation-based early stopping [21] is adopted to terminate the training process when overfitting has begun.
3.3 Quantitative Comparison to Other Methods Model performance is compared to some state-of-the-art colorization models proposed by [5–7, 9, 10]. Two metrics used for evaluation are RMSE (Root Mean Squared Error) and PSNR (Peak Signal-to-Noise Ratio) measured on the predicted
96 Table 2 Quantitative evaluation
P. Van Thanh and P. Duy Hung Method
RMSE
PSNR
Runtime (ms)
Zhang et al. [5]
12.37
22.81
349.65b
Iizuka et al. [6]
10.66
24.20
–
Larsson et al. [7]
10.26a
24.56
–
Zhang et al. [9]
10.52
25.52
1,182.15
Zhao et al. [10]
11.43
23.15
–
Proposed model
11.33
24.88
112.98
a
Bold values indicate the best performance in comparisons b Italic values indicate the metrics reported by this work
ab channels compared to the ground truth. The reports of [5–7, 10] are provided by the work of [10] while the metrics of [9] are reported by this study. Like [10], this work selects 1440 images from Pascal VOC 2012 validation set to evaluate. This study also reports the average prediction time of [5, 9], and the proposed model for speed comparison. This experiment is done on an Intel Core i7-10750H CPU to show how fast each method runs on a regular device without powerful GPUs. Table 2 shows the experiment results. The results of Table 2 show that [7] outperforms the others on RMSE. Zhang et al. [9] achieves the best performance regarding PSNR. The proposed model performs better than most of the others according to the PSNR value and in the middle rank regarding RMSE. In terms of prediction time, this model is 3 times faster than [5] and 10 times faster than [9].
3.4 Qualitative Comparison To show how good the generated images are, the proposed model is trained on Places205 and the predictions are compared to the ground truth images. Figure 3 shows how they “look like” each other. Experiments on Places205 indicate that the model predicts better on blue areas like the sky or the ocean while it does not perform well on green objects like grass or trees. Therefore, in the images of nature, there is less green in the predicted masks than in the ground truth. This work also compares the predicted images to the results of Zhang’s methods [5, 9]. These are two state-of-the-art models of semantic colorization. Figure 4 shows the comparison. Most of the images generated by [5, 9] methods have more green color than the proposed method and even the ground truth. It makes their images more vivid and can fool people in a “colorization Turing test” [5]. However, this “green effect” sometimes makes the images look unreal. For example, in the images on the first row of Fig. 5, the railway in the prediction of [5, 9] is green while it is not in the ground truth. And in fact, a railway is rarely green!
A Lightweight Image Colorization Model …
97
Fig. 3 Predicted images of this work compared to the ground truths
In general, the images generated by this work have the same closure level to ground truth as Zhang’s methods. However, because of the “green effect,” Zhang’s methods do better on the images of nature where there is much green while the proposed model does better on the images of “less-green” things like cars, horses, lakes, and railways.
4 Conclusion and Future Works This paper proposed a neural network architecture for fully automatic image colorization. The model is a simplified version of U-net architecture which is relatively lightweight compared to other authors. The predicted images are not as vivid as some state-of-the-art methods like [5, 9]. However, the images reach the ground truth as close as those do. The model size is several times smaller and then it runs fast even on a device with no GPU available. Therefore, it can be used to build instant video colorization applications where so many image frames need to be colorized in an acceptable time. Predicted images by this work are quite grayish and sometimes desaturated. In the future, a custom loss function is applied to the model for training and predicting more
98
P. Van Thanh and P. Duy Hung
Fig. 4 The prediction of the proposed method compared to Zhang’s and the ground truth. From the left to right: Grayscale image; prediction of [5, 9]; Proposed model and the ground truth
colorful and saturated images. Furthermore, the proposed architecture can be used as the convolutional part of complicated colorization models where various additional techniques are adopted to achieve the best performance on particular aspects. The paper can be a good reference for many machine learning problems [22–24].
A Lightweight Image Colorization Model …
99
Fig. 5 The “green effect” in images generated by Zhang’s methods
References 1. A. Levin, D. Lischinski, Y. Weiss, Colorization using optimization. ACM Trans. Graph. 23(3), 689–694 (2004) 2. G. Charpiat, M. Hofmann, B. Schölkopf, Automatic image colorization via multimodal predictions, in Computer Vision-ECCV (Springer, 2008), pp. 126–139 3. X. Liu, L. Wan, Y. Qu, T.T. Wong, S. Lin, C.S. Leung et al., Intrinsic colorization. ACM Trans. Graph. 27(5), 152 (2008) 4. R.K. Gupta, A.Y.S. Chia, D. Rajan, E.S. Ng, H. Zhiyong, Image colorization using similar images, in Multimedia (2012) 5. R. Zhang, P. Isola, A.A. Efros, Colorful image colorization, in ECCV (2016) 6. S. Iizuka, E. Simo-Serra, H. Ishikawa, Let there be color!: Joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification. ACM Trans. Graph. 35(4), 110 (2016) 7. G. Larsson, M. Maire, G. Shakhnarovich, Learning representations for automatic colorization, in ECCV (2016) 8. A. Royer, A. Kolesnikov, C.H. Lampert, Probabilistic image colorization, in BMVC (2017) 9. R. Zhang, J.Y. Zhu, P. Isola, X. Geng, A.S. Lin, T. Yu, et al., Real-time user-guided image colorization with learned deep priors, in SIGGRAPH (2017) 10. J. Zhao, J. Han, L. Shao, C.G.M. Snoek, Pixelated semantic colorization. Int. J. Comput. Vision 128, 818–834 (2020) 11. P. Vitoria, L. Raad, C. Ballester, ChromaGAN: adversarial picture colorization with semantic class distribution, in WACV (2020)
100
P. Van Thanh and P. Duy Hung
12. A. Deshpande, J. Lu, M.C. Yeh, M.J. Chong, D. Forsyth, Learning diverse image colorization, in CVPR (2017) 13. S. Messaoud, D. Forsyth, A.G. Schwing, Structural consistency and controllability for diverse colorization, in ECCV (2018) 14. Y. Wu, X. Wang, Y. Li, H. Zhang, X. Zhao, Y. Shan, Towards vivid and diverse image colorization with generative color prior, in ICCV (2021) 15. S. Yoo, H. Bahng, S. Chung, J. Lee, J. Chang, J. Choo, Coloring with limited data: few-shot colorization via memory augmented networks, in CVPR (2019) 16. P.A. Kalyan, R. Puviarasi, M. Ramalingam, Image colorization using convolutional neural networks, in ICRTCCNT (2019) 17. J.W. Su, H.K. Chu, J.B. Huang, Instance-aware image colorization, in CVPR (2020) 18. O. Ronneberger, P. Fischer, T. Brox, U-net: convolutional networks for biomedical image segmentation, in MICCAI (2015) 19. J. Long, E. Shelhamer, T. Darrell, Fully convolutional networks for semantic segmentation, in CVPR (2015) 20. D.P. Kingma, J. Ba, Adam: a method for stochastic optimization, in ICLR (2015) 21. L. Prechelt, Early stopping—but when? in Neural Networks: Tricks of the Trade. Lecture Notes in Computer Science, vol. 7700, ed. by G. Montavon, G.B. Orr, K.R. Müller (Springer, Berlin, Heidelberg, 2012) 22. N.T. Su, P.D. Hung, B.T. Vinh, V.T. Diep, Rice leaf disease classification using deep learning and target for mobile devices, in Proceedings of International Conference on Emerging Technologies and Intelligent Systems, ICETIS (2021) 23. P.D. Hung, N.T. Su, Unsafe construction behavior classification using deep convolutional neural network. Pattern Recognit. Image Anal. 31, 271–284 (2021) 24. N.Q. Minh, P.D. Hung, The system for detecting Vietnamese mispronunciation, in Future Data and Security Engineering. Big Data, Security and Privacy, Smart City and Industry 4.0 Applications. FDSE 2021. Communications in Computer and Information Science, vol. 1500, ed. by T.K. Dang, J. Küng, T.M. Chung, M. Takizawa (Springer, Singapore, 2021)
Comparative Analysis of Obesity Level Estimation Based on Lifestyle Using Machine Learning R. Archana and B. Rajathilagam
Abstract Obesity is a global epidemic in which excessive body fat increases the risk of health problems. It is a major contributor to the burden of chronic disease and disability affecting people of all ages and gender. The study focuses on assessing the obesity level of individuals between 14 and 60 years based on their dietary behavior, lifestyle pattern, and physical condition. The supervised machine learning algorithms were used to develop a predictive model, and the factors that have a major impact in developing obesity were analyzed. When compared to other traditional methods using boosting algorithms, the models have achieved better performance in predicting various levels of obesity. Keywords Food security · Obesity · Machine learning · Random forest · Ensemble model
1 Introduction Food security is the ability of people from all sections to access and consume an adequate amount of nutritious and safe food in all period of time, to meet their dietary need food preferences for active life [1]. Food insecurity and obesity are strongly correlated. If there is an increase in rate of obesity, food insecurity is also increased. Obesity is a medical disorder in which the accumulation of a high volume of body fat raises the risk of multiple disease progression, severe impairment, and even premature death [2]. The deficiency of nutritious intake and prevalence of urban food desert is the major factors of obesity. People from low income suffer from the lack of nutritious food, budget-constrained market food basket, and increase in the release of stress R. Archana (B) · B. Rajathilagam Department of Computer Science and Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India e-mail: [email protected] B. Rajathilagam e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_8
101
102
R. Archana and B. Rajathilagam
hormone make them easily prone to obesity. On the contrary, people living in high socioeconomic regions have more access to low-caloric and junk foods that also end in obesity. It also decreases the quality of life. A person with obesity is likely to have inferiority complex that affects his mental health and decreases his cognitive skills. Thereby, being food insecure is a direct threat to the very development of the society. Obesity has reached epidemic proportions globally, with at least 4 million people dying each year as a result of being overweight or obese. The Food and Agriculture Organization (FAO) report shows that between 2000 and 2016 there has been a steady increase in obesity prevalence than that of overweight. It has been increased in all ages, especially among school children and adults [3]. When an individual’s Body Mass Index (BMI) is greater than 25, he/she is said to be overweight and above 30 is considered obese. BMI is determined by dividing the weight in kilograms by square of height in meters. Obesity is generally caused by the increase intake of high caloric or processed food, poor diet, lack of physical activities, and genetic or other medical disorders [4]. The lifestyle, mode of travel, and eating habits of individuals with sedentary work have a greater impact on the development of excessive body fat [5]. De-La-Hoz-Correa et al. [5] and Cervantes et al. [6] mention other factors that lead to obesity as “being only child, familiar conflicts such as divorce of parents, depression, or anxiety.” As Centres for Disease Control and Prevention (CDC) has stated, obesity is considered an immediate health challenge and also a winnable battle. As the obesity prevalence in individuals increases, the development for a computational approach becomes the need of the hour. Based on the prior study, the estimation of obesity level based on the BMI, omitting the other factors like family history, eating habit, and physical condition does not necessarily uncover the possibilities of being obese. The COVID pandemic has increased the sedentary nature of many people around the globe. It has changed the overall lifestyle of people, leaving them with limited choice for physical activities. There is a restriction to continue their habit of walking, exercise, and outdoor activities. Most of them have shifted to work from home pattern for their livelihood. These have huge impact on the increase of obesity to not just adults but also among children. Therefore, a study to estimate the obesity level in individuals of all ages is crucial. This study uses several Machine Learning algorithms to predict obesity level in individuals based on their eating habits and physical condition. The early prediction can help to avoid certain diseases by modifying the lifestyle and eating patterns. The structure of the paper is as follows: studies with similar background is discussed in Sect. 2, data samples and methodology used for the experimentation process is described in Sect. 3, experimental analysis in Sect. 4, results obtained from the techniques in Sect. 5 and the conclusion of the study is discussed in Sect. 6.
Comparative Analysis of Obesity Level Estimation …
103
2 Related Work De-La-Hoz-Correa [5] presented a study to estimate the obesity levels using Decision Trees. 712 data samples from students between 18 and 25 years from Colombia, Mexico, and Peru were used. Six levels of obesity were taken into consideration. The authors used SEMMA data mining technology, and three methods were selected: Decision trees, Bayesian networks, and Logistic Regression. Decision tree model was the best model that obtained a precision of 97%. A software using NetBeans were deployed to classify the patients with obesity. Cervantes [6] in his study to estimate obesity levels used data samples of 178 students of Colombia, Mexico, and Peru between 18 and 25 years old. Using WEKA tool, algorithms like Decision tree and Support Vector Machine (SVM) Obesity Level Estimation based on lifestyle using Machine Learning 3 were used to train the model. Simple K-Means was selected as the clustering method for validation. The study obtained a precision of 98.5%. The authors [7] used Naïve Bayes and Genetic Algorithm for predicting obesity in children. 19 parameters were used for the study. Genetic Algorithm was used to optimize these parameters. This hybrid approach achieved an accuracy of 75%. In order to reduce childhood obesity in Malaysia, the authors [8] used Naïve Bayes to identify children who are prone to be obese. They have developed a knowledge-based system to suggest suitable menu for improving health among school children. This system has a precision of 73.3%. Satvik Garg et al. [9] built a framework using Python Flask. They took leverage from various machine learning algorithms that would predict obesity level, body weight, and fat percentage level. Several hyperparameter optimization algorithms such as Genetic algorithm, Random Search, Grid Search, and Optuna were used to improve the performance of the model. They included many features like customizable diet plans, workout plans, and a dashboard to track the progress of a person. Based on the nutritional facts of the food intake and health status, Manoharan [10] has developed a Patient Recommendation System that automatically suggest the food diet to be followed based on their health condition. The study introduced a k-clique embedded deep learning classifier recommendation system. The data with thirteen features of patients having various disorders were collected over Internet and through hospitals. Machine learning techniques were used to compare the proficiency k-clique deep learning classifier.
3 Methodology To predict the obesity level, the machine learning model has been developed and that is depicted in Fig. 1. The data that satisfies the requirement of the study was identified. The output variable was defined. Data was preprocessed, and various
104
R. Archana and B. Rajathilagam
Fig. 1 Flowchart of the methodology
machine learning algorithms were applied. The hyperparameters were fine-tuned, and the model is validated. The study reaches its purpose when the model can rightly classify and predict the samples that are more prone to the high risk of obesity, for instance Obesity class III. Body Mass Index (BMI) cannot be considered as a sole indicator for predicting the obesity level of an individual. The BMI of an athlete or a body builder will be much higher than normal. Therefore, other factors that indicate the physical condition and the daily eatery habits should be considered as while estimating the obesity level.
Comparative Analysis of Obesity Level Estimation …
105
Table 1 WHO classification of obesity BMI (kg/m2 )
Nutritional status
Risk
Below 18.5
Insufficient weight
Low (but risk of other clinical problems increased)
18.5–24.9
Normal weight
Average
25.0–26.9
Overweight level I
Increased
26.9–29.9
Overweight level II
Increased
30.0–34.9
Obesity class I
Moderate
35.0–39.9
Obesity class II
Severe
3.1 Data The study uses data from individuals aged between 14 and 61 years, of Colombia, Peru, and Mexico collected through a survey [11]. Dataset contains 17 variables and 2111 observations. The data was collected through online survey using a Web platform. The dataset was balanced using Synthetic Minority Oversampling Technique (SMOTE) filter. It is a tool that Weka uses to generate synthetic data. 48% of the samples were students aged between 14 and 22. There was an equal distribution of male and female population in the observation. 46% of the samples were collected from people suffering from any one of obesity level. The drivers used for the estimation of obesity level are eating habit and physical. The attributes used for the study based on eating habits are frequent consumption of high caloric food, vegetable intake, number of main meals per day, consumption of water, food between meals, and drinking or smoking habit. Attributes based on physical conditions are data from calories consumption monitoring, physical activity frequency, time using technology devices, and transportation used. And the attributes based on individual’s characteristic are gender, age, height, weight, and family history with overweight. The obesity levels are categorized based on WHO classification on different degrees of BMI [12] as given in Table 1. The class distribution of the target variable is given in Fig. 2. Data cleaning and pre-processing were done upon the selected dataset. The dataset contained ordinal data that was converted, and label encoding method was used. Atypical and missing data were handled, and the correlation level between the attributes were checked.
3.2 Proposed Model The research hypothesized that supervised machine learning algorithms can be used to predict the obesity level of an individual provided the details about their
106
R. Archana and B. Rajathilagam
Insufficient weight Normal weight Overweight Level I Overweight Level II Obesity class I Obesity class II Obesity class III
Fig. 2 Target variable class distribution
lifestyle. The study also focuses on improving the accuracy of prediction. The term “prediction” implies estimating the output class of an unseen input data. Ensemble Modeling is a process of using multiple independent models to predict an output class. The result of this model is more accurate than individual models as it achieves “the wisdom of the crowd.” In our case, ensemble methods are leveraged to increase the predictive accuracy and to correctly identify the person having more tendency of being obese. A tree-based ensemble method can identify the most important features that have larger impact on the obesity level. The advanced ensemble modeling techniques including bagging and boosting are used for the study. Bagging: Bagging is an ensemble meta-estimator designed to improve the performance of machine learning algorithms used in both classification and regression. It creates multiple models and combines them to generate generalized result. Bagging or Bootstrapping is a sampling method used to subdivide the original dataset into bags or subsets with replacement. A weak base model c1 , c2 , c3 , … cn are built on these bootstrap samples. Final estimator is derived from the combination of all these base models c1 , c2 , c3 , … cn with majority votes or by considering average of predicted class. Bagging classifier can be applied on any classification algorithms especially on decision trees, and neural networks to improve the accuracy [13]. Random Forest: Random Forest is a tree-based ensemble learning method that uses bagging principal with randomization technique called Random Feature Selection. It consists multiple ensembles of independent decision trees each trained by bootstrap sampling of original dataset. With decision tree induction method, it randomly splits the dataset and selects the best split for building the model [14]. Although decision trees have many advantages, they are more likely to prone to over-fitting. Random Forest limits over-fitting without increasing the error. In classification, the prediction function f (x) is the most frequently predicted class [15]
Comparative Analysis of Obesity Level Estimation …
107
Table 2 Hyperparameter tuning values S. No Approach 1
Hyperparameters
Accuracy
Grid search n estimators: 200, max features: 2, max depth: 80, min samples 93 split: 9, min samples leaf: 2, and bootstrap: True
given in Eq. (1). To make a prediction of new input point x, the below equation is used where h j (x) is the prediction of the output variable using the jth tree. f (x) = arg max y∈Y
3.2.1
J I y = h j (x)
(1)
j=1
Hyperparameter Tuning
Hyperparameter tuning is a technique used to optimize the parameter values of learning algorithms that can reduce the overall cost function and improves the model behavior. It helps us to generalize the model with best performance to unseen data. In our study, the holdout validation approach is used. The part of the dataset is splitted and held aside as a holdout set that is used later to test the performance of the model. The whole dataset was split into training and holdout set. The holdout set is left untouched during the training phase. The model is learnt on the training data, and hyperparameter is tuned upon the same for better performance and then is validated using the holdout set. Grid Search is one of the approaches that is used to tune the hyperparameters. It takes all possible combinations of parameter values in the form of a grid, evaluates the values, and returns the best among them. A variety of key parameters are tuned to define an optimal model with a precision of 93%. The key parameter and their best value found after hyperparameter tuning are given in Table 2. The model is at its best performance when the gap between the training score and validation score is smaller. As seen in Fig. 3, the validation score is much lesser than training score which makes the model moderate in its performance. The classification report of the Random Forest model is given in Table 3. The model performs well while predicting the target class Obesity_class_III which is the most severe state of obesity. The overall performance of the model can be stated as good but not the best. Boosting: Boosting is an interactive process of creating a team of weak and less accurate classifiers forming a strong accurate prediction rule. The subsequent models tend to rectify the error of the previous model. Gradient boosting gradually trains different models in an additive and sequential manner. It identifies the shortcomings of a weak classifier by using gradients in the loss function. The loss function is a measure of the fit of model’s coefficient over
108
R. Archana and B. Rajathilagam
Fig. 3 Validation curve for Random Forest
Table 3 Classification report of Random Forest Output class
Precision
Recall
F1 score
Support
0 = Insufficient weight
0.97
0.97
0.97
61
1 = Normal weight
0.76
0.87
0.81
45
2 = Obesity class I
0.97
0.92
0.95
79
3 = Obesity class II
0.95
0.98
0.96
54
4 = Obesity class III
0.99
0.99
0.99
63
5 = Overweight level I
0.93
0.89
0.91
61
6 = Overweight level II
0.95
0.93
0.94
60
the data. In our study, the loss function is measured by correct classification of bad obesity (Obesity class III). Histogram-based gradient boosting: As gradient boosting is a sequential process of training and adding models, the models slow down during training. In order to increase the efficiency of gradient boosting, data or attributes are reduced by binning the values of continuous attributes in histogram-supported gradient boosting. “Instead of finding the split points on the sorted feature values, histogram-based algorithm buckets continuous feature values into discrete bins and uses these bins to construct feature histograms during training” [16]. Histogram-based gradient boosting fits faster on the training data. The model is evaluated by using repeated k-fold cross validation. As a result, a single model is used to fit the data and the mean accuracy is reported. The classification report is given in Table 4. As shown, the prediction accuracy is much higher when compared to other models. The model performs well in estimation of all obesity classes. We hypothesized that model performance is the best if it could correctly classify or predict the worst state of obesity (Obesity class III) which has the highest risk. The
Comparative Analysis of Obesity Level Estimation …
109
Table 4 Classification report of histogram-based gradient boosting Output class
Precision
Recall
F1 score
Support
0 = Insufficient weight
0.99
0.97
0.98
61
1 = Normal weight
0.96
0.99
0.98
45
2 = Obesity class I
0.99
0.97
0.98
79
3 = Obesity class II
0.96
0.99
0.98
54
4 = Obesity class III
0.99
0.99
0.99
63
5 = Overweight level I
0.98
0.98
0.98
61
6 = Overweight level II
0.98
0.97
0.97
60
proposed model performed well not only in predicting the extreme case of obesity but also estimating all the levels of obesity with almost equal predictive rate.
4 Experimental Analysis 4.1 Model Comparison In the previous studies, the researchers De-La-Hoz-Correa et al. [5] and Cervantes et al. [6] have limited the model to estimate the obesity level among students between the age 18 and 25 years. Also the levels of obesity being addressed is limited. It is always crucial to know the exact stage of nutritional status of our body. As the effort to be taken to maintain health level and the dietary or lifestyle pattern to be followed in each of the stage varies. Thereby, in our study we focused on addressing all the possible levels of nutritional status, which lead us to seven classes of target variable. The purpose of our study was to improve the prediction rate of previous research works with similar background. The hypothesis was to use advance machine learning algorithms like ensemble methods that can predict all stages of nutritional status for people of all ages with better performance. We used various other machine learning algorithms like Support Vector Machine (SVM), Naïve Bayes, Decision Trees, etc., that has been used in previous works compare the performance of our model. The widely used hyperparameters in SVM are kernel, regularization parameter (C), and kernel coefficient (gamma), in decision tree classifier are criterion, max depth, and n components, and in Random Forest Classifiers are n estimators, max features, max depth, min samples split, min samples leaf, and bootstrap. We fine-tuned the parameters to result in better predictive analysis. The values of the parameter after tuning along with the precision rate is given in Table 5. Receiver Operating Characteristic (ROC) Curve is used to visualize the performance of each classifier. Figure 4 depicts the ROC curve of various supervised machine learning algorithms.
110
R. Archana and B. Rajathilagam
Table 5 Hyperparameter tuning S. No
Classifier
Hyperparameters
Precision
1
SVM
C: 1000, gamma: 1, Kernel: Linear
95
2
Decision tree
Criterion: entropy, max depth: 12, number of components: 12
96
3
Random Forest
n estimators: 200, max features: 3, max depth: 80, min samples split: 8, min samples leaf: 3, and bootstrap: True
93
Fig. 4 ROC curve of supervised algorithms
From Fig. 4, it is evident that the precision of Random Forest and histogrambased gradient boosting are almost equal. But the latter is considered the best model as the variation of predictive rate between the holdout set and the training set is lesser when compared to that of Random Forest. The multiclass ROC curve is given in Fig. 5. The One vs Rest strategy is used in multilabel classification to fit one classifier per class. This improves the interpretability and gives knowledge about each label independently against the rest. As shown in Fig. 5, the class 4 has the leading rate which helped the model predict the class 4, i.e., Obesity class III with highest predictive rate.
Comparative Analysis of Obesity Level Estimation …
111
Fig. 5 Multiclass ROC curve
4.2 Evaluation Measures Precision is used to compare the proposed model with previous work accuracy. Other evaluation metrics like recall and F1 score are also taken into consideration. Precision gives the proportion of positive identification that was actually correctly classified. Thus, it is the proportion of true positive compared to all positive [17]. Recall gives the proportion of actual positives that were identified correctly. Thus, it is a proportion of true positive compared to the sample that should have been positive. To measure the correctness of a model both precision and recall should be considered. The F1 score is the harmonic mean between precision and recall [17]. To evaluate the obesity level estimation model, the following metrics are used based on the True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN). • • • •
True Positive (TP) = Number of obesity data predicted as obese True Negative (TN) = Number of non-obese data predicted as non-obese False Positive (FP) = Number of non-obese data predicted as obese False Negatives (FN) = Number of obesity data predicted as non-obese
5 Results and Discussion The supervised algorithms such as SVM, Decision Tree, Random Forest, and gradient boosting are used to develop the proposed work. The results of each models are
112
R. Archana and B. Rajathilagam
compared based on the evaluation metrics, and it is found that the best result is obtained by histogram-based gradient boosting. The result of the classifiers is given in Fig. 6. The result of Obesity Level-III classification which is marked severe has accuracy greater than 90% in all classifiers. The Area under the ROC Curve (AUC) which measures the ability of classifiers to identify various classes has a score of 1.00. Using Random Forest classifier, the six most important features that influence the rate of obesity or the nutritional status other than height and weight are found to be frequent consumption of high caloric food, age, consumption of food between meals, mode of transport, smoke, and physical activity frequency. The feature and its relative score are given in Fig. 7. Table 6 depicts the comparison summary between traditional models and the proposed work. Decision tree
Random forest
SVM
Histogram-Based GB
f1-score Recall Precision Accuracy 80
82
Fig. 6 Model evaluation
Fig. 7 Feature ımportance
84
86
88
90
92
94
96
98
100
Comparative Analysis of Obesity Level Estimation …
113
Table 6 Comparative summary Traditional approach
Proposed work
1. Study on prevalence of obesity among children and students aged between 18 and 25 years 2. Machine learning methods like • Decision trees, Bayesian networks, and Logistic Regression [5] • Decision Tree, Support Vector Machine (SVM), and Simple K-Means [6] • Random Forest, Decision Tree, XGBoost, Extra Trees, and KNN [9] were used to estimate the level of obesity 3. The two stages of pre-obese is fused into one label, limiting the nutritional status to six classes [5] • Four gender-specific clusters were formed that segregate people who are prone/ not prone to overweight and unsupervised technique is implemented [6] 4. Models with best performance: Random Forest (84%) [9], Decision Tree (97%) [5], Naïve Bayes (75%) [7], and Naïve Bayes (73%) [8]
1. Study on prevalence of obesity is extended to a wider range of age-groups 2. The model leveraged from supervised machine learning algorithms like logistic regression, decision tree, support vector machine, KNN, Random Forest, and gradient boosting 3. The obesity levels are categorized based on WHO classification on different degrees of BMI, resulting in seven classes of nutritional status. Addressing exact level of obesity is always crucial to take proportionate precautions especially at the pre-obese level 4. All classifiers used in the traditional methods are built with wider range of age-group. Hyperparameter optimization was used to improve the results of previous works. For instance, the precision of Random Forest was 86% in the traditional methods [9] and is improved to 93% in proposed work 5. The model with best predictive accuracy was histogram-based gradient boosting (98%) 6. There was an extension in the obesity classification labels and the range of population addressed. The model was able to obtain a good performance using boosting algorithms without stepping into deep learning models
6 Conclusion COVID pandemic has brought a severe change in most of our lives. The movement of the people around the world was restricted within four walls. The decrease in physical activity triggered a negative body reactions. The prevalence of obesity increased within no time. Thereby, a study on the nutritional status is more crucial as it can trigger early actions that can save lives and reduce sufferings. Machine Learning is a method of recognizing trends in information and utilizing them to make automatic predictions or decisions. The earlier prediction of obesity is crucial, since 39% of adults globally are overweight and 13% are obese. In our research, histogram-based gradient boosting achieved a precision of 98% in classifying obesity levels. The result of Obesity class-III classification which is marked severe has accuracy greater than 99% in histogram-based gradient boosting. This study has surpassed the results obtained in [7] that had 75% in precision, in [8] that obtained 73% in precision, and in [5] that has 97% precision. The proposed work can be used to identify individuals with tendency to suffer from obesity prior.
114
R. Archana and B. Rajathilagam
Acknowledgements We thank Innovation in Science Pursuit for Inspired Research (INSPIRE) managed by Department of Science and Technology for supporting the research.
References 1. 2. 3. 4. 5. 6. 7.
8.
9.
10. 11.
12. 13. 14. 15. 16. 17.
Food security, International Food Policy Research Institute Report (2021) F. Ofei, Obesity-a preventable disease. Ghana Med. J. 39(3), 98 (2005) WHO Report, The state of food security and nutrition in the world (2019) N. H. Service, “Obesity,” National Health Service (2019) E. De-La-Hoz-Correa, F. Mendoza Palechor, A. De-La-Hoz-Manotas, R. Morales Ortega, A.B.Sánchez Hernández, Obesity level estimation software based on decision trees (2019) R. Ca˜nas Cervantes, U. Martinez Palacio, Estimation of obesity levels based on computational intelligence (2020) M.H.B. Muhamad Adnan, W. Husain, N. Abdul Rashid, A hybrid approach using na¨ıve bayes and genetic algorithm for childhood obesity prediction, in 2012 International Conference on Computer Information Science (ICCIS), vol. 1 (2012), pp. 281–285 W. Husain, M.H.M. Adnan, L.K. Ping, J. Poh, L.K. Meng, Myhealthykids: ıntelligent obesity intervention system for primary school children, The 3rd International Conference on Digital Information Processing and Communications (ICDIPC2013) (2013) S. Garg, P. Pundir, MOFit: a framework to reduce obesity using machine learning and IoT, in 2021 44th International Convention on Information, Communication and Electronic Technology (MIPRO) (2021), pp. 1733–1740. https://doi.org/10.23919/MIPRO52101.2021.959 6673 S. Manoharan, Patient diet recommendation system using K clique and deep learning classifiers. J. Artif. Intell. 2(02), 121–130 (2020) E. De-La-Hoz-Correa, F. Mendoza Palechor, A. De-La-Hoz-Manotas, R. Morales Ortega, A.B. Sánchez Hernández, Obesity level estimation software based on decision trees. J. Comput. Sci. 15(1), 67–77 2019 P.T. James, R. Leach, E. Kalamara, M. Shayeghi, The worldwide obesity epidemic. Obes. Res. 9(S11), 228S-233S (2001) D. Lavanya, K.U. Rani, Ensemble decision tree classifier for breast cancer data. Int. J. Inf. Technol. Convergence Serv. 2(1), 17 (2012) S. Bernard, L. Heutte, S. Adam, Influence of hyperparameters on random forest accuracy, in International Workshop on Multiple Classifier Systems (Springer, 2009), pp. 171–180 A. Cutler, D.R. Cutler, J.R. Stevens, “Random forests,” in Ensemble machine learning (Springer, 2012), pp. 157–175 G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, T.-Y. Liu, Lightgbm: a highly efficient gradient boosting decision tree. Adv. Neural. Inf. Process. Syst. 30, 3146–3154 (2017) Google, Classification: Precision and Recall [WWWDocument] (2019). https://developers.goo gle.com/machine-learning/crash-course/classification/precision-and-recall
An Empirical Study on Millennials’ Adoption of Mobile Wallets M. Krithika and Jainab Zareena
Abstract Nowadays, Millennials are good opinion leaders on technology because they are comfortable with new technology and social media. As Millennials gain popularity in Chennai, it is important to explore how the Chennai Millennials’ adoption behaviour of mobile wallets differs from the conceptual model developed for the study is based on the TAM that was tested in the study. The important variables taken into account by the study are perceived ease of use, perceived usefulness, attitude, and intention towards adoption of Chennai Millennials towards mobile wallet adoption. The quantitative data was obtained by circulating online questionnaires (240) through various social media like Facebook and WhatsApp. SPSS was used to analyse the results. This research indicates that there is a positive influence of perceived ease on the intention of adoption. Significant determinants of intent to use mobile wallets were PEOU and PU. Implications and drawbacks of the study have been addressed. Keywords Stepwise regression · Technology Acceptance Model · Mobile wallet · Adoption · Chennai Millennials · Technology · Innovation
1 Introduction India is paving its way gradually towards a cashless society. From those hefty physical wallets to virtual wallets, we are evolving at a significant rate. Recall those days when we’d be carrying those bulky wallets full of cash and credit? However, it’s all thanks to mobile wallets and our freight while facilitating payments and transactions. Now, from the convenience of our home, we can pay for almost any product or service, transfer funds, make bill payments, book tickets, etc. Gone are the days when people M. Krithika (B) Department of Management Studies, Saveetha School of Engineering, SIMATS, Chennai, India e-mail: [email protected] J. Zareena Department of Management Studies, SCAD College of Engineering and Technology, Tirunelveli, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_9
115
116
M. Krithika and J. Zareena
had to wait hours just to get their hands on their favourite movie’s “first day, first series” card. With its one-tap functionality and fast processing all in one go, mobile wallets have simplified our lives. Mobile wallets were developed to allow for a smooth and seamless flow of trouble-free transactions. A mobile wallet uses bank account and credit or debit card information for seamless processing of payments while entirely securing all user details. Compared to other physical wallets, they help lower the payment processing time, reduce fraud, and are economical. The Indian government drove the demonetization of these wallets, and since that time, the user base has been continuously growing. According to CFO India, nearly 95% of transactions were cash transactions, 85% of those paid in cash, and nearly 70% of those voted “cash on arrival” as the preferred method of payment. However, the mobile wallet industry in India will grow by 150% next year, with $4.4 billion in transactions [1, 2]. With the increasing adoption and promotion of information technology and Internet marketing, the corporate environment has transformed at a fast rate [2]. Cell phones are popular in most emerging countries in industrialized economies due to the exponential development of the mobile industry. Tiny programs run on a mobile computer and carry out tasks such as banking, gaming, and web browsing to identify IO applications. A broad variety of applications have been created for mobile users, including sports apps, social networking apps, Internet shopping apps, and travel planning apps. Some statistical figures indicate that young consumers exposed to multiple mobile wallets [3, 4]. In developing countries like India, mobile wallets have become popular. The widespread use of mobile wallets are such as WhatsApp, Snapchat, Uber, OLX, Hotstar, Paytm, Google Pay, Zomato, Amazon, to name a few. Mobile wallets are a major boon for many young consumers in those developing markets. Government agencies have also released valuable public domain applications such as MyGov and Meri Sadak, to name a few. The youngest and major market group from 2017 to 2030 is Gen Z. This generation has a highly qualified, technically knowledgeable, inventive, and creative membership. They are actively engaged in the use of technology and digital devices through social media and the use of mobile wallets (apps) [5, 6].
2 Preparation of Your Paper 2.1 Background of Theoretical Framework The technological acceptance model for studying the behaviour of the adoption of technologies developed and extensively used by scientists was extensively studied. Extensive studies show that the TAM reliably describes improvements in adopted conduct and is well regarded as an effective model for predicting intentional use [7].
An Empirical Study on Millennials’ Adoption of Mobile Wallets.
117
The Technology Acceptance Model (TAM) is an evergreen term in the area of consumer development studies. Much of the TAM model tends to be a study of the driving powers of new technologies. He also found that perceived utility and perceived user-friendliness are important factors in determining how the technology is used [4]. Factors that influence the actions of technology adoption are multifaceted, selfmotivated, and diverse [8]. For instance, incorporating consumer creativity and perceived risk to the Technology Acceptance Model is a key to understanding the behaviour of Internet banking. The Technology Acceptance Model often combines social impact and personal creativity for analysis of the actions of adoption towards online services. The study carried by [9] found that there is an indirect impact of social and individual innovation on perceived utility and ease However, few researches have addressed the preferences of third parties for mobile payments [10]. As a result of the above findings, this study explores how young Indian Millennials will adopt the new mobile wallet adoption. This study will give more insight to TAM’s with regard to new technology adoption behaviour. The four main TAM variables have been set up for execution, perceived utility, and perceived ease of usage. Many who use new technology will be prepared to get them on board [11, 12]. According to a study carried out by [13], the term “attitude” describes people’s expectations and opinions of modern technology. Research [14] has shown that two determinants of attitudes are crucial elements in the decision-making cycle of perceived product interest and perceived ease of usage. The relationship between attitudes and acceptance expectations is clear and optimistic [6]. Therefore, the study uses basic TAM to test factors, namely perceived value (PU) and perceived ease of use (PEU).
2.2 Perceived Ease of Use A highly perilous factor in technology adoption and behavioural use is considered to be easy to use. PEOU has been described as the degree to which a person believes that the use of technology is easily understood or easily used [4]. Consider “to what degree people accept the rapid advancement of technology” and “to what degree people feel the technology is useful.” Specifically, the study indicates that variables such as ease of usage, perceived usefulness, action, and intentional adoption have significant consequences for the implementation of new technology in previous studies [15]. Studies on application adoption have defined the factors of ease of use on their own, and the results have had major implications in terms of user intent [16]. Researchers in the literature review have found that mobile wallets use variable usability, usability, perceived risk, attitudinal behaviour, and customer preferences, which have a significant effect on both acceptance and customer loyalty [9]. The following report has already been written.
118
M. Krithika and J. Zareena
H1 : PEOU significantly influence the adoption intention of young consumers towards using mobile wallets.
2.3 Perceived Usefulness The study states that PU is one of the main components of the initial paradigm of implementation of development [17]. If people feel that using a mobile app will boost their job efficiency, they may do so more frequently [18]. Numerous previous works have proven effective in the utilization of smartphone devices as a critical factor [19]. When considering the utility of technology [6,] the mobile wallet users anticipate the use of the program to boost the efficiency of a job in an organization. The technology is greatly useful to the consumer to perform a particular task specified in the online technology framework [20]. Perceived utility, along with perceived protection and scalability, affects the conduct of a mobile wallet programme on the part of the trader [19]. The purpose of confidence and traders is to use their perceived protection and use of the wallet programme [21]. Further to this study, the effect of the results on the decision to use innovations was seen as plain, considered to be beneficial and seen by consumers [22] [12]. Research was performed by [20, 23] on the usage of the network and mobile payments. It is influenced by the desire of traders to use higher technology prices. For identifying this result, the variable, namely perceived usefulness, was taken. The past study conducted by the researcher also used the above variable [24]. Hence, the researchers proposed the below-mentioned hypothesis. H2: Perceived usefulness has a significant effect on the behavioural purpose of young consumers to use mobile wallets.
2.4 Attitude and Intention Prior research has demonstrated a close correlation between technical development or creative creativity and the adoption or use of technology or innovation in the area of emerging technologies. Many longitudinal studies in the area of technology also confirm that there is a connection between attitude and intention. The customer attitude towards mobile shopping influenced consumers’ intent to participate in mobile shopping, and this was identified [25]. Further work has shown that users have a major effect on their choice to use e-readers [26]. Customers have a heavy influence on their choice to use the smart mobile wallet platform [27]. Consumers’ intent to purchase via online shopping has been identified [28]. Through involving university students in the USA, they studied the effect of congruity
An Empirical Study on Millennials’ Adoption of Mobile Wallets.
119
on confidence and Attitude, and their influence on the intention to buy. They clarified that attitude significantly affected purchasing intentions and congruity with a significantly impacted attitude towards self-image. H3: Attitude significantly affects the adoption intention of young consumers towards using mobile wallets.
2.5 Sampling and Methodology The people between the 18 and 40 years are considered as appropriate respondents for the study because widely accepted defining range for the generation for Millennial. The unbiased sampling was adopted by circulating questionnaires in the university clusters. The snowball sampling was adopted during the data collection where the web questionnaires were circulated in the WhatsApp social networking channel. Networking strategies was adopted by the researchers for sharing the questionnaires to many respondents. Data was collected from young consumers aged 18 to 40 in Chennai and surrounding areas who have been using smart phones and mobile wallets for at least the last two years. Almost 96% (240) of the 250 questionnaires distributed and received completed. Nearly equivalent gender representation achieved at a rate of 53% for males compared to 47% for females.
3 Data Analysis and Results The quantifiable data collected for this study was entered into SPSS 23, a specialized statistical platform for performing research analysis. Previously, all the data was carefully reviewed and marked. Respondents who gave the same answer to all of the questionnaire’s statements were eliminated [29]. Invalid responses in the questionnaires and the negative items are reviewed and recoded into positive ones [29].
3.1 Reliability of the Constructs The stability of buildings has been tested by Cronbach’s alpha. The Cronbach alpha value is over 0.7 in Table 1. This means that the quantities are strongly internally consistent and are ready for further study [30]. In the descriptive assessment, about 250 responses were collected from the respondents, and the number of valid responses was 240. Of the 10 rejected responses, four were younger than 18 years old and six were older than 40 years old. They were not
120 Table 1 Reliability of the variables
M. Krithika and J. Zareena Constructs
Total items in the construct
Alpha value
Perceived ease of use
4
0.921
Perceived usefulness
4
0.946
Attitude
4
0.916
Adoption ıntention
3
0.913
relevant to this study because the study population is Millennial (aged between 18 and 40). Hence, the final sample size for the study is 240. Demographic Profile The study respondents’ demographic characteristics are listed in Table 2. The number of respondents from Male is 128, which contributed 53% of the survey. The number of women is 112, representing 47% of the survey. For this study, a sample is a representative community since the study population is Millennial (age between 18 and 40). Respondent ages were divided between 18 and 40 years. The survey accounted for 59.5%of respondents aged between 18 and 24, which is 59.5%. The number of respondents is 86, followed by ages between 31 and 40, which accounted for 22.5% of the survey. The number of respondents between the ages of 25 and 30 is 64, representing 18.0% of the survey. Table 2 Demographic details of the respondents
Demographic characteristics
Items
Gender of respondents Male Female Age
Living region
Hours spend on mobile
Percentage (%) 53 47
18–24
59.5
25–30
18
31–40
22.5
Urban
46.7
Semi-urban
28.3
Rural
25.0
less than 1 h
5.0
1–2 h
27.5
2–5 h
39.2
5–8 h
17.1
More than 8 h
11.3
Experience on mobile Less than three years 52.1 payment 3–6 years 29.6 Over six years
18.3
An Empirical Study on Millennials’ Adoption of Mobile Wallets. Table 3 Mean and SD of the study variables
Constructs
121
N
Mean
Std. deviation
Perceived ease of use
240
18.90
3.749
Perceived usefulness
240
19.13
3.803
Attitude
240
13.90
3.232
Adoption ıntention
240
21.47
4.596
Valid N (listwise)
240
Normal, standard deviation, minimum and maximum values are displayed in Table 3 for eight structures. As for the mean value, four constructs are above 4, and two below 2. Adoption purpose has a mean value of 4.596, which is the highest. The expected utility ranks second, with the mean value being 3.803.
3.1.1
Stepwise Regression
Step-by-step regression is a variation to the forward set to check whether all model variables are less significant than the specified tolerance point at every stage where a variable has been introduced. If a non-significant characteristic is detected, the pattern will be removed. A description of the model is given in Table 4, in which multiple regressions are used step by step to obtain the standard coefficient; the independent variables are applied one at a time. There we see that the independent variables, perceived ease of use (Model 1), perceived usefulness (Model 2), and attitude (Model 3), were introduced one by one with the dependent variable intention adoption and increased the R and R square value. Table 4 shows the R square and the R square values modified for each move. The first step shows the R-value as .635, but when the steps were applied, the Rvalue started to increase, and in the final step, it reached 0.787, which increase of 0.152 (0.787–0.635). The model was statistically significant, including 3 independent variables and accounting for about 62% of the variation in mobile wallet adoption behaviour. Table 4 Model summary of independent and dependent variables Model
R
R square
R square change
F change
Sig. F change
1. PEOU
0.635a
0.403
0.403
160.464
0.000
2. PU
0.735b
0.541
0.138
71.263
0.000
3. ATTITUDE
0.787c
0.619
0.079
48.731
0.000
a. Independent Varable: Perceived Ease of Use b. Independent variable: Perceived Use c. Independent Variable: Attitude
122
M. Krithika and J. Zareena
Table 5 Standardized and unstandardized coefficient table Unstandardized coefficients Standardized coefficients
Model 1 2
3
t
B
Std. error
Beta
Constant
6.769
1.183
–
5.722
PEOU
0.778
0.061
0.635
12.667
Constant
3.751
1.099
–
3.412
PEOU
0.214
0.086
0.175
2.495
PU
0.715
0.085
0.591
8.442
Constant
2.777
1.013
–
2.742
PEOU
0.086
0.080
0.070
1.066
PU
0.469
0.085
0.388
5.522
ATT
0.467
0.067
0.405
6.981
The results are described in Table 5, which gives the coefficient values. Every model has a transition and modification that occurs as we descend, but here there is more interest in understanding the final model above (Model 3). The stable coefficients provide us with forecasters that greatly influence the dependent variable. The use of mobile wallets is significantly higher than the predicted benefit at p 0.01 (BETA = 0.635, p 0.01). But if Chennai Millennials were to believe it’s easy to use mobile wallets, they would find it helpful. At p 0.05 (BETA = 0.405, p = 0.00), mobile wallet adoption correlates positively with attitude. That means they will have more ability to follow it if Chennai Millennials consider mobile wallets to be easy to use. Among the three independent variables, perceived ease of use (BETA = 0.635, p = 0.001) is the most influential factor that influences consumers’ mobile wallet adoption behaviour.
4 Conclusion The theoretical model is stable, and the findings of the analysis are confirmed. In sum, the findings align with the five hypotheses suggested and contradict all the hypotheses suggested. The study considered that the main objective of the study was to identify the key variables that impact on mobile payments made by Millennials by third parties. Appropriation of the acceptability goal was closely associated, which is consistent with previous research in accepting development activities [6, 31]. The results of the study show that Chennai Millennials have an appreciation of the target of mobile payments by third parties. The objectives of the research are achieved by evaluating the TAM in mobile payment across different user groups. The usage of mobile payments has shown that travel has a major impact. As a result, it is concluded that the Chennai Millennials
An Empirical Study on Millennials’ Adoption of Mobile Wallets.
123
believe that third-party mobile payment is beneficial only when they have a positive attitude towards it. The important connection between the customer’s thoughts and the decision to use technology [23] has made it easy to use, use, user behaviour, and social influence and has confirmed that these variables are strongly linked to consumer behavioural intentions [20]. Assuming that the measure’s usability is ideal for actions, it is suggested that smartphone transactions by third parties are all right as it is easy to use Chennai Millennial. Such findings are consistent with previous research [31, 32] on development adoption. Regression analyses have a major impact on the actual utility of the desired efficiency of the study. The findings are similar to previous work on development acceptance [31]. This study reveals that third-party mobile payments are only useful if Chennai Millennials think that they are really user-friendly. Theoretical Implications The purpose of this study is to determine the functionality of TAM for different users as its efficiency and ease of use have been described as two key factors in the technical acceptance behaviour of Chennai Millennials in mobile wallet applications. This result suggests that the definition of development accession has been used as a helpful reference to the understanding of the approval of the product. This work can also be used in future studies. In the meantime, a concrete paradigm (R square = 67.9%) has developed, which consists of three main factors that define Chennai’s behaviour in accepting technology for mobile payments by third parties (perceived profit, perceived ease of use, and attitude). The study also investigates the actions of Millennials in the area of global online payments from an Indian point of view. Because most research on product adoption behaviour focuses on online shopping, these models should be important for future research if mobile payment adoptions are to be investigated further [12, 31]. Managerial Implications Mobile payments have shaped the local economic structure at both the local and global levels because they are booming faster. Since young people living in India are digital natives, People’s paramedic shift towards third-party mobile payments has been demonstrated [31, 32]. Because of this rapid change in payment mode, thirdparty mobile payments are growing faster in India. From the results, it is identified that the study variable taken for the study will affected Millennials’ intention. This work is, therefore, useful for businesses and companies engaging in third-party mobile payments to frame strategies for understanding the consumer requirements. Limitations and Directions for further Research Due to time and cost constraints, the sample size taken for the study was 240, and they also used snowball sampling. This may cause bias in the study. In the meantime, the research discusses the decision of Chennai Millennials to accept electronic payments from third parties. Millennials are a particular category of users with a radically different degree of exposure than traditional customers. A national review of the
124
M. Krithika and J. Zareena
trend for the implementation of mobile third-party payments should be conducted in all regions of India in the future. Besides, this work does not analyse the perceived danger from a security or privacy danger perspective. Failure to consider other types of risk can result in divergent outcomes. Perceived danger can be investigated further in further studies. In future studies, other gender-based stereotypes and moderator impacts on age could be explored in order to extend a third-party mobile payment acceptance activity inquiry.
References 1. CFO india, post demonitisation, Indians prefers mobile wallets to plastic money (2020) 2. T. S. Kumar, Construction of hybrid deep learning model for predicting children behavior based on their emotional reaction. J. Inf. Technol. 3(01), 29–43 (2021) 3. C. Liu, Y. Au, H. Choi, Effects of freemium strategy in the mobile app market: an empirical study of google play. J. Manag. Inf. Syst. 31(3), 326–354 (2014) 4. F. Davis, Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Q. 13(3), 319 (1989) 5. R.T. Localytics, 21% of Users Abandon an App After One Use. 2020. 6. G. Tan, K. Ooi, S. Chong, T. Hew, NFC mobile credit card: The next frontier of mobile payment? Telemat. Inform. 31(2), 292–307 (2014) 7. A. Angus, Top 10 Global Consumer Trends for 2018: Emerging Forces Shaping Consumer Behavior. Euromonitor International (2018) 8. S. Manoharan, Study on Hermitian graph wavelets in feature detection. J. Soft Comput. Paradigm (JSCP) 1(01), 24–32 (2019) 9. F. Davis, R. Bagozzi, P. Warshaw, User acceptance of computer technology: a comparison of two theoretical models. Manage. Sci. 35(8), 982–1003 (1989) 10. E. Slade, M. Williams, Y. Dwivedi, N. Piercy, Exploring consumer adoption of proximity mobile payments. J. Strateg. Mark. 23(3), 209–223 (2014) 11. C. Tam, T. Oliveira, Understanding the impact of m-banking on individual performance: DeLone & McLean and TTF perspective. Comput. Hum. Behav. 61, 233–244 (2016) 12. K. Madan, R. Yadav, Behavioural intention to adopt mobile wallet: a developing country perspective. J. Indian. Bus. Res. 8(3), 227–244 (2016) 13. Discovering Statistics Using Spss. 4rd ed. + Using IBM Spss Statistics for Research Methods and Social Science Statistics, 4th ed (Sage, Pubns, 2012) 14. X. Lu, H. Lu, Understanding chinese millennials’ adoption intention towards third-party mobile payment. Inf. Resour. Manage. J. 33(2), 40–63 (2020) 15. K. Kim, D. Shin, An acceptance model for smart watches. Internet Res. 25(4), 527–541 (2015) 16. E.E. Adam, Babikir, Survey on medical imaging of electrical impedance tomography (EIT) by variable current pattern methods. J. ISMAC 3(02), 82–95 (2021) 17. F. Liébana-Cabanillas, I. Ramos de Luna, F. Montoro-Ríos, User behaviour in QR mobile payment system: the QR payment acceptance model. Technol. Anal. Strateg. Manage. 27(9), 1031–1049 (2015). Available: https://doi.org/10.1080/09537325.2015.1047757 [Accessed 9 June 2020] 18. Y. Dwivedi, N. Rana, M. Janssen, B. Lal, M. Williams, M. Clement, An empirical validation of a unified model of electronic government adoption (UMEGA). Gov. Inf. Q. 34(2), 211–230 (2017) 19. T. Apanasevic, J. Markendahl, N. Arvidsson, Stakeholders’ expectations of mobile payment in retail: lessons from Sweden. Int. J. Bank. Mark. 34(1), 37–61 (2016). Available: https://doi. org/10.1108/ijbm-06-2014-0064
An Empirical Study on Millennials’ Adoption of Mobile Wallets.
125
20. A. Erdem, U. Pala, M. Özkan, U. Sevim, Factors affecting usage intention of mobile banking: empirical evidence from turkey. J. Bus. Res—Turk. 11(4), 2384–2395 (2019) 21. E. Pantano, C. Priporas, The effect of mobile retailing on consumers’ purchasing experiences: a dynamic perspective. Comput. Hum. Behav. 61, 548–555 (2016) 22. Y. Lee, J. Park, N. Chung, A. Blakeney, A unified perspective on the factors influencing usage intention toward mobile financial services. J. Bus. Res. 65(11), 1590–1599 (2012) 23. C. Kim, M. Mirusmonov, I. Lee, An empirical examination of factors influencing the intention to use mobile payment. Comput. Human Behav. 26(3), 310–322 (2010). Available: https://doi. org/10.1016/j.chb.2009.10.013 [Accessed 9 June 2020] 24. P. Schierz, O. Schilke, B. Wirtz, Understanding consumer acceptance of mobile payment services: an empirical analysis. Electron. Commer. Res. Appl. 9(3), 209–216 (2010) 25. S. R. Mugunthan, T. Vijayakumar, Design of improved version of sigmoidal function with biases for classification task in ELM domain. J. Soft Comput. Paradigm (JSCP), 3(02), pp. 70– 82 (2021) E, Bashar, Abul, Survey on evolving deep learning neural network architectures. J. Artif. Intell. 1(02), 73–82 (2019) 26. I. Ramos-de-Luna, F. Montoro-Ríos, F. Liébana-Cabanillas, J. Luna, NFC technology acceptance for mobile payments: a brazilian perspective. Rev. Bus. Manage. 19(63), 82–103 (2017) 27. I. de Luna, F. Liébana-Cabanillas, J. Sánchez-Fernández, F. Muñoz-Leiva, Mobile payment is not all the same: the adoption of mobile payment systems depending on the technology applied. Technol. Forecast. Soc. Chang. 146, 931–944 (2019) 28. N. Singh, S. Srivastava, N. Sinha, Consumer preference and satisfaction of M-wallets: a study on North Indian consumers. Int. J. Bank Mark. 35(6), 944–965 (2017) 29. S. Yang, Y. Lu, S. Gupta, Y. Cao, R. Zhang, Mobile payment services adoption across time: an empirical study of the effects of behavioral beliefs, social influences, and personal traits. Comput. Hum. Behav. 28(1), 129–142 (2012) 30. C. Antón, C. Camarero, J. Rodríguez, Usefulness, enjoyment, and self-image congruence: the adoption of e-book readers. Psychol. Mark. 30(4), 372–384 (2013) 31. V. Badrinarayanan, E. Becerra, S. Madhavaram, Influence of congruity in store-attribute dimensions and self-image on purchase intentions in online stores of multichannel retailers. J. Retail. Consum. Serv. 21(6), 1013–1020 (2014). Available: https://doi.org/10.1016/j.jretconser.2014. 01.002 32. T. Perry, J. Thiels, Moving as a family affair: applying the soc model to older adults and their kinship networks. J. Fam. Soc. Work. 19(2), 74–99 (2016) 33. W. Kunz et al., Customer engagement in a Big Data world. J. Serv. Mark. 31(2), 161–171 (2017) 34. J. Rowley, Designing and using research questionnaires. Manage. Res. Rev. 37(3), 308–330 (2014) 35. A. Bashar, Survey on evolving deep learning neural network architectures. J. Artif. Intell. 1(02), 73–82 (2019) 36. N. Koenig-Lewis, M. Marquet, A. Palmer, A. Zhao, Enjoyment and social influence: predicting mobile payment adoption. Serv. Ind. J. 35(10), 537–554 (2015) 37. R. Hill, M. Fishbein, I. Ajzen, Belief, attitude, intention and behavior: an introduction to theory and research. Contemp. Sociol. 6(2), 244 (1977) 38. R. Bagozzi, The legacy of the technology acceptance model and a proposal for a paradigm shift. J. Assoc. Inf. Syst. 8(4), 244–254 (2007) 39. Y. Lu, S. Yang, P. Chau, Y. Cao, Dynamics between the trust transfer process and intention to use mobile payment services: a cross-environment perspective. Inf. Manage. 48(8), 393–403 (2011)
An IoT-Based Smart Mirror K. N. Pallavi, Jagadevi N. Kalshetty, Maithri Suresh, Megha B. Kunder, and Kavya Shetty
Abstract In today’s world, many technologies have come up to make human life more comfortable. Nowadays through the Internet, it is possible for the people to get connected to the whole world and access any information easily. People can learn about current events happening around the world through television or Internet. People want everything smart. Smart houses are built nowadays to make their home connected to Internet. It can connect all the digital devices those communicate with each other through the Internet. World is changing, and life is changing too. Lot of improvement is also happening in the world of smart technologies. As a part of smart technology, a smart device is created called smart mirror. The smart mirror concept is inspired from the basic habit of people of looking into the mirror every day. A thought came to the mind why can’t make this mirror as a smart mirror. The result of this thought process is “smart mirror”. Keywords Alexa · Raspberry Pi · AIoT (Artificial Internet of Things) · Face recognition · Flask
K. N. Pallavi (B) NMAM Institute of Technology, Nitte, India e-mail: [email protected] J. N. Kalshetty Nitte Meenakshi Institute of Technology, Bangalore, India e-mail: [email protected] M. Suresh Oracle, Bangalore, India M. B. Kunder Atos Syntel, Bangalore, India K. Shetty Netanalytiks Technologies Pvt Ltd, Bangalore, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_10
127
128
K. N. Pallavi et al.
1 Introduction A smart mirror is a customised mirror to show the date, time, local weather forecasts, real-time news, upcoming calendar events, social media feeds [1] and so on. The big problem with any existing mirror is that it only shows the object in front of it or just the human face. There is no interaction. The voice instructions using Amazon Alexa Voice Service or Google Home Assistant are used to communicate with the user. With voice commands, one can interact with the screen by asking questions, setting reminders, playing songs and many other things. This project was developed with the inspiration drawn from the people [2] who spend some quality time in front of the mirror. A smart mirror acts as a single window system for a person through which all information can be obtained. To get latest news updates, weather forecast [3] and for any other information people will always have to turn on time-consuming television or mobile apps. To eliminate these problems and to make the search easy, a smart mirror [4] concept is introduced. All the necessary information such as weather forecast, latest news or any other information can be found in one place itself. Smart mirror will work with the help of Alexa and Jitsi Meet. Jitsi Meet is a set of open-source projects which empower users to use and deploy video conferencing platforms with state-of-the-art video quality and features. Alexa is a virtual assistant technology by Amazon, also known as Amazon Alexa. This smart mirror is programmed using python language. This can be connected to the camera near the door. Whenever the door is locked, the smart mirror will be active. If camera catches a person, smart mirror checks and find out whether that person [5] is an authorised person or not. If smart mirror concludes that the person identified is not an authorised person, then it will send an email alert to the user. Also whenever user wants to start a meeting [6] or conference call, the user has to just say “Alexa Ask Mirror to Start meeting”. By this voice command, the smart mirror will join the Jitsi Meet video conferencing call with friends and family [7, 8].
2 Proposed Methodology 2.1 Functional Requirements 2.1.1
Software Requirements
OpenCV is an open-source computer vision library based on the machine learning features. This software program is very often used for applications such as video analysis, image processing. With the assistance of this programming, the computer processes and understands the images and videos. Here OpenCV is used for face recognition of the user.
An IoT-Based Smart Mirror
129
Raspbian OS: Raspberry Pi hardware is optimised with a free operating system called Raspbian OS. Raspbian contains over 35,000 packages with many pre-built functions. Raspbian OS is very easy to install on a Raspberry Pi computer. Python: Python is a powerful programming language, and the main advantage is that it is easy to learn. Python contains high-quality data structures that work well. A simple and effective form of object-oriented program is Python. Python’s magnificent syntax makes it an excellent language for scripting and developing the application quickly in many places on many platforms.
2.1.2
Hardware Requirements
Raspberry Pi: It is actually a credit-card-sized computer [9]. The original intention for developing Raspberry Pi was for education. The inspiration for developing Raspberry Pi is a documentary produced by the BBC Micro in the year 1981. The creator of Raspberry Pi is Eben Upton. He wanted to create a computer that would help a person to understand hardware better and improve programming [10] skills cost effectively. It is small in size and has an affordable price. It is quickly adopted by electronics enthusiasts for projects. Speaker: The speaker is required to provide a voice output. Webcam: The webcam in this project is used to detect user’s faces. Any type of webcam is compatible with Raspberry Pi. Mirror: A special mirror named two-way mirror is used in this project. Instead of using a normal mirror, two-way mirror is used because two-way mirror is not painted in an opaque colour on the back. Microphone: Microphone is required to provide voice input. Mouse: Mouse is used to navigate. Keyboard: Keyboard is used to provide the input.
2.2 Software Approach 2.2.1
IoT
The Internet of Things or IoT is a network of physical objects. These objects for the purpose of connecting and exchanging data with other applications or devices are embedded in software, sensors and other technologies through the Internet. These devices can be common household items, or they can be complex industrial tools. More than 7 billion people connected to IoT devices today. Experts expect this number to grow to 10 billion by 2020 and 22 billion by 2025.
130
K. N. Pallavi et al.
2.2.2
Artificial Intelligence
Artificial intelligence (AI) is the intelligence manifested by the machines [11]. It is different from the natural intelligence. Natural intelligence is displayed by humans and animals which can be a knowledge, emotions, empathy, etc. AI and IoT cannot be separated [12]. AI plays a vital role in the world of technology. Combining AI and IoT results in improvement in both technologies. IoT deals with the connection of two or more objects, network or sensors which will enable data transfer for a few applications. AI enables the analysis of the most sensitive data and thus allows to provide important information and make more informed decisions.
2.2.3
Amazon Alexa
Using the script, the Alexa Voice service can be set in Raspberry Pi. Alexa voice service gives the ability to do many different things like playing the favourite songs, checking the cricket ratings using voice command and many more. The steps involved in setting up Amazon Alexa are as below: 1. 2. 3.
First go to Amazon Developer portal and register an AVS device. AVS Device SDK dependencies shall be installed and configured in Raspberry Pi. Create an AVS sample app and use it in Raspberry Pi.
2.2.4
Jitsi Meet Video Conferencing
Smart mirror can be used for video conferencing with friends and family. Jitsi Meet tool belongs to Jitsi software. This software contains open-source voice collection. It allows video conferencing and instant messaging services. Jitsi Meet allows us to capture group video calls, which means video conferencing without creating an account. This is the most useful advantage with the smart mirror. Even the people who do not have experience in using smart phones can also use smart mirror. Users can conduct or join a meeting from the web. There is no compulsion to download the Jitsi Meet application. This makes it easy to run multiple meetings per day and manage data at all times.
2.3 System Design 2.4 System Description Set-up of AI-based smart mirror is shown in Fig. 1 as follows:
An IoT-Based Smart Mirror
131
Fig. 1 High-level architecture
• Raspberry Pi is connected to power supply. A micro SD card which is Raspbian installed is placed in a slot. • Monitor (with HDMI—in) is the screen of the smart mirror. Any type of display monitor with available HDMI input can be used. Connect the monitor to Raspberry Pi using HDMI cable. • Two-way mirror or one-way reflective film is placed on the monitor that is acts as a mirror. • USB speaker, mouse, keyboard and microphone are connected in their respective slots. • Raspberry Pi ribbon camera is placed for facial recognition.
3 Implementation 3.1 Displaying Information A Python program was written to display all the details such as date, time, greeting message, news, weather, calendar events on the mirror. Weather information is obtained from Google Weather, news from Google News and events from Google Calendar API.
132
K. N. Pallavi et al.
3.2 Amazon Alexa Using the script, the Alexa voice service is set in the Raspberry Pi. Alexa voice service gives an ability to do various things such as playing favourite songs, checking cricket ratings using voice command. The steps involved in setting up of Amazon Alexa are as follows: 1. 2. 3.
First go to Amazon Developer portal and register an AVS device. AVS Device SDK dependencies should be installed and configured in the Raspberry Pi. Create an AVS sample app and use it in the Raspberry Pi.
3.3 Face Recognition OpenCV is an open-source computer vision machine learning Library. This library is frequently used for applications like video analysis, image processing, etc. With the assistance of this library, the computer processes the images and videos [13]. Also it helps in recognising the pictures and videos. Here, OpenCV is used for recognising the face of the user. ˙Initially, training is performed by the LBPH Face Recogniser model. Once the labelled images of the user are ready, the training will be performed. After successful training recogniser can be used for face recognition. Name of the face detected is also displayed in the web for future use. LBPH is an algorithm used for face detection [14]. It can detect side face and front face and is known for giving high performance. It compares the input face with that of face registered and is able to detect images. Images are stored in matrix of pixels format. LBPH makes use of this matrix for its facial detection capability.
3.4 Door Lock and Thief Detection Whenever the door is locked, the smart mirror will be active. If camera catches a person, smart mirror checks and finds out whether that person is an authorised person or not. If smart mirror concludes that the person identified is not an authorised person, then it will send an email alert to the user. A page is created using Flask in Raspberry Pi’s Localhost http://192.168.43.173:5000/door where a user can open or close the door. In real world, a switch and a camera can be placed on the door. This camera is connected to smart mirror. When door is locked, camera is turned on and if any unknown person is detected [15], an email will be sent using OpenCV library.
An IoT-Based Smart Mirror
133
3.5 Jitsi Meet Video Conferencing With smart mirror video conferencing is possible with friends and family. Jitsi Meet tool belongs to Jitsi software. This software contains open-source voice collection. It allows video conferencing and instant messaging services. Jitsi Meet allows us to capture group video calls, which means video conferencing without creating an account. This is the most useful advantage with the smart mirror. Even the people who do not have experience in using smart phones can also use smart mirror. Users can conduct or join a meeting from the web. There is no compulsion to download the Jitsi Meet application. This makes it easy to run multiple meetings per day and manage data at all times. User can just ask Alexa “Ask Mirror to Start meeting” and can join the Jitsi Meet video conferencing call with friends and family. Ngrok library is used for forwarding video conferencing address to Raspberry Pi.
4 Developed System Interactive futuristic smart mirror is designed using AI and IOT in Raspberry Pi. The artificial intelligence is used in face recognition and voice command services. Based on AIoT technology, many devices have been introduced. These devices designed based on the AIoT are supplying comfortable, stable, reliable, futuristic personal services everywhere. The face detection task is performed using OpenCV. The mirror will recognise user’s face and carry out further process with the use of Raspberry Pi. The Raspberry Pi is the most critical part of the smart mirror. It acts as a processing unit of the mirror. The programming of Pi is carried out using Python language. The Pi offers many in-built IDE due to this it is able to program in different languages like C, Java, C++, Python, etc. Installation of OS on Raspberry Pi is a simple process. Mirror display following information in smart mirror: • Weather report: Display climate forecasts. • Local news: Display information, announcements and headlines primarily based on favourite subjects. • Alexa: Add voice instructions and provides assistance to smart mirror. • Facial Recognition: Recognise the user. • Calendar: Display upcoming occasions using calendar. • Send mail on detecting unknown person. • Jitsi Meet video conferencing call with friends and family.
134
K. N. Pallavi et al.
5 Results Overall set-up of smart mirror: The monitor, speaker, microphone, keyboard and mouse are all connected to the Raspberry Pi (Fig. 2). Mirror displaying date, time, weather, news, events, etc. obtained by using Google provided APIs (Fig. 3). Amazon Alexa voice service: As shown in picture, Alexa can interact with users. The output shows listening, thinking and speaking state of Alexa (Fig. 4).
Fig. 2 Overall set-up of the smart mirror
Fig. 3 Display information
An IoT-Based Smart Mirror
135
Fig. 4 Alexa voice services
Face detection using OpenCV: The “hey beautiful” message is replaced with “Hi username” here (Fig. 5). Door close and open using website: Website created to open and close door is displayed here. If door is closed, security is ON (Fig. 6). Thief detection: When door is closed, if any other faces of human is captured then the thief is detected (Fig. 7).
Fig. 5 Detect face
136
K. N. Pallavi et al.
Fig. 6 Door open and close
Fig. 7 Detect thief
Mail sent: The thief detected above is sent to the users mail (Fig. 8). Jitsi Meet video conferencing: We can use Jitsi for interacting with friends and family through mirror (Fig. 9).
An IoT-Based Smart Mirror
137
Fig. 8 Mail received
6 Conclusion We have designed a futuristic smart mirror which is interactive with its user and additionally provides the exciting home services. The mirror display is provided through a display LCD monitor which shows news, weather, messages, search engines, conference calls, etc. in one screen. Smart mirror is a unique application of making a smart
138
K. N. Pallavi et al.
Fig. 9 Jitsi Meet
interacting device. The device is reliable and easy to use in the interactive world. Throughout the project, the main intention was to design an interactive device for home. The smart mirror will be more useful in smart home design. The serviceorientated architecture has been tailored for the improvement and deployment of the numerous applications like shopping malls, hospitals, offices, etc. in which the mirror may be used for the news feeds and internet service conversation mechanisms. The facial recognition technology may be used for providing further security. The future houses will be brilliantly designed using smart technology which will make human life comfortable, easier and enjoyable.
References 1. B. Cvetkoska, N. Marina, D.C. Bogatinoska, Z. Mitreski, Smart mirror E-health assistant— Posture analyze algorithm proposed model for upright posture, in IEEE EUROCON 2017—17th International Conference on Smart Technologies, Ohrid, 507–512 (2017) 2. M.M. Yusri et al., Smart mirror for smart life, in 2017 6th ICT International Student Project Conference (ICT-ISPC), Skudai, 1–5 (2017) 3. D. Gold, D. Sollinger and Indratmo, SmartReflect: a modular smart mirror application platform, in 2016 IEEE 7th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), Vancouver, BC, 1–7 (2016)
An IoT-Based Smart Mirror
139
4. O. Gomez-Carmona, D. Casado-Mansilla, SmiWork: an interactive smart mirror platform for workplace health promotion, in 2017 2nd International Multidisciplinary Conference on Computer and Energy Science (SpliTech), Split, 1–6 (2017) 5. S. Athira, F. Francis, R. Raphel, N.S. Sachin, S. Porinchu, S. Francis, Smart mirror: a novel framework for interactive display, in 2016 International Conference on Circuit, Power and Computing Technologies (ICCPCT), Nagercoil, 1–6 (2016) 6. M. Rodriguez-Martinez et al., Smart mirrors: peer-to-peer web services for publishing electronic documents, in 14th International Workshop Research Issues on Data Engineering: Web Services for eCommerce and e-Government Applications, 2004. Proceedings, pp. 121–128 (2004) 7. Yuan-Chih. Yu, Shing-chern. D, Dwen-Ren. Tsai, Magic mirror table with social-emotion awareness for the smart home, in 2012 IEEE International Conference on Consumer Electronics (ICCE), Las Vegas, NV, pp. 185–186 (2012) 8. M.A.Hossain, P.K.Atrey, A.E.Saddik, Smart mirror for ambient home environment. in 2007 3rd IET International Conference on Intelligent Environments, Ulm, pp. 589–596 (2007) 9. J. Markendahl, S. Lundberg, O. Kordas, S. Movin, On the role and potential of IoT in different industries: analysis of actor cooperation and challenges for introduction of new technology, in 2017 Internet of Things Business Models, Users, and Networks, Copenhagen, pp. 1–8 (2017) 10. S.S.I. Samuel, A review of connectivity challenges in IoT-smart home, in 2016 3rd MEC International Conference on Big Data and Smart City (ICBDSC), Muscat, 1–4 (2016) 11. P. Maheshwari, M.J. Kaur, S. Anand, Smart mirror:a reflective interface to maximize productivity. Int. J. Comput. Appl. 166(9), (0975–8887) (2017) 12. A. Sungheetha, R. Sharma, Real time monitoring and fire detection using internet of things and cloud based drones. J. Soft Comput. Paradigm (JSCP) 2(03), 168–174 (2020) 13. J.I.Z. Chen, L-T. Yeh, Graphene based web framework for energy efficient IoT Applications. J. Inf. Technol. 3(01), 18-28 (2021) 14. I.J. Jacob, P.E. Darney, Design of deep learning algorithm for ıot application by ımage based recognition. J. ISMAC 3(03), 276–290 (2021) 15. E. Kovatcheva, R. Nikolov, M. Madjarova, A. Chikalanov, in Internet of Things for Wellbeing— Pilot Case of a Smart Health Cardio Belt, IFMBE Proceedings, pp. 1221–1224 (2014)
AI-Assisted College Recommendation System Keshav Kumar, Vatsal Sinha, Aman Sharma, M. Monicashree, M. L. Vandana, and B. S. Vijay Krishna
Abstract For an aspiring undergraduate student, choosing which college and courses to apply to is a conundrum. Often, the students wonder which colleges are best suited for them. This issue has been addressed by building an artificial intelligence (AI)-driven recommendation engine. This system is developed using the latest techniques in machine intelligence. The system builds user profiles by explicitly asking questions from users and then maps it with college profiles gathered by web scraping and then generates recommendations based on novel hybrid recommendation techniques. The main aim of this system is to utilize artificial intelligencebased techniques to build an efficient recommendation engine particularly for college students to help them select the college which best suits their interests. Keywords Recommender · User profiling · Similarity metric · Collaborative filtering · Content based · Hybrid recommender system
1 Introduction Nowadays, students are provided services like career counseling or career guidance by many companies/websites. They also provide tips for aspiring undergraduate students through psychometric tests. But none of them provide college recommendations. The needs and requirements of every student are unique. Therefore, the recommender system should be able to provide recommendations that consider opinions from the student community as a whole as well as the user’s individual preferences. Demographic factors also play a crucial role in this. A knowledge-based K. Kumar (B) · V. Sinha · A. Sharma · M. Monicashree · M. L. Vandana Computer Science, PES University, Bengaluru, India e-mail: [email protected] M. L. Vandana e-mail: [email protected] B. S. Vijay Krishna CTO, nSmiles, Bengaluru, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_11
141
142
K. Kumar et al.
recommender system fulfills this purpose by attempting to suggest objects based on inferences about a user’s needs [1–16]. The proposed work gets the user profile by asking questions from users explicitly and then uses that profile to generate scores for each college w.r.t user. The college data has been scraped from the Internet. Finally, the students are able to get top college recommendations. This system can be extended to students in other fields or streams or grades. Recommendation engines are types of systems that provide suggestions of items to the users in which they might be interested. These recommendation systems are playing vital roles in almost every sector which involves selling or presenting items to customers [7–10]. These types of systems help users by enabling them to filter out the heaps of information and the product space. The various types of recommendation systems include: • • • • A.
Collaborative filtering. Content-based. Knowledge based. Hybrid based. Collaborative Filtering: Collaborative Filtering uses techniques that involve profiles of different users and generate recommendations based on similar user profiles. In simple words, similar users will like similar things in the future. For example, if a teenager likes Iron Man then it is highly likely that the other teenagers will also like Iron Man. This is done by user–user similarity. Advantages of collaborative filtering includes: a. b.
These systems work on ratings provided by users and hence don’t require any additional information from users. The recommendations generated by this method are completely new as it considers users with similar behavior.
Disadvantages are: a. b. B.
It suffers from a cold start problem as it cannot generate recommendations if there are no ratings available. If there is very little data then the accuracy is very poor [11, 12].
Content Based: In this system, the recommendations are generated with the help of items. To be precise, it can be said that the similarity between the features of items is used [5, 6]. Hence, item–item similarity. For example, if a person likes books on science fiction, then he/she can recommend other books which are in the science fiction category. Advantages of content-based systems are: a. b.
It is independent of users as it only uses ratings provided by the current users, unlike collaborative filtering. It is highly capable of recommending new items which are yet to be rated.
AI-Assisted College Recommendation System
c.
143
It is transparent as it shows what features the recommendations were generated.
Disadvantages are: a. b. C.
The system needs lots of features for generating distinguished recommendations. There can be a situation where it can generate only similar items and no new items.
Hybrid Based: To counter the disadvantages of collaborative filtering and content-based, hybrid approaches are used which mostly involves using useritem similarity. In this paper, weighted hybrid model is being used which uses a user-item similarity metric to generate accurate recommendations [13–15].
The system developed in this paper helps students get recommendations for college with the help of psychometric tests.
2 Methodology The proposed system adapts machine learning techniques for the specific task of assisting students in choosing a college. The architecture of the system is shown in Fig. 1. The novel approach here is that the scores for each college are generated with respect to the user. The scores are generated by calculating the cosine similarity metric between the user preferences vector and the college profile vector. The scores are not just directly calculated as there are few features that have higher weights than other features, for example, placements ratings have much more importance than campus infrastructure ratings. Hence, each feature used to build the vector has different weights assigned to it. For getting the weights of the features, a survey was performed on more than 50 students who were searching for colleges. From the results of the survey, all the features were divided into three classes of weight. For example, the placements ratings, safety ratings, and food ratings were given the most preference and hence these features were assigned to the weight class with the highest value. For getting the user profile, the questions are based on the very features which are used to calculate the score with respect to college [4]. The answers to these questions help to build the user profile. This also addresses the problem of cold start which collaborative filtering recommender systems suffer from as the data is gathered from the user explicitly. The user profile and college profile have identical structures as scores are calculated using similarity metrics between user and college. The equation to get the score is: Score = w1 ∗ (sim (U1, C1)) + w2 ∗ (sim(U2, C2)) + w3 ∗ (sim (U3, C3)) where w1, w2, w3 are the weights, Sim stands for similarity score between the user features and the college features.
144
K. Kumar et al.
Fig. 1 Architecture diagram
U1, U2, and U3 represent the subset of the user feature array. C1, C2, and C3 represent the subsets of the college feature array. How equation works: Suppose there is one user profile U and two college profiles C1 and C2. Also, let there be two types of weights w1 = 2 and w2 = 1 where w1 has more weight than w2. The values for user profile and college profiles are: U = [1, 2, 4, 2, 1, 3, 3, 4, 1, 2] C1 = [1, 2, 3, 3, 1, 1, 2, 4, 1, 3] C2 = [4, 4, 1, 2, 4, 3, 3, 1, 2, 3] Let the first five features of the vector have weight as w1 and other five features as w2. Now, the score of college C1 according to the equation will be: Score(C1) = w1 ∗ sim(U [1:5], C1[1:5]) + w2 ∗ sim(U [6:10], C1[6:10]) = 2 ∗ sim([1,2,4,2,1], [1,2,3,3,1]) + 1 ∗ sim([3,3,4,1,2], [1,2,4,1,3]) = 2.842
AI-Assisted College Recommendation System
145
Similarly, for the score of college C2, the value is 2.142. Now, from the value of scores, it can be easily determined that college C1 is a better match than college C2. The similarity metric used here is Cosine similarity. The scores can be converted to percentages easily as the maximum value is 3. Workflow of the System a. b.
c. d. e. f. g. h. i.
First, the user interacts with the interface and is asked to make an account by providing their credentials. After authentication, the user is provided with a list of questions with multiple options. There are no right or wrong answers as these questions are psychometric in nature. The user now interacts with the interface to answer all the questions and then submits their answer. The server receives the user inputs and builds a profile for that user. The recommender logic is applied to the user profile and all the college profiles present in the database. The score for each college w.r.t user is generated. After generating all the recommendations, they are sorted in descending order based on scores. The server now sends the top 50 recommendations to the front end of the interface. The recommendations are displayed to the user.
3 Implementation The system developed in this paper uses a combination of various technologies from the web intelligence domain. The front end of the system is developed using AngularJS [17], and it was chosen because this system was developed as per requirements by ‘nSmiles’ because this system has been developed in collaboration with ‘nSmiles’. The database layer is implemented using MongoDB [18] which is a NoSQL database. The server which receives requests and sends responses is developed using technologies or libraries in Python. To develop API, Flask was used which is written in Python. To calculate recommendations, pandas and ‘SciPy’ libraries in Python were used. Dataset The database of the colleges was developed using Web scraping [2]. Various websites which provide the information of colleges for free were used to scrape information. The tools used for web scraping were free software like Parse Hub and Python libraries. After gathering the dataset of almost 5400 colleges, the dataset was cleaned and pre-processed for use. The college features were converted to values in a specific range with the highest value implying that the feature having that value is one of the best.
146
K. Kumar et al.
Fig. 2 College data in JSON format
The dataset was then converted to MongoDB. The database was then hosted on the cloud with the help of MongoDB Atlas which is a free service (Fig. 2). Then, a webhook was developed on MongoDB Realm, which sends the data of all the colleges to the server. Server The experimental server was developed with the help of Flask and other data science libraries in Python. The server first receives the request from the front end which consists of an array that stores the user profile which is gathered by the questions from the front end. After receiving the request, the server now connects to the college database with the help of the webhook hosted on MongoDB Realm. The college data is received by the server in JSON format which is then converted to the ‘pandas’ data frame. Then, the user profile array and the data of college are used to generate scores of each college w.r.t. the user. The recommendations are then sorted based on the score, and the top 50 recommendations are sent as a response by the server to the front end in JSON format. The front end then interprets it and displays it to the user. Features used A total of 15 features are used for calculating the scores for each college. A few examples of the features are Placements, Food Quality, Hostel, etc. The features don’t
AI-Assisted College Recommendation System
147
have the same priority. Each feature has a weight associated with it. The weights were calculated by conducting surveys on more than 50 students who are in classes 11–12 and are looking for colleges. The most important features were assigned the most weight like placements, hostel, etc. The questions are based on the features of the colleges. For example, for the feature ‘research’, the question can be ‘How important are the research facilities provided by the college to you?’. Since these questions are psychometric in nature, there are no right or wrong answers (Fig. 3). The options for the question are: • • • •
Most important More important Less important Least important
Fig. 3 Questions
148
K. Kumar et al.
4 Experiments and Observations After building the system, the server was tested with the help of API testers like Insomniac or Postman. The server was sent dummy arrays that represented the user profiles. The server then sent the response in JSON format on average 4 s. The cloud-hosted database was also checked if it was sending the data in the correct format. The front end was implemented in AngularJS. The interface consisted of questions with answers. The system is able to flawlessly generate recommendations and also shows the score for each recommendation (Fig. 4). Also, the recommender logic was also checked by providing a biased input. The system works perfectly as per expectations.
5 Conclusions The students get a lot of career counseling services, but there is no service that provides personalized recommendations of the colleges. This paper describes the system which provides one solution by using the latest techniques from the field of machine learning and addresses the described issue. It addresses some issues which a normal recommendation system has like the cold start problem. The user only needs to provide answers to questions and the recommendations are generated with ease. This system uses the concept of knowledge-based recommenders properly. The system was built using a microservices architecture. More refinements for better UI and performance can be made in the future.
149
Fig. 4 Recommendations
AI-Assisted College Recommendation System
150
K. Kumar et al.
Acknowledgements This paper is based on ideas provided by ‘nSmiles’. ‘nSmiles’ provides various services for mental health in workplaces as well as career counseling services for students. The project is acknowledged by the CTO of ‘nSmiles’ Mr. B.S. Vijay Krishna and Prof. Vandana M.L., PES University.
References 1. S. Bouraga, I. Jureta, S. Faulkner, C. Herssens, Knowledge-based recommendation systems: A survey. Int. J. Intell. Inf. Technol. (IJIIT) 10(2), 1–19 (2014) 2. A.V. Saurkar, K.G. Pathare, S.A. Gode, An overview on web scraping techniques and tools. Int. J. Future Revolution Comput. Sci. Commun. Eng. 4(4), 363–367 (2018) 3. R. Burke, Knowledge-based recommender systems. Encycl. Libr. Inf. Syst. 69 4. S. Girase, V. Powar, D. Mukhopadhyay, A user-friendly college recommending system using user-profiling and matrix factorization techniques, in 2017 International Conference on Computing, Communication and Automation (ICCCA), pp. 1–5. IEEE, (2017) 5. Z. Cui, X. Xu, X.U.E. Fei, X. Cai, Y. Cao, W. Zhang, J. Chen, Personalized recommendation system based on collaborative filtering for IoT scenarios. IEEE Trans. Serv. Comput. 13(4), 685–695 (2020) 6. F. Xue, X. He, X. Wang, J. Xu, K. Liu, R. Hong, Deep item-based collaborative filtering for top-n recommendation. ACM Trans. Inf. Syst. (TOIS) 37(3), 1–25 (2019) 7. Y. Afoudi, M. Lazaar, M. Al Achhab, Hybrid recommendation system combined content-based filtering and collaborative prediction using artificial neural network. Simulat. Model. Practice Theory 113 (2021) 102375 8. W. Haoxiang, S. Smys, Big data analysis and perturbation using data mining algorithm. J. Soft. Comput. Paradigm (JSCP), 3(01), 19–28 (2021) 9. J. Samuel Manoharan, Patient diet recommendation system using K-clique and deep learning classifiers. J. Artif. Intell. 2(2), 121–130 (2020) 10. M.C.V. Joe, J.S. Raj, Location-based orientation context dependent recommender system for users. J. Trends Comput. Sci. Smart Technol. (TCSST), 3(01), 14–23 (2021) 11. S. Milano, M. Taddeo, L. Floridi, Recommender systems and their ethical challenges. AI. Soc. 35(4), 957–967 (2020) 12. S. Wang, L. Hu, Y. Wang, L. Cao, Q.Z. Sheng, M. Orgun, Sequential recommender systems: challenges, progress and prospects. (2019). arXiv preprint arXiv:2001.04830 13. M. Karimi, D. Jannach, M. Jugovac, News recommender systems–survey and roads ahead. Inf. Process. Manage. 54(6), 1203–1227 (2018) 14. P. Kouki, J. Schaffer, J. Pujara, J. O’Donovan, L. Getoor, Personalized explanations for hybrid recommender systems, in Proceedings of the 24th International Conference on Intelligent User Interfaces, pp. 379–390 (2019) 15. A.C. Harkare, N. Pali, N. Khivasara, I. Jain, R. Murumkar, Personalized college recommender: a system for graduate students based on different input parameters using hybrid model 16. Knowledge-Based Recommender Systems: An Overview by Jackson Wu available at medium.com 17. Angular available at https://angular.io/ 18. MongoDB available at: https://docs.mongodb.com
An Agent-Based Model to Predict Student Protest in Public Higher Education Institution T. S. Raphiri, M. Lall, and T. B. Chiyangwa
Abstract The purpose of the paper is to design and implement an agent-based model of student protests to predict the emergent of protest actions. Student protest actions have often resulted in damage to property and cancelation of academic programs. In this study, an agent-based model that integrates grievance as relative deprivation, perceived risk, and various network effects was used to simulate student protest. The results from a series of simulated experiments show that social influence, political influence, sympathy, net risk, and grievance were statistically significant factors that contribute to the probability of students engaging in a protest. Based on Logistic Regression, the model accounts for 93% of the variance (Nagelkerke R Square) in the dependent variables included in the equation. For universities management to effectively manage unruly student protest actions, policies on risk management strategies should place more emphasis on understanding network structure that integrate students’ interactions to monitor propagation of opinions. Keywords Agent-based model · Student protest · Social conflicts · Collective behavior · Network influence
1 Introduction Public universities in South Africa have been negatively affected by recent student protests which continue to be prevalent even after more than two decades of democracy. For instance, October 2015 marked the beginning of #FeesMustFall student movement ever experienced by South African universities post-apartheid [1]. This T. S. Raphiri (B) · M. Lall Department of Computer Science, Tshwane University of Technology, Gauteng, South Africa e-mail: [email protected] M. Lall e-mail: [email protected] T. B. Chiyangwa Computer Science Department, University of South Africa, Gauteng, South Africa e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_12
151
152
T. S. Raphiri et al.
#FeesMustFall protest was triggered by fee increment at Wits University and rapidly spreads to other universities across the country. Students continue to be frustrated with several issues including lack of transformation and high inequalities in South African universities. Therefore, these lack of transformation and unequal distribution of educational resources increase level of frustration and trigger protest behavior among students [2]. Dominguez-Whitehead [3] argues that students’ grievances are as a result of among others, financial and academic exclusion, lack of financial aid, inadequate student residences, and crime occurring on campuses. Students have identified protest actions as an effective strategy to express their frustration and to challenge their perceived injustices. Furthermore, the evolution and widespread accessibility of Internet technologies have enabled protests mobilization to become simpler than before and have changed social conflicts dynamics. To an extent, social media has been used to recruit people to join protests and to share opinions and news about ongoing protests events [4]. These Internet platforms have become an easy target for political campaign and protests mobilization, as witnessed during #FeesMustFall movements. The emergent of social conflict events (such as protests, civil violence, and revolutions) are classified as properties of complex adaptive system and can be modeled through agent-based model (ABM) [5–7]. Extensive research have been done to examine students’ protest in several disciplines, including social and political studies [3, 8], however, very little if any ABM has been developed to predict students’ protests at institutions of higher education. In this study, ABM that integrate the effect of relative deprivation (RD), net risk, social influence, political influence, and sympathy influence is developed to predict students’ protest at a public institution of higher education. The constructed model assists in identifying students’ microlevel behavioral patterns and combinations of factors which result into a protest action. The understanding of this emergent behavior-assisted university management in identifying behavioral patterns that result into a protest and subsequently helped in preventing damage to property, intimidation of staff and non-protesting students, and possible injuries [9]. This paper is organized as follows: Section 2 presents related work. Section 3 provide the methodology which consists of model design, implementation, calibration, and simulation experiments. Section 4 provide results and discussion followed by conclusion in Sect. 5.
2 Related Work Studies based on crowd simulation have shown how incorporating social conflict theories into ABM can help develop useful techniques to examine protests [10–13]. Epstein [13] developed a widely adopted classical agent-based computational model of civil violence, and since then, crowd simulation has evolved. In Epstein [13], civilian rebel if the different between their grievances (G) and perceived net risks (N) exceed a constant non-negative threshold (T), whereas law enforcement agents seeks
An Agent-Based Model to Predict Student …
153
to suppress rebelling civilians in their neighborhood. The grievance was denoted by the product of heterogeneous perceived hardship (H) and fixed government legitimacy (L). Net risk is represented as function of agent’s risk perception (R) as well as the estimated arrested probability (P). The simplified behavioral rule of the agent in the model proposed in the study of Epstein [13] is: “If G – N > T then ‘rebel’ else be ‘quiet’.” Simulation results qualitatively show that certain set of parameters were able to generate typical characteristics and dynamics which represent civil violence processes, such as endogenous and sporadic outbursts of violence. However, Epstein [13]’s model is simple without any defined social interaction, but serves as a baseline for future developments. The study of Kim and Hanneman [11] extended the Epstein [13]’s ABM to simulate crowd dynamics in workers protest to theoretically examine the interplay between race and class and further investigate patterns of protest waves. In [11], agent’s decision to protest is based on grievance, represented by relative deprivation as a result of wage inequalities, perceived risk of being arrested, and group identity, denoted by ethnic and cultural tags. The simulation experiment results in [11] indicate that frequency of the protest is heavily influenced by wage inequalities (or grievances). However, Kim and Hanneman [11] only includes neighborhood social interactions of agents without any network structure or influence exerted by activists or community leaders. Furthermore, the study in [14] extended Epstein’s model by incorporating three types of agents (that is protester, police, and media). Their proposed model qualitatively explores the interaction of agents in a street protest scenario, in order to understand the emerged crowd patterns, the influence of non-protesters, and the effects of news coverage. In [14], protesting citizens seeks to invade the attraction point, while law enforcement agents with sufficient backup arrest violent protesters to defend the attraction points. The media agents seek to be closer to the attraction points to capture the violent actions. The results in [14] show that the model simulated real-life features of a protest, such as clustering of violent and active protesters, formation of confrontation line, occasional fights and arrests, and the media agents lead local clustering and seeking hot spots to capture the action as close as possible. Although this model captures some realistic dynamics of crowd patterns of a protest event, the protesting civilians’ social interaction is minimal and does not incorporate the effect of social, political, and sympathy in the formation of protest. Other studies that extended Epstein [13]’s model include Ormazábal, Borotto and Astudillo [5], who proposed ABM to explore civil violence dynamics when hardship is represented as function of money distribution. The study of Ormazábal, Borotto and Astudillo [5] is aimed at evaluating the effect of inequalities in the distribution of money on social mobilizations. Again, the study of Fonoberova, Mezi´c, Mezi´c, Hogg and Gravel [15] presented an ABM of civil violence to explore the effect of non-neighborhood links on protest dynamics when varying cop agents’ density and network degree. Neither [5] nor [15] integrated social, political, and sympathy influence in their proposed model. This study was aimed at developing an ABM of student protests which extend from Epstein’s work. The model includes an integration of grievance which is
154
T. S. Raphiri et al.
defined by relative deprivation accumulated from discrepancies in resource distribution (inequality level). The model explores the effect of integrating social influence defined by undirected friendship ties, political influence denoted by directed activists’ links, and sympathy influence resulting from Moore’s neighborhood network graph.
3 Methodology 3.1 Model Description The student protest agent-based model (STUDPRO) consists of two types of turtle objects: student (S) and law enforcement officer (LEO) agents. STUDPRO includes two sorts of student agents: activist and regular students. Students and LEO interact in an abstract artificial environment defined by a forty-by-forty-two-dimensional grid space. Additional entities in the model represent linkages that operate as network graphs for the interactions of the agents. Figure 1 depicts a high-level class diagram
Fig. 1 High-level class diagram of the model (own source)
An Agent-Based Model to Predict Student …
155
Fig. 2 Model’s conceptual framework (own source)
of the model’s entities. Each object has its own set of internal characteristics as well as processes. Students are heterogeneous agents that executes submodels which allows them to interact with one another through several network architectures, assess the next action state, and migrate to vacant patches. The ACTIVE? variable indicated each student’s behavioral state, which was determined by using the threshold rule to assess whether the student was protesting or quiescent. Grievances, risks, and network effects such as social, political, and sympathy were all factors in students’ decision to engage in protest activity as depicted in Fig. 2. The green factors positively contribute toward participation, meanwhile red represent negative effect. The threshold rule to evaluate behavioral state is: If T ≤ RD + SInfl + PInfl + SymInfl − NR, Then be Active; else, be quite. That is, if constant threshold T ≡ 0.1 (taken form Epstein [13]) is less or equivalent to accumulated grievance (quantified as deprivation (RD)), plus network influences which are social effect (SInfl), political effect (PInfl), and sympathy effect (SymInfl) minus the net risk (NR), then student transform into active and T represents threshold. Student’s perceptual relative deprivation (RD) represents source of grievance. The possible range of RD felt by each student with regard to accessibility to resource x in relation to a certain reference group (social connections, political ties, and neighbors) is [0, x ∗ ]. The 0 represent the very minimum, while x ∗ denote maximum of resources a student may have. For student i, [0, x i ] denote variety of resources i can access, and [x i , x ∗ ] represent resources i is deprived of, which was used to quantify each unit of RD. Range of RD [x i , x ∗ ] felt by student i can be simplifies as (x, x + dx) or k 1− F(x), where whereby F(x) = 0 f (y)dy is the cumulative resource distribution, and 1 − F(x) is the frequency of students in the referenced group having resource accessibility above x. Relative deprivation experienced was calculated using the (1):
156
T. S. Raphiri et al.
x∗ RD = [1 − F(y)]dy
(1)
xi
Similar to the ABM proposed by Epstein [13], if grievances outweigh the consequences of participation, then students decide to protest. The perceived net risk (NR) of being laid off from academic activities was used to calculate each student’s participation cost. Students calculate participation cost or perceived net risk using (2):
NR = RA × [1 − e
Vlaw active
−k× S
]× J
(2)
As in Epstein [13], RA was a randomly assigned value ranging from 0 to 1 which represent a fixed and heterogeneous risk aversion of each student. Constant k = 2.3, while Vlaw represent number of visible LEO agents and Sactive denoted the number of students in active states in the vicinity. The visible radius was determined by number of sideway patches each student and officer agent can occupy, also referred as Moore (or indirect) neighborhood network [16], which was set by a slider when initializing the ABM. The maximum suspension term (J ) was set as homogenous value for student agents and assigned through a slider during model setup. Social effect was determined by interaction of students which resulted from symmetrical friendship network graphs integrated in the model. The structure of the students’ friendship network denoted by undirected graphG{S, L}, whereby S represent set of linked students and L are set of social ties or links between them. The constructed model only expresses fundamental attributes of network structure which include vertex degree (deg(S)), distance between nodes, and nodes clustering. deg(S) was defined by a fixed heterogeneous number of friendship ties ∈ {L 1 , . . . , L Random(1,Maxfriends) }, in which Maxfriends was obtained as input parameter through a slider in the model. In this study, students construct friendship ties by selecting students in the vision radius (V ) and by randomly choosing other students with equal STUDY-LEVEL which contains a value ranging from {0, 4} if deg(S) < L. This selection technique aided in generating a random network graph for realistic representation of friendship ties. Opinions which are represented as the different between the relative deprivation (RD) and net risk (NR) spread from aggrieved student to another via integrated social friendship network as summarized by (3). SInfl =
RD − NR.ω1
(3)
∈ASNt
∈ ASNt denotes the number of students protesting over time in the friendship network graph of student and ω1 represent global social effect weight which was constant. The directed network graph G{A ∈ [A1 , . . . , Anum_activist ], E ∈ [E 1 , . . . , E political_influence_size ]} was integrated into the model to represent political influence
An Agent-Based Model to Predict Student …
157
by activists. A and E define set of activist nodes, in the range [1, num_activists], and set of directed edges, in the range [1, political_influence_size], respectively. Edges were defined by an ordered pair node (a, n) in the order ofa → n, as a result, influence is initiated by activist (a) and directed to student node (n). Activists have a positive out-degree, which is determined by the POLITICAL-INFLUENCESIZE input parameter. Activists are sources with directed connections aimed toward a proportion of students picked at random from a population with the attribute of POLITICAL-PARTICIPATION? equal to TRUE. A student can be associated with more than one activists. Equation (4) was used to calculate political effect (P I n f l) in this study:
PInfl =
RD − NR.ω2
(4)
∈ NAPi
∈ N A P i denote number of opinions from activists toward regular students, computed as RD minus NR and ω2 represent a constant global political effect weight. Student agents were sympathetic toward active students situated in patches within vision radius. Moore neighborhood graph G = {[x, y] : |x − x0 | ≤ r, |y − y0 | ≤ r } used to incorporate the sympathy effect in this study. Whereby, [x0 , y0 ] denoted patch position occupied by student, and [x, y] represent patches adjacent to [x0 , y0 ], inclusive of [x0 , y0 ]. The maximum vision range denoted by r was defined by input parameter initialized through a slider. For each student in the model, sympathy effect (SymInfl) was calculated based on (5): SymInfl =
RD − NR.ω3
(5)
∈AVi,t
The variances in the RD and NR were used to define propagating opinions to construct sympathy effect. Meanwhile, ∈ AVi,t represented set of active students in the neighborhood network structure of a student, and ω3 represent global constant sympathy effect weight. Officers randomly catch one of the active students in their vision radius. Officers suspend active students by moving into the position of their patch. For each student, SUSPEND-TERM runs from (0 to J), where J was defined using MAXIMUMSUSPEND-TERM slider. Officers delays their next suspension rule by time ticks ranging in the range of (0, SUSPEND-DELAY). Students and cops follow a standard method in which they evaluate adjacent patches within their visual radius and move to any patch that is not occupied by other students or officers at random.
3.2 Model Implementatıon In this study, NetLogo version 6.1 was selected because it is mostly utilized by many researchers and also it is user friendly [17]. NetLogo 6.1 integrated development environment (IDE) was used to code the ABM of students’ protests, and multiple
158
T. S. Raphiri et al.
Fig. 3 NetLogo-integrated development environment (own source)
simulation experiments were conducted through BehaviorSpace [18]. The NetLogo IDE is a key tool that aids in the construction, simulation, and examination of models. In addition, the NetLogo IDE contains useful and easy-to-follow tutorials and documentations. BehaviorSpace aided in the off-screen execution of the model, as well as parameter sweeping and record simulated data in a comma-separated values (csv) file. The user interface for the implemented construct developed in this study is depicted in Fig. 3.
3.3 Experimental Design Simulation experiment was carried out in this study to investigate the effect of grievance, net risk, and network influences which social, political, and sympathy on the dynamics of students’ protests. Table 1 shows the configuration of parameters that were shared across all simulation experiments that were conducted. Note that the values flagged with ** are similar to those in [5, 11, 13], while values for other parameters were chosen based on the model’s sensitivity analysis. Table 2 depicts the experimental variables, which includes the decision variables utilized in the parameter sweeping procedure. When shared parameters were held constant, the emphasis was on the impact of inequality and suspension delay, friendship ties, the number of activists and their influential size, and sympathetic influence on the dynamics of student protests under first and second simulation conditions.
An Agent-Based Model to Predict Student …
159
Table 1 Shared parameters configurations (Own Source) Parameter
Description
Value
S
Student population density
**70%
LEO
Law enforcement officers density
**4%
Jmax
Maximum suspend (layoff) term
**30 ticks
Q1
Fixed arrest probability value
**2.3
V
Moore neighborhood vision radius
**7
Agent_Mobılıty?
Activate students and officers movement
**True
Actıvate_Socıal_Group?
Activate student friendships links
True
Actıvate_Polıtıcal_Group?
Activate activists mobilization links
True
Friendshıp_Weight
Weight of social friendship links
0.05
Political_Weight
Weight of activist links
0.05
Sympathy_Weıght
Weight of sympathy to neighbors
0.02
Lattice dimensions
Grid size (number of patches)
**40 × 40
Time step
Time ticks for each simulation experiments
**250
Table 2 Experimental variables (own source) Inequality level (IL)
Suspend delay (SD)
Maximum friendship ties (NF)
Number of activists% (NA)
Activists influential size (PI)
Sympathy
First condition
Low (0.02) Median (0.03) High (0.04)
Low (5 ticks) Median (7 ticks) High (9 ticks)
Low (deg(S) = 5) Median (deg(S) = 10) High (deg(S) = 15)
Low (0.05) Median (0.1) High (0.15)
Low (0.05) False Median (0.1) High (0.15)
Second condition
Low (0.02) Median (0.03) High (0.04)
Low (5 ticks) Median (7 ticks) High (9 ticks)
Low (deg(S) = 5) Median (deg(S) = 10) High (deg(S) = 15)
Low (0.05) Median (0.1) High (0.15)
Low (0.05) True Median (0.1) High (0.15)
160
T. S. Raphiri et al.
4 Results and Discussion 4.1 Data Screening Data screening was conducted to ensure that data captured from simulated experiments that was utilized in the statistical analysis to verify assumptions is of certain standards to produce accurate or acceptable results. In this study, simulated data was normally distributed for all cases, and no observation contained missing values. Preliminary analysis was conducted to ensure that multicollinearity issues are addressed before running predictive model. The regression coefficients results were used to calculate the collinearity tolerance threshold and variance inflation factors (VIF) to evaluate multicollinearity between decision variables. The result ascertained that there was no multicollinearity between predictor variables, tolerance was greater than 0.2, meanwhile VIF was less than 5. Further validation of no multicollinearities was observed through Pearson’s correlation matrix, whereby all the associativity between independent variables had correlation coefficients (R-values) which were under 0.9.
4.2 Results Dynamics of students’ action rules which are either “protesting,” “quiescent,” or “suspended” over time are illustrated in Figs. 4 and 5. When students show empathy toward protesting neighbors, the strength of the protest represented by the number of students participating in collective action tend to be higher. As shown in Fig. 4, when all decision variables (that is inequalities, suspend delays, activists, political influential size, and number of friends) were high (H), initial number of students protesting surpass the number of students who are quiescent at time step 0–5 and thereafter drops to an average of 38% until time step 250. The average number of bystanders over time was 53%, whereas number of suspended students was 9%. Simulating decision factors with median (M) values resulted in slightly decreased number of active students (36.5%), bystanders (55%), and suspended students (55%). Fig. 4 Time and student population first condition (own source)
An Agent-Based Model to Predict Student …
161
Fig. 5 Time and student population second condition (own source)
When decision variables were low (L), the number of students protesting was 29%, number of students suspended were 4%, and the average number of students who are bystanders were 67%. Figure 5 shows the number of active, suspended, and bystanders over time step when students are not sympathetic toward one another. When all decision factors were high (H), an average of 27.5% of the student population was engaged in protest action, 9.5% of students were suspended and 63% of students were bystanders. Meanwhile, when decision variables were medium (M), an average of 24% of students were active, 69% were quite, and 7% were suspended. When decision factors were low (L), the number of active students was the lowest at 22.5%, 75% of the population was quite, while 2.5% of students were suspended. Furthermore, logistic regression analysis was conducted to evaluate the influence of independent variables on the probability of student protest occurring. The dependent variable STRIKE, which is dichotomous with value 0 (no strike) and value 1 (strike) is used to represent the likelihood of strike occurring. Based on Cox and Snell, and Nagelkerke pseudo R2 values, the variation in the decision variables explained by the model in this study ranges from 18.1 to 93.0%, respectively. The goodness-offit test to evaluate overall performance of the model with predictors was statistically significant, χ2(5) = 3103.400, p < 0.05. Table 3 depicts the classification table that shows the construct that was predicted by the logistic regression model with the intercept and predictor variables. The accuracy in classification with added independent variables correctly predicted by the model was 99.7% as shown by Overall Percentage. Sensitivity illustrates that 92.0% of participants who led to the occurrence of the strike were also predicted by the model. Specificity shows 99.9% of participants who did not led to the occurrence of the strike were correctly predicted by the model. The positive predictive value was 94.19%, while the negative predictive value was 99.82%. Furthermore, the Wald test was used to evaluate the statistical significance for each of the latent variables. The results in Table 4 clearly illustrate that social influence
162
T. S. Raphiri et al.
Table 3 Classification tablea (own source) Observed
Predicted Strike
Step 1
Strike
Yes (1)
No (0)
15,180
20
99.9
Yes (1)
28
324
92.0
Overall percentage a
Percentage correct
No (0)
99.7
The cut value is 0.500
(Wald = 358.222; p value < 0.05), political influence (Wald = 32.537; p value < 0.05), sympathy influence (Wald = 27.554; p value < 0.05), grievance (Wald = 261.2; p value = 0.045), and net risk (Wald = 325.411; p value < 0.05) are positively significant to the model or prediction. In this study, as conclusion the proposed model of this study was different from the traditional model after statistical analysis was carried out as given in Table 5.
4.3 Discussion of Results The results in this research ascertain that grievance is a statistically significant influence in the probability of student participating in protest action (β = 120.361; p value = 0.045). This finding is inline with [19] who argued that certain level of grievance tends to increase the probability of an individual to participation in political activities such as protest action. Furthermore, the perceived net risk is a statistical significant contributor in the probability of student participating in protest behavior (β = − 293.46; p < 0.05). This is consistent with the study of [19] who argued that when risk reaches a particular level: (1) It reduces the probability of any form of participation in political activities in general, (2) reduces interest of participating in protest, and (3) reduce participation in conflicts activities. In addition, simulation results posit that social influence is a statistical significant factor in the probability of student participating in protest behavioral action (β = 16,159.44; p < 0.05). This outcome is further ascertained by study of [20] that suggested that individuals with friendship links or acquaintances that are actively involved in a protest action are more likely to participate in social movement actions than others. In addition, the results reveal that political influence is a statistical significant determinant in the probability of student participating in protest behavioral action (β = 2325.886; p < 0.05). These findings are in line with previous literature. According to [21], political discourses such as protest are primarily described through different levels of issues including propagation of political influence as well as political mobilization. The study of [22] suggests that widespread use of social media platform has enabled activists to target large number of people for protest mobilization and that subsequently increase protest participation. Lastly, the result of this study suggests that sympathy influence
5.249
7.394e + 272
119.695
628.304
Sympathy influence
Note Strike level “1” coded as class 1
5.704
∞
407.756
2325.886
Political influence
−18.039 18.927
3.565e −128 ∞
16.268
853.788
−293.460
16,159.440
Net risk
Social influence
0.447
233,617.056
27.640
12.361
Z 16.122
Odds ratio 4.398e + 60
Grievance
Standard error
8.661
139.636
(Intercept)
Estimate
Table 4 Variables in the equation (own source)
27.554
32.537
358.222
325.411
0.200
259.910
Wald statistic
Wald test Df
1
1
1
1
1
1
< 0.001
< 0.001
< 0.001
< 0.001
0.655
< 0 0.001
p
393.705
1526.700
14,486.047
−325.344
862.903
3125.072
17,832.834
−261.575
66.534
−41.811
Upper bound 156.612
122.660
Lower bound
95% confidence interval
An Agent-Based Model to Predict Student … 163
164 Table 5 Comparison of traditional and proposed model (own source)
T. S. Raphiri et al. Proposed model
Traditional model [11], [13]
Decision
Threshold
Threshold
Similar
Grievance
Grievance
Similar
Net risk
Net risk
Similar
Political effect
Different
Social effect
Different
Sympathy
Different
is a statistical significant contributor in the probability of student participating in protest behavioral action (β = 628.304; p < 0.05). This is in agreement with the study of [20] that argue that individual’s first step in protest participation is guided by consensus mobilization, whereby the general society is distinguished into people who sympathize with the cause and others who do not. The more effective consensus mobilization has been, the bigger the number of sympathizers a protest can attract.
5 Conclusion The purpose of the study was to design and implement an agent-based prediction model of student protests in the context of South African public universities to predict the emergent of protest actions. Several fields, such as social and political studies, have conducted substantial study on student protest, but no agent-based model has been used to predict student demonstrations at higher education institutions. An ABM was employed to mimic student protest in this study. This study shows that social influence contributes the highest beta value (16,159.44) influence toward the emergent of strike. Social influence, political influence, sympathy, net risk, and grievance contribute 93% as total variance explained in predicting the probability of protest occurrence based on Nagelkerke pseudo R Square using logistic regression. Future studies can utilize student demographical and empirical data to initialize internal properties of student agents for more realistic representation of model entities. Furthermore, based on the results obtained, it can be recommended that the application of ABM of student protest can be broadened to explore community-based protest initiated by activist.
6 Limitations In terms of the context and scope limitation, this study only focused on student protests without categorizing it into peaceful nor violent protests. This study acknowledges that incorporating protests containment strategies for law enforcement officers can have a significant change the dynamics of the system; however, that was
An Agent-Based Model to Predict Student …
165
out of scope in this research due to time and lack of available data on protest policing strategies in South African institution of higher learning. The interactions of model entities were limited, since there was no readily available spatial data to integrate geographical information system to guide agents’ movement rules on their hypothetical environment.
References 1. T. Luescher, L. Loader, T. Mugume, # FeesMustFall: an Internet-age student movement in South Africa and the case of the University of the Free State. Politikon 44(2), 231–245 (2017) 2. T.S. Kumar, T. Senthil, Construction of hybrid deep learning model for predicting children behavior based on their emotional reaction. J. Inf. Technol. 3(01), 29–43 (2021) 3. Y. Dominguez-Whitehead, Executive university managers’ experiences of strike and protest activity: a qualitative case study of a South African university. South Afr. J. High. Educ. 25(7), 1310–1328 (2011) 4. T. Ramluckan, S.E.S. Ally, B. van Niekerk, Twitter use in student protests: the case of South Africa’s #FeesMustFall Campaign, in Threat Mitigation and Detection of Cyber Warfare and Terrorism Activities (IGI Global, 2017), pp. 220–253 5. I. Ormazábal, F. Borotto, H. Astudillo, Influence of money distribution on civil violence model. Complexity 2017, 1–15 (2017) 6. A. Bashar, Survey on evolving deep learning neural network architectures. J. Artif. Intell. 1(02), 73–82 (2019) 7. S. Mugunthan, T. Vijayakumar, Design of improved version of sigmoidal function with biases for classification task in ELM domain. J. Soft Comput. Paradigm (JSCP) 3(02), 70–82 (2021) 8. B. Oxlund, Responding to university reform in South Africa: student activism at the University of Limpopo. Soc. Anthropol. 18(1), 30–42 (2010) 9. S. Peté, Socrates and student protest in post-apartheid South Africa-Part Two. J. Juridical Sci. 40(1–2), 1–23 (2015) 10. S.S. Bhat, A.A. Maciejewski, Agent-based simulation of the LA 1992 riots, An (Colorado State University, Libraries, 2006) 11. J.-W. Kim, R. Hanneman, A computational model of worker protest. J. Artif. Soc. Soc. Simul. 14(3), 1 (2011) 12. Lacko, P., Ort, M., Kyžˇnanský, M., Kollar, A., Pakan, F., Ošvát, M., Branišová, J.: Riot simulation in urban areas, in Book Riot Simulation in Urban Areas (IEEE, 2013), pp. 489–492 13. J.M. Epstein, Modeling civil violence: an agent-based computational approach. Proc. Natl. Acad. Sci. 99(suppl 3), 7243–7250 (2002) 14. C. Lemos, H. Coelho, R.J. Lopes, Agent-based modeling of protests and violent confrontation: a micro-situational, multi-player, contextual rule-based approach, in Book Agent-Based Modeling of Protests and Violent Confrontation: A Micro-situational, Multi-player, Contextual Rule-Based Approach (2014), pp. 136–160 15. M. Fonoberova, I. Mezi´c, J. Mezi´c, J. Hogg, J. Gravel, Small-world networks and synchronisation in an agent-based model of civil violence. Global Crime 20(3–4), 161–195 (2019) 16. S. Klancnik, M. Ficko, J. Balic, I. Pahole, Computer vision-based approach to end mill tool monitoring. Int. J. Simul. Model. 14(4), 571–583 (2015) 17. M. Lall, An agent-based simulation of an alternative parking bay choice strategy. S. Afr. J. Ind. Eng. 31(2), 107–115 (2020) 18. U. Wilensky, Netlogo. 1999. http://ccl.northwestern.edu/netlogo/ (Center for Connected Learning and Computer-Based Modeling, Northwestern University, Evanston, IL, USA, 1999) 19. A.F. Lemieux, E.M. Kearns, V. Asal, J.I. Walsh, Support for political mobilization and protest in Egypt and Morocco: an online experimental study. Dyn. Asymmetric Conflict 10(2–3), 124–142 (2017)
166
T. S. Raphiri et al.
20. J. Van Stekelenburg, B. Klandermans, The social psychology of protest. Curr. Sociol. 61(5–6), 886–905 (2013) 21. G. Singh, S. Kahlon, V.B.S. Chandel, Political discourse and the planned city: Nehru’s projection and appropriation of Chandigarh, the capital of Punjab. Ann. Am. Assoc. Geogr. 109(4), 1226–1239 (2019) 22. T. Poell, J. Van Dijck, Social media and activist communication, in Social Media and Activist Communication. In The Routledge Companion to Alternative and Community Media, ed. by P. Thomas, J. van Dijck (2015), pp. 527–537
RETRACTED CHAPTER: High Accuracy for Hyperspectral Image Classification Using Hybrid Spectral 3D-2D CNN
PT ER
Mohini Shivhare and Sweta Tripathi
A C
TE D
C
H A
Abstract HSI characterization is broadly utilized for the investigation of remotely detected image. HIS incorporates changing groups of images. CNN is perhaps the most habitually utilized profound learning-occupying techniques for visual information handling. The utilization of CNN for HSI characterization is additionally apparent in ongoing works. These accesses are generally founded on CNN_2_D. Then again, the HSI characterization execution is profoundly reliant upon both contiguous and spectral data. Not many strategies have utilized the CNN_3_D as a result of expanded compurgation intricacy. This paper introduce a hybrid spectral CNN (HybridSN) for HSI arrangement. As a rule, the HybridSN is a spectral contiguous CNN_3_D pursue by contiguous CNN_2_D. The CNN_3_D works with the collective contiguous–otherworldly component portrayal from a pile of phantom groups. The CNN_2_D on dominant of the CNN_3_D further learns more dynamic level contiguous depiction. In addition, the utilization of cross breed CNNs diminishes the intricacy of the model contrasted with the utilization of CNN_3_D unattended. An exceptionally acceptable exhibition is gotten utilizing the proposed HybridSN for HSI characterization.
R
ET R
Keywords Hyperspectral image (HSI) · 2D · 3D · Convolutional neural network (CNN) · HybridSN
The original version of this chapter was retracted. The retraction note to this chapter can be found at https://doi.org/10.1007/978-981-19-2894-9_63 M. Shivhare (B) · S. Tripathi Department of Elecronics and Communication, Kanpur Institute of Technology, Kanpur, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022, corrected publication 2024 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_13
167
168
M. Shivhare and S. Tripathi
1 Introduction
R
ET R
A C
TE D
C
H A
PT ER
The overall engineering of HSI preparing is introduced in Fig. 1. It contains four phases to be specific picture procurement, pre-preparing, dimensionality decrease, and grouping. In the main stage, HSI is obtained with hyperspectral sensors. Hyperspectral distant detecting gadget catches a scene utilizing different imaging spectrometer sensors over a wide frequencies going from apparent range to approach infrared range, which offers definite phantom data about the ground objects in a few nonstop ghastly groups (from tens to a few hundreds) [1]. The hyperspectral information caught utilizing an imaging spectrometer has various mistakes which are produced because of review math, climatic conditions, stage developments, and different sources. These mistakes are considered as environmental, radiometric, and mathematical blunders. They get decreased in the prehandling stage utilizing different adjustment draws near. In environmental remedy, the surface reflectance from distantly detected symbolism is recovered through evacuation of climatic impacts [2, 3]. In radiometric rectifications, dark upsides of pixel are changed over into brilliance esteems relating to radiation reflected or produced from the surface. Mathematical remedy includes change of a distantly detected picture, empowering having the scale and projection properties of a given guide projection. These revisions are now directed on standard benchmark hyperspectral datasets. Nonetheless, while working
Fig. 1 General architecture of hyperspectral image processing
RETRACTED CHAPTER: High Accuracy for Hyperspectral Image Classification …
169
D
C H
A PT ER
with continuous hyperspectral dataset, this amendment should be acted in the prepreparing stage. Also, the assortment of spectra is mutilated by sensor commotion just as shifting brightening or air conditions. Thus, the HSI for the most part has a couple loud and water assimilation groups [4]. These groups are should have been taken out before additional preparing. In this work, we utilized standard pre-prepared benchmark datasets. Given a bunch of preparing tests, arrangement of hyperspectral expects to dole out an interesting mark for each test pixel in the picture. A vector addresses every pixel of a HIS [5]. Every pixel compares to the reflectance of the article, and the length of the vector is equivalent to the quantity of discrete phantom groups. Characterization of HSI is a difficult assignment because of the presence of an enormous number of groups and predetermined number of named tests [6]. The exhibition of the classifier relies upon the relationship between the quantity of preparing tests and the quantity of highlights. The colossal ghostly groups in hyperspectral information bring about an immense volume of information. Lacking number of named preparing tests, the characterization execution can cause a critical corruption due to the “scourge of dimensionality” or “Hughes wonder”. Since the quantity of highlights in a hyperspectral picture is huge, the quantity of preparing tests must be enormous with the end goal of precise order of the hyperspectral picture, which is unrealistic. In addition, the neighboring groups in the hyperspectral are for the most part emphatically associated [7].
C TE
2 HSI
R
ET
R
A
HSI obtain many, exceptionally thin, coterminous ghostly groups all through the apparent, close—infrared, mid-infrared also, warm infrared places of the electromagnetic range. HSI normally gather at least 200 groups empowering the development of an practically nonstop reflectance range for each pixel in the scene. Adjacent limited transmission capacities normal for hyperspectral information considers top to bottom assessment of earth surface highlights which would some way or another be “lost” inside the somewhat coarse transmission capacities procured with multispectral scanners. Over the previous decade, broad innovative work has been completed in the field of hyperspectral far off detecting. With business, airborne HSI like the dispatch of satellite-based sensors like Hyperion HSI is quick into the standard of distant detecting and applied far off detecting research contemplates. HSI has discovered numerous applications in water asset the board, horticulture, and ecological observing. It is critical to recollect that there is not really a distinction in spatial goal among hyperspectral and multispectral information but instead in their phantom goals. HSI is ordinarily alluded to as ghostly imaging or on the other hand ghostly investigation. The differentiation among hyper- and multi-otherworldly is here and there subject to an optional “number of gatherings” or on the sort of assessment, dependent upon what is fitting to the explanation. Multispectral imaging deals with a couple of pictures at discrete and somewhat close gatherings. Being “discrete and
M. Shivhare and S. Tripathi
PT ER
170
H A
Fig. 2 Concept of HSI
TE D
C
genuinely confined” is what perceives multispectral in the obvious from concealing photography. A multispectral sensor may have various gatherings covering the reach from the clear to the long wave infrared. Multispectral pictures do not make the “range” of a thing. Landsat is an astonishing representation of multispectral imaging (Fig. 2).
A C
3 CNN
R
ET R
CNN is one of the most progressive man-made consciousness ways to deal with handle PC vision challenges [8]. The deep CNN established a class of profound learning as a feed-forward counterfeit neural organization and applied in a few of the agrarian picture orders works [9]. The convolutional layer (CL) is fundamental in CNN, and it extricates the elements from input pictures utilizing channels. The huge volume of preparing information is important to improve the exhibition of CNN [10]. One of the vital advantages of utilizing deep CNN in picture characterization is decreasing the need of the element designing interaction. Various convolutions are acted in a few layers of CNN [11]. They create different portrayals of the preparation information, beginning from additional normal ones in the primary more broad layers and turning out to be more itemized in the more profound layers. At first, the CL operate like element extracting from the preparation information whose greatness is then limited utilizing the pooling layers. The CL extricates different lower level elements into extra discriminative elements [12]. Moreover, the CL is the essential structure squares of deep CNN. Include designing is an exceptionally unmistakable piece of profound learning and a critical stride ahead for conventional AI [13].
RETRACTED CHAPTER: High Accuracy for Hyperspectral Image Classification …
171
PT ER
Fig. 3 Convolutional neural network
R
ET R
A C
TE D
C
H A
The amalgamate layer directs the down-testing activity along the spatial aspects. It upholds diminishing the quantity of boundaries. The max pooling system was utilized in the amalgamate layer of the prospective model [14]. Max amalgamate accomplishes preferred execution over normal amalgamate in the prospective deep CNN model. Another significant layer is a dropout, which alludes to eliminating elements from the organization. It is a regularization strategy for diminishing overriding. The prospective model was prepared and looked at utilizing distinctive dropout esteems changing from 0.2 to 0.8. At last, the thick layer plays out the arrangement utilizing the result of the convolutional and amalgamate layers. Profound CNN is a very iterative cycle, and it should prepare various models to find the best one [15]. Slope plummet is a basic improvement strategy that conducts the slope steps utilizing all preparation information on each progression, and it is moreover known as group inclination plunge. The execution of inclination plummet with a broad preparing set is troublesome. Figure 3 shows the example layered engineering of the deep CNN method. These channels are the basic highlights that are looked in the info picture in the CL. It takes w × h × d (w-width, h-height, d-significance of an image) display of pixels as data over which a k × k size window, known as channel or portion is slid across its width and stature with the end goal that it covers all the space of the information picture. While sliding through the picture, every pixel esteem in the information picture is duplicated with the qualities in the channel component by component and summarized to give one pixel worth of the yield picture [10]. Each layer yields a bunch of actuation guides or highlight maps, and one for each channel and this is taken care of as contribution to the following convolution layer. The convolution activity is obviously clarified in Fig. 4. For a given 5 × 5 picture, a 3 × 3 channel is slid across the picture one pixel at a time until it arrives at the last segment which would bring about three convolution activities for the principal column. Then, at that point, the channel is slid one pixel down from the left most corners and again slid across the picture till the end. This even and vertical sliding occurs till it arrives at the right side base most 3 × 3 square. This would deliver a 3 × 3 initiation map. Also, for n number of channels, n actuation guides will be created.
M. Shivhare and S. Tripathi
C TE
D
C H
A PT ER
172
Fig. 4 Example of transfer learning technique
R
ET
R
A
Move learning is a method to move the information from one AI model to another [16]. It decreases the underlying model advancement cycle of the new model by gathering the loads and predisposition esteems from existing models [17]. For instance, a AI model created for task An is reused as the establishment for a AI model on task B. The exchange learning methods are utilized pre-prepared condition of the art models for gathering the information and moves to the new models [18]. The most well-known pre-prepared profound learning models are AlexNet, visual geometry group network (VGGNet), residual network (ResNet), and inception network [14]. The AlexNet is a broadly utilized pre-prepared profound convolutional neural organization model. The AlexNet comprises of five convolutional layers with a ReLU enactment capacity and three completely associated layers. There are 62 million teachable factors in the AlexNet. Figure 4 outlines the course of move learning strategies. VGGNet expands execution and decreases the preparation time than the AlexNet. A huge contrast between the VGGNet and AlexNet was VGGNet utilize lesser portion size of the convolutional and pooling layers than the AlexNet. The size of the piece is fixed in the whole preparing process. Various variations of VGGNets
RETRACTED CHAPTER: High Accuracy for Hyperspectral Image Classification …
173
A PT ER
are VGG16 and VGG19. The number addresses the quantity of layers in the organization. The VGG16 has an aggregate of 138 million teachable boundaries. The ResNet manages the evaporating angle issue in the preparing cycle of the profound convolutional neural organizations. The ResNet utilizes a alternate route association with improve the exhibition of the organization. It utilizes as it were two pooling layers in the whole organization. The most as often as possible utilized ResNet models are ResNet18, ResNet50, and ResNet101. The ResNet18 has eleven million teachable boundaries. Moreover, the beginning net is presented the equal piece strategies to deal with variable bit esteems [15]. A most basic variant of the InceptionNet is GoogleNet. There are 6.4 million of teachable boundaries in GoogleNet.
4 Proposed Methodology
R
ET
R
A
C TE
D
C H
Certify the phantom contiguous HSI information block be signified by A ∈RW ×H ×B , where A is the first information, W is width, H is height, and B is quantity of otherworldly groups/profundity. Each HSI pixel in A contains D ghastly measures and structures a one-hot mark vector Y = (y1 , y2 , … yC ) ∈ R1×1×C , where C addresses the land-umbrella classifications. In any case, the HSI pixels display the blended land-cover classes, presenting the high eventually class fluctuation and interclass likeness into I. It is of extraordinary test for any model to handle this issue. To eliminate the unearthly repetition first, the customary head segment investigation is practiced over the first HSI information (I) along ghastly groups. The PCA lessens the quantity of phantom groups from D to B while keeping up with similar contiguous measurements (i.e., width W and height H). We have decreased just ghostly groups with the end goal that it saves the contiguous data which is vital for perceiving any article. We address the PCA diminished information 3D shape by X ∈ RM × N × B, where X is the adjusted contribution after PCA, W is width, H is height, and B is the quantity of unearthly groups subsequently PCA. The boundaries of CNN, for example, the predisposition b and the piece substance W, are typically prepared utilizing directed methodologies with the assistance of a slope plummet enhancement procedure. In traditional CNN_2_D, the convolutions are applied over the contiguous measurements just, integument all the element guides of the past layer, to register the 2D discriminate component maps. Then again, for the HSI characterization issue, it is alluring to catch the unearthly data encoded in various groups alongside the contiguous data. The CNN_2_D cannot deal with the ghastly data. Then again, the CNN_3_D portion can separate the unearthly and contiguous component portrayal at the same time from HSI information, however, at the expense of expanded computational intricacy. To yield the benefits of the programmed include acquirements ability of both 2D and 3D CNNs, we prospective a half breed highlight acquirements system called hybridSN for HSI characterization. The stream outline of the proposed hybridSN network is displayed in Fig. 5. It contains three 3D convolutions (2), one 2D convolution (1), and three completely associated layers.
M. Shivhare and S. Tripathi
A C
TE D
C
H A
PT ER
174
Fig. 5 Proposed methodology
R
ET R
To build the quantity of ghostly contiguous element maps at the same time, 3D convolutions are practiced threefold and can protect the ghastly data of info HSI information in the yield volume. The 2D convolution is practiced once sooner the f latten layer by remembering that it emphatically separates the contiguous data inside various ghastly groups without generous loss of phantom data, which is vital for HSI information.
5 Simulation Result The computational proficiency of HybridSN model shows up in term of preparing and testing times in Table 1. The prospective model is more productive than 3D CNN (Figs. 6 and 7).
RETRACTED CHAPTER: High Accuracy for Hyperspectral Image Classification …
175
Table 1 Result of test and training Data
2D CNN
3D CNN
HybridSN
Test (s)
Train (m)
Test (s)
Train (m)
Test (s)
IP
1.871
1.056
14.226
UP
1.793
1.214
57.912
4.128
13.901
4.681
10.291
20.038
SA
2.103
2.078
73.291
15.101
6.492
25.128
8.902
A
C TE
D
C H
A PT ER
Train (m)
R
Fig. 6 HSI image-I
R
ET
The effect of contiguous measurement over the presentation of HybridSN model is accounted for in Table 2. It has been tracked down that the pre-owned 25 × 25 contiguous measurement is generally reasonable for the prospective method. We have additionally figured the outcomes with even less preparing information, i.e., just 10% of complete examples and have summed up the outcomes in Table 3. It is seen from this trial that the presentation of each model reductions marginally, though the prospective strategy is as yet ready to beat different strategies in practically all cases (Table 3).
M. Shivhare and S. Tripathi
Fig. 7 HSI image-II
TE D
C
H A
PT ER
176
Table 2 Contiguous window size IP (%)
UP (%)
SA (%)
Window
IP (%)
UP (%)
SA (%)
19 × 19
99.74
99.98
99.99
23 × 23
99.31
99.96
99.71
21 × 21
99.73
99.90
99.69
25 × 25
99.75
99.98
100
A C
Window
ET R
Table 3 Classification accuracies Method
Indian Pines
OA
Kappa
AA
OA
Kappa
AA
OA
Kappa
AA
2D CNN
79.25
77.98
67.77
95.22
94.67
93.59
95.22
94.87
93.33
3D CNN
Univ. of Pavia
Salinas Scene
78.82
75.32
95.68
94.37
96.22
84.58
82.78
88.36
80.24
80.13
74.11
94.67
92.90
96.49
93.16
92.89
95.21
SSRN
97.04
97.46
85.04
98.38
98.93
98.33
98.22
98.48
98.38
HybridSN
97.79
97.99
97.75
98.39
98.98
98.16
98.34
98.78
98.07
R
81.03
M3D CNN
RETRACTED CHAPTER: High Accuracy for Hyperspectral Image Classification …
177
6 Conclusion
C H
A PT ER
The weight assigned to each of the band was determined based on the models of diminishing the distance inside each group and expanding the distance among the various bunches which features the significance of the specific band in the combination interaction. The meaning of this approach lies in its exceptionally discriminative capacity which prompts a superior grouping execution. Exploratory outcomes and examination with the current component extraction approaches demonstrated the proficiency of the imminent methodology for HSI grouping. Contrasted and that of the other contending approaches on three standard informational indexes, the proposed approach has shown the accomplishment of higher arrangement exactness and better visual outcomes. The prospective HybridSN model fundamentally consolidates the corresponding data of spatio-ghastly and unearthly as 3D and 2D convolutions, individually. The examinations more than three criterion informational collections contrasted and the new cutting edge techniques affirm the prevalence of the prospective strategy. The prospective model is compurgation productive than the 3D-CNN model.
References
R
ET
R
A
C TE
D
1. D. Kumar, D. Kumar, Hyperspectral Image Classification Using Deep Learning Models: A Review (ICMAI, IEEE 2021) 2. U. Kulkarni, S.M. Meena Sunil, V. Gurlahosur, U. Mudengudi, Classification of cultural heritage sites using transfer learning, in 2019 IEEE Fifth International Conference on Multimedia Big Data (BigMM), pp. 391–397 (2019) 3. L. Zhang, L. Zhang, B. Du, Deep learning for remote sensing data: a technical tutorial on the state of the art. IEEE Geosci. Remote Sens. Mag. 4(2), 22–40 (2016) 4. B. Rasti, D. Hong, R. Hang, P. Ghamisi, X. Kang, J. Chanussot, J.A. Benediktsson, Feature extraction for hyperspectral imagery: the evolution from shallow to deep: overview and toolbox. IEEE Geosci. Remote Sens. Mag. 8(4), 60–88 (2020) 5. J.M. Haut, M.E. Paoletti, J. Plaza, A. Plaza, J. Li, Hyperspectral image classification using random occlusion data augmentation. IEEE Geosci. Remote Sens. Lett. 16(11), 1751–1755 (2019) 6. Y. Li, H. Zhang, Q. Shen, Spectral–spatial classification of hyperspectral imagery with 3D convolutional neural network. Remote Sens. 9(1), 67 (2017) 7. S.K. Roy, G. Krishna, S.R. Dubey, B.B. Chaudhuri, HybridSN: exploring 3-D–2-D CNN feature hierarchy for hyperspectral image classification. IEEE Geosci. Remote Sens. Lett. 17(2), 277– 281 (2019) 8. X. Zhang, Y. Sun, K. Jiang, C. Li, L. Jiao, H. Zhou, Spatial sequential recurrent neural network for hyperspectral image classification. IEEE J. Select. Top. Appl. Earth Observ. Remote Sens. 11(11), 4141–4155 (2018) 9. Q. Liu, F. Zhou, R. Hang, X. Yuan, Bidirectional-convolutional LSTM based spectral-spatial feature learning for hyperspectral image classification. Remote Sens. 9(12), 1330 (2017) 10. C. Shi, C.M. Pun, Multi-scale hierarchical recurrent neural networks for hyperspectral image classification. Neuro Comput. 294, 82–93 (2018) 11. Y. Liu, L. Gao, C. Xiao, Q. Ying, K. Zheng, A. Marinoni, Hyperspectral image classification based on a shuffled group convolutional neural network with transfer learning. Remot. Sensor. 12, 01–18 (2020)
178
M. Shivhare and S. Tripathi
R
ET
R
A
C TE
D
C H
A PT ER
12. H. Yu, L. Gao, W. Liao, B. Zhang, L. Zhuang, M. Song, J. Chanussot, Global contiguous and local spectral similarity-based manifold learning group sparse representation for hyperspectral imagery classification. IEEE Trans. Geosci. Remote Sens. 58, 3043–3056 (2020) 13. X. Zhao, Y. Liang, A.J. Guo, F. Zhu, Classification of small-scale hyperspectral images with multi-source deep transfer learning. Remote Sens. Lett. 11, 303–312 (2020) 14. X. He, Y. Chen, P. Ghamisi, Heterogeneous transfer learning for hyperspectral image classification based on convolutional neural network. IEEE Trans. Geosci. Remote Sens. 58, 3246–3263 (2019) 15. H. Zhang, Y. Li, Y. Jiang, P. Wang, Q. Shen, C. Shen, Hyperspectral classification based on lightweight 3-D-CNN with transfer learning. IEEE Trans. Geosci. Remote Sens. 57, 5813–5828 (2019) 16. X. Zhang, X. Zhou, M. Lin, J. Sun, ShuffleNet: an extremely efficient convolutional neural network for mobile devices, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018, pp. 6848–6856 17. N. Ma, X. Zhang, H.T. Zheng, J. Sun, ShuffleNet v2: practical guidelines for efficient CNN architecture design, in Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018, pp. 116–131 18. M.E. Paoletti, J.M. Haut, J. Plaza, A. Plaza, A new deep convolutional neural network for fast hyperspectral image classification. ISPRS J. Photogramm. Remote Sens. 145, 120–147 (2018)
Design Smart Curtain Using Light-Dependent Resistor Feras N. Hasoon, Mustafa Khalaf Aal Thani, Hilal A. Fadhil, Geetha Achuthan, and Suresh Manic Kesavan
Abstract With the growth and development of the intelligent home industry, the smart curtain has become part of the advanced technology of this era. A smart curtain is an advanced technology that closes and opens automatically without interference. The system was built based on the external light sensor. Throughout analyzing and detecting the external light, light-dependent resistor (LDR) automatically closes and opens the curtain according to the light intensity. This paper reveals the tools used to build the smart curtain with experiments before the entire hardware project is implemented. Keywords Smart curtain · Intelligent curtain · Automatic curtain · Sensible curtain
1 Introduction Curtain is the window cover and has an irreplaceable role in daily life. Curtain has both practical and aesthetic advantages that include regulation of sunlight, insulation from heat and cold, privacy at night, and protection from outside dust. It is a mandatory accessory for homes and offices as it plays an essential role in decoration [1–4]. Privacy is one of the many top reasons to utilize curtains. It is necessary to protect the privacy of the home. Curtains prevent people from peering at the privacy of the house, and additionally, it gives feeling of comfort and reassurance in the home especially at night [5]. Installing curtains gives safety and the feeling of being uninterrupted by the additional glare of lighting and outside vehicle movement. This is a great method to improve privacy in the home. With the development of technology, the home curtain F. N. Hasoon · M. Khalaf Aal Thani · G. Achuthan · S. M. Kesavan (B) Electrical and Computer Engineering Department, National University of Science and Technology, Muscat, Sultanate of Oman e-mail: [email protected] F. N. Hasoon e-mail: [email protected] H. A. Fadhil Department of Electrical and Computer Engineering, Sohar University, Sohar, Sultanate of Oman © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_14
179
180
F. N. Hasoon et al.
can be created as a smart curtain. Therefore, the curtain can be made to close and open automatically at times specified by the user. For example, the curtain closes during the night to provide privacy, and it opens throughout the day to illuminate the house and save the electrical energy consumed by the lamps [6, 7]. This paper reveals the smart curtain using light-dependent resistor (LDR) that contributes to operating the curtain according to the sunlight. Curtains are widely installed in the home and office environment [8]. Additionally, the paper introduced the major testing and simulation of the components used to design the hardware of the smart curtain. The operation of the smart curtain has been designed to be the manual mode or auto mode. The manual mode plays as a switch that can open and close the curtain manually in case the user wants to open the curtain at night for example. The auto mode plays as a smart operation depending on the external light. Where this curtain has been designed to opens at the daytime and closes at the night. Moreover, the electric circuit of the smart curtain has an intelligent design by how the relays work inside the electric circuit and stop depending on the mode of the curtain.
2 System Design 2.1 Block Diagram The block diagram of the smart curtain system is shown in Fig. 1, which includes the major components in the electrical circuit. These components include rectifier, double pole double throw (DPDT) switch, double pole triple throw (SP3T) switch,
Fig. 1 Block diagram of smart curtain system
Design Smart Curtain Using Light-Dependent Resistor
181
single pole double throw (SPDT) relay, double pole double throw (DPDT) relay, LDR, and DC motor. The DPDT switch is used to change the mode type of the operation of the curtain between auto and manual. Also, the SP3T switch is used for closing and opening the curtain in manual mode. The AC power supply will be converted to a 16 V DC power supply by a stepdown transformer with a rectifier. In return, an AC power supply is used for the AC lamp and DPDT relay. LDR will be fed by a DC supply to sense the external lights. At a certain resistance, depending on the light intensity, the LDR will feed the DPDT relay with a DC supply to proceed with the operation of the curtain. DPDT relay has three options to send an open or close signal to the next SPDT relay as well as to turn the AC lamp on during the process of opening or closing the curtain. The DC motor will receive power or reversed power signals from SPDT relays to proceed with opening or closing the curtain.
2.2 Light Sensor (LDR) LDR stands for a light-dependent resistor. It is a resistance that its value changes according to the intensity of the light exposed to it. There is an inverse relationship between the intensive light and the resistance of the LDR. The value of resistance decreases when the intensity of the incident light increases. Conversely, the value of resistance increases when the LDR is shielded from light. In the dark, the resistance of the LDR is very high, reaching more than a Mega Ohm. In the event the LDR is exposed to the light, the resistance drops to a few hundred ohms [9]. LDR is the main component of the smart curtain project. Hence, the operation of automatic operation of the curtain depends mainly on the LDR component as shown in Fig. 2.
Fig. 2 LDR component and resistance curve [10]
182
F. N. Hasoon et al.
Fig. 3 Symbol of SPDT and DPDT relay [13]
2.3 SPDT and DPDT Relay Relay is the control device that is utilized to create or stop the flow of electrical power in a circuit [11]. Additionally, this device is employed as a switch for opening or closing the circuit. Hence, it is considered an electromechanical device. The relay circuit often is made up of switches, wires, and coils. Two models of the relay are implemented for the smart curtain system, including SPDT and DPDT relay. SPDT relay has three terminals: a common port and two other ports that exchange connections to the shared port. The SPDT is very suitable for choosing from two power supplies, commutative inputs, or any other application requiring one circuit from between two circuits to be connected [12]. DPDT represents two types of SPDT relay. The DPDT can control two circuits, but these circuits must be switched on or off together. DPDT switches consist of six ports. The operation of the relay can be done through applying a voltage to the circuit, and the coil of the relay will magnetize and cause to move the pole latch to touch the other terminal (from normally closed to open circuit or vice versa). Figure 3 illustrates the symbol of difference between SPDT and DPDT relay.
2.4 LDR and Voltage Divider Displayed equations or formulas are centered and set on a separate line (with an extra line or half line space above and below). Displayed expressions should be numbered for reference. The numbers should be consecutive within the contribution, with numbers enclosed in parentheses and set on the right margin. Please do not include section counters in the numbering. The LDR in the circuit is used to sense external light and decides whether the resistivity is high or low depending on the intensity of the light. Because there is an inverse relationship between the intensity of the light and the resistance of the device. The resistance of the LDR in the dark is very high, reaching more than a Mega Ohm.
Design Smart Curtain Using Light-Dependent Resistor
183
Fig. 4 LDR and voltage divider circuit diagram
In the event the LDR is exposed to the light, the resistance drops to a few hundred ohms [9]. Figure 4 shows the circuit diagram of the LDR and voltage divider. The output voltage depends on the LDR resistance and the R2. The value of R2 is set to be 100 K to match the required value of the voltage if the curtain is put into the automatic mode. At night (the absence of light), the values of resistance of the DLR as well as the output voltage are higher to operate the relay responsible for closing the curtain.
2.5 Transistor and Voltage Divider The superscript numeral used to refer to a footnote appears in the text either directly after the word to be discussed or—in relation to a phrase or a sentence—following the punctuation mark (comma, semicolon, or period).1 NPN transistor type is used to the smart curtain circuit. The major objective of utilizing a transistor in an electrical circuit is to energize or stop the SPDT relay accountable for closing the curtain in automatic mode. Figure 5 shows the circuit diagram of the transistor and voltage divider. At night, the light intensity is very low, and the resistance value of the LDR will become high and reaches Mega Ohms or may reach infinite. In this case, the voltage value across LDR will become in high value. In contrast, the voltage value across R2 will become relatively low (below 0.7 V). This value is not enough to activate the transistor (Q1). The collector of the transistor (Q1) is normally connected to the base–emitter of the transistor (Q2). So, the voltage across the collector of the transistor (Q1) is high (above 0.7 V), and it is enough to trigger the transistor (Q2).
1
The footnote numeral is set flush left, and the text follows with the usual word spacing.
184
F. N. Hasoon et al.
Fig. 5 Transistor and voltage divider circuit diagram
The voltage will supply the SPDT relay, and the relay will be triggered. The operation of the SPDT relay will control the other component of the circuit to open the curtain.
2.6 Overall Circuit Figure 6 shows the overall design of the smart curtain circuit. The DPDT switch-1 is designed to switch the curtain circuit in manual auto mode. For operating the curtain manually, the DPDT switch-1 should be switched to the manual position. The DP3T switch is responsible to switch on and switch off the
Fig. 6 Overall circuit diagram of the smart curtain
Design Smart Curtain Using Light-Dependent Resistor
185
DPDT relay-2 and SPDT relay-3. When we switch the DP3T switch to position no.1, the SPDT relay-3 triggers and turns on the motor in the opening position. When the DP3T switched to position no.2, DPDT relay-2 triggers and turns on the motor and rotates in the reverse direction to close the curtain and turns on the sleeping lamp. On the side of the auto mode, the DPDT switch-1 should be switched to the auto position. In this case, the operation of the curtain depends on the LDR resistivity of the light. If the intensity of the external light is high, the resistance value of LDR is very low. Hence, the SPDT relay-1 remains off and in normally closed. The power triggers SPDT relay-3 and turns on the motor to open the curtain. In case the intensity of the external light is low, the resistance value of LDR is high. Hence, SPDT relay-1 triggers. Also, the power triggers DPDT relay-2 and turns on the motor and rotates in the reverse direction to close the curtain and turns on the lamp. The limit switch used to the curtain circuit is manually close. Therefore, it is set to open the circuit in a certain position of the curtain to open or close the curtain in the desired position.
3 Experiment and Results 3.1 Software Simulation Proteus simulation software is used for circuit simulation of the smart curtain. This simulation enables to test the operation of the curtain project and discovers the defects and finds the appropriate components value to run the circuit in an effective method. There are four simulations for the curtain operation including closing and opening the curtain by manual mode and closing and opening the curtain by auto mode. Figure 7 shows the circuit diagram of the smart curtain using Proteus software. Simulation completed successfully, and the curtain operation worked in the expected way. Table 1 demonstrates the results obtained during curtain operation in the manual mode. On the side of the auto mode, two values of the LDR resistance have been chosen. For the daytime, the LDR value set to be 100 K, and for the nighttime, the LDR value set to be 500 K. Table 2 demonstrates the results obtained during curtain operation in the auto mode.
3.2 Hardware Testing Figure 8 shows the hardware testing for the smart curtain using manual mode. When switched the SP3T SWT-2 to the position 1 (closing process), the voltage value across the DPDT relay-2 is 0 V, and the voltage value across SPDT relay-3 is 12.16 V. The output voltage value is 12.28 V. The SPDT relay-1 remains off because it only triggers in auto mode with the closing process. When switched the SP3T SWT-2 to the position 2 (opening process), the DPDT relay-2 triggers and closes the DC motor
186
F. N. Hasoon et al.
Fig. 7 Circuit diagram of the smart curtain using Proteus software
Table 1 Results obtained in manual mode for daytime Manual mode Position
Relay-1 Vdc
Relay-2 Vdc
Relay-3 Vdc
Lamp Vac
Motor direction
Open
0
0
11.9
Off
Clockwise
+11.8
Close
0
11.9
0
On 169
Anticlockwise
−11.8
V out Vdc
circuit. SPDT relay-3 remains off and is normally closed. Finally, the circuit operates the DC motor in a closing direction, and the lamp turns on. The voltage value across relay-2 is 12.10 V, and the voltage value across relay-3 is 0 V. The manual mode testing has been successfully carried out, and the curtain closing and opening operations have been carried out effectively. Table 3 summarizes the results obtained for the manual mode operation of the curtain. During exposing the LDR into the light, the circuit operates the DC motor in the opening direction, and the lamp remains off. The voltage value across relay1 and relay-2 is 0 V, and the voltage value across SPDT relay-3 is 12.09 V. The output voltage value is 12.28 V. When the external light is cut off from the LDR, the resistance of the LDR increases according to the intensity of the light falling on the LDR. The circuit operates the DC motor in a closing direction, and the lamp turns on. The voltage value across SPDT relay-3 is 0 V, and the voltage value across SPDT
6.17
100 K
500 K
Open
Close
11.2
LDR Vdc
LDR Resistance Value
Position
5.7 0.7
– –
– –
0.4 1.1
–
–
B-C
B-E
E-C
B-E
B-C
Transistor Q-2
Transistor Q-1
Table 2 Results obtained in auto mode for daytime
–
–
E-C 11.1
0
Relay-1 Vdc
11.9
0
Relay-2 Vdc
0
11.9
Relay-3 Vdc
On 169
Off
Lamp Vac
Anticlockwise
Clockwise
Motor Direction
−11.8
+11.8
V out Vdc
Design Smart Curtain Using Light-Dependent Resistor 187
188
F. N. Hasoon et al.
Fig. 8 Testing curtain operation in manual mode in day time
Table 3 Results obtained in manual mode for night time SP3T SWT-2
Relay-1 VDC
Relay -2 VDC
Relay -3 VDC
Lamp VAC
1
0
0
12.16
0
2
0
12.10
0
236
V out Vdc 12.28 −12.25
relay-1 and DPDT relay-2 is 11.96 and 12.03 V, respectively. The output voltage value is −12.22 V. Figure 9 shows the hardware testing using auto mode. The results show that the auto mode testing has been successfully carried out, and the curtain closing and opening processes have been carried out effectively. Table 4 summarizes the results obtained for the manual mode operation of the curtain.
Fig. 9 Testing curtain operation in manual mode for night time
LDR Vdc
11.46
11.41
Light condition
Light
Dark
0.68
1.33
E-C 1.94
0.68 0.76
0.14 0.69
12.09
B-C
B-E
0.67
B-C
B-E
0.74
Transistor Q-2
Transistor Q-1
Table 4 Results obtained in auto mode for night time
E-C 0.09
12.23 11.96
0
Relay-1 Vdc
12.03
0
Relay-2 Vdc
0
12.09
Relay-3 Vdc
233 (On)
0 (Off)
Lamp Vac
−12.22
12.28
V out Vdc
Design Smart Curtain Using Light-Dependent Resistor 189
190
F. N. Hasoon et al.
4 Conclusion Smart curtain is one of the significant home needs of the modern era. This technology is recognized as easy for everyone to use and safe since it works with a low DC supply. Additionally, it depends on the external light when the mode is set automatically, and it could be operated manually depending on the user’s need. The paper presented the block diagram, and the main components used for the smart curtain. Also, the smart curtain has been simulated and testing, and the results obtained are evidence of the project’s success.
References 1. A. Trio, F.A. Sinantya, F. Syifaul, Y.F. Maulana, Curtain control systems development on mesh wireless network of the smart home. Bull. Electr. Eng. Inform. 7, 615–625 (2018) 2. V. Balasubramaniam, IoT based biotelemetry for smart health care monitoring system. J. Inform. Technol. Digit. World 2, 183–190 (2020) 3. P.J. Patil, V.Z. Ritika, R.T. Kalyani, A.S. Bhavana, R.S. Shivani, S. Shailesh, IoT protocol for accident spotting with medical facility. J. Artif. Intell. 3, 140–150 (2021) 4. I.J. Jacob, P.E. Darney, Design of deep learning algorithm for IoT application by image based recognition. J. ISMAC 3(03), 276–290 (2021) 5. Importance of Curtains in Daily Life. https://mydecorative.com/importance-of-curtains-indaily-life/ 6. Y. Wang, Y. Zhang, L. Hong, Design for intelligent control system of curtain based on Arduino, in 2nd International Conference on Electrical, Computer Engineering and Electronics, pp.1344–1348 (2015) 7. A. Sungheetha, S. Rajesh, Real time monitoring and fire detection using internet of things and cloud based drones. J. Soft Comput. Parad. (JSCP) 2, 168–174 (2020) 8. T. Sun, Y. Wang, M. Wu, A new intelligent curtain control system based on 51 single chip microcomputer. Navy Submarine Acad. 3, 16–28 (2017) 9. M. Malin, B. Palmer, L. DeWerd, Absolute measurement of LDR brachytherapy source emitted power: instrument design and initial measurements. Med. Phys. 43, 796–806 (2016) 10. Open circuit. https://opencircuit.shop/search/LDR 11. G. Unni, D. Solu, SPST to DPDT switching conversion module for solid state relays (SSR). Int. J. Eng. 5, 96–104 (2017) 12. J. Kim, S. Kim, H. Shin, A 4-channel SPST and 8-channel SPDT CMOS switch IC for electronic paper display smart card. J. Inst. Electron. Inform. Eng. 55, 47–53 (2018) 13. Electrical Relay. https://www.electronics-tutorials.ws/io/io_5.html
Machine Learning Assisted Binary and Multiclass Parkinson’s Disease Detection Satyankar Bhardwaj, Dhruv Arora, Bali Devi, Venkatesh Gauri Shankar, and Sumit Srivastava
Abstract Neurodegenerative changes in Parkinson’s disease primarily affect the patient’s movement and speech, but they may also cause tremors. This is a central nervous system disease, and neurodegenerative changes mainly affect movement and speech. There are more than 10 million+ people affected worldwide. Its research so far has not yielded any concrete remedies or cures. Researchers believe that through studying past medical data of Parkinson’s patients, we can form algorithms that can diagnose the patient with the disease years before symptoms appear. In this research paper, we have analyzed the signs and symptoms facilitating the early diagnosis of Parkinson’s disease by applying classification algorithms on the classical features. The subjects in this study were placed into four classes according to the Unified Parkinson’s disease Rating Scale (UPDRS) score after analyzing the various features considered in the above classes. We have attained 87.83 and 98.63% accuracy using KNN (Binary Data) and Decision Tree Classifier (Multiclass Data). Keywords Parkinson’s disease · Neurodegenerative disease · Machine learning · Classifier · Regression
1 Introduction When a person has Parkinson’s disease, their movement and speech can be affected, but it can also cause tremors. Parkinson’s disease is a disease of the brain. In the brain, nerve cell damage is caused by dopamine-producing neurons called the substantia. It is one of the most common brain diseases that worsen over time. Because of the high cost of treating the disease, the patients’ quality of life is harmed [1, 2]. They have a hard time socializing, and their finances worsen because of the costs. Because the cause of this disease is known, it can be a lot easier to treat this disease if it is S. Bhardwaj · D. Arora · B. Devi (B) Computer Science and Engineering, Manipal University Jaipur, Jaipur, Rajasthan, India e-mail: [email protected] V. G. Shankar · S. Srivastava Information Technology, Manipal University Jaipur, Jaipur, Rajasthan, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_15
191
192
S. Bhardwaj et al.
found early on when the symptoms are not as bad. This word is called TRAP, and it refers to four main things: People shake when they are not moving: rigidity, a lack of movement, and a jerky posture. Also, Parkinson’s disease slows you down and makes your body move in different ways, so that’s another thing to know about the disease. It is not just anxiety that a neurodegenerative patients have. It also have fatigue, depression, sleep problems, and cognitive problems that make it hard for you to do things [1, 3, 4]. Another thing to note is that phonation and speech disorders are prevalent. At first, they may not show up for up to five years before a doctor can confirm that the person has Parkinson’s disease. The early stages of Parkinson’s disease can be hard to figure out if patients do not show any critical signs. A lot of the symptoms of other disorders in the same group as the one you have could be similar to the one you have. If you have Parkinson’s disease, it could take up to five years or more to get a correct clinical diagnosis from your doctor. Keep an eye on your health during this time. Often, there are not many signs or symptoms that can be seen at the beginning of the disease. Statistical illustration of data related to the problem objective is described in dataset Sect. 4.
2 Identify Research and Literature Survey Gunjan Pahuja [5] used a model based on the Genetic Algorithm Extreme Machine Learning (GA-ELM) trained on MRI images of Parkinson’s disease patients. Classifiers look at images of brain scans for voxel-based features that show how well the brain is working. GA-ELM was right 95% of the time. A better model than SVM was found to be this one. It was found to be more accurate than SVM and more stable than the other. Use Principal Component Analysis to cut down on the number of dimensions in your data. Fisher Discriminant Ratio (FDR) helps you rank the features you choose for a framework for diagnosing neurodegenerative diseases through multiclass classification. This framework is called a “multiple class framework,” and Gurpreet Singh [6] and his team called it that. The average classification accuracy of this method was more than 95% for binary data and more than 85% for multiclass data, as shown in the figure. In their opinion, this is the best accuracy they have seen for a multiclass Parkinson’s disease diagnosis that they have ever seen. The study of Wang [7] attributed to the desirable characteristics of the model in learning linear and nonlinear features from Parkinson’s disease data without making any handcrafted data extractions showed an accuracy of 96.45%. Their study incorporated the analysis based on premotor features using a deep learning model to automatically discriminate patients affected by Parkinson’s disease and normal individuals. This model outperformed the previous twelve considered machine learning models in discriminating Parkinson’s disease patients and normal people. Voice analysis using a dataset was used to diagnose Parkinson’s disease by Benmalek [8, 9]. The dataset of 375 people was divided based on severity into four classes. People used the Local Learning-Based Feature Selection algorithm to group things into different groups.
Machine Learning Assisted Binary and Multiclass …
193
We used the random subspace algorithm to do discriminant analysis for multiclass systems, which helped us get the best results. He has shown how to use voice features at different stages of Parkinson’s disease to tell the difference between Parkinson’s disease patients and healthy people. It used Mel-frequency Cepstral Coefficients (MFCC) chosen by the LLBFS algorithm and discriminant analysis to get 86.7% of the time, which is good. Using more acoustic features that are not used in the model building is even better. They might be more useful in medicine when figuring out if someone has Parkinson’s disease. In his study, Lei [10] used a sparse feature selection framework to look for signs of Parkinson’s disease early on. He also used a multiclass classification model to look for signs of the disease in people [11]. It was also done to use it in the clinic to figure out what type of disease it was. Show that their method can classify three groups at once. They used multimodality data from the PPMI neuroimaging dataset to show that their method can classify NC, PD, and SWEED simultaneously. For PD diagnosis, they also made 10 important ROIs. These ROIs show where the important parts of the brain are for PD diagnosis. In the hybrid intelligent framework for Parkinson’s disease diagnosis built by Ali [12], many different types of persistent phonation data were used to make it. There are two ways to think about this framework: The LDA model is used to cut down on the number of dimensions, and the neural network model is used to classify the things in the dataset. There was a 95% accuracy rate on a training database and a 100% accuracy rate on a test database when the system used all the dataset’s features. That’s why gender-related features were removed from this study’s dataset because there were too many women in the dataset. When the proposed framework was changed, it was able to get 80% correct on the training database and 82% correct on the testing database. The author, Jefferson S. Almeida [13] came up with a new way to tell if someone has Parkinson’s. When he did this, he used a mix of feature extraction and machine learning. Eighteen feature sets and 14 classifiers were used to look at the phonic data set used in this study to measure ERR and Accuracy. This data set had a lot of different things to look at. K1 classifier was best when used with the audio cardioid (AC) microphone’s phonation mode. Yaffe’s (YA) and KTU’s (KT) individual feature sets are best for this classifier. K1 classified Yaffe (YA) and KT best when the smartphone microphone was set to phonation mode. They had an accuracy rate of 94.55 and 92.94%, respectively. Senturk [14] came up with a way to use the features of voice signals from both healthy people and people with Parkinson’s disease to help people figure out if someone has Parkinson’s disease or not. In the experiments, there were a lot of different ways to use FS and classifiers. It was a good idea to use both FS methods and classification methods simultaneously [15, 16]. It was important to use them with speech signals with many different phonetic characteristics, so they were useful. Use this model when Parkinson’s disease is in its early stages. It could also be used to stop the disease’s worst effects, so it could be very accurate.
194
S. Bhardwaj et al.
Table 1 Limitations of existing works S. No.
Existing methods
Limitations
1
Enhanced fuzzy KNN approach [17]
High error rates in elder population samples
2
Linear discriminant analysis and The biased dataset may have low genetically optimized neural network [12] accuracy
3
Cepstral analysis [9]
Required heavy preprocessing and feature extraction of voice samples
4
Sparse feature learning [10]
Dataset requires MRI images and DTI images which require extensive preprocessing
5
Machine Learning framework based on PCA, FDR, and SVM for Multiclass Diagnosis [6]
The framework was used on a limited dataset and not implemented on a large dataset
Problem Statement This research problem is about getting an early diagnosis of Parkinson’s disease (PD) by classifying certain traits. The Unified Parkinson’s disease Rating Scale (UPDRS) score is used to classify the subjects into four groups based on their features. Research Gap in Table 1 Research Objectives and Novelty of the Work 1. 2.
To create a model that can tell if a patient has Parkinson’s disease or not based on clinical observations. To create a multiclass model to predict the severity of Parkinson’s disease.
3 Methodology In our study, primarily the models were applied for binary and multiclass classification. The data was collected from the Department of Neurology in Cerrahpasa Faculty of Medicine, Istanbul University, Turkey. Since the binary dataset consisted of many highly dimensional data, we implemented Principal Component Analysis (PCA) to reduce execution time and increase accuracy. They made the data for multiclass classification with help from 10 medical centers in the United States and Intel Corporation. Athanasios Tsanas and Max Little worked on the project with help from Intel. The target variable, the Unified Parkinson’s disease Rating Scale (UPDRS) score, was between 7 and 54. The higher the rating, the more severe the disease. To conduct multiclass classification on the target variable, we scaled down these continuous values into 4 broad categories, ranging from 0 to 3 through unsupervised learning. The dataset consisted of no missing or repeated values. The data was highly dimensional, i.e., it had a massive number of features. This caused problems with
Machine Learning Assisted Binary and Multiclass …
195
Fig. 1 Detailed design methodology
processing time and negatively affected the accuracy of the data. To solve this problem, Principal Component Analysis (PCA) was used. PCA is an algorithm that groups together similar-looking features, thereby reducing the data’s number of features or dimensionality. As stated in detail ahead, PCA also yielded higher accuracy while significantly optimizing the process in Fig. 1. The basic skeletal of working of all used algorithms are as shown in Fig. 2:
4 Dataset We loaded labeled pickle files in an array and used different algorithms on these models. We applied Principal Component Analysis (PCA) to help to reduce the dimensionality of data, with 100 components giving the highest accuracy. To further increase accuracy, we implemented data standardization [18–20]. For Binary Dataset: There is a school of medicine at Istanbul University called Cerrahpa¸sa. That’s where the data was taken, Do you have Parkinson’s disease? This study looked at 188 people who had the disease, aged 33–87. It looked at 107 men and 81 women with the disease. People in the control group, 23 men and 41 women, are all healthy, and there are 64 of them between 41 and 82 years old. As part of the data-gathering process, the mics are set to 44.1 kHz. After a doctor looked at each person, three repetitions of the vowel “a” were removed from each one [18–20]. For Multiclass Dataset: The data has been sourced from the UCI machine learning repository: Parkinson’s telemonitoring dataset. The dataset had UPDRS value (target variable) in continuous form and thus was transformed into 4 broad categories
196
S. Bhardwaj et al.
Fig. 2 Detailed design flow diagram
through unsupervised scaling [18–20]. The target variable now had 4 discrete values ranging from 0 to 3, with the increasing value indicating a more severe form of Parkinson’s disease. Analysis of correlation of UPDRS with features showed that age was the most impactful feature followed by HNR, DFA, sex, PPE, and RPDE had the most impact on the severity of the disease. HNR—the ratio between periodic and non-periodic components in a voice sample. DFA—signal fractal scaling exponent, PPE—a nonlinear measure of fundamental frequency variation, RPDE— a nonlinear dynamical complexity measure. Comparison of age, sex, and UPDRS showed that females were more likely to show symptoms earlier than males while males tended to have a higher UPDRS (For reference, 0 = “Male” 1 = “Female”) in Fig. 3.
5 Results 5.1 Binary Data We tested the data on the following models—SVM, KNN, Decision Tree, and Logistic Regression. We observed that KNN provided the highest accuracy among the tested models. SVM followed closely and was the second-best performing. The
Machine Learning Assisted Binary and Multiclass …
197
Fig. 3 UPDRS data with age
decision tree gave varying accuracy and was not a reliable option for binary classification. PCA increased the accuracy for SVM and Logistic Regression, reducing the same decision tree and KNN.
5.1.1
SVM Algorithm
Support vector machine (SVM) is a popular classification algorithm that uses support vectors to classify data points on an n-dimensional plot. It plots the data points and tries to find a boundary or hyperplane separating different data points. This hyperplane can be linear or nonlinear. Unlike other classification algorithms, SVM performs well with highly dimensional data and is not susceptible to overfitting [2, 3]. This accuracy was achieved by modifying the parameters like Kernel—poly, Param C—1, Param Gamma—auto (Figs. 4, 5 and Table 2).
5.1.2
Logistic Regression Algorithm
Logistic Regression is a classification algorithm that works on the logistic function to give the result for binary-valued output. However, with more complex solvers, multiclass outputs can also be used. We used this algorithm because of its popularity as a classifier due to its simplicity and efficiency as a binary classifier. This also allowed us to use Logistic Regression in a multiclass classification that is less commonly used [2, 16, 21].
198
S. Bhardwaj et al.
Fig. 4 Model comparison for binary data
Fig. 5 Confusion matrix SVM (binary data)
Table 2 Accuracy of SVM on binary data
With PCA (110)
Without PCA
87.30%
79.36%
This accuracy is achieved by tuning the parameters like Solver–newton-cg, Param C-0.1 in Fig. 7 and Table 3. Table 3 Accuracy of logistic regression (binary data)
With PCA (110)
Without PCA
85.71%
83.59%
Machine Learning Assisted Binary and Multiclass …
199
Fig. 6 Confusion matrix KNN (binary data)
Fig. 7 Confusion matrix logistic regression (binary data)
5.1.3
KNN Algorithm
K-nearest neighbor (KNN) is one of the simplest classification algorithms that classify data by plotting the given data points and classifies the unknown data points depending on their distance to the “k” closest known data points. The user can manually set the value of “k”; hence, we could compare the accuracy of the model for different values of “k” and choose the most optimal one. We favored this algorithm as it is simple, robust, and works very well with large datasets [3, 15, 22]. The model showed an accuracy of 87.83%. The model’s accuracy did not change from the application of PCA in Fig. 6 (Fig. 7).
5.1.4
Decision Tree
A Decision Tree Classifier is a classifier that implements a set of rules to decide, i.e., to get the final classification. It constructs a tree based on training data. In this tree, the leaf nodes are the final answers. A decision tree is an excellent classifier because
200 Table 4 Accuracy of decision tree
S. Bhardwaj et al. With PCA (150)
Without PCA
75.13%
85.18%
it can work with data that has not been pre-processed and supports categorical data along with numerical data [11, 21–23]. We have used the decision tree classifier model to achieve the following result in Table 4 and Confusion Matrix Decision Tree in Fig. 8. This accuracy is achieved by tuning the parameters as given as Random State—1. Fig. 8 Confusion matrix decision tree (binary data)
Fig. 9 Model comparison for multiclass data
Machine Learning Assisted Binary and Multiclass …
5.1.5
201
Comparison Table for All the Classifiers Algorithms-Binary Class is Given in Table 5 and Model Comparison for Binary Data in Fig. 4
See Table 5.
5.2 Multiclass Data The data has been sourced from the UCI machine learning repository: Parkinson’s telemonitoring dataset. The dataset had target variable in continuous form and thus was transformed into 4 broad categories. For multiclass classification (0: Normal, 1: Slight, 2: Mild, 3: Moderate), we use SVM, KNN, and Decision Tree Classifier. We dropped Logistic Regression for multiclass classification because it works on sigmoid functions. It is excellent for binary classification but performs poorly otherwise. Decision Tree Classifier gives the best accuracy of 98.63%. The distribution of Continuous UPDRS is shown in Figs. 10 and 11. Table 5 Comparison table for all the classifiers algorithms (binary class)
Model name
Accuracy (%)
SVM
87.30
Logistic regression
85.71
KNN
87.83
Decision tree
85.18
Fig. 10 Distribution of continuous UPDRS
202
S. Bhardwaj et al.
Fig. 11 Distribution of discrete UPDRS
Fig. 12 Confusion matrix SVM (multiclass data)
5.2.1
SVM Algorithm
The maximum accuracy of 94.86% of the model was achieved by scaling the data and tuning the following parameters: Kernel—poly, Degree param—3, C param—100 in Fig. 12.
5.2.2
Decision Tree Classifier
The maximum accuracy of 98.63% of the model was achieved by scaling the data and tuning the random state parameter to 3, shown in Fig. 13.
5.2.3
KNN Algorithm
The maximum accuracy of the model was 92.85% achieved by scaling the data and with the value of K as 5, shown in Fig. 14.
Machine Learning Assisted Binary and Multiclass …
203
Fig. 13 Confusion matrix decision tree classifier (multiclass data)
Fig. 14 Confusion matrix KNN (multiclass data)
5.2.4
Logistic Regression Algorithm
The maximum accuracy achieved with Logistic Regression was 90.53% with newtoncg Solver, l1 penalty, and C = 100.
5.2.5
Comparison Table for All the Classifiers Algorithms for Multiclass Data is Given in Table 6, and Model Comparison for Multiclass Data for Accuracy and Classifiers is Shown in Fig. 9
See Table 6.
204
S. Bhardwaj et al.
Table 6 Comparison table for all the classifiers algorithms (multiclass data)
Model name
Accuracy (%)
SVM
94.89
Decision Tree Classifier
98.63
KNN
92.85
Logistic Regression
90.53
6 Conclusion We have used machine learning models like SVM, logistic Regression, Decision Tree, and KNN classifiers for making models on given datasets to distinguish and classify patient data according to the UPDRS. We have attained 87.83 and 98.63% accuracy using KNN (Binary Data) and Decision Tree Classifier (Multiclass Data). We concluded that SVM always provided consistently high accuracy in any clinical data scenario. A comparison summary table clarifies the differences between the proposed model and the traditional model shown in Table 7. Acknowledgements We would like to thank Manipal University Jaipur and CIDCR Lab-1032AB, School of Computing and IT, Manipal University Jaipur, for supporting us in conducting this research. We are also very grateful for UPDRS (Unified Parkinson’s disease Rating Scale) and UCI for sharing the dataset.
Table 7 A comparison summary table make clear the differences between the proposed model and the traditional model Paper
Binary classification accuracy (%)
Multiclass classification accuracy (%)
Proposed model
87.83
98.63
Pahuja et al. [5]
95
–
Singh et al. [6]
95
85
Wang et al. [7]
96.45
–
Benmalek et al. [8]
–
86.7
Benmalek et al. [9]
–
86.7
Lei et al. [10]
–
78.4
Ali et al. [12]
100 (without gender features)/82 (with gender features)
–
Almeida et al. [13]
94.55
–
Senturk [14]
93.84
–
Cai et al. [17]
97.89
–
Machine Learning Assisted Binary and Multiclass …
205
References 1. V.G. Shankar, D.S. Sisodia, P. Chandrakar, DataAutism: an early detection framework of autism in infants using data science, in Data Management, Analytics and Innovation. Advances in Intelligent Systems and Computing, ed. by N. Sharma, A. Chakrabarti, V. Balas, vol. 1016 (Springer, Singapore, 2020). https://doi.org/10.1007/978-981-13-9364-8_13 2. B. Devi, V.G. Shankar, S. Srivastava, D.K. Srivastava, AnaBus: a proposed sampling retrieval model for business and historical data analytics, in Data Management, Analytics and Innovation. Advances in Intelligent Systems and Computing, ed. by N. Sharma, A. Chakrabarti, V. Balas, vol. 1016 (Springer, Singapore, 2020). https://doi.org/10.1007/978-981-13-9364-8_14 3. V.G. Shankar, D.S. Sisodia, P. Chandrakar, A novel discriminant feature selection–based mutual information extraction from MR brain images for Alzheimer’s stages detection and prediction. Int. J. Imag. Syst. Technol. 1–20 (2021). https://doi.org/10.1002/ima.22685 4. B. Devi, S. Srivastava, V.K. Verma, Predictive analysis of Alzheimer’s disease based on wrapper approach using SVM and KNN, in Information and Communication Technology for Intelligent Systems. ICTIS 2020. Smart Innovation, Systems and Technologies, vol. 196, ed. by T. Senjyu, P.N. Mahalle, T. Perumal, A. Joshi (Springer, Singapore, 2021). https://doi.org/10.1007/978981-15-7062-9_71 5. G. Pahuja, T.N. Nagabhushan, A novel GA-ELM approach for Parkinson’s disease detection using Brain Structural T1-weighted MRI Data, in 2016 Second International Conference on Cognitive Computing and Information Processing (2016) 6. G. Singh, M. Vadera, L. Samavedham, E.C.-H. Lim, Machine learning-based framework for multiclass diagnosis of neurodegenerative diseases: a study on Parkinson’s disease. 2016 IFACPapersOnLine 49(7), 990–995 (2016) 7. W. Wang, J. Lee, F. Harrou, Y. Sun, Early detection of Parkinson’s disease using deep learning and machine learning. IEEE Access 8, 147635–147646 (2020). https://doi.org/10.1109/ACC ESS.2020.3016062 8. E. Benmalek, J. Elmhamdi, A. Jilbab, Multiclass classification of Parkinson’s disease using different classifiers and LLBFS feature selection algorithm. Int. J. Speech Technol. 20(1), 179–184 (2017) 9. E. Benmalek, J. Elmhamdi, A. Jilbab, Multiclass classification of Parkinson’s disease using cepstral analysis. Int. J. Speech Technol. 21(1), 39–49 (2017) 10. H. Lei, Y. Zhao, Y. Wen, Q. Luo, Y. Cai, G. Liu, B. Lei, Sparse Feature Learning for Multiclass Parkinson’s Disease Classification (IOS Press, 2018) 11. J.I.Z. Chen, P. Hengjinda, Early prediction of coronary artery disease (CAD) by machine learning method—a comparative study. J. Artif. Intell. 3(1), 17–33 (2021) 12. L. Ali, C. Zhu, Z. Zhang, Y. Liu, Automated detection of Parkinson’s disease based on multiple types of sustained phonations using linear discriminant analysis and genetically optimized neural network. IEEE J. Transl. Eng. Health Med. (2019) 13. J.S. Almeida, P.P. Rebouças Filho, T. Carneiro, W. Wei, R. Damaševicius, R. Maskeliunas, V.H.C. de Albuquerque, Detecting Parkinson’s Disease with Sustained Phonation and Speech Signals Using Machine Learning Techniques. elsevier.com (2019) 14. Z.K. Senturk, Early diagnosis of Parkinson’s disease using machine learning algorithms. Med. Hypotheses 138, 109603 (2020) 15. A. Sungheetha, R. Sharma, Design an early detection and classification for diabetic retinopathy by deep feature extraction based convolution neural network. J. Trends Comput. Sci. Smart Technol. (TCSST) 3(02), 81–94 (2021) 16. J.S. Manoharan, Study of variants of extreme learning machine (ELM) brands and its performance measure on classification algorithm. J. Soft Comput. Paradigm (JSCP) 3(2), 83–95 (2021) 17. Z. Cai, J. Gu, C. Wen, D. Zhao, C. Huang, H. Huang, C. Tong, J. Li, H. Chen, An intelligent Parkinson’s disease diagnostic system based on a Chaotic bacterial foraging optimization enhanced fuzzy KNN approach. Comput. Math. Methods Med. (2018)
206
S. Bhardwaj et al.
18. Multiclass dataset. https://archive.ics.uci.edu/ml/datasets/Parkinsons+Telemonitoring 19. Image Dataset Used, Parkinson’s Progression Markers Initiative (PPMI) Databasewww.ppmiinfo.org/data 20. UPDRS Dataset. https://www.movementdisorders.org/MDS/MDS-Rating-Scales/MDS-Uni fied-Parkinsons-Disease-Rating-Scale-MDS-UPDRS.htm. Online accessed 11 Sept 2021 21. V. Goel, V. Jangir, V.G. Shankar, DataCan: Robust approach for genome cancer data analysis, in Data Management, Analytics and Innovation. Advances in Intelligent Systems and Computing, vol. 1016, ed. by N. Sharma, A. Chakrabarti, V. Balas (Springer, Singapore, 2020). https://doi. org/10.1007/978-981-13-9364-8_12 22. V.G. Shankar, B. Devi, A. Bhatnagar, A.K. Sharma, D.K. Srivastava, Indian air quality health index analysis using exploratory data analysis, in Micro-Electronics and Telecommunication Engineering. Lecture Notes in Networks and Systems, vol. 179, ed. by D.K. Sharma, L.H. Son, R. Sharma, K. Cengiz (Springer, Singapore, 2021). https://doi.org/10.1007/978-981-33-46871_51 23. V.G. Shankar, B. Devi, U. Sachdeva, H. Harsola, Real-time human body tracking system for posture and movement using skeleton-based segmentation, in Micro-Electronics and Telecommunication Engineering. Lecture Notes in Networks and Systems, ed. by D.K. Sharma, L.H. Son, R. Sharma, K. Cengiz, vol. 179 (Springer, Singapore, 2021). https://doi.org/10.1007/978981-33-4687-1_48
Category Based Location Aware Tourist Place Popularity Prediction and Recommendation System Using Machine Learning Algorithms Apeksha Arun Wadhe and Shraddha Suratkar
Abstract Tourism contributes majorly in the economic growth of a country. Country like India which is a large market for tourism. Tourism is one of the prime sectors contributing highly to the GDP. Hence, it is necessary to build a smart tourism system which helps in increasing revenue from the tourism industry. In this paper, we have proposed a novel framework for place popularity prediction and recommendation system using machine learning algorithms. A novel recommendation system can overcome cold start problem. Dataset used in a research study has been gathered from popular tourism web sites. In experiment, category of places has been determined using the LDA algorithm. Location of places has been identified using K-means clustering algorithm. Sentiment analysis has been used for popularity rating prediction. Sentiment classification has been done using famous supervised machine learning algorithms, i.e., naive Bayes (NB), decision tree (DT), support vector machine (SVM), random Forest (RF), and performance analysis using several evaluation factors. From research, we could conclude that random forest has given the highest performance in comparison with decision tree, naive Bayes, and SVM classifiers. As a result, popularity-based tourist spot classification using RF had been implemented which has given accuracy of 88.02% for testing data used. On bases of category + location top-N popular places have been recommended to an end user using a combination of LDA, and RF and K-means algorithms used in a layered approach. This hybrid recommendation system is a combination of content-based + popularity-based recommendation systems. Such a recommendation system will give more precise recommendations as compared to only content-based and only popularity-based systems. Keywords Latent Dirichlet allocation · Linear support vector machine · Multinomial naive Bayes · Bag of words · TF-IDF · K-means · Random forest · Decision tree A. Arun Wadhe (B) · S. Suratkar Department of Computer Engineering and Information Technology, VJTI, Mumbai 400019, India e-mail: [email protected] S. Suratkar e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_16
207
208
A. Arun Wadhe and S. Suratkar
1 Introduction Tourism contributes majorly in economic growth of the country. Country like India which is a large market for tourism. Tourism is one of the prime sectors contributing highly to the GDP. Hence, it is necessary to build a smart tourism system which helps in increasing revenue from the tourism industry. Nowadays, social media is growing rapidly millions of users post reviews, rate place over tourism web sites, and forums. This can lead to a platform for popularity prediction and recommendation. There are various models proposed yet for popularity prediction and recommendation. But there are some problems with prediction and recommendation system like data sparsity, cold start problem. As tourism data is rare so it suffers from data sparsity problem. Also, the cold start problem is one of the major problems in the recommendation system. So, there is a need for a system which will overcome this problem. In this research, study tourist place categorization has been performed using topic modeling algorithm for category-based recommendation. Also, sentiment analysis has been performed for popularity prediction. Then, tourist place region detection has been performed for location aware recommendation. Topic modeling has been performed using latent Dirichlet allocation (LDA). Sentiment classification has been performed using popular machine learning algorithms, i.e., multinomial naive Bayes, linear support vector machine, and random forest. K-means algorithm is used for place location clustering. Comparative analysis of various supervised algorithms used in research study has been performed with the help of several evaluation parameters such as accuracy score, recall, precision and F1-score. Contents of paper are outlined as below. Section 2 states related work to research study. Section 3 explains terminologies of machine learning algorithms used to conduct research. Section 4 elaborates methodology for tourist place popularity prediction and recommendation. Section 5 describes experiments performed in research study. Section 6 shows results of research study. Section 7 states comparative analysis. Section 8 describes advantages of the recommendation system in research study. Section 9 depicted disadvantages. Section 10 infers research work. Future work of study has been delineated in Sect. 11.
2 Literature Review Author [1] has proposed a novel hierarchical framework for point of interest popularity prediction. He has used heterogeneous tourist place dataset in research. He has estimated TCHP using hierarchical multi-clue fusion technique, and he has compared various early fusion and late fusion techniques like EF, SVM2K (SVM + KCCA) for popularity prediction. In hierarchical multi-clue modeling multi-modal data types taken into consideration with topic modeling layer implemented using latent Dirichlet allocation (LDA). Future scope of research is to design recommendation system. In this paper [2], online news popularity has been predicted using
Category Based Location Aware Tourist Place Popularity …
209
machine learning techniques. Author has used the UCI machine learning repository dataset. They have used SVM, neural network, random forest, KNN, naive Bayes, multilayer perceptron, bagging, adaboost, logistic regression, etc. From research study, it has found that multilayer perceptron and random forest gives best performance by setting a best value for parameters. In this paper [3], author has proposed a new method for classification of sentiment. He has used latent Dirichlet allocation (LDA) for topic detection. They have concluded that their proposed method with naive Bayes has outperformed other machine learning techniques like naive Bayes, SVM, adaboost along with naive Bayes and decision tree. In this paper [4], author has proposed a demographic-based recommendation system using machine learning algorithms. They have used naive Bayes, Bayesian networks, and support vector machine. It is found that machine learning algorithm specially SVM outperforms over baseline methods. Future scope of study is to include review textual information to improve accuracy of prediction. In this paper [5], users have implemented a hybrid recommendation system. It is a combination of content-based filtering as well as collaborative filtering. Author has used Bing data. Content-based recommendation technique is based on user’s content, whereas collaborative is based on similarity in choices of users. They had found that fusion of content-based filtering and collaborative filtering gives better recommendation quality. In this paper [6], author has proposed a novel system for personalized tourist destination recommendation on the basis of social media check-in data. Also, friend check-in data has been taken into consideration to overcome the cold start problem. They have used Facebook as a social media platform for research study. Paper concludes that by overcoming cold start problem recommendation system quality can be improved. In paper [7], author has proposed novel GuideMe model for tourist recommendation. In this research, recommendation has been given on bases of user preferences, user present location, and user past activity. They have used integration with social services like Facebook and Twitter. Future scope of research paper is to integrate user rating, friends’ similarity, and comments into consideration. As elaborated in paper [8], author has proposed user-based collaborative filtering method for tourist spot recommendation. For calculating the similarities between each user, cosine similarity method has been used. And later the recommendation of tourist spot has been given as per the visiting history of the user’s neighbors as per similarity of choices. As presented in [9] paper, author has performed in detailed survey on various types of recommendation systems like content-based filtering, collaborative filtering, hybrid recommender, and demographic recommendation system. Also, the author has discussed various challenges in the recommendation system. Advantages and disadvantages of different RS used in research study have been discussed. In this paper [10], author has proposed a novel approach for recommendation of a research paper based on topic modeling technique. Author has used latent Dirichlet allocation (LDA) for topic analysis In this way, he has calculated thematic similarity measurement for topic clustering. By adopting proposed method problems in traditional recommendation systems which are cold start problems could get overcome. This paper is extension of base paper [11].
210
A. Arun Wadhe and S. Suratkar
3 Terminologies of Machine Learning Algorithms 3.1 Feature Extraction In machine learning, pattern recognition feature extraction is a crucial step. It helps in dimensionality reduction as it removes redundant and irrelevant attributes. As a result, performance of machine learning algorithms improves drastically. In NLP, word is used as a feature. Natural language processing contains many algorithms for feature extraction. In this research study, we have used BoW and TF-IDF. Bow is a bag of words is an approach in which a feature vector is formed based on count of words. Number of times a token appears in a document is calculated for finding feature weight. TF-IDF is an extension of BoW. In addition to word count, here inverse document frequency taken into consideration for calculation of weightage of feature in document. [12, 13] will provide more details. Below equations indicate TF-IDF mathematics. TF = No of times particular word w occurs in a document/Total no of words in a document IDF = log(Total no of documents/No of documents in which particular word w is present TF − IDF = TF ∗ IDF
3.2 Topic Modeling Topic modeling is one of the kinds of statistical modeling for finding the hidden or latent topics that appear in a set of documents. Latent Dirichlet allocation (LDA) is one of topic modeling algorithm which is useful for categorization of words in a document to a specific topic. LDA based on Gibb’s sampling is used which performs better in case of data imbalance as well. LDA Gibbs sampling uses Bayes algorithm and Monte Carlo Markov chains (MCMC) principle. Main aim of MCMC principle is to develop a Markov chain. This Markov chain finds equilibrium posterior probability distribution. But it takes a long run time to converge. For more insights refer [14].
3.3 Classification Algorithms Classification is one of the methods of supervised machine learning. Supervised algorithms are useful to train the model if data is labeled which is also called as class. Apart from classification, there is another method of supervised technique, i.e.,
Category Based Location Aware Tourist Place Popularity …
211
regression. Classification is a technique to determine a discrete class of data records, whereas regression is used for prediction of continuous value. For this research, we have used classification algorithms for popularity prediction. Below are a few highly popular classification algorithms.
3.3.1
Decision Tree
Decision tree is one of the most popular supervised machine learning algorithms. It is based on a rule-based approach. It forms a tree which consists of various nodes. Root node are decision terminals, and leaf nodes are labels of classification. Depending on tree, rules are formulated [15]. For more details refer [15]
3.3.2
Multinomial Naive Bayes
Naive Bayes is one of the most famous supervised machine learning algorithms. It is based on a probabilistic approach. The most common model of naive Bayes is Bernoulli and multinomial. Multinomial NB is denoted by MNB. MNB is more appropriate for text mining. MNB algorithm executes naive Bayes on multinomial distributed dataset. MNB assumes features are distributed multinomial for calculating probability of document to each class [16, 17]. Refer [17] for more details.
3.3.3
Support Vector Machine
Support vector machine is one of the most powerful supervised algorithms. Support vector machine is to find the separating hyperplane with the largest margin for generalized classification purpose. Support vector machine has several kernel methods like linear, radial basis function, and polynomial. Linear kernel is used when dataset is linearly separable by hyperplane [18]. Reference [18] will provide more details of SVM.
3.3.4
Random Forest
Random forest ensemble predictor with a set of decision trees that grow in randomly selected subspaces of data. It is a collection of multiple decision trees which predict class based on maximum vote. It estimates the final prediction. References [19, 20] will provide more details.
212
A. Arun Wadhe and S. Suratkar
3.4 Cluster Analysis Cluster analysis is the task of grouping similar kinds of entities together. It is a type of unsupervised machine learning techniques. For better clustering, intra-cluster distance must be least and inter-cluster distance must be most.
3.4.1
K-Means
K-means is one of the most famous and powerful partitioning methods which is used to cluster data. K-means is an unsupervised, numerical, non-deterministic, and iterative process of clustering. In k-means, every cluster is a representation of the mean value of elements present in the cluster. Where n elements get partitioned into K number of groups or clusters for keeping similarity among inter-cluster elements low and intra-cluster elements high. Similarity is calculated by mean value of elements present within a cluster. This process gets iterated n number of times so that cluster elements remain constant. Li and Wu [21] gives more insights.
4 Methodology In this paper, category-based location aware tourist place popularity prediction and recommendation has been performed using the following methodology. Refer Fig. 1 for proposed framework.
4.1 Dataset Data are gathered from several heterogeneous tourism review web sites. Data were in heterogeneous format which needed to be converted into a common format. Data has features: place description, place name, latitude, longitude, review, and rating. sentiment has been derived from rating. If rating > 3 then + 1, if rating < 3 then sentiment value is −1, if rating = 3 then sentiment is 0.
4.2 Language Detection and Data Cleansing Data present on tourism review web sites is multilingual and hence, language detection has been performed. Non-English data has been deleted. Data preprocessing is a crucial process which is highly essential as social media data is noisy and raw. Data cleansing has been carried out using steps like tokenization, stop word elimination,
Category Based Location Aware Tourist Place Popularity …
213
Fig. 1 Proposed framework for tourist place popularity prediction and recommendation system
lower casing, stemming, and lemmatization. Customized stop word list has been used. Stop words are words which occur frequently and irrelevant that need to be eliminated to improve performance to machine learning algorithms. Stop word list includes words like a, this, we, you, that, and etc. In addition to these steps, words whose length was less than four characters, punctuation characters, numbers, and special symbols have been removed.
4.3 Feature Extraction Text mining contains numerous feature extraction. Feature extraction for topic modeling has been performed using bag of word (BoW) algorithm. Whereas sentiment analysis has been performed using TF-IDF.
214
A. Arun Wadhe and S. Suratkar
4.4 Topic Modeling Topic modeling has been performed to find the category of place based on place description. Topic modeling performed using latent Dirichlet allocation (LDA) Gibbs sampling. 3900 records used for training purpose and 152 records for testing model.
4.5 Sentiment Analysis Sentiment analysis has been performed for popularity prediction. Sentiment analysis has been performed using decision tree, naive Bayes, support vector machine, and random forest. Reviews associated with 152 tourist places have been collected. There are 3209 reviews which divided train and test data into 80:20 ratios.
4.6 Spatial Data Mining Spatial data mining has been performed for place region detection. It is helpful in location aware recommendations. Place regions are clustered using the K-means algorithm. Places are categorized into four classes north, south, east, and west.
4.7 Popularity Prediction Popularity of 143 testing places has been predicted. On bases of total number of positive sentiment review and total number of review popularity rating of tourist place has been determined. Formula used for popularity rating prediction described in Sect. 5.
4.8 Recommendation System Top-N recommendation has been given to end users based on search query by user. Place in search query has been analyzed and similar categories, similar regions, and similar category + region places shown to the end user. For cold start user, popularity-based recommendation is given.
Category Based Location Aware Tourist Place Popularity …
215
4.9 Visualization Results have been visualized using pie chart, bar chart, gmaps. From gmap, users can view results over Google Map. Comparative analysis of supervised algorithm performed and visualized matplotlib bar chart.
4.10 Performance Evaluation Machine learning models have to be evaluated for calculating efficiency of models. Performance of machine learning model used in research study has been evaluated based on parameters such as accuracy score, recall, precision, and f1-score.
5 Experiment In the research experiment, 3900 records place description data have been collected for topic modeling training, whereas in testing, there are 152 records data. Reviews associated with 152 tourist places have been collected, and there are 3209 reviews in the dataset. Data has been assembled from numerous tourism forums and sites. Data has assembled in the comma separated values file format that comprises fields place description, place name, latitude, longitude, review, and rating. On bases of rating, sentiment has been calculated into three classes (+1, −1, 0). +1 denotes positive class, −1 denotes negative class, and 0 denotes neutral class. In this way, a labeled dataset has been gathered. Also, latitude and longitude data for tourist places have been collected for spatial data mining. Dataset is split into 80:20 proportions for training data and testing data, respectively. Next step was language detection and data preprocessing. Feature extraction has been performed using BoW for topic modeling. Topic modeling has been performed using LDA Gibbs sampling. Topic modeling has been performed for finding the category of place among the total 12 categories. Categories are hill station, historical place, beach, pilgrimage place, art gallery, museum, educational place, desert, botanical garden, national park, aquarium, and amusement park. For sentiment analysis, feature extraction has been performed using TF-IDF algorithm. Sentiment analysis has been performed using machine learning algorithms such as naive Bayes, support vector machine, and random forest. Sentiment analysis has categorized reviews into three categories positive (+1), negative (−1), and neutral (0). Based on sentiment analysis overall popularity rating of tourist places has been predicted. Spatial data mining has been performed using K-means algorithm for categorizing place into region, i.e., north, south, east, and west. Ultimately top ten recommendations given based on category, location, and popularity rating.
216
A. Arun Wadhe and S. Suratkar
5.1 Experimental Framework The experiment has been executed on the system with Windows 8.1 Home 64-bit OS, ×64-based processor with below hardware and software configurations. Hardware Configuration • Intel (R) Core (TM) i5-2450 M Processor @ 2.50 GHz • 8 GB Memory • 512 GB ROM. Software Configuration • • • • •
Python Programming Language Version 3.6 Pycharm Studio 2018 Mallet 2.8 Java Development Kit 1.8 MySQL Workbench.
5.2 Experimental Scenarios The motive of the experimental scenario is to estimate the perfect combination of parameters and hyperparameters for topic modeling and classification algorithms. Performance evaluation has been performed for every combination of parameters. Parameters and hyperparameters have been selected after several iterations so that the model will not overfit or underfit. In research study, for topic modeling algorithms, LDA Gibbs sampling has been used. The most important parameter in topic modeling is the number of topics which is denoted by T. Its value should be equal to number of classes. T has been set to 12 because documents needed to be classified into 12 categories. Second hyperparameters, alpha and beta whose value is set to 50/T and 0.001, respectively, where T is the number of topics. Number of iterations is set to 1000. Alpha and beta value influence topic sparsity in document and word sparsity in topic, respectively. In sentiment analysis, classification algorithms such as decision tree, multinomial naive Bayes, linear support vector machine, and random forest are utilized. In this research for naive Bayes, a multinomial event model was used. For MNB, alpha has assigned value 0.25 Similarly for SVM parameter, gamma has assigned value 0.01 and C has assigned value 1 with kernel function linear. For RF, n_estimator has assigned value 1000 also gini criterion selected. For DT has assigned criteria set as entropy, max_features as auto. For K-means, K-value has been set to four.
Category Based Location Aware Tourist Place Popularity …
217
5.3 Implementation We have gathered a dataset from heterogeneous tourism review web sites which got converted into one common format which contains place description, place name, latitude, longitude, comment as feature attributes. As social media data is very raw and noisy. It needs to preprocess properly. Non-English data records have been deleted for making data monolingual. Data preprocessing consists of tokenization, punctuation mark removal, stop word and short word removal, lemmatization, and stemming. Customized stop word list has been used instead of NLTK stop word list. Next step was feature extraction which has been performed using bag of words (BoW) for topic modeling, whereas for sentiment analysis TF-IDF used. After feature extraction, topic modeling has been performed using LDA Gibbs sampling for finding the category of place among 12 categories. Sentiment analysis has been performed over labeled data of reviews using decision tree (DT), multinomial naive Bayes (MNB), linear support vector machine (SVM), and random forest (RF) algorithm. Performance of the model has been estimated with the help of several factors such as accuracy, precision, recall, and f1-Score. Popularity of place has been determined using the following equations. P = ((No of Positive comments/Total No of Comments) ∗ 10) Popularity Rating = P/2 Depending overall popularity rating predicted if rating greater than or equal to three, then destination is classified into popular class else unpopular. During the experiment, we have found that RF gave better results of sentiment classification, so popularity prediction and recommendation engine have been developed using RF. Results of popularity prediction have been visualized using matplotlib and gmap Python library. Geographic representation on Google Maps helps end users to visualize place details. Next step was to categorize places into regions north, south, east, and west. It has been performed using the K-means algorithm. K-value for algorithm passed as four. Last step was recommendation. For recommendation, web site has been developed where Top-N recommendation has been given to end users based on popularity to cold start users. Also, similar region, similar category, similar category + region recommendation given to the end user on the basis of search query place. Experimental results are shown in the RESULT section.
6 Results Results of the research study are depicted in Figs. 2, 3, 4, and 5. Figure 2 denotes results popularity prediction on gmap. Figure 3 indicates search form for tourists and on bases of search the most popular place similar to the searched place has been
218 Fig. 2 Geographic result of category detection and popularity prediction of places
Fig. 3 Search form
Fig. 4 Similar places to searched place
A. Arun Wadhe and S. Suratkar
Category Based Location Aware Tourist Place Popularity …
219
Fig. 5 Popularity-based recommendation for cold start use
given in Fig. 4. In Fig. 5, popularity-based recommendation as given for cold start user (Fig. 5).
Fig. 6 Popularity statistic report using random forest
220
A. Arun Wadhe and S. Suratkar
Table 1 Comparison of machine learning algorithms using several factors Parameters
Methods DT (%)
MNB (%)
LSVM (%)
RF (%)
Accuracy
80.37
84.11
85.35
86.29
Precision
81
83
85
86
Recall
80
84
85
86
F1-measure
81
82
84
85
7 Comparative Analysis Comparative analysis of machine learning algorithms for sentiment-based popularity prediction using several parameters are delineated in Table 1.
8 Advantages • This hybrid recommendation system is a combination of content-based and popularity-based systems so it will give more precise recommendations to travelers in comparison with only content-based and only popularity-based systems. • Also As popularity of place is determined for cold start users who have not searched much more content, could be given only popularity-based place recommendations.
9 Disadvantages • In this system, social influence has not taken into consideration. • Also for giving location aware recommendations, distance has not taken into consideration
10 Conclusion Research presented in this paper reveals the main problem in the recommendation system and possible solution proposed to overcome it. We can conclude that categorybased location aware tourist place popularity prediction and recommendation could be beneficial for giving recommendations to cold start users which has improved the quality of recommendation by giving more precise recommendation to an end user as it has used three factors for recommendation, i.e., category, location, and
Category Based Location Aware Tourist Place Popularity …
221
popularity. In sentiment analysis, random forest (RF) has outperformed decision tree (DT), support vector machine (SVM), and naive Bayes (NB) for research dataset used and on bases of various evaluation parameters used in research. So, popularitybased tourist spot classification using RF has implemented which classify spots into popular and unpopular classes and gave accuracy of 88.02%.
11 Future Scope Major problem in the tourism domain has data sparsity problem. So, in the future, we will try to extend the dataset which will overcome data sparsity problem. Also, recommendation quality can be improved using a distance-based location aware system. So, in the future work, we will try to incorporate geographic distances in the design of recommendation system. Acknowledgements We wish to express our gratitude toward Veermata Jijabai Technological Institute, Mumbai for providing laboratory and resources for conducting research work.
References 1. Y. Yang, Y. Duan, X. Wang, Z. Huang, N. Xie, H.T. Shen, Hierarchical MultiClue Modelling for POI Popularity Prediction with Heterogeneous Tourist Information. IEEE2018OI10.1109/TKDE.2018.2842190 2. R. Joshi, R. TekcFeras Namous, A. Rodan, Y. Javed, Online News Popularity Prediction. 2018 Fifth HCT Information Technology Trends (ITT). https://doi.org/10.1109/CTIT.2018.8649529 3. S. Nakamura, M. Okada, K. Hashimoto, An investigation of effectiveness using topic information order to classify tourists reviews, in 2015 International Conference on Computer Application Technologies. https://doi.org/10.1109/CCATS.2015.32 4. Y. Wang, S.C.-f. Chan, G. Ngai, Applicability of demographic recommender system to tourist attractions: a case study on trip advisor, in 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology. https://doi.org/10.1109/WI-IAT.2012.133 5. Z. Lu, Z. Dou, J. Lian, X. Xie, Q. Yang, Content-based collaborative filtering for news topic recommendation, in Proceedings of the TwentyNinth AAAI Conference on Artificial Intelligence. ISBN:0-262-51129-0 6. K. Kesorn, W. Juraphanthong, A. Salaiwarakul, Personalized attraction recommendation system for tourists through check-in data. IEEE Access 5, 26703–26721 (2017) 7. A. Umanetsa, A Ferreiraa, N. Leitea, GuideMe—A Tourist Guide with a recommender system and social interaction, in ScienceDirect Conference on Electronics, Telecommunications and Computers—CETC (2013). https://doi.org/10.1016/j.protcy.2014.10.248 8. Z. Jia, W. Gao, Y. Yang, X. Chen, User-based collaborative filtering for tourist attraction recommendations, in 2015 IEEE International Conference on Computational Intelligence & Communication Technology. https://doi.org/10.1109/CICT.2015.20 9. P.B. Thorat, R.M. Goudar, S. Barve, Survey on collaborative filtering, content-based filtering and hybrid recommendation system. Int. J. Comput. Appl. 110(4), 0975–8887 (2015) 10. C. Pan, W. Li, Research paper recommendation with topic analysis, in International Conference on Computer Design and Applications (ICCDA 2010).https://doi.org/10.1109/ICCDA.2010. 5541170
222
A. Arun Wadhe and S. Suratkar
11. A. A. Wadhe, S. S. Suratkar, Tourist place reviews sentiment classification using machine learning techniques. In 2020 international conference on Industry 4.0 Technology (I4Tech) (2020, February) 12. J. Ramos, Using TF-IDF to determine word relevance in document queries, in Proceedings of the Twenty-First International Conference on Machine Learning (2003) 13. S.-W. Kim, J.-M. Gil, Research paper classification systems based on TF-IDF and LDA schemes. Hum.-Centric Comput. Inform. Sci. 9, Article number: 30 (2019) 14. D.M. Blei, Probabilistic topic models. Mag. Commun. ACM 55(4), 77–84 (2012) 15. J.R. Quinlan, Induction of decision trees. Mach. Learn. 1, 1 (1986) 16. K. Sarkar, Using character N-gram features and multinomial naive bayes for sentiment polarity detection in Bengali Tweets, in 2018 Fifth International Conference on Emerging Applications of Information Technology (EAIT) 17. Text Classification and Naïve Bayes. https://web.stanford.edu/~jurafsky/slp3/slides/7_NB.pdf 18. C. Burges, A tutorial on support vector machines for pattern recognition. Data Min. Knowl. Disc. 2, 121–167 (1998) 19. L. Breiman, Random Forests. Statistics Department University of California Berkeley. https:// ieeexplore.ieee.org/document/8167934 20. C. Sheppard, Tree-Based Machine Learning Algorithms: Decision Trees, Random Forests, and Boosting (CreateSpace Independent Publishing Platform, 2017) 21. Y. Li, H. Wu, A clustering method based on K-means algorithm, in 2012 International Conference on Solid State Devices and Materials Science 22. S. Bird, E. Klein, E. Loper, Natural Language Processing with Python. O’Reilly Publication 23. T. Haslwanter, An Introduction to Statistics with Python With Applications in the Life Sciences. Springer 24. A.C. Müller, S. Guido, Introduction to Machine Learning with Python. O’Reilly Publication 25. C.-Y. Yang, J.-S. Yang, F.-L. Lian, Safe and smooth: mobile agent trajectory smoothing by SVM. Int. J. Innov. Comput. Inform. Control 8(7B) (2012) 26. A.A. Wadhe, S.S. Suratkar, Tourist place reviews sentiment classification using machine learning techniques, in 2020 International Conference on Industry 4.0 Technology (I4Tech)
Maximization of Disjoint K-cover Using Computation Intelligence to Improve WSN Lifetime D. L. Shanthi
Abstract WSNs have been used in different sectors of applications such as industrial, environmental, and social due to the progress of technology and the necessity. Because the network’s sensors are restricted by battery power, network operations are important. The life extension of a wireless sensor network has been explored in this study by locating a large number of disjoint set coverings. All of the targets were covered by each separate group of sensors. Instead of maintaining all sensor nodes in operation, the only way to prolong the service life by about K times is to use the sensors of one cover while the sensors of the other covers are in sleep mode. This approach saves both energy and time by processing useful data and reducing duplicate data coming from different sensors in a region. Different configurations of sensor networks have been tested using an evolutionary computation-based computer intelligence technique, as well as a genetic algorithm and differential evolution. To make the solution possible, a local operator has been incorporated. With integer encoding of solutions in genetic algorithm performs better than differential evolution in finding a good number of disjoint set covers. Over a continuous search space, DE is highly efficient, but its efficiency has been hampered by integer transformation. Keywords Computational Intelligence (CI) · Coverage · Disjoint sets · Differential evolution (DE) · Genetic algorithm (GA) · Lifetime
1 Introduction In many applications, wireless sensor networks (WSNs) have demonstrated tremendous usefulness, including battlefield surveillance, environmental monitoring, traffic control, animal tracking, and residential applications. A large number of WSN applications need to be monitored for long-term and in cost-effective manner, the network needs to be operational for a longer time, enhancing the network lifetime is of major concern. Exhaustible sensor batteries have prompted the development of techniques D. L. Shanthi (B) BMS Institute of Technology and Management, Bengaluru, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_17
223
224
D. L. Shanthi
to extend the network’s lifetime and have become one of the most critical and difficult challenges. As a basis, extending the life of WSN networks is an essential study topic. Data processing, routing, device location, topology management, and device control are all challenges that can be solved to extend the life of WSNs. In a highly dispersed WSN, a fraction of the devices can already address coverage and connection problems [1]. Network coverage problem is one of the issue that has to be solved; from literature it has been contemplated that connectivity across the network is by the nodes coverage and is established that each node communication range is at least twice its sensing range. The device control technique that schedules the devices’ sleep/wakeup activities has proved to be promising in a WSN, where the coverage problem affects how well an area of interest is monitored by sensors. Random placement is used to deploy the sensors to the target location (by dropping from planes). Active and sleep modes are the most common modes of operation for sensors. A sensor needs a lot of energy to perform all of its functions, such as detecting, processing, and communicating. The asleep mode sensor, on the other hand, uses very little energy and can be activated for complete activities within a predetermined amount of time. In an area where a subset of sensors can entirely cover the target region, the remainder of the sensors may be programmed to enter sleep mode to conserve energy; hence, if there are more subsets, the lifetime of the WSN may be significantly extended. Another approach to improving network life would be to increase the number of fully covered subsets. The maximum percentage of full cover subsets is difficult to find since each subset needs to offer full coverage of the target region yet only one subset of sensors is available at any moment to the WSN to conduct the monitoring task. The disjoint set cover is an issue commonly called as K-COVER problem and is an NP problem in WSN.
2 Related Work Wireless sensor networks (WSNs) are known to be energy-constrained, and the battery capacity of the nodes is heavily influenced by the endurance of each network sensor. As a result, WSN research has focused on the network lifespan. Although various energy-efficient solutions for extending network lifespan have been learned, different definitions have been given for lifetime based on different topology settings and procedures. The most common definition of a sensor network’s lifespan is the time until the first sensor node fails, which seems to be decidedly unhelpful in many possible deployment scenarios [2]. WSN protocols were compared using different network lifespan criteria and explored the implications of these metrics as well as their use in objectively measuring the performance of WSN data delivery techniques. In [3], an overview of current WSN advancements, including its applicability, design restrictions, and predictive lifetime techniques, was provided. Yıldız et al. [4] used two linear programming (LP) techniques to evaluate the impact of capturing numerous key nodes on WSN life length. The findings indicate that capturing a large number of key nodes in a WSN considerably reduces network life. The challenge
Maximization of Disjoint K-cover Using Computation Intelligence …
225
of prolonging the life of dynamically varied WSNs with EH sensors was investigated in [5]. This issue was defined as having the highest number of covers that are each part of all the sensors for all of the targets to be tracked. Authors designed a mathematical model of the problem, and proposed a search algorithm called a harmonic search algorithm with multiple populations and local search (HSAML). Routing is a vital activity in wireless sensor networks, and it is dependent on the deployment and administration of a flexible and efficient network. Both sensor nodes and the complete network must be optimized for energy consumption and resource management to provide effective routing [6]. To extend the WSN’s lifespan, a multicriteria routing system with adaptability was proposed. The clustering method has been shown to extend the lifespan of WSNs. Alghamdi [7] developed a clustering approach to choose an optimum CH based on energy, distance, latency, and safety. The CH is chosen via the hybridization of fireflies and dragonflies [8, 9]. The key contribution of the work here is the transformation of the single-object disjoint set cover problem into a multi-objective problem (MOP). A maximization of (DSC) disjoint set cover is addressed by increasing the coverage and optimizing the number of DSC, and scheduling the nodes into a maximum DSC is an NP-hard optimization issue [10]. Authors have experimented on a more realistic WSN model to maximize the set cover depending on the target, and a multi-layer genetic algorithm is proposed to determine utmost set covers having minimum sensors. Depending on the required configurations for different applications, the design of WSN is very complex and the dynamic nature mandates finding optimum network quick and adaptive [11]. Authors have addressed this issue by designing a genetic algorithm-based self-organizing network clustering (GASONeC) framework, an initiative of providing dynamically optimized paths and clusters formation. In any WSN design, the idea to extend the network lifetime is to avoid the energy depletion of a node before other nodes in a homogeneous network [12]. To reduce network energy usage, a dynamic cluster model is suggested for selecting a good CH. One of the most essential ways to reduce energy exhaustion is to use the right routing technique, which may be improved by using the right CH based on the threshold specifications. A Low Energy Adaptive Cluster Hierarchy (LEACH), selects CH on rotation basis and there exists equal probability for high energy nodes to become CH. Ayati et al. [13] showed a threelayer routing strategy called super cluster head election using fuzzy logic in three layers, in which fuzzy logic is utilized to pick a supercluster head from a group of CHs (SCHFTL). Practically nodes are deployed randomly in the region of interest, this makes the sensor density variations in any application [14]. The problem is optimized by finding more set covers to track the targets at active mode. A greedybased heuristic technique for scheduling sensors to act on events generated in an environment is suggested to increase network lifetime [15]. To minimize the delay in data transmission, authors have proposed Optimal Mobility-based Data Gathering (OMDG) approach with Time-Varying Maximum Capacity Protocol (TMCP) and CHEF is applied for cluster construction, also a variety of sinks were developed to collect data from various regions of the network.
226
D. L. Shanthi
3 Representation of Disjoint Set Cover Problem Consider a given finite set of targets T to be covered by disjoint subsets S from a collection of nodes in a given region and finding the maximum number of such subsets that every cover C i (C i ⊆ S), each of the target T in the region belongs at least to one cover C i or maybe two covers C i and C j , but C i ∩ C j = ϕ. This has been an NP-complete problem as given in Fig. 1, and a network with a set of five nodes and a set of four targets in a region is deployed as a sample. The the targets T 1 … T 4 may be denoted as a relation between the sensors S 1 … S 5 and bipartite graph G = (V, E), where V = S T and eij E if S i covers T j . Figure 2 depicts the bipartite graph of the WSN seen in Fig. 1, with S 1 = {T 1 }, S 2 = {T 1 , T 2 }, S 3 = {T 2 , T 3 , T 4 }, S 4 = {T 3 }, and S 5 = {T 4 }. In this case, the maximum number K of disjoint covers is two C 1 = {S 1 , S 3 }; C 2 = {S 2 , S 4 , S 5 }. Lifetime enhancement of WSN is achieved by solving the network coverage problem, and this can be done by determining more set of covers especially by dividing the sensors into disjoint sets which covers the targets better. Fig. 1 Example deployment of WSN
Fig. 2 WSN representation using bipartite graph
Maximization of Disjoint K-cover Using Computation Intelligence …
7 T3
5 T2
1 9 T1 T3 1st Cover
10 T4
4 T2
2 8 T1 T4 T3 2nd Cover
227
6 T3
3 T1
Fig. 3 A sample representation of solution with its fitness for sensors and the targets covered
4 Methodology 4.1 Fitness Value and Solution Representation Sensors have been given a separate number from 1 to a maximum number of sensors (NS) in the network area during solutions determined. Figure 3 illustrates a sample solution representation with ten sensors in the region and with 4 targets to be covered. Fitness estimation is given in the table for the set covers in the region as specified in Fig. 2. In the Fig. 3, it can be noticed that 2 sets of disjoint set covers have been discovered; this gives the fitness of a solution, that is the total number of disjoint covers developed specifies the fitness of the solution, and in the given sample, we have a solution {7, 5, 1, 9, 10, 4, 2, 8, 6, 3} which is having fitness of 2 as there is 2 disjoint cover in the region.
4.2 Locally Corrected Genetic Algorithm The work is carried out using a genetic algorithm (GA), and offspring were generated by applying a more natural way by provisioning equal chances for each member. This approach differs from the conventional type of GA, in which parents have a fitness focus. Equal possibilities to become a parent for all members are more natural and provide them a greater chance to explore. The solution space was explored with a two-point crossing, while the tournament selection is followed by the exploitation for the following generation illustrated in Fig. 4. The initial population is chosen by the random generator, which produces random numbers by permuting integer values from 1 to NS (NS: total number of sensors). Two random parents were chosen from
228
D. L. Shanthi
Fig. 4 Suggested genetic algorithm with a locally corrected operator
the population, and a two-point crossover was used to generate offspring, based on the criteria defined as equal opportunities. The integer mutation strategy was used to further alter the children, with the mutation location ranging from 1 to NS. Following a mutation, progeny with numerous locus points may have the same sensors at various locations at the same time, resulting in an infeasible solution. This condition of theoretical view is rectified by randomly selecting unavailable sensors from the current solution to make a multi-locus capability for sensors and to make the solution feasible. When the parents and the offspring are of the same sizes, the two populations are merged into a pool, and then, a tournament selection is applied over it to determine the sample population for the following generation. Depending on the termination
Maximization of Disjoint K-cover Using Computation Intelligence …
229
condition, the previous population will be replaced by the next generation population, and the process is repeated, or a final solution from the next generation is obtained.
4.3 Differential Evolution (DE) DE is currently one of the powerful entities under evolutionary computations to address global optimization problems. The DE algorithm population includes NP individuals, and each of them has a D-dimensional vector as accessible in a task, corresponding to D dimensions. During every generation different mutation strategies under DE are applied to generate a D-dimension donor-vector. Two methods, namely DE/rand/1 as in Eq. (1) and DE/current to best/rand/1 as in Eq. (2), have been used in this research. The crossover operator in a probabilistic environment was applied to generate the trial vector, as shown in Eq. (3). CR, a crossover control parameter or factor with a range of [0, 1], represents the probability of creating parameters for a trial vector from a mutant vector. In the range [1, NP], index jrand is a randomly selected integer. The target and accompanying trial vectors are then chosen for the following generation using a greedy selection procedure, as described in Eq. (4). Rounding was used to arrive at the integer values. (G) (G) Vi(G) = X r(G) 1 + F ∗ Xr2 − Xr3 (G) Vi(G) = X i(G) + F ∗ X best − X i(G)
(G) + F ∗ X r(G) − X 1 r2
(1) (2)
u i(G) j
vi(G) j if rand(0, 1) ≤ CR or j = jrand (G) xi j otherwise (G) (G) (G) ≤ f x u if f u i i i xi(G) j = xi(G) otherwise
=
(3)
(4)
5 Experimental Set-Up and Results The suggested form of GA and DE was applied to various configurations of a simulated WSN network. Sensors have been randomly deployed into 2D space, and objectives have been set. These conditions are reproduced by the assignment for the sensors and targets of the random values of (x, y). In the first step, the Cover Matrix determined the convergence of the target by the individual sensor with the help of each sensor’s Euclidean distance from the target. The coverage matrix construction has
230
D. L. Shanthi
been demonstrated in Fig. 5. The entire simulation procedure was built up in a MATLAB setting. The population size was 100, and the number of generations permitted was 100 for the GA and DE. The chance of mutations of GA was assumed to be 0.1, while the size of the tournament was 10%. The F value in the DE was 0.5 and CR 0.5. Case-1 Fig. 6 depicted a network set-up with defined parameters as in Table 1 as above for case-1. Looking into the network, it can be known that some sensors are redundant that do not cover any target in the region, so they can be eliminated, and the final Fig. 5 Formation of coverage matrix
100
SNS TAR
90
Distance in y co-ordinate
80 70 60 50 40 30 20 10 0
0
10
20
30
40
50
60
70
Distance in x co-ordinate Fig. 6 Deployment of network for case-1 (100 nodes and 10 targets)
80
90
100
Maximization of Disjoint K-cover Using Computation Intelligence … Table 1 Network set-up parameters
231
Number of sensors deployed
100
Number of useful sensors
86
Number of targets
10
Sensing range
20
coverage matrix with useful nodes is shown in Fig. 7, and the corresponding coverage matrix is shown in Table 2. From the table, it can be seen that sensors considered are going to cover at least a single target in the network. Figure 8 compares the performance of GA and DE in the 100th generation (the most recent generation) to the first generation. At the start, fitness variations varied from four to six covers, but DE created a maximum of eight covers and GA found the 10 covers. Convergence in fitness is also seen in Fig. 9. The finished cover by DE and GA is presented in Table 3 and can be seen that all targets may be covered with relatively low numbers of sensors. In the case of DE, certain useless sensors are left which cannot be reused if necessary.
100
SNS TAR
90
Distance in y co-ordinate
80 70 60 50 40 30 20 10 0
0
10
20
30
40
50
60
70
Distance in x co-ordinate Fig. 7 Network after applying coverage matrix under case-1
80
90
100
232
D. L. Shanthi
Table 2 Coverage matrix determined for the network in case-1 Sensor
Target 1
2
3
4
5
6
7
8
9
10
1
0
0
0
0
1
1
0
0
0
0
2
0
0
0
0
1
1
0
0
0
0
3
0
0
1
1
0
1
0
0
0
0
4
0
0
1
1
1
0
0
0
0
0
5
0
0
0
0
0
0
0
0
1
0
6
0
0
0
0
0
0
0
0
0
1
7
0
0
0
0
0
0
1
0
0
0
8
0
0
0
0
0
0
0
1
0
0
9
1
0
1
0
0
0
0
0
0
0
10
0
0
1
0
0
0
0
0
0
0
11
0
0
1
0
1
1
0
1
0
1
12
1
0
0
0
0
0
0
0
0
0
13
1
0
0
0
0
0
0
0
0
0
14
0
0
0
0
0
0
1
0
0
0
15
0
1
0
0
0
0
0
0
0
0
16
0
0
0
0
0
0
1
0
1
0
17
1
0
0
0
0
0
0
0
0
0
18
0
0
0
1
0
0
0
0
0
0
19
0
0
0
0
0
0
0
0
1
0
20
0
0
0
0
0
0
1
0
1
0
21
1
0
0
0
0
0
0
0
0
0
22
0
0
0
0
0
0
1
0
0
0
23
0
0
1
0
0
0
0
0
0
0
24
0
1
0
0
0
0
0
0
0
0
25
1
0
0
0
0
0
0
0
0
0
26
0
0
1
0
0
0
0
0
0
0
27
0
0
0
0
0
0
1
0
1
0
28
0
1
0
0
0
0
0
0
0
1
29
0
0
0
1
0
0
0
0
0
0
30
0
0
1
1
1
1
0
0
0
0
31
0
0
0
1
0
0
0
0
0
0
32
0
0
0
0
0
0
0
1
0
1
33
1
0
0
0
0
0
0
0
0
0
34
0
0
0
1
0
0
0
0
0
0
35
0
0
0
0
0
0
1
0
0
0 (continued)
Maximization of Disjoint K-cover Using Computation Intelligence …
233
Table 2 (continued) Sensor
Target 1
2
3
4
5
6
7
8
9
10
36
0
0
0
1
1
0
0
0
0
0
37
0
0
0
0
0
0
1
0
0
0
38
0
1
0
0
0
0
0
0
0
0
39
0
0
0
0
0
0
1
0
1
0
40
0
1
0
0
0
0
0
0
0
0
41
0
0
0
0
0
0
0
1
0
1
42
0
0
1
0
0
0
0
0
0
0
43
0
0
0
0
0
0
1
0
1
0
44
0
0
0
0
0
0
0
0
1
0
45
1
0
0
0
0
0
0
0
0
0
46
0
0
0
0
0
0
1
0
1
0
47
1
0
0
0
0
0
0
0
0
0
48
0
1
0
0
0
0
0
0
0
0
49
1
0
0
0
0
0
0
0
0
0
50
0
0
1
0
0
1
0
1
0
1
51
0
0
0
0
0
1
0
1
0
1
52
1
0
0
0
0
0
0
1
0
1
53
1
0
0
0
0
0
0
0
0
0
54
0
0
1
1
1
1
0
0
0
0
55
0
0
0
0
0
1
0
0
0
0
56
0
0
0
1
0
0
0
0
0
0
57
0
0
0
0
0
0
0
0
1
0
58
0
0
1
0
0
0
0
0
0
0
59
0
0
0
0
0
0
0
1
0
1
60
0
1
0
0
0
0
0
0
0
0
61
0
0
0
1
1
1
0
0
0
0
62
0
0
0
0
0
0
1
0
1
0
63
0
0
0
0
0
0
1
0
1
0
64
0
0
0
1
0
0
0
0
0
0
65
0
0
0
0
0
0
1
0
1
0
66
0
1
0
0
0
0
0
0
0
0
67
0
0
0
0
0
0
0
1
0
1
68
0
0
0
0
0
0
0
0
1
0
69
0
0
0
1
0
0
0
0
0
0 (continued)
234
D. L. Shanthi
Table 2 (continued) Sensor
Target 1
2
3
4
5
6
7
8
9
10
70
0
0
1
0
0
0
0
0
0
0
71
1
0
0
0
0
0
0
0
0
0
72
1
0
0
0
0
0
0
0
0
0
73
0
0
0
0
0
0
0
0
1
0
74
0
1
0
0
0
0
0
0
0
0
75
1
0
0
0
0
0
0
0
0
0 0
76
1
0
0
0
0
0
0
0
0
77
0
0
0
0
0
0
0
0
1
0
78
0
0
1
0
1
0
0
0
0
0
79
0
0
0
0
0
0
0
1
0
1
80
0
0
1
1
1
1
0
0
0
0
81
0
0
0
1
1
0
0
0
0
0
82
0
0
0
0
0
0
0
1
0
1
83
0
0
0
0
0
0
0
1
0
1
84
0
1
0
0
0
0
0
0
0
0
85
0
0
1
1
1
1
0
0
0
1
86
0
0
1
0
1
1
0
0
0
0
10 1st Gen 100th Gen(DE) 100th Gen(GA)
9
K-Cover value
8
7
6
5
4
3
0
10
20
30
40
50
60
70
Population Members Fig. 8 Solution under various generations for population and fitness case-1
80
90
100
Maximization of Disjoint K-cover Using Computation Intelligence …
235
11 KDE KGA
10.5 10
K-Cover value
9.5 9 8.5 8 7.5 7 6.5 6
0
10
20
30
40
50
60
70
80
90
100
Generation No. Fig. 9 Fitness convergence for best solutions in case-1
The proposed algorithm is tested for different experimental set-ups under different network configurations with many sensors that varied from 100, 200, and 500 and targets that varied from 10, 10, and 15, respectively, and keeping the same sensing rage 20 for all cases. In all the cases, the proposed GA has performed better by determining the maximum number of disjoint set covers as shown in Table 4.
6 Conclusion The adoption of a disjoint cover-based approach can save a lot of battery energy in most WSN applications where sensor placement is random and a high number of sensors are dropped. The disjoint coverage is quite difficult to find, although heuristic algorithms such as GA have been highly successful. The second advantage of GA direct integer coding is not just that the transmission error from a continuous variable to an integer value is straightforward to accomplish. Over a continuous search space, DE is highly efficient, but its efficiency has been hampered by integer transformation.
20
63
11
44
33
69
4
5
6
7
8
Unused sensors
78
81
83 84
85
82
49
46
31
80
64
73
25
75
50 14
71 74
65 68
30
53
77 27
51
55
38
46
49
78
33
75
14
29
5 1
47
38
37
77
72 68
42 66
35 31
39 45
26 19
69
70 2
81 9
18
6 30 41
8 71
27
50
25
60
62
48
24
28
72
59
54
61
39
41
18
6 8
19
32
60
63
79 58
36 70
23 45
15 16
3 40
9 54
4
59
57
35
22
13
86
2 67
7
34 21
26
10
52
56
12
17
66
42
Sensor no
GA
9
76
10
43
3
29
37
2
1
47
1
Sensor no
K-Disjoint cover DE
Table 3 Disjoint set covers under DE and GA
28
85
55
56
79
53
82
57
52
61
20
17
65
76
44
48
51
10
16 40
34
12 67 43
7 24
83
58
84
11
80
86
64
15
13
5
62
21 22
73
3 4 32
74
23
36
236 D. L. Shanthi
Maximization of Disjoint K-cover Using Computation Intelligence …
237
Table 4 Comparison of algorithm with different network set-ups Case
Number of sensors deployed
Number of useful sensors
Number of targets
Sensing range
Number of disjoint set covers using DE
Number of disjoint set covers using GA
1
100
86
10
20
8
10
2
200
128
10
20
14
15
3
500
442
15
20
26
28
References 1. Y. Xu, J. Fang, W. Zhu, Differential evolution for lifetime maximization of heterogeneous wireless sensor networks. Math. Probl. Eng. (2013). https://doi.org/10.1155/2013/172783 2. N.H. Mak, W.K.G. Seah, How long is the lifetime of a wireless sensor network? Int. Conf. Adv. Inf. Netw. Appl. 2009, 763–770 (2009). https://doi.org/10.1109/AINA.2009.138.2016 3. H. Yetgin, K.T.K. Cheung, M. El-Hajjar, L.H. Hanzo, A survey of network lifetime maximization techniques in wireless sensor networks. IEEE Commun. Surv. Tutorials 19(2), 828–854, Second quarter 2017, https://doi.org/10.1109/COMST.2017.2650979 4. H.U. Yıldız, B. Tavlı, B.O. Kahjogh, Assessment of wireless sensor network lifetime reduction due to elimination of critical node sets, in 2017 25th Signal Processing and Communications Applications Conference (SIU), 2017, pp. 1–4. https://doi.org/10.1109/SIU.2017.7960228 5. C.C. Lin, Y.C. Chen, J.L. Chen et al., Lifetime enhancement of dynamic heterogeneous wireless sensor networks with energy-harvesting sensors. Mob. Netw. Appl. 22, 931–942 (2017) 6. F. El Hajji, C. Leghris, K. Douzi, Adaptive routing protocol for lifetime maximization in multiconstraint wireless sensor networks. J. Commun. Inf. Netw. 3, 67–83 (2018). https://doi.org/ 10.1007/s41650-018-0008-3 7. T.A. Alghamdi, Energy efficient protocol in wireless sensor network: Optimized cluster head selection model. Telecommun. Syst. 74, 331–345 (2020). https://doi.org/10.1007/s11235-02000659-9 8. B.A. Attea, E.A. Khalil, S. Özdemir et al., A multi-objective disjoint set covers for reliable lifetime maximization of wireless sensor networks. Wirel. Pers. Commun. 81, 819–838 (2015). https://doi.org/10.1007/s11277-014-2159-3 9. M.K. Singh, Discovery of redundant free maximum disjoint Set-k-covers for WSN life enhancement with evolutionary ensemble architecture. Evol. Intel. 13, 611–630 (2020). https://doi.org/ 10.1007/s12065-020-00374-z 10. M.F. Abdulhalim, B.A. Attea, Multi-layer genetic algorithm for maximum disjoint reliable set covers problem in wireless sensor networks. Wirel. Pers. Commun. 80, 203–227 (2015). https://doi.org/10.1007/s11277-014-2004-8 11. X. Yuan, M. Elhoseny, H.K. El-Minir et al., A genetic algorithm-based, dynamic clustering method towards improved WSN longevity. J. Netw. Syst. Manage. 25, 21–46 (2017). https:// doi.org/10.1007/s10922-016-9379-7 12. M. Elhoseny, A.E. Hassanien, Extending homogeneous WSN lifetime in dynamic environments using the clustering model, in Dynamic Wireless Sensor Networks. Studies in Systems, Decision, and Control, vol. 165 (Springer, Cham, 2019). https://doi.org/10.1007/978-3-319-92807-4_4 13. M. Ayati, M.H. Ghayyoumi, A. Keshavarz-Mohammadiyan, A fuzzy three-level clustering method for lifetime improvement of wireless sensor networks. Ann. Telecommun. 73, 535–546 (2018). https://doi.org/10.1007/s12243-018-0631-x
238
D. L. Shanthi
14. J. Sahoo, B. Sahoo, Solving target coverage problem in wireless sensor networks using greedy approach, in 2020 International Conference on Computer Science, Engineering and Applications (ICCSEA) (2020), pp. 1–4. https://doi.org/10.1109/ICCSEA49143.2020.9132907 15. A.R. Aravind, R. Chakravarthi, N.A. Natraj, Optimal mobility based data gathering scheme for life time enhancement in wireless sensor networks, in 2020 4th International Conference on Computer, Communication and Signal Processing (ICCCSP) (2020), pp. 1–5, https://doi. org/10.1109/ICCCSP49186.2020.9315275
Car-Like Robot Tracking Using Particle Filter Cheedella Akhil, Sayam Rahul, Kottam Akshay Reddy, and P. Sudheesh
Abstract This paper proposes the method that tracks a remote-controlled car-like robot in outdoor environment by using particle filter algorithm. The proposed model uses a sampling-based recursive Bayesian algorithm which is implemented in the state estimator particle filter object. A geometrical model tracker distinguishes particles by resampling from invalid particles to make more precise prediction. Commands are sent to the robot and the robot pose measurement is provided by an on-board global positioning system (GPS). The proposed algorithm predicts the estimated path, and the error has been calculated to evaluate the performance of the tracking model. Keywords Particle filter · State-estimator · Bayesian algorithm · State transition function
1 Introduction Particle filter techniques are an established technique for producing samples of the desired distribution without requiring assumptions approximately the nation area fashions or the nation distribution [1]. Particle filter makes use of set of samples (particles) to symbolize the posterior distribution of a few stochastic methods given through noisy and partial observations. The stochastic process is a deterministic process where there is only one possible reality. It is also a random process where it has several possible evolutions and is characterized by probability distributions where the starting point might be known [2]. It is a time series modeling where it has series of random states or variables. These modeling measurements are taken at discrete times [3]. In state space, the state vector consists of all facts to explain the investigated device that is normally multi-dimensional. The size vector represents observations associated with state vector that is commonly of decreased measurement than the state vector [3]. There are known motion commands sent to robot, but robot cannot C. Akhil · S. Rahul · K. A. Reddy · P. Sudheesh (B) Department of Electronics and Communication Engineering Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_18
239
240
C. Akhil et al.
execute the exact commanded motion due to mechanical stack or model inaccuracy. This paper shows how to use state estimator particle filter to reduce the effect of noise in measurement data and get more accurate estimation of pose of robot [4]. This paper is structured as following. An overview of literature works is presented in Sect. 2. The probabilistic inference of sampling primarily based on particle filter monitoring framework for car-like robot and detailed set of rules put in force supplied over the framework has been elaborated in Sect. 3. The experimental effects have been illustrated in Sect. 4. Finally, the conclusion and future works are addressed in Sect. 5.
2 Related Work The main goal of the paper is to have a low cost and more informational model that has some advantages over other tracking algorithms that had been studied widely. There are many researchers operating on monitoring gadgets with a few data given to it, especially tracking of pedestrian. With the scenario of real vehicle tracking, a number of them employ assumptions which include background subtraction, situation with constant entrance and exit, etc. Car-like robot tracking and its behavior analysis using many samples is summarized in [1]. The diversion of pose caused by roofed area or some other obstacles are quite common. A lot of set of rules had been proposed to resolve and track the failure brought about by any interface in the moving path. Some of them are recognition on statistics association, seeking to supplement the positions during interface area via way of means of optimizing the trajectory, etc. [1]. They can nonetheless work while the goal is to acquire greater accuracy, however, except the strategies that usually depend on a set line of sight with no obstacles present in the path of target’s poses. The line graph of path of robot and its visible illustration is at extra risk of noisy and clustered background [3]. By introducing online gaining knowledge into monitoring framework, it is possible to replace the arrival version, while the arrival of target is changed. While their fashions are lack of semantic statistics, which helps in distinguishing aim besides the factor objects. The sampling-based recursive Bayesian version may be very famous in detecting and monitoring objects. Without any constraint of the global appearance, while a few paths of a goal are misplaced because of obstacles, visible paths nonetheless offer legitimate observation for tracking. Procedures with sampling-based model are more effective with estimating object. The compact version works with a totally restricted scene, however, this proves the use of sampling-based model [1]. In plenty of research, sampling-based model is used in object tracking for a given path, and they mainly focus on predicting accurate path of the goal or studying the geometry shape of samples they extract, however, this work is inquisitive about car-like robotic monitoring [3]. Bayesian filtering strategies have been broadly used with the car monitoring problem. Methods using Kalman filter have a low computation complexity, while
Car-Like Robot Tracking Using Particle Filter
241
particle filter can constitute a broader area of distribution and version nonlinear transformation. Car-like robot was tracked with the use of a particle filter with the aid of using modeling trajectories at the intersection of particles that had been anticipated in determining movement pattern. Debris of feature elements in 2D were tracked on a grid map, and weights can be measured with the resource of the use of occupied gird which distinguishes static from shifting elements. Both seek to tune car-like robot in two dimensional space [4]. To find the mean square error of tracking, distance between actual pose and estimated pose have been calculated for hypothetical samples and RMS of those samples were calculated [3].
3 Procedure Sampling-based car-like robot framework has been illustrated in this section. For estimating state of car-like robot system, particle filter algorithm is ideally ˙ y˙ , θ˙ , and suited, since the particle filter can deal with inherent nonlinearities x, ´ are velocities, given in Eqs. (1–4). Ø The noisy measurements of robot pose (x, y, θ ) are, x = v ∗ cos θ,
(1)
y = v ∗ sin θ,
(2)
θ = (v/L) ∗ tan ∅,
(3)
∅=ω
(4)
´ is The linear and angular velocities which are sent to robot are v and ω, and Ø the wheel orientation which is not included in estimation [5]. Car-like robotic drives and modifies its speed and steerage attitude continuously as shown in Eq. (5). The full state of car from observer’s point of view will be, [x, y, θ, x, ˙ y˙ , θ˙ ]
(5)
The pose of the robot is measured by some noisy external system, e.g., a Vicon or a GPS system. Along the path, the robot drives through a roofed area where no measurement can be made. The noisy measurement on robot’s partial pose (x, y, θ ) and no measurement is available on the front wheel orientation (φ) as well as on all ˙ y˙ , θ˙ , φ). The linear and angular velocity command sent to the robot the velocities (x, are vc and ωc, respectively. There will be some difference between the commanded motion and the actual motion of the robot. To estimate the partial pose (x, y, θ ) of the car-like robot from the observer’s perspective, the full state of the car, given
242
C. Akhil et al.
in Eq. (5), uses state estimator PF to process the two noisy inputs and make best estimation of the current pose. At the predict stage, the states of the particles are updated with a simplified, unicycle-like robot model, as shown in velocity equations below. The system model used for state estimation is not an exact representation of the actual system. This is acceptable, as long as the model difference is well-captured in the system noise as given in Eqs (1), (2), and (4). At the correct stage, the importance weight (likelihood) √ of a particle is determined by its error norm from current measurement, (()2 + ()2 + (θ )2 ), as there are measurements only on these three components. This model configures the particle filter using 5000 particles. Initially, all particles are randomly picked from a normal distribution with mean at initial state and unit covariance. Each particle contains six state variables as shown in Eq. (5). The third variable is marked as circular since it is the car orientation. It is also very important to specify two callback functions, state transition function, and measurement likelihood function. These two functions directly determine the performance of the particle filter. Here, the commanded linear and angular velocities to the robot are arbitrarily picked time-dependent functions. Also, the fixed-rate timing of the loop is realized through rate control run loop at 20 Hz for 20 seconds using fixed-rate support. The fixedrate object is reset to restart the timer. The timer is reset right before running the time-dependent code. 1.
State Transition Function
2.
The sampling-based state transition function evolves the particles based on a prescribed motion model so that the particles form a representation of the proposal distribution. Below is an example of a state transition function based on the velocity motion model of a unicycle-like robot. The sd1, sd2 and sd3 are decreased to see how the tracking performance deteriorates. Here, sd1 represents the uncertainty in the linear velocity, sd2 represents the uncertainty in the angular velocity, and sd3 is an additional perturbation on the orientation. Measurement Likelihood Function The measurement likelihood function computes the likelihood for each predicted particle based on the error norm between particle and the measurement. Importance weight for each particle is assigned based on the computed likelihood.
Here, predict particles are a N × 6 matrix (N is the number of particles), and measurement is a 1 × 3 vector.
Car-Like Robot Tracking Using Particle Filter
243
4 Results The tracking performance is evaluated for a car-like robot by using particle filter. The particle filter tracks the car as it drives away from the initial pose which is depicted in Fig. 1. The robot moves across the roofed area as shown in Fig. 2. Measurement cannot be made, and the particles only evolve based on prediction model (marked with orange color). There are particles gradually forming a horseshoelike front, and the estimated pose gradually deviates from the actual pose after the robot has moved out of the roofed area. With the new measurements, the estimated pose gradually merges with the actual path.
Fig.1 Environment where the car-like model travels
Fig. 2 Path of robot on different surfaces
244
C. Akhil et al.
Table 1 Calculation of RMS from values of actual path and estimated path
X-position (m)
Actual Y-position (m)
Estimated Y-position (m)
Error (m)
0
0
0
0
2
0.68
0.68
0
4
1
1
0
6
1.86
1.89
0.03
6.25
1.89
1.93
0.04
6.5
1.95
2.05
0.1
6.75
1.98
2.22
0.24
7
2
2.05
0.05
8
2.42
2.42
0
10
3.05
3.05
0
Actual path and estimated path are calculated from the resultant output graphs and the error is calculated at ten instants and is tabulated in Table 1. Rootmean square (RMS) error = 0.0871779 m
5 Conclusion To cope with the partial observation and ranging viewpoints in car-like robot tracking, this research proposes a sampling-based particle filter to find the error between actual path and estimated path. This version takes sampling-based Bayesian version into particle filter framework. The path followed by the robot is extracted from the preestimated path and geometric model, ensuring that it is no longer required to draft on inappropriate objects. During prediction of path, a roofed area is placed on the way of the robot’s path, where the robot slightly deviates from its actual path. The advantage of the roofed area environment was considered to calculate the error. The estimated path compared to the actual path, i.e., error, is of extraordinary assist in getting a higher remark and a green prediction. Qualitative and quantitative analyzes display that the proposed set of regulations, take care of barriers and alternate the route in car-like robotic monitoring.
References 1. J. Hui., Tracking a Self-Driving Car with Particle Filter. A Survey. Published Online in jonathanhui.medium.com, Apr 2018
Car-Like Robot Tracking Using Particle Filter
245
2. G.M. Rao, C. Satyanarayana, Visual object target tracking using particle filter: A survey. Published Online in MECS. 5(6), 57–71 May (2013) 3. F. Gustafsson, F. Gunnarsson, N. Bergman, U. Forssell, J. Jansson, R. Karlsson, P.-J. Nordlund. Particle filters for positioning, navigation, and tracking, in Proceeding of the IEEE Transactions on Signal Processing, vol. 50(2) (2002) 4. P.L.M. Bouttefroy, A. Bouzerdoum, S.L. Phung, A. Beghdadi, Vehicle tracking using projective particle filter, in Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance (2009), p. 9 5. S.-K. Weng, C.-M. Kuo, S.-K. Tu, Video object tracking using adaptive Kalman filter, J. Vis. Commun. Image Represent. 1190–1208 (2006) 6. K.R. Li, G.T. Lin, L.Y. Lee, J.C. Juang, Application of particle filter tracking algorithm in autonomous vehicle navigation, in CACS International Automatic Control Conference (CACS) Dec 2013, pp. 250–255 7. Y. Fang, C. Wang, H. Zhao, H. Zha, On-road vehicle tracking using part-based particle filter, in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Sept 2017, pp. 694–711 8. P.D. Moral, Nonlinear filtering using random particles. Theory Probab Appl. 40(4), 690–701 (1995) 9. M.S. Arulampalam, M. Simon, G. Neil, C. Tim, A tutorial onparticle filters for online nonlinear/non-gaussian bayesiantracking. IEEE Trans. Signal Process. 50(2) (2002) 10. C.V. Reddy, K.V. Padmaja, Estimation of SNR, MSE and BER with incremental variation of power for wireless channel. Int. J. Eng. Res. Technol. 3(7), Jul (2014) 11. K. Roy, B. Levy, C.J. Tomlin, Target tracking and estimated time of arrival (ETA) prediction for arrival aircraft, in AIAA Guidance, Navigation and Control Conference and Exhibit (2006), pp. 112–183 12. M.G. Muthukrishnan, P. Sudheesh, M. Jayakumar, Channel estimation for high mobility MIMO system using particle filter. In: International Conference on Recent Trends in Information Technology (ICRTIT) (2016), pp. 1–6 13. S.J. Sreeraj, R. Ramanathan, Improved geometric filter algorithm for device free localization, in International Conference on Wireless Communications, Signal Processing and Networking (2017), pp. 914–918 14. S.K. Megha, R. Ramanathan, Impact of anchor position errors on WSN localization using mobile anchor positioning algorithm, in International Conference on Wireless Communications, Signal Processing and Networking (2017), pp. 1924–1928 15. A.S. Chhetri, D. Morrell, A. Papandreou-Suppappola, The use of particle filtering with the unscented transform to schedule sensors multiple steps ahead, in IEEE International Conference on Acoustics, Speech, and Signal Processing (2004), pp. 301–304 16. G. Ignatius, U. Murali Krishna Varma, N.S. Krishna, P.V. Sachin, P. Sudheesh, Extended Kalman filter based estimation for fast fading MIMO channels, in IEEE International Conference on Devices, Circuits and Systems (ICDCS) (2012), pp. 466–469
Secured E-voting System Through Blockchain Technology Nisarg Dave, Neev Shah, Paritosh Joshi, and Kaushal Shah
Abstract Election is the heart and soul of any democratic country. Hence, it is mandatory that elections should be conducted in a safe and secure environment. Also it is the duty of voters to present themselves on the Election Day. However, there are various security issues with present-day election processes like fake voting and vote tampering. Also, there is an issue of low voter turnout especially during a pandemic. Such issues can be solved by blockchain technology. The architecture of blockchain technology makes it difficult to alter data, or alter votes, and attack on the blockchain itself which makes it secure and reliable. Furthermore, voters could cast their vote from the comfort of their home which is helpful in a pandemic which leads to a healthy voter turnout. Blockchain could very well be the future of how a country conducts its elections. Keywords E-voting · Blockchain technology · Android · Smart contract
1 Introduction 1.1 What is Blockchain? Blockchain is a technology designed to work in a peer-to-peer network. In this network, each user is called a node. These are interconnected; hence, these nodes can make transactions with each other. For this, nodes’ public addresses and the amount of assets to be transferred is specified in the transaction. To confirm a transaction, the sender node presents its private signature. These transactions are stored as a whole group. These whole groups are called blocks. It also verifies this transaction by consensus achieved by [1] proof of work. These blocks have a unique key that is used for identifying it and storing its predecessor’s key, forming a chain of sequential blocks. Hence, blocks are connected N. Dave (B) · N. Shah · P. Joshi · K. Shah Pandit Deendyal Energy University, Gandhinagar, Gujrat, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_19
247
248
N. Dave et al.
in the form of chains, hence the name. Any change to any single block in the chain will lead to change in its key which will break the chain from the block onwards. Blockchain provides transparency as well as anonymity. As blockchain is public, each node can be checked if everything is as it is supposed to be. Multiple copies of the chain are present among [2] various nodes of the network. So for someone to manipulate the chain, they would have to attack 51% of the network or gain access to 51% of the network which will lead to control over blockchain, which is very unlikely.
1.2 Election Process Election process is a critical part of democracy. Each democratic process requires transparency and individual voters’ anonymity. Even if the election process is of a University representative and if the fair means are not practiced, then it may bring trouble for individual voters. According to [3], only 67% of the voters turned up to vote on the day of election. In a country of 1 billion population, if 33% of voters are not practicing their basic right is not beneficial for that country.
1.3 Why the Need for E-Voting? According to the reports, 33% of voters who did not vote were having doubts about casting their votes or were not able to turn up at the voting booth on time. If the voting was held using electronic media and was given assurance that their votes aren’t going to get manipulated, then the voter turnout could have been more. If a system was available, more votes would have been casted, which might lead to the change in result of elections as well.
1.4 Why to Use Blockchain for E-Voting? Election requires transparency, anonymity, and immunization. All these are the features of blockchain, which is the reason why blockchain is the most appropriate technology for conducting elections. Blockchain is immutable; hence, it will be impossible to change results of held elections.
Secured E-voting System Through Blockchain Technology
249
2 Literature Review Before starting with the development of the project, we referred to the research papers in this domain. We decided to review the work, to get some more ideas in this field. Table 1 shows some observations that we made from the research papers.
3 System Model Figure 1 depicts the system model of the Android application.
4 Proposed Method with Implementation As a solution to our problem statement, we have developed an Android application. It will allow the user to vote for their candidate during the election phase and view the result after the election phase is over. We have developed a smart contract using a programming language called Solidity on Remix Ethereum IDE. We have deployed this smart contract on the Ethereum blockchain. We are using MetaMask as our crypto wallet, to store our test ethers. We are working on the Ropsten Test Network which is also provided by MetaMask. We have used Firebase for the database and back end operations. We have used Infura and Web3j for connecting our smart contract with our Android application.
4.1 Tools and Technologies Used 4.1.1
Remix
Remix IDE is an open source desktop and Web application. Solidity contracts can be written right from the browser, using this open source tool. The Remix IDE contains modules for testing smart contracts, debugging them, and deploying them.
4.1.2
Ropsten
Ropsten is a proof-of-work testnet. This means it is the best like-for-like representation of Ethereum.
250
N. Dave et al.
Table 1 Literature review References Observations [4]
• Suggested SHA256 and ECDSA algorithm for security • Implementation of private key + public key infrastructure for privacy • Implementation of digital signature for non-repudiation
[5]
• • • •
[6]
• Implemented digital signature for privacy and non-repudiation • A system where one cannot get the data regarding votes, i.e., which voter voted for which candidate, until the counting phase begins • There was inclusion of a central authority for overlooking the operations
[7]
• • • • •
Implemented SHA256 and ECDSA algorithm for security Implementation of public key Infrastructure for privacy Implementation of digital signature for non-repudiation Inclusion of a central authority for overlooking the operations Proposed the structure of block for blockchain where 1 block will contain transaction detail, or vote transfer details, of 1 voter
[8]
• • • •
Implemented SHA256 for security Proposed candidate confidentiality for privacy of candidates Inclusion of a central authority for overlooking the operations Proposed a system where each candidate will have their own blockchain and length of blockchain will determine the number of votes they acquired
[9]
• Implemented blind signature for security • Proposed vote validation and voter confidentiality for voters’ privacy • Inclusion of a trusted third party for keeping a check on operations
[10]
• • • •
[11]
• Implementation of SHA256 for security • Implementation of public key infrastructure for privacy • Inclusion of a trusted third party, like an Election Commission, for voters’ confidentiality
[12]
• Proposed voters’ anonymity and votes’ validation • Inclusion of an authority to look over election process and declare results
[13]
• A blockchain-based on deniable authentication encryption storage of image data is proposed • Cloud’s use has been suggested for returning cipher text and trapdoor • Partial private keys, system master keys, and system parameters are used for deniable authentication encryption
Secret-ballot election scheme for security Implementation of Anonymous Veto Protocol (also known as AV-net) for privacy Added district node and boot node inside POA network for verification Suggested 3 blockchain frameworks, namely Exonum, Quorum, and Geth
Implementation of SHA256 for security Proposed voters’ anonymity and also votes’ validation Inclusion of a central authority for overlooking operations Used a two link blockchain
Secured E-voting System Through Blockchain Technology
251
Fig. 1 System model flowchart
4.1.3
MetaMask
MetaMask is a software cryptocurrency wallet used to interact with the Ethereum blockchain. It allows users to access their Ethereum wallet through a browser extension or mobile app, which can then be used to interact with decentralized applications.
252
4.1.4
N. Dave et al.
Android Studio
Android Studio is the official Integrated Development Environment (IDE) for Android app development, based on IntelliJ IDEA. On top of IntelliJ’s powerful code editor and developer tools, Android Studio offers even more features that enhance your productivity when building Android apps.
4.1.5
Kotlin
Kotlin is a cross-platform, statically typed, general-purpose programming language with type inference. Kotlin is designed to interoperate fully with Java, and the JVM version of Kotlin’s standard library depends on the Java Class Library, but type inference allows its syntax to be more concise.
4.1.6
Git
Git is a free and open source distributed version control system designed to track changes in any set of files. It is designed for coordinating work among the programmers who are collaboratively developing source code during software development. It is fast and efficient. It supports distributed and non-linear workflow and also handles data integrity.
4.1.7
Firebase
Firebase is a platform developed by Google that provides detailed documentation and cross-platform SDKs to build and ship apps on Android, iOS, Web, etc.
4.1.8
Infura
Infura is a set of tools for creating an application that connects to Ethereum blockchain. It interacts with Ethereum blockchain and runs nodes on behalf of its users. It is also used by MetaMask which we have used as a wallet.
4.1.9
Solidity
Solidity is a high-level, object-oriented programming language for developing smart contracts. It is used for implementing smart contracts on blockchain platforms, mainly Ethereum.
Secured E-voting System Through Blockchain Technology
4.1.10
253
Smart Contract
Smart contract is a program, also referred to as transaction protocol for blockchain, which automatically executes and controls relevant events and actions as per its terms.
4.2 Methodology 4.2.1
Smart Contract Development
We used Remix IDE for smart contract development. Smart contract was developed in Solidity. We created two structures, one for voters and another for candidates. Voter structure consists of following parameters: • • • •
voterNum—stores the voter number voterId—stores the voter ID given to them voterAddress—stores the wallet account address of the voter isvoted—stores a Boolean value of either true or false based on whether the voter has voted in the election or not Candidate structure consists of following parameters:
• • • •
candidateNum—stores the candidate number candidateId—stores the candidate ID provided to them candidateAddress—stores the wallet account address of the candidate counter—stores the total number of votes the candidate received.
We also specified some modifiers for keeping some functions in check that is to restrict execution of some functions during a specific state or phase of the election. The modifiers that we specified, which can also be seen in Fig. 2, are as follows: • onlyOwner—This modifier will allow only the owner/admin to execute the particular function. No other person could call that function. • onlyPreElection—This modifier will allow the function call only when election is not yet started, that is the pre-election phase. It cannot be called once the election is started or finished, that is, entered a phase other than pre-election phase. • onlyActiveElection—This modifier will allow the function call only when the election is live, that is the active/live election phase. It cannot be called in any other phase. • onlyPostElection—This modifier will allow the function call only when the election is finished, that is the post-election phase. We have implemented two functions to change the phase of the election. Those functions are as follows:
254
N. Dave et al.
Fig. 2 A screenshot of Remix Ethereum IDE showing a snippet of the Solidity code
• Activate()—This is an owner/admin only function. It allows the admin to change the election phase to active, that is make the election live. • Completed()—This is also an owner/admin only function. It allows the admin to change the election to complete phase, which is finish the election. We have implemented two functions to check the phase of the election. Both of them are Boolean functions. Those functions are as follows: • IsActive()—This function allows us to check whether the election is active or not. • IsCompleted()—This function allows us to check whether the election is completed or not. We have also implemented two functions, one for adding voters and another for adding candidates. Following are those two functions: • addVoter()—This function can be called only before election as specified by the onlyPreElection() modifier. It allows us to add a voter by taking in three parameters, namely voterId, voterAddress, and isvoted. • addCandidate()—This is an admin only function. It allows admin to add a candidate by taking in two parameters, namely candidateAddress and candidateId. We have implemented another function named Vote(), that will allow a voter to vote for a candidate. This function can be called only when the election is live, as specified by the onlyActiveElection modifier. It first checks whether the voter has already voted or not, so as to meet the criterion of allowing one vote per voter. After it is checked, and if the voter has not voted yet, then the vote is transferred to the candidate of the voter’s choice. The voter must confirm this transfer of vote
Secured E-voting System Through Blockchain Technology
255
through his/her account (account on MetaMask) and pay the required fee to perform the transaction. Lastly, we have implemented a result() function to declare the result of the election. This function can be called only when the election is over, as specified by the onlyPostElection modifier. This function will search for the candidate with the highest number of votes and will return the winner candidate’s ID and the number of votes that candidate acquired.
4.2.2
Smart Contract Deployment
After the smart contract is compiled, the next step is to deploy it. For that, the first step is to select the environment as “Injected Web3” as we will be using web3j for connecting our smart contract with our Android application. Next, it will ask us to connect this smart contract to our MetaMask wallet so that we could deploy our smart contract on the Ethereum blockchain, and for that we will need to pay fees. After the connection is established, we will now see our account ID (which is also the public key of our account) and the amount of ethers present in our account. Note that these are test (or fake) ethers on the Ropsten Test Network. The Gas Limit will be established, as shown in Fig. 3, and we could now proceed to click on the Deploy button and deploy our smart contract.
4.2.3
Android Development
Android development is the core part of this entire project. Android application is the face of this project. When deployed publicly, software or applications are supposed to be easy to use; hence, expecting voters to do the complex task will only reduce the use of E-voting. We developed an Android application which is user friendly and which will be connected to blockchain. For development, Kotlin is used. And for connecting smart contracts, web3j and Infura dependencies are included. For storing the information of users or candidates, we have connected an Android application with Firebase real-time database. For authenticating users via OTP or authenticating admin via Email and Password, we have used FirebaseAuth. Firebase real-time database is a real-time database which is a NoSQL database. FirebaseAuth is used for authentication. Figure 4 is a screenshot of a code snippet we wrote for development of Android application in Android Studio:
256
N. Dave et al.
Fig. 3 A screenshot of Remix Ethereum IDE showing deployment details
4.2.4
Need of the MetaMask and Ethereum
MetaMask: MetaMask is a crypto wallet which allows us to store ethers and conduct transactions. It also supports various test networks, namely Ropsten, Kovan, Rinkeby, and Goerli alongside its Mainnet network. It also allows us to create our own custom network.
Secured E-voting System Through Blockchain Technology
257
Fig. 4 A screenshot of Android Studio showing a snippet of code used for Android application development
Here, for our project, we have used Ropsten Test Network and have some test ethers for carrying out transactions. For deploying our contract on Ethereum blockchain, we need to pay a certain amount as fee. Hence, we need to connect our MetaMask account to the Remix Ethereum IDE and confirm this transaction so as to deploy our smart contract. Furthermore, for a voter to vote, he/she will be required to pay a certain amount as fee to confirm the vote and will need a MetaMask account to carry out this transaction. Ethereum: Ethereum provides a great development environment. One of the most famous and frequently used programming languages for smart contracts, Solidity, is supported by Ethereum. It also allows the use of test ethers for deploying smart contracts on Ethereum blockchain. Since these are test ethers, there is no market value of it and hence could be obtained for free, which is quite helpful for projects.
4.2.5
Setting up Backend
We need to manage the voters and candidates’ data, and we can store it in blockchain too. But storing in blockchain will increase the traffic in blockchain. And as we all know, it will require a gas amount that means we would be damaging the environment too. So we implemented the backend in Firebase. Firebase allows awesome tools for user authentication, storage, and database.
258
4.2.6
N. Dave et al.
Integrating Application with Backend and Blockchain
Blockchain is the ledger where all the important data and records of transactions are stored. And all the transactions are recorded and handled by smart contracts which are developed in Solidity. And this smart contract contains different methods. This method will be called from an Android application. Once the Solidity file is deployed on Remix, then it is saved locally inside the Android project. As Kotlin allows extension of Java, we will convert .sol file to .java file. After saving the .sol file inside the project directory, solc—a node packaging module is used for extracting data from Solidity file. Command: solcjs < filename > .sol—bin—abi—optimize—o < path/to/directory > After executing the above command, .bin and .abi files are autogenerated. Using these two files, .java will be generated which will have all the methods inside classes and constructors. Command: web3j generate solidity -b < path/to/bin/file > .bin—a < path/to/abi/file > .abi—o < path/to/dir > -p < name of package > After executing the above command, the .java file is autogenerated and will have all the methods of smart contract. Then, this Java class will be loaded inside Kotlin files and the methods will be called accordingly.
4.3 Features 4.3.1
User Authentication
The user is supposed to verify themselves, for logging into the application, with two-step authentication. Those two steps are as follows: • Password—The user will have to enter the correct password, known only to the user, to login to our application. The password will be stored in a Firebase real-time database as a hash digest. • OTP—The user will also have to enter a one-time password which will be sent to them on their registered mobile number in order to log in. After entering these two parameters, the user will then be allowed to log in to the application and will be able to vote provided the election is live.
4.3.2
User Confidentiality
User confidentiality and privacy are important factors not just in blockchain network but [14, 15] any network. Just as in real-life scenarios where data like which voter
Secured E-voting System Through Blockchain Technology
259
voted for which candidate is kept confidential, here also we provide the similar feature. In blockchain technology, transactions are kept anonymous, that is, one cannot trace back the actual person who initiated the transaction. Here also, when a voter transfers a vote to a candidate, there is no way to trace back that transaction and find out who the voter is in real life. Only details that are public are the public key of the voter and the same of the candidate. The public key does not contain any information regarding actual identity of the voter, like their real name, address, etc., and such information cannot be derived from the public key. Hence, by knowing the public key, one cannot deduce who voted for whom, which should be the case for maintaining the voter confidentiality.
4.3.3
Security Features
Every block of blockchain is cryptographically linked with its previous block as it stores the hash of the previous block. The hashing function used is SHA256. SHA256 is a standard hashing algorithm, which means it is a one-way function. Once a hash digest, of original data, is created using hashing function, it is impossible to reverse the process and obtain the original data. Changing even a single bit of data in the original data would result in a completely different hash digest. Hence, it is not possible to change data of the block without breaking the current chain. Therefore, it is not possible for an attacker to change the data of the block and change the voter’s vote to some other candidate. Hence, it prevents stealing of votes and changing of votes to a particular candidate’s favor.
5 Summary and Future Works There are various research papers that have addressed the similar problem statement as ours and are mentioned in the Literature Review section of this report. One of the main notable differences between those proposed work and ours is that we have actually developed an application for this problem statement unlike the rest. We have developed an Android application unlike them. There is one more notable work done in this direction. It is a project by Chrystopher Borba titled “Smart Contract example for elections in blockchain platform.” They have successfully created and deployed a smart contract to address this problem statement. However, there is no user end application, like Android, iOS, or Web application. While on the other hand, we have developed an Android application. Following are the future plans regarding this project: • Conduct a small scale election through this application before deploying it to the Google Play Store.
260
N. Dave et al.
• Increase the reach of project by developing a web-based solution and also an iOS compatible application. We have developed a working model for conducting a safe election. Voters could cast their vote from the safety and comfort of their home which is beneficial in pandemic era, which otherwise would be difficult for traditional election process. Also, the major issue of fake/false voting in traditional election process is also solved to a greater extent through this solution. There are still some problems that need to be addressed, like expanding the reach of our project, and conducting a big scale election is a good challenge in itself. Hence, we are working on both the challenges and will develop applications for Web and iOS devices too.
References 1. C. Lepore, M. Ceria, A. Visconti, U.P. Rao, K.A. Shah, L. Zanolini, A survey on blockchain consensus with a performance comparison of PoW, PoS and pure PoS. Mathematics 8(10), 1782 (2020) 2. V. Patel, F. Khatiwala, K. Shah, Y. Choksi, A review on blockchain technology: Components, issues and challenges, in ICDSMLA 2019 (Springer, Singapore, 2020), pp. 1257–1262 3. A. Gunasekar, C. Srinivasan, Voter turnout, in Lok Sabha Polls 2019 Highest Ever: Election Commission, 2019. https://www.ndtv.com/india-news/general-elections-2019-record-voter-tur nout-of-67-11-per-cent-in-lok-sabha-polls-2041481 4. R. Hanifatunnisa, B. Rahardjo, Blockchain based E-voting recording system design (2017) 5. F.Þ. Hjálmarsson, G.K. Hreiðarsson, Blockchain-based E-voting system 6. F.S. Hardwick, A. Gioulis, R.N. Akram, K. Markantonakis, E-Voting with Blockchain: An E-Voting protocol with decentralization and voter privacy (2018) 7. H. Yi, Securing e-voting based on blockchain in P2P network. J. Wirel. Com. Netw. 2019, 137 (2019). https://doi.org/10.1186/s13638-019-1473-6 8. A.B. Ayed, A conceptual secure blockchain-based electronic voting system. Int. J. Net. Secur. Appl. (IJNSA) 9, (2017) 9. Y. Liu, Q. Wang, An E-Voting protocol based on blockchain (2017) 10. F. Fusco, M.I. Lunesu, F. Pani, A. Pinna, Crypto-voting, a blockchain based e-voting system 223–227 (2018). https://doi.org/10.5220/0006962102230227 11. R. Ganji, B.N. Yatish, Electronic voting system using blockchain, in Dell EMC Proven Professional Knowledge Sharing (2018) 12. R. Ta¸s, Ö.Ö. Tanrıöver, A systematic review of challenges and opportunities of blockchain for E-Voting. Symmetry 12(8), 1328 (2020). https://doi.org/10.3390/sym12081328 13. C.V. Joe, J.S. Raj, Deniable authentication encryption for privacy protection using blockchain. J. Artif. Intell. Capsule Netw. 3(3), 259–271 (2021) 14. K.A. Shah, D.C. Jinwala, Privacy preserving, verifiable and resilient data aggregation in gridbased networks. Comput. J. 61(4), 614–628 (2018) 15. K. Shah, D. Jinwala, Privacy preserving secure expansive aggregation with malicious node identification in linear wireless sensor networks. Front. Comp. Sci. 15(6), 1–9 (2021)
A Novel Framework for Malpractice Detection in Online Proctoring Environment Korrapati Pravallika, M. Kameswara Rao, and Syamala Tejaswini
Abstract As a result of both pandemics and the advantage of a remote examination, job interviews have become popular and necessary. These systems are being used by the majority of companies and academic institutions for both recruitment and online exams. In this study, a new virtual assessment system uses deep learning to continuously monitor. Physical locations without the use of a live proctor, however, conducting exams in a safe environment are one of the main disadvantages of remote examination systems. In this work, there is a pipeline for an online interview and exam fraud analysis. Only a video of the candidate is required by the system, which is recorded during the exam. Face detection, recognition, object detection, and tracking algorithms are part of this work. The proposed work focus on techniques for identifying malpractices during online exam. Keywords Online learning · Online proctor · Student authentication · Face detection · Face recognition · Multi-person detection
1 Introduction In society, today, online exams and interviews have become popular. A pandemic is one justification for this kind of workforce, but another important reason is that both people and enterprises have the opportunity to properly engage their time and attention. Instead of appearing for interviews in physical offices, candidates may do it at any time and from anywhere in the world by simply browsing online. In this manner, online interviews allow the hiring task to go even more easily and effectively. Exams are a common method of checking people’s knowledge of specific subjects. Therefore, proctoring formative assessments in a safe atmosphere has now become popular and is essential for the exams’ legality. According to [1], approximately 74% of participants believed they can easily cheat in online exams. It simply entails taking a look at a single person or a set of individuals during every exam to K. Pravallika (B) · M. K. Rao · S. Tejaswini Department of Electronics and Computer Engineering, KLEF, Vaddeswaram, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_20
261
262
K. Pravallika et al.
prevent post-cheating. Even if the pandemic seems to be no longer present, among the biometric approaches employed by the system are facial recognition using the HOG face detector and the OpenCV face recognition algorithm. Recognition of multiple persons using a OpenCV based libraries is one of the key aspects of malpractice detection. The system is tested using SSD approaches and the COCO dataset, which is put into practice as a software system.
2 Related Works The field of education has changed dramatically in the last two years. These most backward of institutions have been pushed to shift to the online-based learning experience of the COVID-19 outbreak. As a result, the demand for online proctoring has grown. Even though, due to their hurried development, most software lacks critical functionalities and settings or is not as user pleasant as it is. Considering that the vast majority of people are familiar with devices, the use of an easily available basic interface is critical. Several retailers buy paid versions of the proctoring system. The user’s face, voice, touch, mouse, and laptop are individually acknowledged as part of a continuous verification system. Builds and deploys a network firewall system that uses network address translation, demilitarized zone, virtual protocol network, and other firewall techniques for threat detection. Deep learning-based face identification is also a solution for detecting the face in online test. An introduction of another a virtual exam system that uses fingerprint and eye gaze technology. It also contains cheating and anti-cheating status of participants and is used to assess their recommended methods. For researchers and engineers, the online proctoring system has recently become a difficulty because of COVID-19 the demand for online proctoring system increases rapidly. The paid version of the proctoring system was promoted by several industrial enterprises. From the start of the test until the end of the exam, this is the entire process of online proctoring. Artificial intelligence (AI) has altered our environment and lifestyle by introducing intelligence into many aspects of everyday life [2, 3] but with a few limitations [4]. Machine learning (ML) has improved innovation in a various industries [5]. However, it has a lot of drawbacks. Education, transportation [6], communication networks, disaster management, smart cities, and many more fields have benefited from machine and deep learning innovations. The virtual exam encounters multiple challenges throughout exam, and [7] discussed the online exam’s various difficulties and proposed an alternative that included grouping client hostnames or IPs for a specific area and time, as well as biometric authentication technologies such as face recognition and fingerprints [8], and AI has the ability to completely transform proctoring and online learning.
A Novel Framework for Malpractice Detection in Online …
263
3 Proposed Model There are two parts to the proposed web-based online proctoring system. First, there is the online proctoring. The suggested design for the online proctoring system is shown in Fig. 1. During the exam, the online proctoring program verifies the examinees identify and protects against unethical behavior. Even before the test begins, the application ensures that the participant has access to a screen that combines video and audio recording. The examination is not valid. Begin until the proctors have verified their identities. The major topic begins with online proctoring, in which a student takes an exam from a monitor using the camera of the student. The webcam uses the HTTPS protocol to capture frames and detects three types of techniques: face detection, which detects the face using facial landmarks, face recognition, which determines whether the face matches the student or not using loops, and head pose estimation, which estimates the student’s head movement and angle using facial landmarks and some functions and algorithms. When the data is discovered, it will be saved in a CSV file. This method will be followed throughout the exam. The test will be proctored continuously during this procedure. The data will be saved in a database as a CSV file. This procedure will conclude once the entire inspection has been completed, and the collected data will be saved in a CSV file database.
Fig. 1 Proposed architecture
264
K. Pravallika et al.
Algorithm 1 Master algorithm 1: Start process online Proctor 2: while True do 3: screen ← detect the face from cam 4: face Detect← by face Detection (method “HOG”) 5: find face← find Face Form Current Frame 6: if person == 0 then 7: cancel Exam: person not found 8: else if get face == 1 then 9: continue Exam 10: else if 11: face Match← face Recognition () 12: if face Match = = False then 13: cancel Exam: Unjustified face 14: else if 15: if person range > = 1 then 16: Stop the exam (another person detected from the frame) 17: else if 18: Person range=l then 19: Continue the exam 20: end if 21: end while 22: end process
Face Recognition The most extensively used biometric for Internet authentication is face recognition. Intel created the OpenCV computer vision library in 1999. Image representation, operations, and binary patterns histograms (BPH) are some of the face recognition techniques supported by OpenCV. From the suggested methodology, photograph of the participant as the input and use HOG methods to recognize face landmarks in the image. Estimate the face with landmarks for the identify field picture. Measure the distance between eyes and size of lips and the photos are combine to faces already identified and saved in our database. Algorithm 2 shows the facial recognition pseudocode. Face recognition algorithm uses biometrics to map facial features from a photograph or video. The geometry of face is read by facial recognition software. The length between your eyes and the length from your forehead to chin are important considerations. The software recognizes facial landmarks; each system recognizes 68 points on a face which plays a crucial role in differentiating your face.
A Novel Framework for Malpractice Detection in Online …
265
Algorithm 2 Recognition of face 1: procedure Face Identification (Id, Name, password) 2: while True do 3:Casel ← present Case 4: Locate face ← get all faces 5: Encoding face ← get all face Encoding 6: face Match ← compare all faces with student face 7: 8: if same face == True then 10: 9: Face same 11: 12: else 13: 14: Face not same 15: end if 16: 17: end while I8: 19: end procedure
Object Detection Object tracking method that deals with a fixed depth of conceptual items of a certain class (such as individuals, houses, or automobiles) in digital photographs and videos. It is linked to computer vision applications. Human detection and recognition are two well-studied convolutional neural networks areas. Detection may be used in a wide range of computer tasks, such as picture extraction and video surveillance. For object recognition, the well-prior components of the Mobile Net SSD model trained with three classes (body, mobile phone, and laptop) from the COCO dataset to recognize a person. It was also travel to every frame Ft as well. In a similar way of face detection, object detection, and find the person count with the help of Mobile Net SSD provided a threshold value. With the help of COCO data set to find the objects present in the video. Algorithm 3 shows the object detection pseudocode. It works with the help of Mobile Net SSD with the help of mobile net ssd uses a matching phase while training to identify the appropriate anchor box to the clusters of each ground truth object in the frame. Generally, the anchor box having the greatest overlap with an object is in charge of deciding the object’s class and address.
266
K. Pravallika et al.
Algorithm 3 Object detection algorithm 1: procedure (login id. Name: Password) 2: while True do 3: capture the frame from participant’s webcam 4: frame ← present frame 5: face ← detect face 6: : frame ← detect the frame of participant 7: 8: 9: object ← object recognition 10: if object found 11: cancel exam 12: if object not found 13: continue the exam 14: end if 21: end for 22: end while 23: end procedure
Multi-person Detection As previously mentioned, the COCO dataset and Mobile Net SSD is a deep neural network algorithm to detect multiple objects present in the image. Mobile Net SSD (Single-shot object detection) the object recognition system depends on mobile net as a base, and it can identify rapidly. It takes input image stores in the database during the examination. It will detect every participant’s display when there is another person in the screen making it malpractice. With help of a deep convolutional neural network, it classifies 80 objects and is super-fast. It has 53 convolutional layers with each of them followed by a batch normalization layer. It just need a single photo using SSD to identify multiple objects in the frame. Algorithm 3 shows the facial recognition pseudocode.
A Novel Framework for Malpractice Detection in Online …
267
Algorithm 4 Multi-person detection 1: procedure gets (person count) 2: while True do 3: Capcv2.vedio capture (0) 4: detector ← dlib. get portal faces 5: while true 6: frame← cap. Read() 7: frame ← cv2.flip (frame) 8: gray ← cv2 lvt color (frame cv2 color BGR2GRAY) 9: face ← detector (gray) 10: for face in face: x, y ← face.Left () .face.Top () X1,y1← frame. Right (), face, Bootom () 11: cv2 ← rectangle (frame (x,y),(x1,y1),(0,255,0,2)) 12: print (face i) 13: cv2 test show (“frame”, frame) 14: if 15:cv2. Wait kgy(l) 16: break 17: cap release () 18: cv2. destroy all 19: end if 20: end while 21: end procedure
4 Results Object Detection (Mobile Phone) and Multiple Person Detection are shown in Figs. 2 and 3.
5 Conclusion In this project, they used both methodologies to create an effective pipeline for online interviews and assessments. The system’s operands have to be limited. Since a fast pipeline was offered. Visual data is being used to restrict inputs, whereas audio data is avoided and the cheating analysis system outputs are limited to. There are only three ways to make mistakes: the presence of an individual, a tool, and is unavailable. Moreover, the model is implemented on the OpenCV. With the help of dlib and HOG methods, face detection and identification algorithms were developed. Fraud identification mechanism is presented in this paper. The system’s primary goal is to serve secure and reliable testing. Discussions and assessments can be done online in this situation. It simply requires a small person’s video, which can be recorded using an integral webcam, as data. As a consequence, it is a fast and easy method to
268 Fig. 2 Object Detection (mobile phone)
Fig. 3 Multiple Person Detection
K. Pravallika et al.
A Novel Framework for Malpractice Detection in Online …
269
implement face identification, recognition, monitoring, and identification of objects are included within the pipeline. Another person, device, and absence were identified as the three key cheating acts. Additional work which can enhance the effectiveness of our system would include using a speech processing module, as most cheating instances involve merely talking or are followed by suspect voice behavior. The addition of eyeball detection, comparable to analyze the participant eyeball motion is yet another enhancement. These two modules have the potential to significantly improve the efficiency of our system. The trials were carried out a unique data of three videos exposing real-life cheating dataset consisting of three videos depicting reallife cheating acts. As a further step, features like audio analysis and eye prediction add to the work.
References 1. F. Alam, A. Almaghthawi, I. Katib, A. Albeshri, R. Mehmood. Response: An AI and IoTenabled framework for autonomousCOVID-19 pandemic management. Sustainability 13(7), 3797 (2021). [Online]. Available: https://www.mdpi.com/2071-1050/13/7/3797 2. E. Bilen, A. Matros, Online cheating amid COVID-19. J. Econ. Behav. Organ. 182, 196–211 (2021) 3. Y. Atoum, L. Chen, A.X. Liu, S.D. Hsu, X. Liu, Automated online exam proctoring. IEEE Trans. Multimedia 19(7), 1609–1624 (2017) 4. S. Prathish, K. Bijlani, An intelligent system for online exam monitoring. In 2016 International Conference on Information Science (ICIS) (IEEE, 2016), pp 138–143 5. H.S. Asep, Y. Bandung, A design of continuous user verification for online exam proctoring on M-learning, in 2019 International Conference on Electrical Engineering and Informatics (ICEEI) (IEEE, 2019), pp. 284–289 6. A.K. Pandey, S. Kumar, B. Rajendran, B.S. Bindhumadhava, e-Parakh: Unsupervised online examination system, in 2020 IEEE Region 10 Conference (TENCON) (IEEE, 2020), pp. 667–671 7. T. Yigitcanlar, R. Mehmood, J.M. Corchado, Green Artificial intelligence: Towards an efficient, sustainable and equitable technology for smart cities and futures. Sustainability 13(16), 8952 (2021). [Online] Available: https://www.mdpi.com/2071-1050/13/16/8952 8. Y. Atom, L. Chen, A.X. Liu, S.D. Hsu, X. Liu, Automated online exam proctoring. IEEE Trans. Multimedia 19(7), 1609–1624 (2017)
Frequency Reconfigurable of Quad-Band MIMO Slot Antenna for Wireless Communication Applications in LTE, GSM, WLAN, and WiMAX Frequency Bands B. Suresh, Satyanarayana Murthy, and B. Alekya Abstract This study proposes frequency reconfigurable, Quad-band MIMO slot antenna in various frequency bands of L and S to be used in wireless communication applications. The proposed Quad-band antenna comprises of 4 PIN diodes, and it is electronically switchable within LTE, GSM, WLAN, and WiMAX frequency bands, for example 1.9, 2.6, 2.4, and 3.8 GHz. This Quad-band antenna is designed for the operating bands and wireless communication bands for different performance parameters, i.e., S-parameters, far-field radiation characteristics, voltage standing wave ratio, and surface current distribution. In the operating band, it had a highest gain of 4.6 dBi. A meander-line resonator is placed in between antennas to achieve high isolation, which can improve the working of the Quad-band antenna. The proposed antenna was fabricated with FR4 substrate and 4 PIN diodes. The gain of proposed antenna is improved from 2.46 to 4.62 dBi gain, number of frequency bands is increased from three to four, and radiation efficiency obtained is 50%. Keywords Frequency reconfigurable · PIN diode · Slot antenna · Wireless communication applications · Quad-band frequency operation · Multiple-input multiple-output (MIMO) · Diversity gain (DG) · Envelope correlation coefficient (ECC) · Total active reflection coefficient (TARC) · Mean effective gain (MEG) · Channel capacity loss (CCL)
1 Introduction The developments in current wireless communication systems has enhanced the use of antenna-based sensor technology in satellite systems, radars, robotic, healthcare, wireless communication, [1] etc. As temperature sensor [2] and sensor for soil moisture [3], patch antennas have been considered to be utilized. MIMO antennas have been recently utilized as microwave sensor in Worldwide Interoperability for Microwave Access (WiMAX), mobile network frequency bands, Bluetooth, and B. Suresh · S. Murthy (B) · B. Alekya Department of ECE, V R Siddhartha Engineering College, Vijayawada, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_21
271
272
B. Suresh et al.
Wireless Local Area Network (WLAN). To get high data rate, this MIMO technology is unified into Long-Term Evolution (LTE), Global System for Mobile Communications (GSM), WLAN, and WiMAX networks. The operating frequencies of LTE, WLAN, GSM, and WiMAX are 1.9 GHz, 2.4 GHz, 2.6 GHz, and 3.8 GHz, respectively. A compact frequency reconfigurable antenna may be utilized in space of utilizing multiple antennas, to work on various frequency bands considering its applications on various time places. The size of the compact frequency reconfigurable MIMO antennas will be compact, and the mutual coupling effect between antennas should be low, for integrating MIMO antennas into various electronic systems and devices. Different methods can be utilized to lower the mutual coupling effect among antennas, including the resonators, metamaterials, parasitic elements, employing neutralization lines, etc. Reduction of mutual coupling by utilizing the slot of meander-line resonator [4] and some defected structures of ground [5] have been recently revealed by researchers. Frequency reconfigurable antenna may be proved utilizing different techniques, like Radio Frequency Microelectromechanical Systems, PIN, varactor diodes [6], etc. Some reconfigurable antennas are recently reported in [7, 8]. In [7], for single and double band operation, to get reconfigurability of frequency, the stepped feed line was drawn to feed a slot of antenna, which connects the 3 PIN diodes. In [8], 4 MEMS switches were placed to get a reconfigurable of frequency MIMO antenna along with an isolation over the 22 dB. In any case, it is big challenge to have reconfigurable of frequency in a compact slot MIMO antenna along with greater isolation ratio. In [9], frequency reconfigurable MIMO slot antenna using 4 PIN diodes achieved two frequency bands and gain of 2.46 dBi only. This study presents an advanced frequency reconfigurable MIMO compact antenna having wireless communication applications in various frequency bands of LTE, WLAN, GSM, and WiMAX. The antenna-based sensor covers the 4 communicating frequency bands (1.9, 2.4, 2.6, and 3.8 GHz). The isolation between the two antennas in all 4 four frequency bands is over the 28 dB, and the ECC is lower than 0.08, so it is not more than desired range for ECC.
2 Antenna Design 2.1 Reconfigurable of Frequency Slot Antenna In this section of the study, we have presented analysis and design for a W-shaped reconfigurable of frequency slot antenna (RFSA). The Fig. 1 presents a RFSA antenna which can be switched between four communication bands of frequency adjusted at f 1 = 1.9, f 2 = 2.6, f 3 = 2.4, and f 4 = 3.8 GHz. By utilizing a W-shaped slot, and 2 PIN diodes, having dielectric constant (εr ) on 4.4 and loss tangent (tan δ) on
Frequency Reconfigurable of Quad-Band MIMO Slot Antenna …
273
L W3
L7
L6
L3
w
PIN diode
L5
L1
D1
L4
D2
L2 W1
Y
W2
X Port
Fig. 1 Reconfigurable of frequency slot antenna, (L1 = 6 mm, L = 32.5 mm, L2 = 26 mm, L3 = 14 mm, L4 = 2 mm, L5 = 5 mm, L6 = 9 mm, L7 = 10 mm, W 1 = 2 mm, W = 20 mm, W 2 = 8 mm, W 3 = 1 mm)
(a)
(c)
(b)
(d)
Fig. 2 Various structures of RFSA at various states of PIN diodes. a Structure 1 for frequency = 1.9 GHz, b Structure 2 for frequency = 2.6 GHz, c Structure 3 for frequency = 2.4 GHz, and d Structure 4 for frequency = 3.8 GHz
0.02, a RFSA is designed having 0.8-mm thickness FR4 substrate. This antenna was provided feedline of 50 microstrip.
274
B. Suresh et al.
Table 1 Different equivalent slot shapes for the RFSA antenna Pin diodes
Structure I
Structure II
Structure III
Structure IV
D1
Off
Off
On
On
D2
Off
On
Off
On
Resonant frequency (GHz)
1.9
2.6
2.4
3.8
Slot length (mm)
58.5
36.25
49.15
27.4
In Fig. 2a–d, different structures (Structures I–IV) for the slot RFSA are presented. The distances of the slots are equivalent to λg/2, while the wavelength for the resonance frequency was λg. Various structures could be illustrated in one structure (Fig. 1) by utilizing D1 and D2 (2 PIN diodes) on their suitable positions. For instance, by switching D2 and D1 to ON and OFF states, respectively, the Structure II could be achieved. All structures are described in Table 1, with various switching positions of the diodes. This antenna is fabricated with a PIN diode— BAP65-0, 115 PIN diodes. This diode shows lower capacitance in off position [10]. The diode ON condition has the values of R = 1 , L = 0.6 nH, and the diode OFF condition has the values C = 0.35 pF, R = 20 K and L = 0.6 nH where R and C are parallel. The reconfigurable of frequency W shape slot antenna can be structured by using 2 (two) BAP65-0, 115PIN diodes. Turn on the PIN diode, taking into account 1 resistance and 0.6 pF inductance. Turn off the PIN diode, taking into account 20 Kohm resistance, a 0.35 pF capacitor, and 0.6 nH [10], the simulation of RFSA antenna in software of Ansys HFSS. As shown in Table 1, we can see that RFSA antenna may be switched among 4 single bands of frequency. Fig. 3 shows return loss points of different structures (Structure I to Structure IV). This is evident and observed in the study that desired resistance matching is gained at f 1 = 1.9, f 2 = 2.6, f 3 = 2.4, and f 4 = 3.8 GHz.
2.2 Reconfigurable of Frequency MIMO Antenna The reconfigurable of frequency MIMO antenna for application in various frequency bands like LTE, GSM, WLAN, and WiMAX has been developed on substrate FR4 utilizing 2 RFSAs, which is presented in Fig. 4. This Frequency Reconfigurable MIMO antenna has 70 mm length, 20 mm width, and 0.8 mm height, with 5 mm separation distance between its elements. For this RF multi-input multi-output antenna, progress is presented in Fig. 5 and S-parameters are shown in Fig. 7. Firstly, there are two W-shaped RF slot antennas kept at head-to-head places at FR4 substrate having size Lm × Wm and thickness 0.8 mm, as visible in the Fig. 5. A mazeformed meander-line resonator slot is fixed in the center of two RF slot antennas with the same separation distance, to improve the isolation. When the external current propagation between the components of antenna is suppressed by the meander-line
Frequency Reconfigurable of Quad-Band MIMO Slot Antenna …
275
Fig. 3 Return loss of reconfigurable of frequency antenna as indicated by Structures I–IV
Lm
substrate
Wf L4
L3
L2 Wm
D1
D2
D3
D4
metal
Y Z X L3 L1
(a)
Port
Port
(b)
Fig. 4 Designed reconfigurable of frequency MIMO antenna. (Lm = 70 mm, L1 = 4 mm, L2 = 5 mm, L3 = 33 mm, L4 = 1.75 mm, Wm = 20 mm, Wf = 2 mm) PIN diodes are at D1to D4. a Top view b bottom view
resonator slot, it results in low mutual coupling. Thus, setting at the f 1, f 2, f 3, and f 4 resonance frequencies, the isolation among the components was improved to 20, 18, 26, and 28 dB, as shown in Figs. 6 and 7. Furthermore, a novel maze shape meander-line resonator is applied to the MIMO, to improve the isolation as shown in Fig. 5. In Fig. 4, on the two sides of MIMO antenna at reasonable places, the MLR is placed. The MLR decreases the nearby field coupling among the components of the antenna. It also suppresses the surface current distribution and more. The MLR decreases the nearby field coupling among the components of the antenna. It also suppresses the surface current distribution and more. Further, Fig. 9 shows surface distribution of current flows. From the Sparameter plot presented in Fig. 7, it is also evident that the values of isolation are
276
B. Suresh et al.
D1
D2
D3
P1
P2
(a)
D1
D2
D3
P1
D4
P2
(b)
D1
D4
D3
D2
D4
(c)
Fig. 5 Various arrangements of reconfigurable of frequency proposed antenna a stage1, b stage 2, c stage3
D1
w7 D2
w6 w5 Lm w4
D3
w2
w3
w1 D4
Wm
Fig. 6 Geometry of meander-line resonator (Lm = 20 mm, Wm = 5 mm, w1 = 5 mm, w2 = 2 mm, w3 = 2 mm, w5 = 0.45 mm, w6 = 0.5 mm, w7 = 0.2 mm)
more than 20, 18, 26, and 28 dB in the corresponding frequencies of 1.9, 2.6, 2.4, and 3.8 GHz, respectively.
Frequency Reconfigurable of Quad-Band MIMO Slot Antenna …
277
Fig. 7 S-parameters simulated and measured a for 1.9 GHz, b for 2.6 GHz, c for 2.4 GHz, d for 3.8 GHz
2.3 S-parameters S-parameters are utilized to characterize electrical networks using matched impedances. In the Fig. 7 presented simulation of S-parameters for the plots of MIMO antenna (a) resonating ON f 1 where d 1 to d 4 are in OFF state, (b) resonating ON f 2 when d 1 , d 3 are in OFF state and d 2 , d 4 are in ON state, (c) resonating ON f 3 when d 1 , d 3 are in ON state and d 2 , d 4 are in OFF state, and (d) resonating ON f 4 where d 1 to d 4 are in ON state.
2.4 VSWR characteristics Voltage standing wave proportion (VSWR) is characterized as the proportion among transmitted and reflected voltage standing waves in a radio RF electrical transmission network. It is a proportion of how proficiently RF power is communicated from the power source, through a transmission line, and into the load. Simulated VSWR characteristics of this proposed antenna plots are presented in Fig. 8. They are resulted from these characteristics LTE band 1.9 GHz, GSM band 2.6 GHz, WiMAX frequency band 3.8 GHz, and WLAN frequency band 2.4 GHz.
278
B. Suresh et al.
Fig. 8 VSWR plot simulated and measured
(a)
(b)
(c)
(d)
Fig. 9 Surface current distribution of proposed MIMO antenna (a) at D1 off D2 off condition, (b) at D1 off D2 on condition, (c) at D1 on D2 off condition, (d) at D1 on D2 on condition
2.5 Surface Current Distribution Fig. 9a–d, respectively, has shown the distribution of surface current for this frequency reconfigurable slot antenna at frequencies 1.9, 2.6, 2.4, and 3.8 GHz. The current distribution at slot frequencies is very less, the current distribution for
Frequency Reconfigurable of Quad-Band MIMO Slot Antenna …
279
operating frequencies is high, and it is more concentrated in radiating element feed line and the corners of rectangular patch.
2.6 Radiation Pattern Fig. 10a–d represents the 3D-polar plots, and Fig. 11a–d simulated and measured radiation patterns (co-polarization and cross-polarization) of proposed reconfigurable antenna for four operating frequencies 1.9, 2.6, 2.4, and 3.8 GHz. At 1.9, 2.6, 2.4, and 3.8 GHz, the proposed antenna exhibits directional patterns with a maximum of 2.65, 2.00, 3.64, and 4.6 dB gain, respectively. The antenna gain and efficiency were additionally estimated in the anechoic chamber, and the outcomes are plotted in Figs. 10 and 12. It should be noted that the integration of the pin diodes reduces the antenna acquisition and productivity due to the additional disadvantage caused by the characteristic impedance of the diodes.
(a)
(c)
(b)
(d)
Fig. 10 3D-polar plot of proposed antenna 3D gain plot of proposed antenna. a Frequency at 1.9 GHz, b frequency at 2.6 GHz, c frequency at 2.4 GHz, d frequency at 3.8 GHz
280
(a)
(c)
B. Suresh et al.
(b)
(d)
Fig. 11 Radiation patterns of proposed antenna (Co-polarization phi = 0 and cross-polarization phi =90) a frequency at 1.9 GHz, b frequency at 2.6 GHz, c frequency at 2.4 Ghz, d frequency at 3.8 GHz
Fig. 12 Radiation efficiency of proposed antenna
3 Photograph of Fabricated Antenna This proposed reconfigurable of frequency slot antenna with Quad-bands was fabricated on FR4 substrate having dimensions 70 × 20 mm2 , thickness 0.8 mm and dielectric constant of 4.4.
Frequency Reconfigurable of Quad-Band MIMO Slot Antenna …
(a) Top view
(b) Bottom view
281
(c) Experimental set-up with VNA formeasuring antenna
Fig. 13 Antenna testing
The fabricated and testing of proposed MIMO Quad-band antenna is presented in Fig. 13, and its experimental set-up with VNA for measuring antenna performance parameters is represented in Fig. 7.
4 Experimental Results To validate the current method, a model of proposed MIMO antenna is presented; after placing 4 PIN diodes at desired areas with proper DC biasing, the characteristics are noted. S-parameters for all these states (F1-F4) are simulated and measured, as shown in Fig. 7. This is evident by the figure that the MIMO antenna [11–18] can be switched between four bands, i.e., LTE, GSM, WLAN, and WiMAX applications, and the isolation obtained among the antenna components is more than 28 dB. At the resonance frequencies, Fig. 11 shows the radiation patterns of the propped antenna in bi-directional in the E plane and omnidirectional in the H plane.
4.1 MIMO antenna Diversity Performance Analysis Instead of one antenna, two or more antennas can be used to enhance the quality and reliability of the wireless network. This presents model with fathoming of the proposed antenna. The accompanying boundaries [19, 20] like MEG (mean effective gain), ECC (Envelope correlation coefficient), DG (diversity gain), TARC (total active reflection coefficient), and CCL (channel capacity loss) are determined (reproduced furthermore, estimated) and introduced in this part. Here, the figure of measurements for the multiple-input multiple-output antenna framework is presented.
282
B. Suresh et al.
Fig. 14 Simulated and measured of envelope correlation coefficient of proposed antenna
Values of the radiation for antenna components 1 and 2 are η1 and η2, respectively. In Fig. 14, the ECC is plotted which displays that in all the three frequency bands the ECC values are less than 0.08, and it is lower than the desired range of ECC for better presentation, i.e., ρe < 0.5 [8]. 2 ∗ S S12 + S ∗ S22 11 21 ρe = 1 1 − |S11 |2 − |S21 |2 1 − |S22 |2 − |S12 |2 2
(1)
The increased diversity gain is utilized to communicate using multi-input multioutput receiving antenna. Diversity gain can be determined utilizing Eq. (2) and accomplish more noteworthy than 9.7 in a working band, which can be seen from Fig. 15. DG =
(1 − 0.99Pe)
(2)
In multipath environment, mean effective gain is to measure a ratio of signal strength of the main test antenna to the reference antenna. MEG is calculated according to the following Eq. (3) and plot shown in Fig. 16. ⎡
⎤ N 2 Si j ⎦ MEG = 0.5⎣1 − Z j=1
Also MEGi − MEGj < 3 dB. So, MEG can be written as
(3)
Frequency Reconfigurable of Quad-Band MIMO Slot Antenna …
283
Fig. 15 Diversity gain of proposed antenna simulated and measured
Fig. 16 Mean effective gain of proposed antenna simulated and measured
M E G1 = 1/2 1 − |S11 |2 − |S12 |2 − |S13 |2 − |S14 |2 MEG2 = 1/2 1 − |S21 |2 − |S22 |2 − |S23 |2 − |S24 |2 MEG3 = 1/2 1 − |S31 |2 − |S32 |2 − |S33 |2 − |S34 |2 MEG4 = 1/2 1 − |S41 |2 − |S42 |2 − |S43 |2 − |S44 |2 TARC can be figured straightforwardly from the scattering matrix. Essentially to the dynamic reflection coefficient, the TARC is a component of recurrence, and it likewise relies upon scan angles and tapering. TARC relates the total incident power to the total outgoing power in an N-port microwave components and determined by utilizing Eq. (4), and it is shown in Fig. 17.
284
B. Suresh et al.
Fig. 17 Total active reflection coefficient of proposed antenna
2 N N jθm−1 i=1 Si1 + m=2 Sim e TARC = √ N
(4)
CCL is computed, which helps in defining the loss of transmission bits/s/Hz in a high data rate transmission. The minimum acceptable limit of CCL over which the high data rate transmission is defined by 0.4 bits/s/Hz and it displayed in Fig. 18, The highest possible CCL achieved in the frequency band shown below (Table 2). Closs = − log2 det α R
Fig. 18. Channel capacity loss of proposed antenna simulated and measured
(5)
Frequency Reconfigurable of Quad-Band MIMO Slot Antenna …
285
Table 2 Comparison of proposed antennas with other research papers Ref. No
Structure of antenna
Material
Antenna size (mm3 )
[21]
Microstrip slot antenna
Taconic RF35 εr = 3.5 and tanδ = 0.0018
50 × 46 × 1.52 2.2, 4.7
3.7
[22]
U slot antenna
Taconic TLT εr = 2.55 and tanδ = 0.0025
40 × 40 × 0.8
2.3, 3.6
4.5
[7]
L slot antenna
Rogers RO4350B εr = 3.48 and tanδ = 0.0037
27 × 25 × 0.8
2.8, 4.5, 5.8 3.6
[9]
C-shaped slot
FR4 εr = 4.4 and tanδ = 0.02
60 × 20 × 1.6
2.4. 3.6, 5.8 2.4
[23]
Rhombus
RT-duroid εr = 2.2 and tanδ = 0.0009
55 × 52 × 1.6
4–7.3
4
[24]
H-shaped slot
FR4
r = 4.4, tanσ = 0.02
50 × 50 × 1.6
3.5, 5.5
2.5, 2.6
[25]
Ring shaped
FR4
r = 5.4, tanσ = 0.02
22 × 13 × 1.5
3.1–10.6
4.2
[26]
Square-shaped cell
Rogers 80 × 80 × 1.6 RT/Duroid 5880 εr = 2.2 and tanδ = 0.0009
1.76, 5.71
4.2
Proposed antenna
W-shaped slot
FR4 εr = 4.4 and tanδ = 0.02
1.9, 2.6, 2.4, 3.8
2.6 2 3.6 4.6
70 × 20 × 0.8
⎞ N 2 αii = 1 − ⎝ Si j ⎠
Operating frequency (GHz)
Gain (dBi)
⎛
(6)
j=1
⎡
α11 ⎢ α21 R α =⎢ ⎣ α31 α41
α12 α22 α32 α42
α13 α23 α33 α43
⎤ α14 α24 ⎥ ⎥ α34 ⎦ α44
(7)
286
B. Suresh et al.
5 Conclusion The work presents reconfigurability of frequency MIMO antenna utilizing 4 PIN diodes for application of wireless sensor network in WLAN, LTE, GSM, and WiMAX and wireless frequency bands, for example, one LTE band (1.9), GSM band (2.6), WLAN band (2.48), and one WiMAX band (3.8 GHz). Simulated and measured values of antenna performance parameters like VSWR characteristics, surface current distribution, and far-field radiation characteristics are investigated. VSWR values are between 2 and 1 in the whole operating frequency range. This MIMO antenna achieved a highest gain of 4.6 dB and maximum radiation efficiency of 50%. In all 4 frequency bands, by utilizing an meander-line resonator slot and after, a metallic strip at the ground level, more over 20, 18, 26 and 28 dB isolation in the corresponding frequency band within the antenna components and get ECC lower than 0.08, DG greater than 9.7, MEG less than 3 dB and CCL under 0.4 pieces/s/Hz. This proposed MIMO antenna has 2.65, 2.0, 3.64, and 4.6 dBi gains, respectively, at the different frequencies at f 1, f 2, f 3, and f 4. This MIMO antenna has the radiation which is bidirectional in E plane and omnidirectional in H plane. The proposed MIMO antenna is good for wireless communication applications with the provision of LTE, GSM, WLAN, and WiMAX applications.
References 1. R. Bhattacharyya, C. Floerkemeier, S. Sarma, Low-cost, ubiquitous RFID- tag-antenna-based sensing. Proc. IEEE 98(9), 1593–1600 (2010) 2. J.W. Sanders, J. Yao, H. Huang, Microstrip patch antenna temperature sensor. IEEE Sens. J. 15(9), 5312–5319 (2015) 3. H. Zemmour, G. Baudoin, A. Diet, Effect of depth and soil moisture on buried ultra-wideband antenna. Electron. Lett. 52(10), 792–794 (2016) 4. S. Hwangbo, H.Y. Yang, Y.K. Yoon, Mutual coupling reduction using micromachined complementary meander line slots for a patch array antenna. IEEE Antennas Wirel. Propag. Lett. 16, 1667–1670 (2017) 5. S. Pandit, A. Mohan, P. Ray, A compact four-element MIMO antenna for WLAN applications. Microw. Opt. Technol. Lett. 60, 289–295 (2018) 6. L. Ge, K.M. Luk, Frequency-reconfigurable low-profile circular monopolar patch antenna. IEEE Trans. Antennas Propag. 62(7), 3443–3449 (2014) 7. L. Han, C. Wang, X. Chen, W. Zhang, Compact frequency-reconfigurable slot antenna for wireless applications. IEEE Antennas Wirel. Propag. Lett. 15, 1795–1798 (2016) 8. S. Soltani, P. Lotfi, R.D. Murch, A port and frequency reconfigurable MIMO slot antenna for WLAN applications. IEEE Trans. Antennas Propag. 64(4), 1209–1217 (2016) 9. S. Pandit, A. Mohan, P. Ray, Compact frequency-reconfigurable MIMO antenna for microwave sensing applications in WLAN and WiMAX frequency bands. IEEE Sens. Lett. (2018) 10. NXP Semiconductors, “BAP65–0, 115 PIN diode,” data Sheet 2010 [Online]. https://www.far nell.com/datasheets/1697008.pdf 11. S.M. Nimmagadda, A new HBS model in millimeter-wave beamspace MIMO-NOMA systems using alternative grey wolf with beetle swarm optimization. Wirel. Pers. Commun. 120, 2135– 2159 (2021). https://doi.org/10.1007/s11277-021-08696-6
Frequency Reconfigurable of Quad-Band MIMO Slot Antenna …
287
12. S.M. Nimmagadda, Enhancement of efficiency and performance gain of massive MIMO system using trial-based rider optimization algorithm. Wirel. Pers. Commun. 117, 1259–1277 (2021) 13. S.M. Nimmagadda, Optimal spectral and energy efficiency trade-off for massive MIMO technology: analysis on modified lion and grey wolf optimization. Soft. Comput. 24, 12523–12539 (2020). https://doi.org/10.1007/s00500-020-04690-5 14. N.S. Murthy, S.S. Gowri, B.P. Rao, Non orthogonal quasi-orthogonal space-Time block codes based on circulant matrix for eight transmit antennas. Int. J. Appl. Eng. Res. 9(21), 9341–9351 (2014). ISSN 0973–4562 15. N.S. Murthy, S.S. Gowri, Full rate general complex orthogonal space-time block code for 8transmit antenna. in ELSEVIER, SCIVERSE Science Direct Procedia Engineering in IWIEE 2012 (China, 2012), Jan 9–10, 2012. H Index: 11, IF: 0.61 (Scopus index), ISSN: 1877-7058 16. N.S. Murthy, S.S. Gowri, B.P. Rao, Quasi-orthogonal space-time block codes based on circulant matrix for eight transmit antennas, in IEEE International Conference on Communication and Signal Processing—ICCSP 14 at Melmaruvathur, on 3rd-5th Apr 2014 17. N.S. Murthy, S.S. Gowri, P. Satyanarayana, Complex Orthogonal space-time block codes rates 5/13 and 6/14 for 5 and 6 transmit antennas, in Wireless Communications, Networking and Mobile Computing (WiCOM), 2011 7th International Conference Sept 23-25 at Wuhan (IEEE Explore Digital Object Identifier, China, 2011). https://doi.org/10.1109/wicom.2011.6040107 18. N.S. Murthy, Improved isolation metamaterial inspired MM-Wave MIMO dielectric resonator antenna for 5G application. Prog. Electromagnet. Res. C 100, 247–261 (2020). https://doi.org/ 10.2528/PIERC19112603 19. A. Kayabasi, A. Toktas, E. Yigit, K. Sabanci, Triangular quad-port multi-polarized UWB MIMO antenna with enhanced isolation using neutralization ring. AEU-Int. J. Electron. Commun. 85, 47–53 (2018) 20. M. Naser-Moghadasi, R. Ahmadian, Z. Mansouri, F.B. Zarrabi, M. Rahimi, Compact EBG structures for reduction of mutual coupling in patch antenna MIMO arrays. Prog. Electromagnet. Res. C 53, 145–154 (2014) 21. H.D. Majid, M.K.A. Rahim, M.R. Hamid, M.F. Ismail, A compact frequency-reconfigurable narrowband microstrip slot antenna. IEEE Antennas Wirel. Propag. Lett. 11, 616–619 (2012) 22. Z. Ren, W.T. Li, L. Xu, X.W. Shi, A compact frequency reconfigurable unequal U-slot antenna with a wide tunability range. Prog. Electromagn. Res. Lett. 39, 9–16 (2013) 23. K. Murthy, K. Umakantham, K. Satyanarayana, Reconfigurable notch band monopole slot antenna for WLAN/IEEE-802.1 1 n applications. Aug 2017 http://www.inass.org/2017/201 7123118 24. G. Jin, C. Deng, Differential frequency reconfigurable antenna based on dipoles for Sub6GHz 5G and WLAN applications. IEEE Antennas Wirel. Propag. Lett. https://doi.org/10. 1109/LAWP.2020.2966861 25. M.U. Rahman, CPW fed miniaturized UWB tri-notch antenna with bandwidth enhancement. Adv. Electric. Eng. 2016, Hindawi Publishing Corporation, Article ID 7279056, https://doi. org/10.1155/2016/7279056 26. M. Shirazi, J. Huang, A switchable-frequency slot-ring antenna element for designing a reconfigurable array. IEEE Antennas Wirel. Propag. Lett. https://doi.org/10.1109/LAWP.2017.278 1463
Intelligent Control Strategies Implemented in Trajectory Tracking of Underwater Vehicles Mage Reena Varghese and X. Anitha Mary
Abstract Underwater vehicle (UV) is a considerable technology for the renewable use of ocean resources. As we know, the ocean environment is harsh and complex and is affected by various force and moments. Hence, it is a tough task to attain an UV stable in the pre-planned static position and heading angle. Trajectory tracking control is a main research area for the above said problem. In this paper, various control strategies used in UVs for tracking control such as classical, robust, and intelligent controllers are reviewed. Intelligent controllers with novel controlling techniques give extreme good results under complex environmental uncertainties such as waves, ocean currents, and propulsion. Keywords PID control · Sliding mode control · Adaptive control · Fuzzy control · Artificial neural network · Linear quadratic gaussian · H infinity · intelligent controllers
1 Introduction Nowadays, underwater vehicles have been extensively used in many vital applications such as extraction rigs, military, gas, and oil factories. These are mainly suitable for risky conditions where human beings face difficulties to carry on. The two main classifications of underwater vehicles are remotely operated vehicle (ROV) and autonomous underwater vehicle (AUV). ROV is observed as a hitched robot linked through an umbilical cord for communications and power transferring. But AUV is an untethered robot which have its own power and achieve a pre-planned task. ROV is simpler and faster to set up because it can be operated on on-board by a person and is good for pre-prepared missions for vast areas like the deliver for MH370. In M. R. Varghese Department of Robotics Engineering, Karunya Institute of Technology and Sciences, Coimbatore, India X. A. Mary (B) Karunya Institute of Technology and Sciences, Coimbatore, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_22
289
290
M. R. Varghese and X. A. Mary
the twenty-first century, the AUV is becoming more popular under the name of two categories, hovering vehicles and flight vehicles. Former is used for physical work around fixed objects, detailed inspection, etc. and later for the purpose of object location, object delivery, searches, and surveys. But the design of AUVs is tough since it includes lots of manipulators and sensors. Currently, researches all over the world are tediously working in the motion control and development of control algorithms of UV. The research is so vital that it has to deal with the multiplex oceanic habitat and disturbance agents such as sea waves and ocean currents that alter with seas and depths. However, UV has to track the specific paths accurately and efficiently in a most stable manner [1, 2]. The basic performance of an underwater vehicles can be inflated by enhancing the trajectory tracking control. This is much crucial in UV implementation. It mainly includes trajectory tracking and path following. The former is dependent on time, and later is independent of time. UVs have to follow the known reference trajectory to reach the destination for completing a pre-planned task in an underwater environment such as subsea jacket inspection on offshore oil platforms as well as maintenance and pipeline inspection. Many research works are undergoing in this path tracking areas of the AUV. The various control strategies used in former studies can be categorized into classical controllers like PID, sliding mode, and self-adaptive; robust controllers like Linear Quadratic Gaussian and H infinity controllers; intelligent controllers like fuzzy and neural network controllers; combination of intelligent controllers along with other controllers; and various other control techniques. The paper first discusses the classical and robust control tracking algorithms used in UV by the researchers. Secondly, the fast growing intelligent control techniques and the recent developments in this area are also reviewed.
2 Conventional Trajectory Tracking Controls Used in UV Some of the most common conventional or basic control strategies used in underwater vehicles are classical controllers like PID, sliding mode, and self-adaptive and robust controllers like Linear Quadratic Gaussian and H infinity controllers. This is been discussed in the following section.
2.1 Proportional Integral Derivative (PID) Control PID controller (Fig. 1) is a classical control which adopts a feedback control technique to rectify the position error ‘e (t)’between an actual trajectory ‘y (t)’ and reference trajectory ‘r (t)’ with corrective action [3]. Kp, Kd, and Ki are the parameters to be selected appropriately for a fast, accurate, stable, and smooth dynamic process. The PID controller’s output [4] is given by u (t) = Kp e (t) + Ki Ѐ e(t) dt + Kd d/dt e(t).
Intelligent Control Strategies Implemented in Trajectory …
291
Fig. 1 PID tracking control for underwater vehicle. Source [3, 4]
PID control is a nonlinear controller. Guerrerro et al. 2019 recommended to replace a PID controller’s constant feedback gains with a set of nonlinear functions for trajectory tracking control of UV. Its stability can then be checked by using Lyapunov’s design [5]. The work is done in real time, and the nonlinear PID shows good tracking performance and robustness [5]. The main task of the PID controller is to tune its parameters. PID controller is widely used in all control applications as it is simple and reliable. One of its drawback is that it is not adaptable to the environmental and working condition, because manual set up is done for the parameters. Hence, many of the PID controllers are clubbed together with other intelligent and adaptive controllers to obtain more optimal results. And also, it is chosen commonly to compare the simulations of the advanced controllers with PID controllers. Borase et al. [4] reviewed various applications of the PID in different fields such as process control, robotic manipulator, electric drives, and mechanical systems [4].
2.2 Sliding Mode Control Sliding mode control or SMC has been frequently using controller in underwater vehicles since three decades. It is one of the nonlinear robust controllers to sustain with marine environment. It has got many sliding mode control strategies for nonlinear system such as conventional, integral, dynamic, twisting, super-twisting, terminal, and fast terminal with different sliding manifold and control inputs. However, SMC results in chattering phenomenon due to discontinuous control action which leads to the reduction of tracking accuracy in AUV. The input to the SMC is the position error e (t), and the input to the UV is the force τ which comes from the controller (Fig. 2) [6–8].
292
M. R. Varghese and X. A. Mary
Fig. 2 SMC tracking control for underwater vehicle. Source [3, 7, 8]
Vu et al. [9] proposed a dynamic sliding mode control (DSMC) to improve the system robustness for the motion control of an over actuated AUV. Lyapunov criteria are used to analyze the stability of the system. In this, two different methods are used for the optimal allocation control in distributing a proper thrust to each seven thrusters of the over actuated AUV. They are the quadratic programming method and the least square methods [9]. For the depth control of AUV vertical plane, a dual loop control methodology is adopted by Wang et al. [10]. The inner pitch controller is a SMC, and the parameter tuning is done with extreme learning machine (ELM). The outer loop for depth control is built by a proportional controller. This design guaranteed the stability and robustness of the system [10]. Guerrerro et al. [11] presented high order sliding mode control with auto-adjustable gain under oceanic disturbances and uncertainty model in the issue of underwater trajectory tracking [11].
2.3 Adaptive Control Adaptive control is a complicated nonlinear feedback control method with little dependence on mathematical models. Craven et al. quoted ‘the majority of successful control studies include some kind of adaptive control strategy’ [12]. It does not need a known dynamic model. That is the reason most of the researchers combine their proposed controllers with adaptive especially in underwater vehicles. The four types of adaptive control are model reference adaptive control (MRAC), self-tuning, feed-forward, and feedback. In this, the most popular control is MRAC. The feedback loop permits an error measure to be calculated between the reference model and the output of the system (Fig. 3). Thus, depending on the error measurement, the controller parameters are tuned to minimize the error [7, 8, 12]. For trajectory tracking of AUV in [13], a novel adaptive control law is implemented for the calculation of uncertain hydrodynamic damping parameters. The stability of this law is checked using direct method of Lyapunov. Bandara et al [14] addressed a vehicle fixed frame adaptive controller for attitude stabilization in a low speed autonomous underwater vehicle along with external disturbances [14]. However, adaptive control finds difficult to adjust the control parameters and robustness when dealing with the actual motion.
Intelligent Control Strategies Implemented in Trajectory …
293
Fig. 3 Model reference adaptive control tracking system for UV. Source [3, 7, 8, 12]
2.4 Linear Quadratic Gaussian (LQG) LQG is an optimal controller used for linear time varying and linear time invarying system. It is a combination of a Linear Quadratic Regulator (LQR) and a Kalman filter. This controller is used to obtain the optimal parameter values which minimizes the quadratic cost function [15]. Hassan et al. (2021) suggested two controllers for an AUV to overcome the sensor noise which is a disturbance for the motion control. One is the LQG controller, and the other is an improved version of FOPID controller. The latter provides more stability.
2.5 H Infinity H infinity controller is a robust controller. It is based on H infinity norm, that is, under all frequencies and in all directions, maximum gain is obtained. Linearization and planning of the control law are two main steps of the H infinity design [16]. Gavrilina et al. [17] recommends an H infinity proposal for an underwater vehicle attitude control system to attain less response to noises from other passages. The outcome of this suggestion increased the standard of the current attitude control system of UVs [17].
3 Intelligent Control Techniques Used in UV Other than the basic controllers, artificial intelligence control is currently a trend in the tracking control of underwater vehicles as it gives excellent control. Some of the intelligent control techniques involve fuzzy control, artificial neural network control strategies, genetic algorithm techniques, machine learning techniques, etc.
294
M. R. Varghese and X. A. Mary
3.1 Fuzzy Intelligent Control Fuzzy control is a control method using the concept of fuzzy logic and comes under the category of intelligent controller. This is commonly used to control the uncertainties or strongly nonlinear systems when the precise system model is unknown. A fuzzy controller consists of empirical rules which is beneficial for operator controlled plants. A simple if–then structure is used as rules and given to a knowledge-based controller. The inference engine simulates the rules and calculates a control signal based on the inputs measured. The fuzzy controller is built of four parts: knowledge base, fuzzy interface, inference engine, and defuzzification interface (Fig. 4) [18]. Recent studies have shown that fuzzy logic controllers are implemented with huge success in underwater vehicles. Chen et al. [20] compared a fuzzy controller design with the PID controller for motion control system of an autonomous underwater vehicle and under uncertainties, and the proposed fuzzy controller has shown an excellent quality control [19]. To build an efficient obstacle avoidance approach for underwater vehicles in marine environments, Chen et al. [20] used an experimental platform with ROV equipped scanning sonar and applied fuzzy logic control to calculate linear and nonlinear problems. With the help of an optimum navigation strategy, fuzzy logic adopted ROV was able to avoid the obstacle and had a magnificent control stability [20]. A self-tuned nonlinear fuzzy PID (FPID) controller [21] is suggested for speed and position control of multiple-input multiple-output (MIMO) fully actuated UV with eight thrusters to follow desired trajectories. Here, Mamdani fuzzy rules are taken to tune the PID parameters. The result is compared with classical controller PID. The fuzzy PID controller has got a quicker response and reliable behavior.
Fig. 4 Fuzzy tracking control for UV. Source [3, 19]
Intelligent Control Strategies Implemented in Trajectory …
295
3.2 Artificial Neural Network or Artificial Intelligent Control Artificial neural network or simply neural network (NN) is also an intelligent controller adopted from a mammal neuron structure to match the biological neuron system’s learning ability. The general mathematic model of neural network comprises of input layer, hidden layer, and output layer [22]. NN has got a vital importance in nonlinear underwater vehicle control applications such as tracking control, collision avoidance control, motion control, and target searching. The recent trends are to incorporate the neural networks with adaptive, robust, dynamic, reinforcement learning, evolutionary, convolutional and bioinspired algorithm to get optimal control (Fig. 5) illustrates the block diagram of neural network (NN) tracking control for UV. Muñoz et al. [23] implemented a dynamic neural network (DNN)-based nonparametric identifier construction for the collected disturbances blended along with constant input gain’s parametric identification given by a parameter adaptive law known as Dynamic Neural Control System (DNCS) for an AUV with 4° of freedom. This resolves the trajectory tracking issue in an AUV under the harsh marine environment [24]. For efficacious real-time search of multiple targets with the help of multiple AUVs, a bioinspired neural network (BNN) is recommended. Here, a steepest gradient descent rule is used to establish a search path autonomously by AUVs and fuzzy control is provided to avoid the obstacles in the path of movement by optimizing the search path. The speciality of this algorithm is that the parameters do not require training and learning [23]. An evolutionary neural network (ENN) control known as assembler encoding in recurrent and feed-forward is carried out on a biomimetic AUV to bypass collision in underwater. The tests done inferred that ENN in recurrent control had a good performance over feed-forward [25].
Fig. 5 Neural network (NN) tracking control for UV. Source [3, 16]
296
M. R. Varghese and X. A. Mary
3.3 Genetic Algorithm Control ˙It is a computational tool which undergoes Darwin’s genetic criterion and natural evolutionary biological evolution process [26]. In [27], the parameter tuning is done with the help of genetic algorithm (GA) and harmonic search algorithm (HSA). This is suggested to obtain a robust steering control of an autonomous under water vehicle. Another advance method of tuning the parameter using GA is cloud modelbased quantum GA (CQGA). It is utilized in the tuning of fractional order PID control parameters to increase the performance of motion control for an AUV. Thus, a conventional integer order PID controller can be generalized to this newly suggested method [28]. A cloud model-based quantum genetic algorithm (CQGA)based FOPID controller for motion control is presented in [29]. In this, CQGA is used to tune the optimal parameters of the FOPID controller and the simulations show better result than using GA alone.
3.4 Machine Learning Control This is a controlling technique which is used mainly in complex nonlinear system. It will teach the machine to control the problem by previous experiences or using examples by different learning control methods [26]. Deep learning and reinforcement learning are the common methods used in underwater vehicles for trajectory tracking control Liu et al. [30] recommended a Deep Deterministic Policy Gradient (DDPG), an intelligent control algorithm for controlling the lower layer motion control of the vectored thruster AUV. The main advantage of this design is that a system model is not required, but some input coefficients of the AUV is obtained from the sensors [30]. An adaptive controller based on the deep reinforcement learning structure is been suggested by Carlucho et al. [31] for low level control of an AUV [31]. They were able to control all the six degrees of freedom by giving low-level commands to the thrusters for a real-time autonomous underwater vehicle by this machine learning technique. Different NNs such as Faster Region-based Conventional NN and Single Shot Multibox Detector structures were trained and confirmed using deep learning technologies for automatic recognition of target in an AUV on optical and acoustic subsea imagery [32].
4 Comparıson of Conventional and Intelligent Control Algorıthms Used in Underwater Vehicles Table 1 shows the pros and cons of the main trajectory tracking control algorithms which are used commonly.
Intelligent Control Strategies Implemented in Trajectory …
297
Table 1 Comparison of various control algorithms used in UV Advantages
Control method
Disadvantages
Conventional controllers PID [3]
1 2 3
Simple and reliable Most widely used Parameters can be automatically adjusted by self-tuning or intelligent algorithms
1 2
No adaptability to the changes because of the manual set up of the parameters Optimal control will not be achieved
Sliding mode [3]
1 2
An accurate dynamic model is not necessary More reliable and robust
1 2
High frequency of chattering Hence intensive heat losses and proleptical wear in thruster
Self-adaptive [3]
1 2
Highly adaptive Automatically adjust the control parameters
1 2
Based on accurate mathematical model The adjustment of control parameters and robustness when dealing with the actual motion is very difficult
LQG [2]
1 2
Optimal state feedback Controller Accurate control design
1 2
Sensitive to model accuracy Inefficient to handle nonlinearity
H infinity [16]
1 2 3
Robust Sharp tracking performance Fast response speed
1 2
Complexity of design Experience designer is required
Fuzzy logic [2, 19]
1 2 3 4
Accurate knowledge of system model is not needed It is popularly used for nonlinear and uncertain systems Easy to design, good stability, Faster response User friendly since it is natural language
1 2 3 4
Absence of a learning function Difficult to tune the fuzzy rules It has to smoothen the overshoot prediction Time consuming
Neural network [2, 22]
1 2 3 4
Exact model is not required 1 Commonly used in nonlinear 2 systems Self-learning ability is the good strength of NN Ability to tolerate faults
Real-time application of the control system is difficult as sample learning process lags Complex for real-time application
Genetic algorithm [26]
1 2
Commonly used as an optimization tool with other controllers Good amount of data types can be processed
Cost is high Software packages are less available
Intelligent controllers
1 2
(continued)
298
M. R. Varghese and X. A. Mary
Table 1 (continued) Control method Machine learning technique [26, 31]
Advantages 1 2 3 4
Model based or model free can be used Used in complicated systems Different frameworks can be used Good portability
Disadvantages 1 2 3 4
Efficiency depends upon data Much time consuming Computation is complex High cost
5 Conclusıon The ultimate aim of the trajectory tracking controller in underwater vehicles is to make the system stable under the ocean disturbances, unpredictable disturbances, and model uncertainties. In Sect. 2, the basic tracking controls used in the current researches have been discussed. These ones are commonly used in linear systems, and many of the traditional controllers require a known model. PID, which is the most common used controller, is much difficult to tune its parameters. This can be done with the help of intelligent controllers. The result obtained will be highly optimal. Intelligent controllers are best in trajectory tracking of UV. It can be applied to nonlinear models of complex nature, and there is no necessary of a known model. This is the main advantage of an intelligent controller other than the traditional controllers which is especially good in the case of underwater vehicles with unpredictable disturbances. The comparison Table 1 clearly pointed out the limitations of traditional controllers. So in real time, the basic controllers are commonly combined together with intelligent controllers to get a better simulation so as to compensate its own drawbacks and to obtain a better performance. Many researchers have done the above matter and trying to get the best results. Machine learning control, an intelligent controller using is a very good control technique in autonomous underwater vehicle’s trajectory tracking. In Sect. 3.4, we have seen some research papers based on this. They are much time consuming, but can be used as a control in the worst complicated cases. Research has been progressing to discover more advanced and innovative intelligent control techniques in the trajectory tracking of underwater vehicles.
References 1. F.U. Rehman, G. Thomas, E. Anderlini, Centralized control system design for underwater transportation using two hovering autonomous underwater vehicles (HAUVs). IFAC-Papers OnLine 52(11), 13–18, (2019). ISSN 2405-8963 2. M. Aras, M. Shahrieel, S. Abdullah, F. Abdul Azis, Review on auto-depth control system for an unmanned underwater remotely operated vehicle (ROV) using intelligent controller. J. Telecommun. Electron. Comput. Eng. 7, 47–55 (2015) 3. W.-Y. Gan, D.-Q. Zhu, W.-L. Xu, B. Sun, Survey of trajectory tracking control of autonomous
Intelligent Control Strategies Implemented in Trajectory …
4. 5.
6. 7. 8. 9.
10. 11.
12. 13. 14.
15.
16.
17.
18. 19. 20.
21.
22. 23.
299
underwater vehicles. J. Mar. Sci. Technol. (Taiwan). 25, 722–731 (2017). https://doi.org/10. 6119/JMST-017-1226-13 R.P. Borase, D.K. Maghade, S.Y. Sondkar et al., A review of PID control, tuning methods and applications. Int. J. Dynam. Control 9, 818–827 (2021) J. Guerrero, J. Torres, V. Creuze, A. Chemori, E. Campos, Saturation based nonlinear PID control for underwater vehicles:Design, stability analysis and experiments. Mechatronics 61, 96–105 (2019). ISSN 0957-4158 M. Mat-Noh, M.R. Arshad, Z.M. Zain, Q. Khan, Review of sliding mode control applications in autonomous underwater vehicles. Indian J. Geo-Mar. Sci. (2019) J.E. Slottine, W.Li., Applied nonlinear control, in KeyInformation: Non-linear control Techniques (Prentice Hall, 1991) H.K. Khalil. Nonlinear systems, in Key information: Nonlinear control Techniques, Third Edition (Prentice Hall, 2002) M.T. Vu, T.-H. Le, H.L.N.N. Thanh, T.-T. Huynh, M. Van, Q.-D. Hoang, T.D. Do, Robust position control of an over-actuated underwater vehicle under model uncertainties and ocean current effects using dynamic sliding mode surface and optimal allocation control. Sensors 21(3), 747 (2021) D. Wang et al., Controller design of an autonomous underwater vehicle using ELM-based sliding mode control, in OCEANS 2017 (Anchorage, 2017), pp. 1–5 J. Guerrero, E. Antonio, A. Manzanilla, J. Torres, R. Lozano, Autonomous underwater vehicle robust path tracking: Auto-adjustable gain high order sliding mode controller. IFAC-Papers OnLine 51(13), 161–166 (2018). ISSN 24058963 P.J. Craven, R. Sutton, R.S. Burns, Control strategies for unmanned underwater vehicles. J. Navig. 51(1), 79–105 (1998) B.K. Sahu, B. Subudhi, Adaptive tracking control of an autonomous underwater vehicle. Int. J. Autom. Comput. 11, 299–307 (2014) C.T. Bandara, L.N. Kumari, S. Maithripala, A. Ratnaweera, Vehicle-fixed-frame adaptive controller and intrinsic nonlinear PID controller for attitude stabilization of a complex-shaped underwater vehicle. J. Mechatron. Rob. 4(1), 254–264 (2020) M.W. Hasan, N.H. Abbas, Controller design for underwater robotic vehicle based on improved whale optimization algorithm. Bull. Electr. Eng. Inf. [S.l.], 10(2), 609–618, Apr 2021. ISSN 2302-9285 K. Vinida, M. Chacko, An optimized speed controller for electrical thrusters in an autonomous underwater vehicle. Int. J. Power Electron. Drive Syst. (IJPEDS) 9(3), 1166–1177, Sept 2018. ISSN: 2088-8694 E.A. Gavrilina, V.N. Chestnov, Synthesis of an attitude control system for unmanned underwater vehicle using H-infinity approach. IFAC-Papers OnLine 53(2), 14642–14649 (2020). ISSN 2405-8963 R.S. Burns, R. Sutton, P.J. Craven, Computational intelligence in ocean engineering: A multivariable online intelligent autopilot design study (2000) A. Zhilenkov, S. Chernyi, A. Firsov, Autonomous underwater robot fuzzy motion control system with parametric uncertainties. Designs 5(1), 24 (2021) S. Chen, T. Lin, K. Jheng, C. Wu, Application of fuzzy theory and optimum computing to the obstacle avoidance control of unmanned underwater vehicles. Appl. Sci. 10, 6105 (2020). https://doi.org/10.3390/app10176105 M.M. Hammad, A.K. Elshenawy, M.I. El Singaby, Trajectory following and stabilization control of fully actuated AUV using inverse kinematics and self-tuning fuzzy PID. PLoS One 12(7), e0179611, 6 Jul 2017 Y. Jiang, C. Yang, J. Na, G. Li, Y. Li, J. Zhong, A brief review of neural networks based learning and control and their applications for robots. Complexity 14, (2017). ArticleID 1895897 A. Sun, X. Cao, X. Xiao, L. Xu, A fuzzy-based bio-inspired neural network approach for target search by multiple autonomous underwater vehicles in underwater environments. Intell. Autom. Soft Comput. 27(2), 551–564 (2021)
300
M. R. Varghese and X. A. Mary
24. F. Muñoz, J.S. Cervantes-Rojas, J.M. Valdovinos, O. Sandre-Hernández, S. Salazar, H. Romero, Dynamic neural network-based adaptive tracking control for an autonomous underwater vehicle subject to modeling and parametric uncertainties. Appl. Sci. 11(6), 2797 (2021) 25. T. Praczyk, Neural collision avoidance system for biomimetic autonomous underwater vehicle. Soft Comput. 24, 1315–1333 (2020) 26. K.-C. Chang, K.-C. Chu, Y.C. Lin, J.-S. Pan, Overview of some ıntelligent control structures and dedicated algorithms 8th Apr 2020 27. M. Kumar, Robust PID tuning of autonomous underwater vehicle using harmonic search algorithm based on model order reduction. Int. J. Swarm Intell. Evol. Comput. 4 (2015) 28. J. Wan, B. He, D. Wang, T. Yan, Y. Shen, Fractional-order PID motion control for AUV using cloud-model-based quantum genetic algorithm. IEEE Access 7, 124828–124843 (2019) 29. M. Wang, B. Zeng, Q. Wang, Study of motion control and a virtual reality system for autonomous underwater vehicles. Algorithms 14(3), 93 (2021) 30. T. Liu, Y. Hu, H. Xu, Deep reinforcement learning for vectored thruster autonomous underwater vehicle control. Complexity 2021, 25 (2021). Article ID 6649625 31. I. Carlucho, M. De Paula, S. Wang, Y. Petillot, G.G. Acosta, Adaptive low-level control of autonomous underwater vehicles using deep reinforcement learning. Robot. Auton. Syst. 107, 71–86 (2018). ISSN 0921-8890 32. L. Zacchini, A. Ridolfi, A. Topini, N. Secciani, A. Bucci, E. Topini, B. Allotta, Deep learning for on-board AUV automatic target recognition for optical and acoustic imagery. IFAC-Papers On Line 53(2), 14589–14594 (2020). ISSN 24058963 33. H. Tariq et al. A Hybrid Linear Quadratic Regulator Controller for Unmanned Free-Swimming Submersible. Appl. Sci. 11(19) :9131 (2021)
Fused Feature-Driven ANN Model for Estimating Code-Mixing Level in Audio Samples K. Priya, S. Mohamed Mansoor Roomi, R. A. Alaguraja, and P. Vasuki
Abstract Code mixing spoken language identification (CM-SLI) from speech signal plays a vital role in many computer-aided voice analysis applications. To identify the level of code-mixing language from a speech signal, a powerful feature is needed for training the model. In the proposed work, such a feature is extracted from the speech by mel frequency cepstral coefficients (MFCC), Delta Delta MFCC (D2 MFCC), and pitch. These features are fused and trained by multilayer perceptron (MLP) neural network (NN) with a Bayesian regularization (BR) function. This classifies the given audio sample into Tamil or English and achieves an accuracy of 97.6%. Then, the level of a mix of languages is estimated by classifying fragments of audio that find the acquaintance of the speaker on the chosen language. Keywords Code mixing · Language identification · Mel frequency cepstral coefficients · Pitch · Multilayer perceptron · Neural network
1 Introduction Speech is the most important communication modality that conveyed 80% of the information by face and 100% of the information by telephonic conversation. The process of determining the language of an anonymous speaker’s utterance, regardless of gender, accent, or pronunciation, is known as spoken language identification (SLI). The SLI plays a significant role in speech processing applications [1, 2] like K. Priya (B) · S. Mohamed Mansoor Roomi · R. A. Alaguraja ECE, Thiagarajar College of Engineering, Madurai, Tamil Nadu, India e-mail: [email protected] S. Mohamed Mansoor Roomi e-mail: [email protected] R. A. Alaguraja e-mail: [email protected] P. Vasuki ECE, Sethu Institute of Technology, Madurai, Tamil Nadu, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_23
301
302
K. Priya et al.
automatic speech recognition (ASR), speech coding, speech synthesizes, speech to text communication, speaker and language identification, etc. Persons can distinguish one language from another without having prior knowledge of the language terms, but it is a tedious process for machines. In service centers, SLI systems can be used to route foreign inquiries to an operator who speaks the recognized language proficiently [3]. It is also used voice-controlled information retrieval systems like Amazon Alexa, Google Assistant, and Apple Siri. There are several research methods available to carry out the problems of SLR with acoustic or phonotactic features. The acoustic features include a wide range of vocal tract characteristics such as rhythm, pitch, and stress. Lin and Wang [4] proposed a language identification system using Gaussian mixture model (GMM) based on pitch contour. The other acoustic features such as linear prediction (LP), linear prediction cepstral coefficients (LPCC), and mel frequency cepstral coefficients (MFCC) are used in speech processing applications. In SLI, some extra features fused in these basic features and classified with artificial neural networks (ANN) [5]. A fresh universal acoustic description method for language recognition was developed by Siniscalchi et al. [6]. A universal set of fundamental units that can be defined in any language has been investigated. Gundeep Singh et al. [7] presented work for language identification by deep neural networks that has been used spectrogram of the speech signal as input. In speech processing analysis, the MFCC is the dominant acoustic feature. It reproduces the shape of human voice perception and it provided speaker-independent features [8, 9]. Hariraj et al. [10] proposed a framework for language identification by extracting MFCC and classifying these features by support vector machine (SVM). Zissman [11] presented a work to identify language using pitch and syllable time as a feature and classified by GMM. The SLI is proposed by Sadanandam [12] by fusing the acoustic features MFCC and fundamental frequency. The fused features were recognized by the hidden Markov model (HMM). Sarthak et al. [13] proposed work for language identification using new attention module-based convolution networks. It uses log mel spectrogram images of the speech signal in Voxforge dataset. Multilingual speakers frequently flip between languages in a social media conversation and automatic language recognition becomes both an essential and difficult task in such an environment. Code-mixing is a common term in multilingual societies, and it refers to the intra sentential switching of two different languages. So in any given audio sample containing a mix of two languages, the proposed algorithm finds the level of a mix of languages by classifying fragments of audio into Tamil or English. This would provide information on how the speaker mixes two languages voluntarily or involuntarily leading to the gauging versatility and acquaintance of the speaker on the chosen language. Major contributions of the works are • Proposal of a multilayer perceptron (MLP) neural network (NN) with a Bayesian regularization (BR)-based code-mixing spoken language identification (CM-SLI). • Selection and fusion of acoustic features such as MFCC, D2 MFCC, and pitch. • Classification of fused features by ANN.
Fused Feature-Driven ANN Model for Estimating Code-Mixing Level …
303
• Estimating the Level of code-mixing spoken language. • Performance comparison on speech dataset against various ANN classifiers. This proposed work is structured as follows. Section 2 describes speech database collection, and Sect. 3 describes a proposed methodology using feature extraction and ANN classification algorithms. Section 4 explains the experimental results and their comparison with other ANN models, and Sect. 5 concludes the paper.
2 Database Collection The objective of the proposed work is to identify code-mixing language with a collected speech signal from YouTube. The collected speech signal consists of speech samples from two languages that are Tamil and English. The total number of files in the language database is 2500 with a sampling frequency 44.1 kHz. Each class contains 50% of the total files which is 1250. The length of the speech files is varying from 5 to 15 s. In this work, speech files with a length of 5 s of both classes are used to identify the language.
3 Proposed Methodology The flow of the proposed methodology is shown in Fig. 1. It comprises four stages such as speech signal detection and segmentation, framing and windowing, feature
Fig. 1 Block diagram of code-mixing spoken language ıdentification system
304
K. Priya et al.
extraction, and ANN model to classify the given code-mixing language speech signal as Tamil or English.
3.1 Speech Signal Detection and Segmentation The speech signal may have a silent and noise-free signal. Suppose the speech signal has a silent part, it is not carrying any information. So that the speech signal analysis requires the removal of the silence part. Equation 1 represents the input speech signal with silence and speech parts. In this work, voice activity detection is used to remove the silence from the input speech signal (I). The silence-removed speech signal (y) is shown in Eq. 2. I = Silence(Ss ) + Speech S p
(1)
y = [t1 , t2 . . . tT ]
(2)
3.2 Framing and Windowing The speech signal is time-varying and non-stationary. When the speech signal has a long period, the characteristics of the signal is changed and it will be reflected in different sound spoken by the same speaker. If the length of a speech signal is short (10–30 ms), it will be considered stationary. So that used framing and windowing to use speech signal characteristics properly. Framing [14] converts the non-stationary signal into stationary by changing the speech signal into a short period of small signals.
w(n) =
F(n) = {y1 |y2 |y3 . . . .yn }
(3)
Fw (n) = F(n) ∗ w(n)
(4)
0.54 − 0.46 cos 0
2πr , 0≤r ≤ N −1 N −1 otherwise
(5)
The framing Fw (n) of the signal is obtained from Eq. 3. Then, the hamming window function w(n) (Eq. 4) is applied on the framed signal shown in Eq. 4 that eradicates the spectral distortions in the signal.
Fused Feature-Driven ANN Model for Estimating Code-Mixing Level …
305
3.3 Feature Extraction Mel Frequency Cepstral Coefficients (MFCC) Feature extraction is the process of extracting significant information and removing unrelated information. In this, the MFCC, Delta Delta MFCC (D2 MFCC), and pitch are extracted to identify the language from the speech signal. MFCC is the most important acoustic feature which represents the short-term power and shape of the speech signal. It provides compressed information about the vocal tract in the form of a small number of coefficients. It depends on the linear cosine transform of the log power spectrum on a nonlinear mel scale of frequency. The flow of the MFCC is shown in Fig. 2. First, the windowed signal is converted into the frequency domain by discrete Fourier transform (DFT) using Eq. 6. X (k) =
N −1
Fw (n)e
−j2π nk N
0≤k ≤ N −1
(6)
n=0
where X(K) is the DFT of the signal and N is the number of DFT points that have been computed. The DFT of the signal is applied to mel filter banks to compute the mel spectrum. Mel is used to measuring the frequency that imitates like the human ear perceived. The physical frequency of the mel scale f mel was calculated using Eq. 7. f f mel = 2595 log10 1 + 700
(7)
where f is the physical frequency and f mel is the perceived frequency. The triangular filters are used to compute the magnitude spectrum envelope s(m) in Eq. 8. N −1
|X (k)|2 Hm (k) 0 ≤ m ≤ M − 1 s(m) =
(8)
k=0
where M is the total number of weighting filters. The weight of the kth energy spectrum is denoted as Hm (k). The energy levels of the adjacent bands are correlated because of the smoothness of the vocal tract. This can be achieved by discrete cosine transform, and it produced the mel cepstral coefficients. The MFCC is computed using Eq. 9.
Fig. 2 Block diagram of MFCC
306
K. Priya et al.
c(n) =
N −1
log10 (s(m)) cos
k=0
π n(m − 0.5) M
n = 0, 1, 2 . . . .C − 1
(9)
where c(n) is the mel cepstral coefficients and C is the number of coefficients. It provides more discriminative spectral information of the signal. Delta Delta (D2 ) MFCC The second-order derivative of the MFCC is called D2 MFCC coefficients, and it has temporal dynamics information and acceleration of the speech. Delta is the difference between the current and previous coefficients and is represented in Eq. 10. g di ( j ) =
j=1 (ci+ j − ci− j ) g 2 j=1 j 2
(10)
where di ( j ) is the coefficient of delta and i is the frame. The D2 MFCC is returned when N = 2. Pitch The periodic excitation of the vocal folds produces a voiced signal estimated by pitch as time or the fundamental frequency ( f 0 ) in the frequency domain. In this proposed methodology, the pitch is calculated as fundamental frequency ( f 0 ). It is the vibration of the vocal card per second during the voiced speech. In this work, the spectral-based harmonic-to-noise ratio (HNR) [15] method was used to predict the pitch. First, the Hamming window function is multiplied with the framed signal using Eq. 4. Then, apply fast Fourier transform (FFT) on the windowed signal in Eq. 11. The resulting array E(mw) has complex values, and it is doubled except first values, and the first N/2 array values are considered as energy spectrum, and this can be calculated using Eq. 12. The pitch ( f c ) is estimated using HNR in Eq. 13. N −1 −j2πnk 1 N Fw (nT )e s(kw) = N T n=0
(11)
E(mw) =
|s(kw)|k = 0 |2s(kw)|k = 1, 2 . . . N /2
HNR( f c ) =
E( f c ) + E(2 f c ) + E(3 f c ) E( f < 3 f c , f = f c , f = 3 f c )
(12) (13)
Fused Feature-Driven ANN Model for Estimating Code-Mixing Level … Table 1 Fused acoustic feature descriptions
Acoustic features
Indices
Number of features
MFCC
1–13
13
D2 MFCC
14–26
13
Pitch
27
1
Total
27
307
3.4 Fusion of Acoustic Features The extracted acoustic features are fused, and it is denoted as U shown in Eq. 14, and the length of the fused feature vector is 27. The first 13 features are extracted from MFCC coefficients and are represented as U 1 to U 13 . The next features U 14 to U 26 represent the features extracted from D2 MFCC. The final U 27 th feature is pitch. Table 1 shows the acoustic feature descriptions. U = U1 , U2 , U3 . . . U27
(14)
3.5 Artificial Neural Network ANN is the common technique that has been used in various applications used in image and signal processing [16–18]. It learns the complex mapping between input and output layers as shown in Fig. 3. In this work, multilayer perceptron neural network (MLP-NN) is used for language identification. The input layer depends on the features, while the output layer matches the number of language detection. The trial and error method is used to predict the number of the hidden layer. The following equations for the backpropagation algorithm are presented in the order in which they might be employed during training for a single training vector V (Eq. 15), where ‘w’ stands for weights and ‘f ’ stands for activation functions, ‘b’ stands for bias, and ‘y’ stands for the target. nethj =
N
W jih Vi + bhj
(15)
i=1
i j = f jh nethj
(16)
Equation 16 represents the hidden layer i j that gives output to another hidden layer using input vectors. The input vectors are applied to the hidden layers (Eq. 17).
308
K. Priya et al.
Fig. 3 Block diagram of artificial neural network
netok =
L
Wkoj i j + bkO
(17)
j=1
Ok = f kO net KO
(18)
Then, the hidden layer parameter outputs are applied to the output layer which is represented by Eq. 18. The results from the output layer provide the classification result (Eq. 19). δkO = (yk − Ok ) f kO net KO
(19)
N δ hj = f jh nethj δkO Wkoj
(20)
k=1
This MLP is using Eqs. 19 and 20 to calculate the error rate for the output δkO and hidden layers δ hj . wkoj (t + 1) = wkoj (t) + ηδkO i j
(21)
Fused Feature-Driven ANN Model for Estimating Code-Mixing Level …
w hji (t + 1) = w hji (t) + ηδ hj X i
309
(22)
The NN backpropagates this weightage to minimize the error using Eqs. 21 and 22. 1 2 δ 2 k=1 k M
Ep =
(23)
The network is trained up to the error E p (Eq. 23) becomes minimal for each of the input vectors. Finally, the classified speech signal is categorized as Tamil or English. Then, the level of spoken language is calculated using the following steps. Let the length of the input speech signal (L) comprises the length of silence signal (S) and length of the speech signal (SP) shown in Eq. 24. L = S + SP
(24)
S = S1 , S2 . . . S N
(25)
SPT = L −
N
Sn
(26)
n=1
The length of the total speech signal is calculated by subtracting the total silence signal from the input speech signal. The total silence (S) signal and the total speech SPT signal is represented in Eqs. 25 and 26, respectively. SPT = SP1 + SP2 + . . . SPm SPET =
k
(27)
SPT i
(28)
SPT j
(29)
i=1
SPTT =
l j=1
The total speech SPT time is consisting of segmented speech time between the silence part shown in Eq. 27. The total English SPET and Tamil SPTT language speech time is obtained by Eqs. 28 and 29. LSPET =
SPET SPT
(30)
LSPTT =
SPTT SPT
(31)
310
K. Priya et al.
The level of English LSPET and Tamil LSPTT language speech time is calculated from the ratio of individual language time to total speech time shown in Eqs. 30 and 31.
4 Experimental Results The CM-SLI from the collected speech database is developed using MATLAB 2020a with 8 GB RAM, GETFORCE GTX, CORE i7 laptop. 70% of the total speech files are used for training the network, and 30% of the files are used for validation and testing. The input speech signal containing a mix of two languages shown in Fig. 4, is silence removed and segmented, then it is framed to 20 ms length with 10 ms overlapping. The speech detected signal is shown in Fig. 5. The Hamming windowing is applied to the framed signal for feature extraction. Extract MFCC, D2 MFCC, and pitch from the windowing signal. The number of MFCC coefficients for each frame is 13. Then, calculate mean of each coefficient of the widowed signal is used as features. Likewise, the mean of D2 MFCC features is extracted for the next 13 features. Finally, the pitch is calculated as the last feature. Then, these acoustic features are fused to make the input vector and the feature vector length is 27. The extracted feature vector is the input of the MLP-NN, and it is trained with Bayesian regularization (BR) function to achieve language identification. In this work, the input layer contains 27 features, the number hidden layer is 20 and the number of the output layer is 2. So that the network architecture is input–hidden– output, that is 27–20–2. The number of training speech sample is 1750, so the number of training factors are 1750 * 27. The network modeling parameters are shown in Table 2. Fig. 4 Sample ınput speech signal
Fused Feature-Driven ANN Model for Estimating Code-Mixing Level …
311
Fig. 5 Sample detected speech signal
Table 2 Network modeling parameters Training algorithm
Transfer function
Network structure
Training data
Verifying data
Testing data
MLP
Bayesian regularization
27–20–2
1750 * 27
375 * 27
375 * 27
The MLP-NN was trained with the training function of Bayesian regularization (BR) with 100 epochs. The number of hidden layers is 20, and the training time is 26 s with 0.000862 performance. After training the MLP, the performance function reaches the target for all samples. This model achieves 97.9% classification accuracy to the testing data as shown in Table 3. The sample code-mixing input speech signal was segmented, and it has three segmentations as shown in Fig. 6. The identified language of the first segment speech signal is Tamil with a length of 0.69 S, and the other two segments are English with a length of 1.82 S and 1.89 S, respectively. It shows the information on the level of code-mixing language that is 84.31% shown in Table 4. Table 3 Training parameters Training algorithm
Number of hidden layers
Epoch
Performance
Training time (s)
Accuracy (%)
MLP-BR
20
100
0.000862
26
97.9
312
K. Priya et al.
Fig. 6 Detected language (Tamil and English)
Table 4 Level of code-mixing spoken language identification Speech Language
(SP1 )
(SP2 )
(SP3 )
Tamil (SPT )
English (SPE )
English (SPE )
Total speech Time = SP1 + SP2 + SP3
Time
0.69
1.82
1.89
4.4
% of language detection
15.69
41.36
42.95
–
Total % of language detection
– 84.31
4.1 Evaluation Metrics for Language Identification In this section, the evaluation metrics for language identification have been discussed. The evaluation of the performance metrics provides the MLP-NN classifier classification state by using the measures of precision, recall, and F1-score. The confusion matrix of the MLP-NN-BR is shown in Fig. 7. These performance metrics define whether the classifier performs well or not. In this work, the proposed classifier achieves better accuracy of 97.9% and the other performance metrics are shown in Table 5. Figure 8 explains the comparison results of the language identification using various MLP algorithms such as Levenberg Marquardt (LM), scaled conjugate gradient (SCG), Polak-Ribiere conjugate (CGP), conjugate gradient with Powell/ Beale restarts (CGB), BFGS quasi-Newton (BFG), and Fletcher Powell conjugate gradient (CGF). Among these NN training functions, the BR achieves better classification accuracy.
Fused Feature-Driven ANN Model for Estimating Code-Mixing Level …
313
Fig. 7 Confusion matrix of the MLP-NN-BR Table 5 Experimental results of the speech fluency recognition Database
Class
Accuracy (%)
Collected from YouTube
80.1
SCG
Recall %
F1-score
Accuracy (%) 97.9
Tamil
98.4
97.4
97.89
English
97.3
98.4
97.84
86.9
CGB
Precision %
78.5
CGF
85.6
87.9
CGP
LM
MLP-NN
Fig. 8 Comparison results of various MLP algorithm
97.9 78.5
BFG
BR
314
K. Priya et al.
Table 6 Comparison of the proposed method with state-of-the-art Database
Features
Classifier
Accuracy (%)
Three Indian language [3]
Revised perceptual linear prediction
GMM
88.75
Constructed database [5]
MFCC
SVM
74
Four Indian language [10]
MFCC
SVM
76
Six Indian language [12]
MFCC
HMM
90.63
Voxforge dataset [13]
Log mel spectrogram
ConvNet
96.3
Collected from YouTube (proposed)
MFCC, D2 MFCC, Pitch
ANN
97.6
Table 6 shows the performance of the proposed method with state-of-the-art methods. Most of the methods used MFCC alone to identify language with Indian databases. The proposed method uses fused features to identify language, and it also estimates the level of language used. This method achieves good classification accuracy among these state-of-the-art methods.
5 Conclusions A powerful acoustic feature based on an MLP-BR NN for code-mixing spoken language identification is proposed. The proposed method performs the level of language identification on speech signals by signal detection and segmentation, fused acoustic feature extraction, and detection. The pre-processing step normalizes the speech signal and the acoustic features such as MFCC, D2 MFCC, and pitch are extracted from the pre-processed signal. These features are classified with different NN classifiers and among these classifiers, the MLP-BR NN classifier achieves higher accuracy for language identification. Then, the level of identified language is estimated for the improvement of language processing application. The proposed method achieves 97.9% classification accuracy. As a result, the proposed method is highly recommended for estimating the level of code-mixing language identification based on the speech signal. Acknowledgements This work was supported by Thiagarajar Research Fellowship (TRF) in Thiagarajar College of Engineering, Madurai.
References 1. A.P. Pandian, Performance evaluation and comparison using deep learning techniques in sentiment analysis. J. Soft Comput. Paradigm (JSCP) 3(2), 123–134 (2021)
Fused Feature-Driven ANN Model for Estimating Code-Mixing Level …
315
2. M. Tripathi, Sentiment analysis of Nepali COVID19 tweets using NB, SVM AND LSTM. J. Artif. Intell. 3(03), 151–168 (2021) 3. P. Kumar, A. Biswas, A.N. Mishra, M. Chandra, Spoken language identification using hybrid feature extraction methods. arXiv prepr. arXiv:1003.5623 (2010) 4. C.Y. Lin, H.C. Wang, Language identification using pitch contour information. Department of Electrical Engineering, National Tsing Hua University, Hsinchu, Taiwan 2005 5. B. Aarti, S.K. Kopparapu, Spoken Indian language classification using artificial neural network—an experimental study. in 2017 4th IEEE International Conference on Signal Processing and Integrated Networks (SPIN), pp. 424–430 Sept (2017) 6. S.M. Siniscalchi, J. Reed, T. Svendsen, C.-H. Lee, Universal attribute characterization of spoken languages for automatic spoken language recognition. Comput. Speech Lang. 27(1), 209–227 (2013) 7. G. Singh, S. Sahil, V. Kumar, M. Kaur, M. Baz, M. Masud, Spoken language ıdentification using deep learning. Comput. Intell. Neurosci. Article ID 5123671, 12 (2021) 8. S. Jothilakshmi, V. Ramalingam, S. Palanivel, A hierarchical language identification system for Indian languages. Digital Signal Process. 22(3), 544–553 (2012) 9. M.B. Alsabek, I. Shahin, A. Hassan, in Studying the Similarity of COVID-19 Sounds based on Correlation Analysis of MFCC. https://dblp.org/rec/journals/corr/abs-2010-08770.bib (2020) 10. H. Venkatesan, T.V. Venkatasubramanian, J. Sangeetha, Automatic language ıdentification using machine learning techniques, in Proceedings of the International Conference on Communication and Electronics Systems (2018). IEEE Xplore Part Number: CFP18AWO-ART; ISBN:978-1-5386-4765-3 11. M.A. Zissman, Automatic language identification using Gaussian mixture and hidden Markov models, in 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing. (Minneapolis, MN, USA, 1993), pp. 399–402 12. M. Sadanandam, HMM based language identification from speech utterances of popular Indic languages using spectral and prosodic features. Traitement Signal 38(2), 521–528 (2021) 13. Sarthak, S. Shukla, G. Mittal, Spoken Language Identification Using ConvNets, arXiv:1910.04269v1 [cs.CL] 9 Oct (2019) 14. O.K. Hamid, Frame Blocking and Windowing Speech Signal. J. Inf. Commun. Intell. Syst. (JICIS) 4(5) (2018).ISSN: 2413–6999 15. S. Markov, A. Minaev, I. Grinev, D. Chernyshov, B. Kudruavcev, V. Mladenovic, A spectralbased pitch detection method. AIP Conf. Proc. 2188, 050005 (2019). https://doi.org/10.1063/ 1.5138432 16. S.R. Mugunthan, T. Vijayakumar, Design of improved version of sigmoidal function with biases for classification task in ELM domain. J. Soft Comput. Paradigm (JSCP) 3(02), 70–82 (2021) 17. H.K. Andi, An accurate bitcoin price prediction using logistic regression with LSTM machine learning model. J. Soft Comput. Paradigm 3(3), 205–217 (2021) 18. J.L.Z. Chen, K.L. Lai, Deep convolution neural network model for credit card fraud detection and alert. J. Artif. Intell. 3(02),101–112 (2021)
Pre-emptive Caching of Video Content Using Predictive Analysis Rohit Kumar Gupta, Atharva Naik, Saurabh Suthar, Ashish Kumar, and Ankit Mundra
Abstract Pre-emptive caching is a technique to pre-fetch data based on the outcome of algorithmic predictions. In this paper, we use machine learning models that account for time-based trends instead of metadata to estimate a popularity score integrated with existing caching schemes to pre-fetch trending videos or replace cached content with newer videos having higher in-demand user requests. The paper primarily focuses on regression models for prediction as they are faster to train which is crucial for a resource-intensive task like caching. Keywords Popularity prediction · Video content · Caching · CDN · Regression model
1 Introduction With increasing Internet usage, passive entertainment consumption and the booming scale of video hosting websites, managing content distribution has become very important in recent years such that a feed is streamed fast enough to minimize lags in transmission. To overcome this persistent issue with scaling, servers have become more decentralized through the means of content delivery networks, which are server proxies that host cached data received from a centralized database. But the upkeep of a CDN [1–3] can be very expensive if resources are not efficiently managed. Caching every video type is impractical and commercially unfeasible due to the high amount of space and server infrastructure requirements. Instead, CDNs rely on popularity determination to estimate what kind of video content is more in demand and caches the content on multiple edge nodes which then beam R. K. Gupta (B) · A. Naik · S. Suthar · A. Kumar · A. Mundra (B) Manipal University Jaipur, Jaipur, India e-mail: [email protected] A. Mundra e-mail: [email protected] A. Kumar e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_24
317
318
R. K. Gupta et al.
the stream to numerous users across spread out locations. Commonly used CDN caching techniques rely solely on the amount of user requests a particular content receives to cache data. This is an example of non-pre-emptive caching. Newer CDNs use complex models to determine increasing popularity and pre-fetch data before it receives many user requests. We have performed a comparative study using country-wise statistical data from YouTube and machine learning models from Microsoft Azure ML Studio to gauge the best performing algorithms for predicting video popularity and utilized them to set a custom caching policy. Compared to industry grade caching, pre-emptive policies improve average response time and minimize transmission latency as content with increasing popularity will be pre-cached in CDN edge nodes, thereby limiting network overheads.
2 Related Work Existing approaches rely on metadata of videos for predicting their future popularity [4, 5]. These methods are robust but consume a lot of time for training the learning models. Most applications are focused on keyword-based prediction models. We intend to use a trend-based methodology to achieve objectives. The system will establish relationships between the most important metrics of a video [2, 6] which include but are not limited to the number of dislikes, likes, comments, etc. By using regression models instead of neural networks, training time and model complexity can be lowered. Faster predictions [7] are optimal for pre-emptive caching, where speedy delivery of content is a major priority. A custom caching policy [2, 8] can be created to override a cached item if another video having a higher request count is not captured by the prediction model. In [1], human perception models are used to consider network parameters like topology, data links and strategies using mixed-linear functions. This approach utilizes image data and continuous feed from video sources to train models. Image data is user defined, and it may not represent the real video content in thumbnails. In [3], LSTM approach used which is highly suitable for predictive analysis provides high accuracy in outcomes. They are using time series data required which is currently unavailable in used datasets. In [9], CDNs have been implemented to play a major role in delivering content at a reduced latency and to provide high quality of experience to the users. Due to the limited storage on these caching servers [10], it is of utmost importance to manage data transmission on these servers, which required to optimize the caching policies to maximize cache hits.
Pre-emptive Caching of Video Content Using Predictive Analysis
319
3 Methodology We are using a real-world YouTube dataset [11] taken from Kaggle for predictive analysis of views and cache them based on their predictive score to the content delivery network (CDN). We have compared predictions using different models like Poisson regression [12], linear regression [13], decision forest regression [14], boosted decision tree regression [15] and neural network-based regression [16]. The learning model with the highest accuracy across multiple countries’ video content in the dataset will be finalized for use in the caching scheme.
3.1 Dataset Dataset used for the proposed approach includes the following features: • • • • • • • • • •
video_id trending_date title channel_title category_id publish_time tags views likes dislike
3.2 Data Pre-processing Data pre-processing includes the cleaning of dataset by removing null values, outliers, analysing the data distribution, converting categorical data into numeric data as the machine learning models only take the numeric data for training and processing and performing normalization or standardization on data according to the use case. For the taken dataset, it includes many irrelevant features like video_id and category_id as they do not affect the video popularity. Trending data is converted to the days as the day is the categorical variable. So, it will be converted to dummy variables, creating columns for every category. We are not using the features like title and channel_title as it will increase the computation complexity as the new columns are added when the categorical data is converted to numeric data. The data is densely populated at the centre. So, we need to normalize the data using log normalization.
320
R. K. Gupta et al.
Fig.1 Distribution of comment_count versus frequency after normalization
3.3 Normalization Initial visualizations of the dataset indicated that most parameters were positively skewed. As demonstrated in the image below for one of the parameters, viz. comment_count for Canadian YouTube dataset, videos with fewer than 100,000 comments outnumber every other category. The dataset was normalized [17] using log transform to reduce the uneven skewness of the data distribution. Post normalization, the distribution was observed as shown in Figs. 1 and 2.
3.4 Correlation Matrix The results of an initial correlation matrix indicated many features with low correlation with the view counts of the dataset. We dropped these features in favour of three highly correlated ones, viz. likes, dislikes and comment_count (Fig. 3).
Pre-emptive Caching of Video Content Using Predictive Analysis
321
Fig. 2 Scatterplot distribution of comments versus views after normalization
Fig. 3 Correlation matrix showing relationships between the four finalized attributes—likes, dislikes, comment count and views
322
R. K. Gupta et al.
3.5 Predictive Analysis Methods a.
Linear regression
Linear regression [13] predicts outcomes based on dependent variables by establishing linear correlation between predictor parameters and independent variables, which determines the output of a linear regression model. Simple linear regression is one of the most common regression approaches for finite continuous data. b.
Poisson regression
Poisson regression [12] predicts outcomes based on the count data and contingency tables. The output or response variable is assumed to have a Poisson distribution. c.
Decision forest regression
Decision forest regression is an ensemble learning technique that works well on nonlinear data distribution by creating decision trees. The output of each tree is a Gaussian distribution which is aggregated to find a resultant value. d.
Boosted decision tree
Boosted decision tree utilizes MART gradient boosting algorithm to efficiently build a regression tree in a series of steps. Each step has a predefined loss function that is corrected in the next step. It is optimal for linear as well as nonlinear distributions. e.
Neural networks
The neural network [16] is the network of small computing units, i.e. neurons which perform the computation. It can be used for any of the tasks that have big data which will eventually increase the accuracy of the model. But, sometimes machine learning methods perform better than neural networks [16] on regression problems.
3.6 Caching methodology We can couple the prediction model with existing caching policies, for faster preemption, a score-gated least recently used (SG-LRU) [8] caching policy can be used. It caches videos based on their prediction score by the highest accuracy model, i.e. boosted decision tree. CDN will cache the videos according to the highest prediction score and space availability in the CDN. Then, view counts will be counted over the hits (user requests) received on a particular video. Pre-fetched videos with higher hit ratio will be retained in the CDN, while those with lower hit ratio will be replaced by pre-fetched content estimated by the learning algorithm. After a preset time frame, if a video gets less than 80% of predicted views at that instant, the video will be allotted a lower priority then eventually replaced by another video that gets a higher estimated popularity score (Fig. 4).
Pre-emptive Caching of Video Content Using Predictive Analysis
Content Request
Popularity Predictor
Score
Apply Caching Policy
323
Pre-emptive Cache
Hit/Miss
Fig. 4 Content caching methodology
Table 1 Error metrics for different model Linear regression
Poisson regression
Decision tree
Boosted decision tree
Neural network
Mean absolute error
0.507864
0.202652
0.465561
0.451615
0.475683
Root mean-squared error
0.7072567
0.689224
0.612788
0.583466
0.616534
Relative absolute error
0.437201
0.435296
0.400783
0.388778
0.408233
Relative squared 0.219402 error
0.21147
0.166911
0.15132
0.168054
Coefficient of determination
0.78853
0.833089
0.84868
0.831946
0.780596
4 Results and Analysis Here, we used the five models to predict the popularity of the video. We used linear regression, Poisson regression, decision tree, boosted decision forest and neural network. We are using a coefficient of determination to know the accuracy [18] of the model (Table 1). Underlying graph tells the accuracy (coefficient of determination) of different models and the highest accuracy, we have achieved with boosted decision tree (Fig. 5). Boosted decision tree is the most accurate model with a lower training time (Fig. 6). The only other model that gets trained within 5 s is linear regression, but at the cost of accuracy, which is the lowest of all models at 78.05%. Therefore, boosted decision tree is the most preferable choice for popularity prediction (Table 2).
5 Conclusion Out of the different learning models that we used, boosted decision tree regression showed highest accuracy with 84.86% of R2 score. Combining Azure ML Studio [19] and local machine Python resources, we have proceeded with this algorithm for prediction of popularity scores which are transformed to the number of views
324
R. K. Gupta et al.
Fig. 5 Accuracy results after training the five types of regression models
Fig. 6 Graphical representation of training times for different learning models (lower is better)
using antilog transform. Unlike metadata-based predictions that rely on complex topologies with interrelated learning techniques, proposed model relies on changes in engagement rates by capturing real-time trends, thereby eliminating the need of constant live data feed or image processing as predictions can be performed on statistical snapshots taken at regular intervals. The proposed prediction methodology can be seamlessly integrated with existing caching policies which makes the approach versatile. Optimization can be carried out for pre-fetching content using
Pre-emptive Caching of Video Content Using Predictive Analysis Table 2 Accuracy metrics along with training time for different regression algorithms
325
Model
Accuracy (R2 score)
Training time (in s)
Linear regression
78.05
3
Poisson regression
78.88
8
Decision forest
83.30
9
Boosted decision tree
84.86
5
Neural network
83.22
18
a combination of pre-emptive analysis and a demand-based priority system. This technique when combined with the score-gated LRU caching scheme demonstrated in the research paper [8] has a higher hit ratio which results in more efficiency and faster content delivery through advanced prefetching and handles dynamic requests from the predictive model without a higher cache memory footprint. The SG-LRU policy and the learning model can be further customized as per network requests and scaling demand.
References 1. S.M.S. Tanzil, W. Hoiles, V. Krishnamurthy, Adaptive scheme for caching youtube content in a cellular network: a machine learning approach. IEEE Access 5 (Apr 2017). https://doi.org/ 10.1109/ACCESS.2017.2678990 2. A. Masood, T.V. Nguyen, S. Cho, Deep regression model for videos popularity prediction in mobile edge caching networks. in 2021 International Conference on Information Networking (Feb 2021).https://doi.org/10.1109/ICOIN50884.2021.9333920 3. R. Viola, A. Martin, J. Morgade, S. Masneri, M.Z.P. Angueira, J. Montalbán, Predictive CDN selection for video delivery based on LSTM network performance forecasts and cost-effective trade-offs. IEEE Trans. Broadcast. (Nov 2020). https://doi.org/10.1109/TBC.2020.3031724 4. J. Chorowski, J. Wang, J.M. Zurada, Review and performance comparison of SVM- and ELMbased classifiers. Sci. Dir. Neuro Comput. 128, 507–516 (Mar 2014) 5. W. Ding, Y. Shang, L. Guo, X. Hu, R. Yan, T. He, Video popularity prediction by sentiment propagation via implicit network, in ACM International Conference on Information Knowledge Management (Oct 2015), pp. 1621–1630 6. H. Zhu, Y. Cao, W. Wang, T. Jiang, S. Jin, Deep reinforcement learning for mobile edge caching: review, new features, and open issues. IEEE Netw. 32(6), (2018). https://doi.org/10. 1109/MNET.2018.1800109 7. A. Bielski, T. Trzcinski, Understanding multimodal popularity prediction of social media videos with self-attention. IEEE Access 6 (Dec 2018). https://doi.org/10.1109/ACCESS.2018.288 4831 8. G. Hasslinger, K. Ntougias, F. Hasslinger, O. Hohlfeld, Performance evaluation for new web caching strategies combining LRU with score based object selection. Science Direct (12 Apr 2017) 9. R.K. Gupta, R. Hada, S. Sudhir, 2-Tiered cloud based content delivery network architecture: An efficient load balancing approach for video streaming. Int. Conf. Signal Proc. Commun. (ICSPC) 2017, 431–435 (2017). https://doi.org/10.1109/CSPC.2017.8305885
326
R. K. Gupta et al.
10. R.K. Gupta, V.K. Verma, A. Mundra, R. Kapoor, S. Mishra, Improving recommendation for video content using hyperparameter tuning in sparse data environment, in ed. by P. Nanda, V.K. Verma, S. Srivastava, R.K. Gupta, A.P. Mazumdar, Data Engineering for Smart Systems. Lecture Notes in Networks and Systems, vol. 238 (Springer, Singapore, 2022). https://doi.org/ 10.1007/978-981-16-2641-8_38 11. Trending youtube video Statistics dataset. https://www.kaggle.com/datasnaek/youtube-new? select=USvideos.csv. Date of Access: 18 Feb 2021; Time of Access: 19:36 IST 12. P.C. Consul, F. Famoye, Generalized poisson regression model. Commun. Stat.-Theory Methods, 89–109 (Jul 2007). https://doi.org/10.1080/03610929208830766 13. O.O. Aalen, A linear regression model for the analysis of life times. Stat. Med. first published: Aug 1989, online issue: (Oct 2006). https://doi.org/10.1002/sim.4780080803 14. W. Tong, H. Hong, H. Fang, Q. Xie, R. Perkins, Decision forest: combining the predictions of multiple independent decision tree models. Am. Chem. Soc. 525–531 (Feb 2003). https://doi. org/10.1021/ci020058s 15. A. Poyarkov, A. Drutsa, A. Khalyavin, G. Gusev, Boosted decision tree regression adjustment for variance reduction in online controlled experiments, in 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Aug 2016), pp. 235–244.https://doi. org/10.1145/2939672.2939688 16. S.C. Wang, Artificial Neural Network, in Interdisciplinary Computing in Java Programming (2003) 17. T. Jayalakshmi, A. Santhakumaran, Statistical normalization and back propagation for classification. Int. J. Comput. Theor. Eng. 3(1), 1793–8201 (2011) 18. D.M. Allen, Mean square error of prediction as a criterion for selecting variables. Technometrics 13, 469–475 (Apr 2012). https://doi.org/10.1080/00401706.1971.10488811 19. Microsoft Azure Machine Learning Studio https://studio.azureml.net/.Platform used for creating learning models and visualizing data
Information Dissemination Strategies for Safety Applications in VANET: A Review Mehul Vala and Vishal Vora
Abstract The intelligent transportation system (ITS) aims to improve the performance of the transportation systems. Vehicular ad hoc networks (VANETs) are the potential mechanism by which ITS can realize its goal. In VANET, moving vehicles form ad hoc networks through wireless connection for exchanging critical information also known as information dissemination. Safety-related information dissemination is multicast or broadcast communication, and it must be fast and reliable. This criterion draws the researchers’ focus to develop efficient dissemination schemes. This review paper discusses safety-related message dissemination strategies, along with comprehensive classification, challenges, and future research direction. Keywords VANET · Ad hoc network · Broadcasting · Multi-hop · Data dissemination
1 Introduction The ultimate goal of the intelligent transportation system (ITS) is to improve the performance of the transportation systems [8]. Vehicular ad hoc network (VANET) is the key enabling technology by which ITS can realize its goal. The next-generation vehicles are intelligent in the sense that they are equipped with processing and communication technologies. VANET supports the idea of communication among moving vehicles [12]. Moving vehicles form ad hoc networks through wireless connection for exchanging critical information. Such information exchange is called information dissemination. Standards are developed to govern this kind of communication, and it is known as wireless access in vehicular environment (WAVE). WAVE standards are actually a combination of dedicated short-range communication (DSRC) and IEEE 1609 M. Vala (B) · V. Vora Atmiya University, Rajkot, Gujarat, India e-mail: [email protected] V. Vora e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_25
327
328
M. Vala and V. Vora
Applicaon layer
Safety Applications
SAE J2735
Message Sublayer Network & Transport Layer
Traffic Management and other Applications
IEEE 1609.2 (Security)
IEEE 1609.3 (WSMP)
LLC Sublayer
IEEE 802.2
MAC Sublayer Extension
IEEE 1609.4
MAC Sublayer
ICP/UDP IPV6
IEEE 802.11p
PHY Layer Fig. 1 DSRC protocol stack
standards [13]. Figure 1 shows DSRC protocol stack. The wireless connectivity can be categorized as vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I) connectivity depending on connection between two moving vehicles or vehicles and stationary nodes [7]. The DSRC standards support both V2V and V2I communications with ranges up to 1000 m. It supports data rates from 3 to 27 Mb/s over a bandwidth of 10 MHz [2]. Though DSRC supports V2I communication, installation of road side infrastructure is a costly affair. So to make it practically viable technology, infrastructure-less pure ad hoc communication is preferred among researchers [18, 35]. Practical range of transmission is less than 1 km, and in certain situations, the safety messages need to be sent to longer distances. In such a situation, multi-hop broadcasting is crucial; hence, it is drawing the attention of many researchers into developing efficient and reliable dissemination schemes beyond the transmission range of sender [24]. The structure of the paper will be as follows: Sect. 2 provides overview of VANET technology. Section 3 describes the classification of data dissemination strategies. Section 4 covers safety data dissemination methods. Discussion and future scope covered in Sect. 5, and finally Sect. 6 concludes the paper.
2 VANET Overview Vehicular ad hoc network (VANET) is a special case of mobile ad hoc networks (MANETs), in which every moving vehicle forms wireless connectivity with other moving vehicles for information sharing purposes [13].
Information Dissemination Strategies for Safety . . .
329
2.1 VANET Architecture All the devices that constitute VANET architecture are defined as – On Board Unit (OBU): is an equipment that is being installed into vehicles. It establishes wireless connectivities with other OBUs and RSUs while on the move. – Road Side Unit (RSU): RSUs are installed at regular intervals on roads and constitute infrastructure in VANET. Technically RSUs are similar to OBUs but stationary in nature and used to establish wireless connectivities with moving vehicles. It may work as a bridge for Internet connection. Maximum DSRC range is 1 km. So to realize fully connected networks, RSUs need to be placed at every kilometer interval and raise the cost. Smart vehicles are equipped with many sensors and processing devices which can collect and process crucial information. Through the use of V2V and V2I communication as shown in Fig. 2, they can share it with other vehicles. For example, vehicles can share its location, speed, direction to other vehicles for cooperative safety application realization [10].
Fig. 2 VANET architecture [4]
330
M. Vala and V. Vora
Fig. 3 VANET applications
2.2 VANET Standard DSRC standards are designed for short to medium range of communication, and its aim is to offer the least delay and high data rate in VANET. The US Federation of Communication commission (FCC) has allocated 75 MHz of spectrum at 5.9 GHz (5.85–5.925 GHz) for V2V and V2I communication [14]. DSRC standards are further composed of two standards IEEE 802.11p and IEEE1609. IEEE 802.11p governs the operation of medium access control (MAC) and physical (PHY) layers while IEEE1609 governs higher layers functions for vehicular communication [36].
2.3 VANET Applications Overall VANET applications can be broadly classified into three categories as shown in Fig. 3 [2, 6, 26]. Active Safety applications The main aim of active safety applications is to reduce life-threatening accidents by providing warning to drivers so as to avoid collisions. Information like vehicle positions, speed, and braking events can be shared to other vehicles. By processing collective information, vehicles can locate the hazards. Few representative active safety applications are shown in Fig. 4. Traffic management applications This category of applications attempt to reduce road congestion, increase fuel efficiency, and support cooperative navigation. Few example, applications are speed limit warning, optimal speed for green light, cruise control, platooning. Infotainment Applications This class of applications covers local as well global service offered to drivers, for example, the nearest fueling station, Internet access. Three different classifications are presented above for VANET applications. The goal of VANET is to be able to provide all three classes of services with respective
Information Dissemination Strategies for Safety . . .
331
Fig. 4 Active safety applications
QoS requirements. Active safety applications are time-sensitive applications, and speedy propagation in networks is very crucial. It requires propagating the information beyond the transmission range of sending vehicles also. It is where multi-hop information dissemination strategies need to be implemented [17].
3 Message Dissemination Strategies In VANET safety-related applications, the shared data is usually important to a group of nodes. Due to high dynamic topology and short wireless link lifetime, traditional routing strategies will be ill-suited for VANET applications. Hence, most of the research work explores broadcasting-based data dissemination strategies. We can classify these information dissemination strategies into two broad categories. Single-hop broadcast and Multi-hop broadcast [24]. Both of the above strategies differ by the way information disseminates in networks.
332
M. Vala and V. Vora
3.1 Single-hop Broadcast In this method, the sender shares the information to its immediate neighbor vehicles. Receiving vehicles kept this information for own use. Periodically some of the information broadcasted to its single-hop neighbor vehicles. Many safety-related applications are implemented through single-hop broadcast, for example, braking event warning, blind spot warning, lane change warning, etc. Based on the frequency of broadcast single-hop broadcast strategies can be divided into fixed broadcast and adaptive broadcast [21]. Fixed broadcast In fixed broadcast, vehicle periodically broadcast crucial information to its immediate neighbors. The vehicles that receive these information, update their database with these new information. At some fixed interval, they also send few information with their neighbors. So by cooperatively sharing information to singlehop neighbors, they ultimately enhance the transport safety. Here as the broadcast interval is fixed key design interest toward information selection and information aggregation. Selection of fixed interval is needed to be optimum. It should not promote congestion in network neither it should create scarcity of data [23]. Adaptive Broadcast In adaptive broadcasting, the broadcast interval is selected based on the need. Suppose it detects that the congestion is there in network then broadcast rate is reduced. Single-hop broadcast schemes utilize store-and-forward strategy to convey information. Hence, they are best suited for applications, where information needs to be shared to short distance and timing criteria is not very strict [29].
3.2 Multi-hop Broadcast In the proposed DSRC standard, the transmission range is 1 km for communication, but experimental result shows that practical range is not more than 300 m. in such case to propagate safety-related messages to longer distance multi-hop message forwarding schemes need to be utilized. In an ad hoc network, central coordination is missing, so establishing multi-hop message dissemination in VANET is a challenging task. The severity of the problem increases in extremely dense and sparse networks which are typical in vehicular communication [34]. Broadcast mechanism in its native sense is simple flooding. In simple flooding the sender broadcast the data to all single-hop neighbors. In Multi-hop broadcasting this data further propagated to receiver’s neighbors and so on. In simple flooding, many vehicles broadcast the same packets and waste bandwidth. Plus in dense network, such kind of flooding easily creates congestion in the network. Sometimes it is referred to as broadcast storm problems. Plain flooding leads to the following problems in information dissemination [27].
Information Dissemination Strategies for Safety . . .
– – – –
333
Excessive Redundant data Channel Contention Large Packet drops Delay in message delivery.
Summary of the comparison between single-hop and multi-hop broadcast techniques is shown in Table 1.
4 Safety Message Dissemination Methods As discussed above, plain broadcasting is very inefficient and leads to broadcast storm problem in network. To alleviate these problems, methods of selective broadcasting are practiced. In which upon receiving the packets at one-hop distance, out of all receiving nodes, one or few nodes are selected as a relay candidate to further broadcast the packets. Other nodes keep the data for their own use. The popular selection strategies practiced in the literature for relay node selection are: Distance-dependent, Link quality-based, Probability-based, Counter-based, Cluster-based, Network coding-based, Neighbor knowledge-based, and Hybrid strategies as shown in Fig. 5. Distance-Dependent: Based on the distance between sender and receiver, the farthest node is selected to relay the message. By selecting the farthest node for relaying, the largest area can be covered with minimum-hop count. Link Quality-Based: Realistic channel conditions considered for next hop selection. The next broadcasting node is selected based on received RSSI value or other channel conditions in this method.
Table 1 Comparative analysis of single-hop and multi-hop broadcast schemes Single-hop brodcast Multi-hop broadcast Characteristics
Advantage
Application
• Message exchange between immediate neighbor only [24] • Message exchange rate can be fixed or adaptive as per design • Delay tolerant scheme • Less Redundancy [2] • Avoid broadcast storm problem Cooperative awareness applications like • Blind spot alert • Lane Change, Collision warning [26]
• Message exchange beyond 1-hop distance [17] • Can cover large area through multi-hop message propagation • Broadcast storm problem • Long distance propagation of critical messages [17] Emergency applications • post-crash alert • road condition alert, etc. [13]
334
M. Vala and V. Vora
Fig. 5 Complete broadcast classification
Probability-Based: Among the available nodes for relaying messages, different probabilities are assigned to every node for relaying the message. The node with the highest probability will broadcast the message, and other nodes will discard their scheduled broadcast when they hear the relay nodes broadcast. The probability assignment strategy is dependent on different parameters such as distance, density of vehicles, direction, and speed. Counter-Based: In counter-based scheme, whenever any node receives a broadcast packet, it first sets a random wait time before relaying it further. In wait time duration, it will count the number of retransmissions of same packets. If the total retransmission is less than predetermined threshold, then node will rebroadcast it, otherwise discard the broadcasting. Cluster-Based: In this method, a group or cluster is formed among neighbor vehicles having common features. The common features include but not limited to relative velocity, acceleration, position, direction, Vehicle density, transmission range, etc. A cluster head (CH) is selected within all cluster members (CM). On behalf of all cluster members, only cluster head will broadcast the message toward other clusters. Neighbor Knowledge-Based: In this method, vehicles exchange among them several key information such as position, direction, speed. By processing this information, every vehicle forms knowledge about its surrounding network condition. Based on acquired knowledge vehicles choose the optimum node as a relay candidate. Network Coding-Based: In this method, transmitted data is encoded and decoded to enhance network throughput. Here relay nodes combine several received packets before transmitting. In this sense, the aim is to reduce net transmission compared to broadcasting without network coding. Hybrid: To improve performance and alleviate limitations of the above-mentioned methods, sometimes researchers combine more than one method in the relay node selection process. All such methods belong to the hybrid category.
Information Dissemination Strategies for Safety . . .
335
4.1 Beacon-Assisted Versus Beacon-Less All the methods used for relay node selection can be either be beacon-assisted or beacon-less. Beacon-assisted methods require periodic exchange of Hello Messages, while Beacon-less methods do not have any such requirements [25]. Periodic exchange of beacons increases overhead but at the same time it improves performance. The bandwidth is very precious resource in VANET, so to reduce wastage of bandwidth beacon-less methods can be utilized [11]. The following section refers and classifies papers based on beacon-less and beacon-assisted data dissemination strategies. Beacon-Assisted Protocols DV-CAST exchanges periodic messages to one-hop neighbors and generates local topology knowledge. It stands robust against diverse traffic conditions. Each node continuously checks the local topology to find out any node in same or opposite direction to broadcast. It applies store-carry-forward mechanism when no node available in sparse network. Otherwise, it applies rebroadcast suppression and efficiently forwards the packet. Weighted p-persistent suppression scheme used to reduce broadcast storm problem. The direction and position information beacons are needed to continuously exchange. In diverse scenario of dense and sparse network, the optimum frequency of these beacon messages is very crucial in deciding performance of proposed work [31]. Inter-vehicle geocast (IVG) shares information of position, direction, and acceleration to calculate the area of interest. From this area, it selects the best forwarding nodes. Timer-based approach is used for broadcasting the messages. Whenever any message is received and it is first time received, node will wait for specific time. Upon expiration of timer, node will retransmit the message. Timer-based next forwarder selection scheme reduces redundant transmission [3]. In distributed optimized time (DOT)-based approach, beacon-assisted timeslot density control is provided. Thus, it addresses the scalability issue for dense traffic reducing the density of vehicles in each time slot. One-hop neighborhood information exchanged thorough beacons to select the farthest vehicle for rebroadcast [28]. MOZO is a clustering-based protocol [19], in which through hello messages vehicles collaborate with each other to form dynamic moving zones. The moving zone consists of vehicles having similar moving patterns and connected with one-hop link. Captain vehicle maintains the combined location and velocity tree (CLV-tree) to estimate position of vehicles in cluster. Whenever a vehicle leaves the cluster, it is updated into the leaving event queue (LE). Though compared to position sharing less data needs to be exchanged, it still needs neighbor information to perform data dissemination. Major problem beacon-assisted protocol faces are the frequent contention and broadcast storm. AddP adjusts the retransmission rate based on node density to reduce broadcast storm problem. In addition to this, AddP also selects the best suitable candidate to relay the packet based on local density and distance. To alleviate hidden node problem, it proposes transmitted packet monitoring mechanism to con-
336
M. Vala and V. Vora
firm if relay node has transmitted the message or not. Network coding-based data aggregation mechanism is utilized to reduce duplicate packets propagating in the network [22]. Zhang in [37] has proposed an adaptive link quality-based safety message (ALQSM) forwarding scheme for vehicular network. In this, physical channel connectivity checking method is proposed. Based on the calculated connectivity, probability among vehicles different score is assigned to potential forwarders. The scoreoriented priority method will select an optimal forwarder. This method aims to reduce the contention during broadcasting among different vehicles. Data dissemination scheme presented in [20] is based on clustering and probabilistic broadcasting (CPB). A clustering algorithm forms cluster of vehicles closely moving in same directions, which allows vehicles to exchange received messages with cluster head. During this phase, probabilistic forwarding is used where probability is calculated based on how many times the message is received during defined interval. Only cluster head will forward the received message toward its transmission direction. The Enhanced Counter-based broadcast protocol in Urban VANET (ECUV) improves data dissemination for urban VANETs using a road topology-based approach to select the best relay nodes to coverage capabilities in urban vehicle-to-vehicle (V2V) scenarios. This protocol avoids broadcast storm problem by reducing the transmission probability in high vehicle density as well as increase coverage in lowdensity scenario [15]. Beacon-Less Protocols SEAD utilizes beacon-less method to estimate node density, and based on estimated node density, it dynamically defines probability of rebroadcast. The redundancy ratio is computed at each node to find out node density as per given equation [1]. R=
Total received messages(Original + Duplicated) Total new messages (Original)
The locally measured metric offers a beacon-less and adaptive dissemination scheme that helps in reducing broadcast storm problem. The distance between sending and receiving nodes is utilized to compute wait time while node density will be used to compute retransmit probability. In this sense, it is a hybrid beacon-less protocol. Range-based relay node selecting (RBRS) protocol describes emergency warning dissemination protocol. The receiver node will refrain from immediate broadcasting and wait time for random time before retransmission. The wait time will be inversely proportional to the distance between sending and receiving vehicles. In this way, the chosen relay vehicle will be the farthest vehicle from the sender. In cases when boundary vehicles are not available, then chosen relay vehicle will wait unnecessarily longer time and the provided coverage area will be less due to close distance from the sender. It helps in reducing broadcast storm problem by discarding the scheduled transmission, when node hears the same message transmission by other relay node [16].
Information Dissemination Strategies for Safety . . .
337
SAB protocols provide estimation of traffic conditions by speed observation through negative corelation between them. Three versions of speed adaptive broadcast (SAB) protocols, namely Probabilistic-SAB, Slotted-SAB, and Grid-SAB provided. Grid-based SAB provides the lowest redundancy of packets among three proposed protocols. Without extra beacon overhead, this paper addresses the issue of scalability and reliability [5]. In [30], author represents a novel way to optimally use bandwidth by reducing large number of data packets, thus reducing the wastage of bandwidth. A fuzzybased beacon-less probabilistic broadcasting algorithms (FBBPA) is proposed, in which the broadcasting probability is calculated by considering distance, direction, angular orientation, and buffer load. The packet having the highest probability in buffer will be transmitted first. DRIVE aims to mitigate the broadcast storm problem and network partitions by disseminating data within an area of interest (AoI). It does not require vehicles to maintain a neighbor table instead it uses a sweet spot to alleviate the broadcast storm problem and increase coverage range. Vehicle that is located within sweet spot is more likely to disseminate data and enhance coverage compared to distance-based broadcasting [32]. In [33], wang designed a distributed relay selection method, by considering the locations, channel quality, velocities, and message receiving statuses of vehicles, to improve performance in highly mobile vehicular ad hoc networks. An instantly decodable network coding for the next relay vehicle to retransmit packets, resulting in significant improvements in both network throughput and transmission delay. Simulation results show that the proposed strategy effectively reduces the delay of data dissemination in highway scenarios. In [9], beacon-less traffic-aware geographical routing protocol (BTA-GRP) is proposed which tries to eliminate mobility-induced unreliability in vanet. BTA-GRP is an improved geographic routing strategy which adapted to high mobility and link disconnection issues. It considers traffic density, distance, and direction for next broadcast node. The protocol is suitable for dense as well sparse traffic. Table 2 summarizes all reviewed papers based on methods being used, objective of research work, and evaluation scenario.
5 Discussion and Future Scope The message dissemination process will depend heavily on type of traffic, type of application, and its QoS requirements. The forwarding strategy may be singlehop or multi-hop depending on the distance between sender and receiver as wellperformance criteria. The elected scheme needs to assure that all neighbor nodes have received crucial information through broadcast without network congestion, excessive delay and with good efficiency. Single-hop communication can provide acceptable throughput but data delivery time is large due to the store-and-forward nature of communication. Hence, it is suit-
338
M. Vala and V. Vora
Table 2 Information dissemination approaches Protocol Strategy Forwarding Objective method IVG [3]
Beaconassisted Beaconassisted
Distancebased Neighbor knowledge
AddP [22]
Beaconassisted
Density and distance based
DOT [28]
Beaconassisted Beaconassisted Beaconassisted
Locationbased Link quality-based Clustering and probabilitybased Cluster-based
RBRS [16]
Beaconassisted Beaconassisted Beacon-less
FBBPA [30]
Beacon-less
SEAD [1]
Beacon-less
SAB [5]
Beacon-less
DRIVE [32]
Beacon-less
DVCast [31]
ALQSM [37] CPB [20]
MoZo [19] ECUV [15]
NCRS-NC Beacon-less [33] BTA-GRP [9] Beacon-less
Counter-based Distancebased Fuzzy-based Probabilitybased Density-based
Locationbased Network coding-based Position-based
Broadcast storm Broadcast storm disconnected network Broadcast storm hidden node Redundancy reduction Redundancy reduction Delay reduction, improve coverage Broadcast storm Broadcast storm Delay reduction Delay reduction Broadcast storm Scalability redundancy reduction Overhead reduction Delay reduction Delay reduction, disconnection issue
Scenario
Simulator
Highway
Glomosim
Highway and Urban
Ns-2
Highway and Urban
OMNeT++
Highway
–
Urban
OMNeT++
Highway
NS-2
Highway and Urban Highway and Urban Highway
Ns-2
Highway and Urban Highway
– Ns-2 Ns-3
Highway and Urban
OMNeT++
Highway and Urban Highway
OMNeT++
Highway and Urban
– NS-2
Information Dissemination Strategies for Safety . . .
339
able for delay-tolerant applications, while performing poorly in delay-sensitive applications. Due to the limitations of single-hop communication, considerable research activities are ongoing toward multi-hop data dissemination schemes. A good multihop dissemination strategy will elect only a subset of neighbor nodes to rebroadcast the message in the network. The redundancy rate and congestion in the network are dependent on the elected scheme of dissemination.
6 Conclusion This paper provides review about VANET technology, discussion on VANET architecture, VANET protocol stack, and applications provided. This review paper highlights the importance of VANET in establishing ITS applications. Broadcasting is the basic mechanism for information dissemination in vehicular network. Due to high mobility and no centralized coordination, the task of message dissemination becomes very challenging. Safety-related application is utmost important among all and needs special consideration. A comprehensive classification of safety message dissemination is provided. Choice between beacon-less strategy and beacon-assisted strategy is a trade of between reliability and bandwidth saturation. To efficiently utilize available bandwidth, beacon-less schemes are suitable.
References 1. I. Achour, T. Bejaoui, A. Busson, S. Tabbane, Sead: a simple and efficient adaptive data dissemination protocol in vehicular ad-hoc networks. Wirel. Netw. 22(5), 1673–1683 (2016) 2. S. Al-Sultan, M.M. Al-Doori, A.H. Al-Bayatti, H. Zedan, A comprehensive survey on vehicular ad hoc network. J. Netw. Comput. Appl. 37, 380–392 (2014) 3. A. Bachir, A. Benslimane, A multicast protocol in ad hoc networks inter-vehicle geocast, in The 57th IEEE Semiannual Vehicular Technology Conference, 2003. VTC 2003-Spring, vol. 4, pp. 2456–2460. IEEE (2003) 4. R. Chandren Muniyandi, M.K. Hasan, M.R. Hammoodi, A. Maroosi, An improved harmony search algorithm for proactive routing protocol in vanet. J. Adv. Transp. 2021 (2021) 5. M. Chaqfeh, A. Lakas, A novel approach for scalable multi-hop data dissemination in vehicular ad hoc networks. Ad Hoc Netw. 37, 228–239 (2016) 6. F.D. Da Cunha, A. Boukerche, L. Villas, A.C. Viana, A.A. Loureiro, Data Communication in VANETs: A Survey, Challenges and Applications. Ph.D. thesis, INRIA Saclay, INRIA (2014) 7. K.C. Dey, A. Rayamajhi, M. Chowdhury, P. Bhavsar, J. Martin, Vehicle-to-vehicle (v2v) and vehicle-to-infrastructure (v2i) communication in a heterogeneous wireless networkperformance evaluation. Transp. Res. Part C: Emerg. Technol. 68, 168–184 (2016) 8. G. Dimitrakopoulos, P. Demestichas, Intelligent transportation systems. IEEE Veh. Technol. Mag. 5(1), 77–84 (2010) 9. S. Din, K.N. Qureshi, M.S. Afsar, J.J. Rodrigues, A. Ahmad, G.S. Choi, Beaconless trafficaware geographical routing protocol for intelligent transportation system. IEEE Access 8, 187671–187686 (2020)
340
M. Vala and V. Vora
10. Y.P. Fallah, C.L. Huang, R. Sengupta, H. Krishnan, Analysis of information dissemination in vehicular ad-hoc networks with application to cooperative vehicle safety systems. IEEE Trans. Veh. Technol. 60(1), 233–247 (2010) 11. R. Fracchia, M. Meo, D. Rossi, Vanets: to beacon or not to beacon?. in IEEE Globecom Workshop on Automotive Networking and Applications (AutoNet 2006) (2006) 12. H. Hartenstein, L. Laberteaux, A tutorial survey on vehicular ad hoc networks. IEEE Commun. Mag. 46(6), 164–171 (2008) 13. G. Karagiannis, O. Altintas, E. Ekici, G. Heijenk, B. Jarupan, K. Lin, T. Weil, Vehicular networking: a survey and tutorial on requirements, architectures, challenges, standards and solutions. IEEE Commun. Surv. Tutorials 13(4), 584–616 (2011) 14. J.B. Kenney, Dedicated short-range communications (dsrc) standards in the united states. Proc. IEEE 99(7), 1162–1182 (2011) 15. L. Khamer, N. Labraoui, A.M. Gueroui, S. Zaidi, A.A.A. Ari, Road network layout based multihop broadcast protocols for urban vehicular ad-hoc networks. Wirel. Netw. 27(2), 1369–1388 (2021) 16. T.H. Kim, W.K. Hong, H.C. Kim, Y.D. Lee, An effective data dissemination in vehicular adhoc network, in International Conference on Information Networking, pp. 295–304. Springer (2007) 17. S. Latif, S. Mahfooz, B. Jan, N. Ahmad, Y. Cao, M. Asif, A comparative study of scenariodriven multi-hop broadcast protocols for vanets. Veh. Commun. 12, 88–109 (2018) 18. W. Liang, Z. Li, H. Zhang, S. Wang, R. Bie, Vehicular ad hoc networks: architectures, research issues, methodologies, challenges, and trends. Int. J. Distrib. Sens. Netw. 11(8), 745303 (2015) 19. D. Lin, J. Kang, A. Squicciarini, Y. Wu, S. Gurung, O. Tonguz, Mozo: a moving zone based routing protocol using pure v2v communication in vanets. IEEE Trans. Mob. Comput. 16(5), 1357–1370 (2016) 20. L. Liu, C. Chen, T. Qiu, M. Zhang, S. Li, B. Zhou, A data dissemination scheme based on clustering and probabilistic broadcasting in vanets. Veh. Commun. 13, 78–88 (2018) 21. M. Naderi, F. Zargari, M. Ghanbari, Adaptive beacon broadcast in opportunistic routing for vanets. Ad Hoc Netw. 86, 119–130 (2019) 22. R. Oliveira, C. Montez, A. Boukerche, M.S. Wangham, Reliable data dissemination protocol for vanet traffic safety applications. Ad Hoc Netw. 63, 30–44 (2017) 23. B. Pan, H. Wu, J. Wang, Fl-asb: a fuzzy logic based adaptive-period single-hop broadcast protocol. Int. J. Distrib. Sens. Netw. 14(5), 1550147718778482 (2018) 24. S. Panichpapiboon, W. Pattara-Atikom, A review of information dissemination protocols for vehicular ad hoc networks. IEEE Commun. Surv. Tutorials 14(3), 784–798 (2011) 25. B. Paul, M. Ibrahim, M. Bikas, A. Naser, Vanet Routing Protocols: Pros and Cons. arXiv preprint arXiv:1204.1201 (2012) 26. A. Rasheed, S. Gillani, S. Ajmal, A. Qayyum, Vehicular ad hoc network (vanet): a survey, challenges, and applications, in Vehicular Ad-Hoc Networks for Smart Cities, pp. 39–51. Springer (2017) 27. T. Saeed, Y. Mylonas, A. Pitsillides, V. Papadopoulou, M. Lestas, Modeling probabilistic flooding in vanets for optimal rebroadcast probabilities. IEEE Trans. Intel. Transp. Syst. 20(2), 556–570 (2018) 28. R.S. Schwartz, K. Das, H. Scholten, P. Havinga, Exploiting beacons for scalable broadcast data dissemination in vanets, in Proceedings of the Ninth ACM International Workshop on Vehicular Inter-Networking, Systems, and applications, pp. 53–62 (2012) 29. C. Sommer, O.K. Tonguz, F. Dressler, Traffic information systems: efficient message dissemination via adaptive beaconing. IEEE Commun. Mag. 49(5), 173–179 (2011) 30. A. Srivastava, A. Prakash, R. Tripathi, Fuzzy-based beaconless probabilistic broadcasting for information dissemination in urban vanet. Ad Hoc Netw. 108, 102285 (2020) 31. O.K. Tonguz, N. Wisitpongphan, F. Bai, Dv-cast: a distributed vehicular broadcast protocol for vehicular ad hoc networks. IEEE Wirel. Commun. 17(2), 47–57 (2010) 32. L.A. Villas, A. Boukerche, G. Maia, R.W. Pazzi, A.A. Loureiro, Drive: an efficient and robust data dissemination protocol for highway and urban vehicular ad hoc networks. Comput. Netw. 75, 381–394 (2014)
Information Dissemination Strategies for Safety . . .
341
33. S. Wang, J. Yin, Distributed relay selection with network coding for data dissemination in vehicular ad hoc networks. Int. J. Distrib. Sens. Netw. 13(5), 1550147717708135 (2017) 34. L. Wu, L. Nie, J. Fan, Y. He, Q. Liu, D. Wu, An efficient multi-hop broadcast protocol for emergency messages dissemination in vanets. Chinese J. Electron. 26(3), 614–623 (2017) 35. Z. Xu, X. Li, X. Zhao, M.H. Zhang, Z. Wang, Dsrc versus 4g-lte for connected vehicle applications: a study on field experiments of vehicular communication performance. J. Adv. Transp. 2017 (2017) 36. S. Zeadally, R. Hunt, Y.S. Chen, A. Irwin, A. Hassan, Vehicular ad hoc networks (vanets): status, results, and challenges. Telecommun. Syst. 50(4), 217–241 (2012) 37. X. Zhang, Q. Miao, Y. Li, An adaptive link quality-based safety message dissemination scheme for urban vanets. IEEE Commun. Lett. 22(10), 2104–2107 (2018)
Tech Stack Prediction Using Hybrid ARIMA and LSTM Model Radha SenthilKumar , V. Naveen, M. Sri Hari Balaji, and P. Aravinth
Abstract A tech stack is a set of tools developer use to make an application. It consists of software applications, frameworks, and programming languages that realize some aspects of the program. In the advent of the future tech world, the student communities and developing computer society are eager to lay their hands on new frameworks and tech stacks. But of learning a reasonable technology will increase their chance of industrial growth and career enhancement. This will also channelize their productivity in right choice to get better outcomes. So, developing a hybrid model using autoregressive integrated moving average (ARIMA) and long shortterm memory (LSTM) to forecast the growth of popular tech stack for the upcoming decade. This will be feasible to find out the tech in rising curve with high accuracy. The outcome of the prediction will further pave the way for exciting opportunities and paths for growth. The accountability of this predictions is purely based on popular search engines and developer communities such as stack overflow. Keywords Time series forecasting · Tech stack analysis · Machine learning · Grid search method
1 Introduction Tech stack prediction focuses on bridging the gaps between college education and industrial demand in terms of developing students Skills. Learning the right thing will channelize the students’ career goal in right manner. So, we are mainly focusing on forecasting the developing tech stacks growth curve in the next decade. Thus, we can foresee the outcome of specific developing tech stack before trying hands in it. Many forecasting models are built for use case scenarios like weather forecasting and stock price prediction. But giving helping hand to student community on telling
R. SenthilKumar · V. Naveen (B) · M. Sri Hari Balaji · P. Aravinth Anna University, Madras Institute of Technology, Chennai, Tamil Nadu, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_26
343
344
R. SenthilKumar et al.
Fig. 1 For perspective, the growth of the open source projects and technologies in the past decade
what to learn will implement a rapid change among students which will lead to high productivity of pupils. Figure 1 shows the growth of the open source projects and technologies in the past decade. Currently, there are no existing systems for the student community to know about the developing technology to focus their productivity on right track to develop their career. But, there are many other prediction systems that focus on stock marketing, gold rate prediction, and financial systems. Taking that approaches into consideration, we built a system to predict the most popular technology to learn in their concerned domain with of search tags from popular search engines. Thus, the students productivity can be improved and industrial knowledge can be obtained with the help of effective self-learning.
2 Related Work To check whether the data is trend or stationary or not stationary has been proposed in [1]. The performance estimation can be gone through in-sample and out-sample forecasting the predicted values. The work depicts the observation and analysis to find the trend and usage of Box-Jenkins forecasting to find the trend in the future for the projected years. Also, the trend levels are categorized as high, low, medium, and no trend for the period of time has been proposed in [2]. Parameter estimation in [3] and to check, the data is stationary or not has been indicated. Since the model by using ARIMA only accepts the stationary values, the conversion of stationary process is carried out. Analyzing the data and forecast the predicted values by using LSTM for the population data of the specific province in [4]. Upon collecting the past data, the
Tech Stack Prediction Using Hybrid ARIMA and LSTM Model
345
growth trend of the population is identified to forecast for the next set of years. An idea of modeling and forecasting the demand in a food company by using time series approach is discussed in [5]. In this, the historical demand data is utilized to forecast demand and these forecasts result in predicting the supply chain. In [6] illustrates the stock price prediction. This work is carried out with autoregressive integrated moving average(ARIMA) model because of its wide acceptability, and it is proceeded on various possible previous data to predict accurate stock prices. This works focuses on good fit of the model for various sectors stocks. The parameter selection for ARIMA by using iterative method and the performance metric to determine the accuracy of the model has been discussed in [7]. An LSTM technique is applied over the decomposed sub-components for model fitting, and short-term future values are forecasted based on the obtained model is discussed in [8]. For the model, the dataset collection has been proposed in [9] which include the popular search engines, stack overflow and GitHub. Based on the respective tags data has been fetched. Various methodologies have been discussed in [10] to forecast the data and to predict the data with high accuracy with least error rate. The LSTM model uses a decomposable time series model with three major model components. They are trend, seasonality, and holidays. The LSTM model will try to fit the linear and nonlinear functions as components [11, 12]. By using LSTM machine learning, the proposed Bitcoin dataset has been trained and deployed to obtain a more accurate in-sample and out-sample forecasting of the Bitcoin price has been proposed in [13]. A combination of ensembles is carried out, and the working of model performance is evaluated to determine the perfectly fitted model in accordance with parameters has been proposed in [14]. In [15] illustrates using of upgraded version of extreme learning machine(ELM) can obtain the good accuracy and also reduces the classification error.
3 Proposed Work 3.1 Tech Stack Architecture In predicting tags count, we are considering only the past data over the years so no external factors contribute. Past values has effect on future values (autoregression), but a sudden variation in past dataset will affect forecast to large extend. To overcome the impacts of short-term fluctuations, the moving average part in ARIMA model contributes by analyzing the data points with respect to the averages of subset of data. Upon the data recorded over regular interval and to measure events that happen over a period of time, hybrid model will be a sophisticated choice. The entire architecture for the hybrid model of tech stack prediction has been shown in Fig. 2. It includes the various process for the conversion of non-stationary to stationary, parameter estimation for the model, fitting the model and forecasting the predicted values.
346
R. SenthilKumar et al.
Fig. 2 Tech stack architecture for hybrid ARIMA and LSTM model
3.2 Data Collection The main resource for building our model is lying on fetching past years dataset related to search tags on the popular search engine, stack overflow. Stack overflow is ranked as top developer community in terms of emerging programming world. Thus, there are more chances and the programmers are getting in touch with stack overflow for learning their new tech skills and tools. The tech domains that the developer community concentrating are observed with that of tags that are linked to every question and answers that are posted in this community. Tags count can give us an idea of how much developers are currently posting questions or answers in that tech topic, which contributes our project on visualizing and forecasting the growth of specific tech domain. So, these tags can precisely give us overview of how many developers are currently working on with specific domain. Thus, this will yield us the past years dataset of how the community in specific time got in touch with specific domain. Hence, we can define the popularity of the particular domain in upcoming years. So far, we collected 146 rows and 30 columns of data for initial level of analysis starting from year 2009 to 2020. We choose the most used and popular four tech domains.
Tech Stack Prediction Using Hybrid ARIMA and LSTM Model
347
3.3 Model Selection Then, we need to forecast the specific technology growth for the next few years from the data and we obtained from the past years. We also tried with other models like long short-term memory (LSTM), Holt Winters, and ensemble, but finally we end up with hybrid ARIMA and LSTM model because it has a lesser error rate while comparing with others and also yields greater accuracy. For that purpose, we decided to take combination of autoregressive integrated moving average (ARIMA) and long short-term memory (LSTM) hybrid model. ARIMA Model ARIMA models are the class of models for forecasting a time series using the series past values. Thus, we focused on analyzing data with plotting charts to see the trend of the data over the years. And after that, testing the data for stationarity is a must do task before we feed our ARIMA model with our dataset. By making sure each attributes are stationary, identifying seasonality in the dependent series and extract knowledge from the autocorrelation, partial autocorrelation and inverse autocorrelation plots to decide if any autoregressive or moving average component should be used in the model. While testing whether the estimated model conforms to the specifications of a stationary univariate process. The residuals should be independent of each other and constant in mean and variance over a particular period of time. LSTM Model LSTM is an artificial recurrent neural network technique. It is used because it has comparatively more long term memory than RNN. Also, it helps mitigate the vanishing gradient problem which is commonly seen in neural networks. It uses a series of gates contained in memory blocks which are connected through layers. The three types of gates are input gate which writes input to the cell, forget gate which reads output from the cell, and output gate which resets the old cell value. A number of hyperparameters need to be specified for the LSTM model to predict optimally. The number of layers should be one for simple problems, two for complex features, and more than two layers make it harder to train the dataset. Hybrid Model (ARIMA-LSTM) Finally, by concluding in order to successfully identify the behavior of a time series, it is essential to control its two main components: the trend and the cyclical component. The first describes the overall movement while the second points to periodic fluctuations. The idea is, therefore, to transform the original time series into a smooth function by isolating it from its seasonal component. The residuals from the ARIMA model are trained using LSTM, which is an artificial recurrent neural network technique. In addition, based on the performance both ARIMA and LSTM models: the first is optimal for the seasonal component and the second is effective for the trend. Once this is done, we can crosscheck the predicted trend and the reproduced cyclical component to construct a forecast of the original series. The predictions from the three models, viz., ARIMA, LSTM, and ARIMA- LSTM hybrid is finally tested with collected dataset of 10 years through evaluation metrics like RMSE and MAE. The one with a better performance is finalized for further development of the system.
348
R. SenthilKumar et al.
Fig. 3 Observing data as non-stationary data
3.4 Data Processing If our dataset is non-stationary, we intend to transform the non-stationary data to stationary data. Because non-stationary data are unpredictable and cannot be modeled/forecasted. So, to change a series to stationary, we tend to difference it. That is, we subtract the previous value from the current value. Again if the data is not stationary, we will perhaps perform nonlinear transformation such as log transformation or square root transformation. Test for Stationarity To check our dataset for stationarity, we look at plots for trends or seasonality. And, we look at statistical test (ADF test) results. By using augmented Dickey–Fuller test with the support of statsmodel package via adfuller() function. On inferring to the p-value which is returned by the results of the adfuller() function, we can tell whether the series is stationary or not. If the p-value obtained is greater than the significance level of 0.05, we can clearly see that the given time series is non-stationary has shown in Fig. 3. Since ARIMA is feded with stationary data. If the p-value is less than the significance level of 0.05, then the time series is known as stationary has shown in Fig. 4.
Tech Stack Prediction Using Hybrid ARIMA and LSTM Model
349
Fig. 4 Observing data as stationary data
3.5 Model Building and Parameter Estimation Before proceeding to the ARIMA model, we tend to see autocorrelation of the series indexed in time. A stationary process has the property that the mean, variance, and autocorrelation structure do not change over time. From that we can arrive at the result that data is stationary with autocorrelation do not change over time. Then, we proceed with building the ARIMA model for the picked up technology domains to forecast and the further tags count to get overview of best tech stack to learn further to build promising applications. 1. For building ARIMA, the parameters p, q, and d are pre-determined to feed the model with ARIMA Order. i. p is the order of the autoregressive (AR) term ii. q is the order of the moving average (MA) term iii. d is the number of differencing required to make the time series stationary 2. Order p is the lag value which can be obtained on analyzing the PACF plot (partial autocorrelation). The lag value can be calculated on seeing the plot which crosses the upper confidence interval for the first time. 3. Order q is obtained from the ACF plot (autocorrelation). This value can be calculated on seeing the plot which crosses the upper confidence interval for the first time.
350
R. SenthilKumar et al.
Fig. 5 In-sample forecasting for Python technology in programming domain
4. Also, grid search method is utilized to arrive at the least mean squared error value with taken corresponding ARIMA order values. These order values will be considered to build ARIMA model. 5. Once the ARIMA model is forecasted the residuals will be evaluated and the actual difference will be feed with hybrid LSTM model. Finally, the overall forecasting will be done based on the obtained residuals.
4 Experimental Analysis With the help past value data, the model able to predict the upcoming next two years by using hybrid model (ARIMA-LSTM). The model will get trained up until the previous value to make the next prediction for forecasting. In Fig. 5 represents the in-sample forecasting data for Python technology in programming language domain. In Fig. 6 represents the out-sample forecasting data for Python technology in programming language domain. In Fig. 7 represents the in-sample and out-sample forecasting data for Java technology in programming language domain. In Fig. 8 represents the overall comparison of forecasting data for all technologies in programming language domain. It is clearly indicating that Python will be most dominating one while compared to other languages in the next upcoming months. In Fig. 9 represents the overall comparison of forecasting data for all technologies in frontend domain. Based on forecasting reactjs
Tech Stack Prediction Using Hybrid ARIMA and LSTM Model
351
Fig. 6 The raising curve for Python in programming domain through out-sample forecasting
Fig. 7 In-sample and out-sample forecasting for Java technology in programming domain
will be most widely used in the next upcoming months when compared with other technologies.
5 Performance Metrics While performing the analysis of the tech stack prediction, the evaluation metrics of the model that we have made use of are square error (MSE), root mean square error (RMSE), and mean absolute error (MAE). While comparing with other models, the hybrid model (ARIMA-LSTM) has the least error rate and also yields greater accu-
352
R. SenthilKumar et al.
Fig. 8 Comparison of all technologies in programming domain including past and forecasting data
Fig. 9 Comparison of all technologies in frontend domain including past and forecasting data Table 1 Performance metrics for Python—programming domain Metrics ARIMA model LSTM model MAE MSE RMSE
1129.36 1936264.65 1391.49
1133.54 1986264.46 1409.34
Hybrid model (ARIMA-LSTM) 1013.83 1653274.09 1285.79
racy. In parameter estimation of ARIMA model will be chosen based on the above metrics and the model will be fitted and forecasted with the obtained parameters. Table 1 represents the comparison of all the evaluation metrics for the hybrid model forecasting of Python technology. It is clearly evident that the hybrid model yields less error rate while comparing with other models. Frontend Angular, Node.js, Veu.js, jQuery, React Backend MongoDB, PostgreSQL, MySQL, SQLite, Elasticsearch Machine Learning Numpy, Pandas, MATLAB, Pytorch, Keras Programming Language Python, Java, R, C++, Kotlin
Tech Stack Prediction Using Hybrid ARIMA and LSTM Model
353
6 Limitations of the Model The hybrid model based on their performance and accuracy worked well on the small amount of data. The predictions of unseen data are quite a bit in correlation with the tuned model’s prediction. But, the limitations lie with the collection of data. Even though the model’s forecasting is quite fit with reality, we have a very small volume of data. As of now the hybrid model can able to predict accurately up to 2 years but when we collect more data from different sources and communities, we can feed model with multivariate and for the further development process.
7 Conclusion In this paper, we dealt with forecasting the popular technologies growth in each domain. Predicting the technology which is going to flourish in the next decade will allow the student developer communities to spare their free time in learning these technology, which will also increase their skillset and job opportunities. Through our proposed hybrid model, we can pick out one technology in each domain which will going to be trendy by evaluating the current state of them. Also the proposed system is based on specific search engine’s search tags count analyzed for the specific time. Also, asynchronous task processing system can be implemented to evaluate forecasting models toward front end by User intervention.
References 1. S. Athiyarath, M. Paul, S. Krishnaswamy, A comparative study and analysis of time series forecasting techniques. SN Comput. Sci. 1(3) (2020) 2. O.A. Balasmeh, R. Babbar, T. Karmaker, Trend analysis and ARIMA modeling for forecasting precipitation pattern in Wadi Shueib catchment area in Jordan (Arab. J, Geosci, 2019) 3. J.P. Brockwell, R.A. Davis, Modeling and forecasting with ARMA processes, in Proceedings of Introduction to Time Series and Forecasting, pp. 121–155 (2016) 4. J. Dai, S. Chen, The application of ARIMA model in forecasting population data. J. Phys. Conf. Ser. (2019). https://doi.org/10.1088/1742-6596/1324/1/012100 5. J. Fattah, L. Ezzine, Z. Aman, H.E. Moussami, A. Lachhab, Forecasting of demand using ARIMA model. Int. J. Eng. Bus. Manage. (2018) 6. P. Mondal, L. Shit, Study of effectiveness of time series modeling (Arima) in forecasting stock prices. Int. J. Comput. Sci. Eng. Appl. 4(2), 13–29 (2014) 7. S. Prajapati, A. Swaraj, R. Lalwani, A. Narwal, K. Verma, G. Singh, A. Kumar, Comparison of Traditional and Hybrid Time Series Models for Forecasting COVID-19 Cases (Publ, Social and Information Networks, 2021) 8. R.R. Sharma, M. Kumar, S. Maheshwari, K.P. Ray, EVDHM-ARIMA based time series forecasting model and its application for COVID-19 cases. IEEE Trans. Instrum. Measur. 99, 6502210 (2020) 9. Y. Tian, W. Ng, J. Cao, S. McIntosh, Geek talents: who are the top experts on GitHub and stack overflow? Comput. Mater. Continua 465–479 (2019)
354
R. SenthilKumar et al.
10. S. Tipirneni, C.K. Reddy, Self-supervised transformer for multivariate clinical time-series with missing values (Mach, Learn, 2021) 11. M. Tripathi, Sentiment analysis of Nepali COVID19 Tweets using NB. SVM AND LSTM. J. Artif. Intel. 3(03), 151–168 (2021) 12. J.I.-Z. Chen, K.-L. Lai, Deep convolution neural network model for credit card fraud detection and alert. J. Artif. Intel. 3(2), 101–112 (2021) 13. H.K. Andi, An accurate Bitcoin price prediction using logistic regression with LSTM machine learning model. J. Soft Comput. Paradigm 3(3), 205–217 (2021) 14. A.P. Pandian, Performance evaluation and comparison using deep learning techniques in sentiment analysis. J. Soft Comput. Paradigm (JSCP) 3(2), 123–134 (2021) 15. S.R. Mugunthan, T. Vijayakumar, Design of improved version of sigmoidal function with biases for classification task in ELM domain. J. Soft Comput. Paradigm (JSCP) 3(02), 70–82 (2021)
Deceptive News Prediction in Social Media Using Machine Learning Techniques Anshita Malviya and Rajendra Kumar Dwivedi
Abstract Our society has witnessed numerous incidents concerning fake news. Social media has always played a significant role in its contribution. People with notorious mindsets often are the generators and spreaders of such incidents. These mischievous people spread the fake news without even realizing the effect, and it has on naive people. People believe on the fake news and start behaving accordingly. Fake news appeals to our emotions. It plays with our feelings, and it can make us angry, happy or scared. Also, fake news can lead to hatred or anger towards a specific person. Now-a-days, people easily befool each other using social media as a tool to spread the fake news. In this paper, machine learning-based models are proposed for detection of such deceptive news which creates disturbances in our society. This model is implemented in Python. Logistic regression, multinomial naive Bayes, passiveaggressive classifier and multinomial classifier with hyperparameter algorithms are used. Results show that the logistic regression algorithm outperforms others in terms of accuracy to detect the fake news. Keywords Logistic regression · Multinomial NB algorithm · Passive-aggressive classifier · Multinomial classifier with hyperparameter · Count vectorizer · Tfidf vectorizer · Hash vectorizer
1 Introduction The ineluctable part of our community is the social media. The news source present on it cannot be trusted. Now-a-days, these social platforms are the medium of spreading fake news which consists of forge stories and vague quotes, facts and sources. These hypothetical and spurious stories are used to influence people’s opinion towards any issue. The term fake news amalgamates these different conceptions such as misinformation, disinformation and mal-information. Since few years, the spread of fake news is generally through social media platforms like Twitter, Facebook, A. Malviya (B) · R. K. Dwivedi Department of Information Technology and Computer Application, MMMUT Gorakhpur, Gorakhpur, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_27
355
356
A. Malviya and R. K. Dwivedi
WhatsApp, YouTube, etc., in the form of videos, memes, advertisements, imposing contents and many more. This has become a serious issue as it is causing serious crimes and affecting the peace and brotherhood among people. As we have seen that fake news comprises of misinformation, disinformation and mal-information and we have seen that these information goes viral because of overload of information on social media and uses. The intent of the people or the medium of transfer of information helps of fake news. Misinformation means false information, the news which is not true and is fake. Public opinion is changed using this form of fake news. Intentionally changing the real news to fake by a person is referred to as a disinformation. It can be dangerous on social platforms as large amount of news are present these and lot of people are uses of social media. Malinformation is referred to as true news but when it goes viral causes harm to the society, organization and community. All of these three together constitutes fake news. These are circulated for fun or for some political or business interests. People does not give second thought about the news which they read on social media, this is because that the processing capability of our brain is very less, that is why it makes the judgement according to some previous issues or facts. As we have seen that during this COVID-19 pandemic, many misinformation are present on social media related to some have remedies which are not authentic, false advisories or some intrigue theories. These were some news related to financial crises being foist in India during this pandemic, which was fake checked by Press Information Bureau on 24 March 2020. Even our Prime Minister requested us not to believe these rumours and stay calm and motivated. The swamp of fake news was seen on social media when the government brought CAA act in 2019. The people thought that their citizenship would be taken because of this act which was not true. The Supreme Court of India asked the government to explain the citizens about this act and remove their doubts and misconceptions. Spreading of fake news is also seen during the time of elections. There are various ways by which this fake news could be identified. Readers’ emotions are influenced by these fake news which leads to anger and disputes. If the website address is fake, author is anonymous, source of news is mispresented and the article is grammatically incorrect, all these clearly indicate that the news is fake. People should also check the publication date of the article to find the authenticity of the news. We readers could also take actions against the misinformation spread on social media. If we see a post trending on social media with misinformation, we should report it or we could also report people who are spreading this information and creating disturbance in the society. We could be the editors and could find the truth behind the articles and protect our community from this problem. Rest of the paper is organized as follows. Section 2 introduced the literature survey related to the deceptive news prediction using machine learning algorithms. Proposed methodology is presented in Sect. 3. Section 4 gives the overview of the machine learning techniques used in prediction of fake news. Experimental work is explained in Sect. 5. Conclusion and future directions is given in Sect. 6.
Deceptive News Prediction in Social Media Using Machine …
357
2 Literature Survey This section discusses a detailed review on the detection of fake news and classification of fake and real news present on social media platforms. It has been observed that since few years researchers have shown their great interest in this area. Mandical et al. [1] discussed about the problem of fake news spreading worldwide and the importance of machine learning in the detection of these fake news. A fake news classification system has been proposed by the authors using three machine learning algorithms such as naïve Bayes, passive-aggressive classifier and deep neural networks on eight datasets. If correct approach would be used, then the task of identification of fake news could be made insignificant. Ksieniewicz et al. [2] told about two methods of identifying fake news. The first method is fact checker which involves volunteers, and the second is based on intelligent systems. A different method is used for detection of false news which involves stream data classification approach and drift occurring concept. Benchmark data has been used to evaluate the approach used. Bhogade et al. [3] focus on the growth and popularity of social media which contributes towards the spread of false news disturbing the positive mindset of people. The authors collected various news stories using natural language processing, machine learning as well as artificial intelligence. The usage of different models of machine learning has been discussed for prediction of fake news and checking the authenticity of news stories. Ashtaputre et al. [4] proposed a model which involves machine learning techniques and natural language processing for the detection of fake news. They performed comparison of different classification models and techniques to determine the best result among them. They used TFIDF vectorization for data preprocessing. Baarir and Djeffal [5] explained about the current threat to mankind which is the wide spread of illegal information in the form of news on social media. The most interesting research topic of present time is the detection and prevention of fake news. Tfidf vectorizer has been used as the data preprocessing technique, and support vector machine classifier is used for the construction of detection system. Bharath et al. [6] explored five machine learning algorithms such as logistic regression, support vector machines, Naïve Bayes, recurrent neural network models and found their effectiveness towards solving the fake news detection task. They also performed sentiment analysis and concluded that naïve Bayes and SVM are the best approaches among other used methods. Sharma et al. [7] elaborated about the consequences and disadvantages of using social media platforms. Tendentious options created to any issue is created among people. The authors performed binary classification of different news articles using artificial intelligence, natural language processing and machine learning. Ahmed et al. [8] proposed techniques to check the authenticity of the news. They concluded that the machine learning methods are reliable for detecting fake news. They evaluated the accuracy of their proposed model with other systems by applying
358
A. Malviya and R. K. Dwivedi
various combinations of machine learning algorithms like support vector machine, passive-aggressive classifier, logistic regression and naïve Bayes. Kumar et al. [9] presented the significance of techniques of natural language processing to classify news in fake and real. Text classification method with the help of various classifications models are used by them to predict the result. They concluded that LSTM application is best among all the models used. Waghmare and Patnaik [10] showed their concern towards developing a technique which would predict fake news and control their widespread transmission of putting positive and negative impact. They used blockchain framework with machine learning methods for the detection and classification of social news. For the purpose of revoking, the illegal achieves of spreading false news blockchain framework is important. Khivasara et al. [11] proposed web-based augmentation to provide authenticity of content to the readers. Deep learning models are used as the web augmentation. LSTM and GPT-2 are two algorithms used by the authors to distinguish between real and fake news articles. Mishra [12] found influence association among readers by proposing a HiMap model for the detection of fake news. Direct and indirect association among the readers could be captured using this approach. Experiments are performed using two Twitter datasets and calculated accuracy for the state-of-the-art models. Shaikh and Patil [13] detected the accuracy of the fake social news with the help of various machine learning approaches. Many fields such as politics, education, etc., are influenced by the spread of fake news. Resources are limited because of which the deletion of fake news becomes complicated. TFIDF vectorization has been used for feature extraction. Chokshi and Mathew [14] collected the percentage of users tricked by the spread of fake news as in 2020, 95% of the people protesting against the CAA law thought that their citizenship would be taken and these people became the victim of this problem. Different deep learning methods are used by the authors to solve this problem. Artificial neural network, convolutional neural network architectures have also been used. Wang et al. [15] investigated the issues related to the prediction of fake news. They proposed novel set-up consisting of the fake article detector, the annotator and the reinforced selector as the models based on deep learning were not capable to tackle the dynamic nature of article on social media. This set-up improved the efficiency of finding the fake news. Lee et al. [16] had proposed an architecture of deep learning which is used to detect the fake news in Korean articles. The sentences written in Korean are smaller than English sentences which create problem, and therefore, different CNN-based architectures of deep learning are used to resolve the issue. Qawasmeh et al. [17] accepted that in comparison to present text-based analysis of traditional approaches, and the detection of fake news is tougher. It has been seen that machine learning methods are less efficient than neural network models. They have used latest machine learning approaches for the automatic identification of take news.
Deceptive News Prediction in Social Media Using Machine …
359
Agarwalla et al. [18] discussed about the people intentions of spreading the fake news and creating disturbance in the society. They tried to find an efficient and relevant model for the prediction of fake news using classification algorithms and natural language processing. Han and Mehta [19] used naïve Bayes and hybrid CNN and RNN approaches used to mitigating the problem of fake news. They performed comparison among both machine learning and deep learning approaches for finding the accurate systems for the detection of false news. Manzoor and Singla [20] told about the success of detection of fake news using various machine learning algorithms, but the ongoing change in the features and orientation of fake news on different social media platform can be solved by various deep learning approaches such as CNN, deep Boltzmann machine, deep neural network, natural language and many more.
3 Proposed Methodology Methodology used to make various models based on different machine learning algorithms to analyse model accuracy and find fake versus true news consists of the following algorithm and also depicted in Fig. 1. Algorithm Deceptive news prediction Input: Take news dataset Output: Accuracy of different models for predicting fake news Begin Step 1: Take news dataset consisting of fake and real news articles Step 2: Preprocessing of the data Step 3: Split the data in train and test datasets Step 4: Design and Train the models with machine learning algorithms like logistic regression, multinomial naïve Bayes, etc. Step 5: Test the models Step 6: Analyse the models End
4 Selection of Machine Learning Techniques Machine learning, a subset of artificial intelligence, is so versatile today that we use it several times in a day without having knowledge of it. We cannot imagine this world
360 Fig. 1 Methodology for fake news detection
A. Malviya and R. K. Dwivedi
Start
Get the news dataset
Preprocessing of dataset
Split the dataset in train and test datasets
Design and Train the models
Test the models
Analyze the accuracy of models
Stop
without machine learning as we already got so many things from it and in future will also get. Learning is a native behaviour of living beings. Living beings gets new knowledge from the surrounding and modify it by experiences like happiness and hurdles which comes on their way. Simulating the learning ability of living beings into machines is what we all know as machine learning. Used machine learning algorithms in experimental work are discussed below.
Deceptive News Prediction in Social Media Using Machine …
361
4.1 Logistic Regression (LR) Logistic regression comes under supervised machine learning technique and is the most widely used algorithm. Output of categorical dependent attributes are predicted with the help of independent attributes set in this algorithm. The result of the prediction is in the form of probabilistic values (between 0 and 1). The classification problems are solved using it. There is ‘S’-shaped logistic function/sigmoidal function in place of regression line for predicting two values (0 or 1). Data classification can be performed using continuous and discrete datasets. The concept of threshold value is used in logistic regression. The nature of dependent attribute should be categorical and multi-collinearity should not be present in independent variables are the two main assumptions of logistic regression. There are three types of logistic regression—binomial, multinomial and ordinal. The equation of logistic regression is given below. log[y/(1 − y)] = b0 + b1 x1 + b2 x2 + · · · bn xn
4.2 Multinomial Naïve Bayes (MNB) Multinomial naïve Bayes algorithm is commonly used in natural language processing and is a probabilistic learning approach. It uses the approach of Bayes theorem. Probability of each item for a given list is calculated, and the output is the item with the highest probability. Many algorithms come under naïve Bayes theorem with a principle that each attribute is independent of the other attribute. Bayes theorem formula is as follows. P( A/B) = P( A) × P(B/A)/P(B) The advantages of multinomial NB are that it is easy to implement, can be used for continuous and discrete data, real time applications can be simply predicted and it can handle huge dataset. This algorithm is not suitable for regression. It is suitable for textual data classification rather than predicting numerical data.
4.3 Passive-Aggressive Classifier (PAC) Passive-aggressive classifier is a machine learning algorithm, but it is not very popular among enthusiasts. This is very efficient for various applications such as detection of fake news on social media. This algorithm is similar to a perceptron model that consists of regularization parameter and does not used a learning rate. Passive refers
362
A. Malviya and R. K. Dwivedi
to not making changes in the model if the prediction is correct and aggressive refers to making changes in the model if the prediction comes as incorrect. It is a classification algorithm for online learning used in machine learning. Online learning is one of the categories of machine learning like supervised, unsupervised, batch, instance-based and model-based learning. A system can be trained in passive-aggressive classifier by incrementally giving the instances continuously sequentially, individually or in small batches.
4.4 Multinomial Classifier with Hyperparameter (MCH) Multinomial classifier with hyperparameter is a naïve Bayes algorithm. This algorithm is not so much popular among the machine learning enthusiasts. This algorithm is mostly suitable for text data. It is a naïve Bayes algorithm which involves tuning with the hyperparameter.
5 Experimental Work 5.1 Data Collection The first step involved in developing the classification model is collecting data. The goodness of the predictive model is based on the quality and quantity of the data collected which turn out to be one of the most important steps in developing a machine learning model. The news dataset is taken from Kaggle repository. Figure 2 represents five records of dataset. This dataset consists of 20,800 instances and five attributes namely ID, title, author, text and label. The ID attribute represents unique ID for the news article, title attribute tells the title of the news, author attributes give the name of the author of the article, text attributes contain entire news and label attribute informs about the authenticity of the news articles in terms of zero and one. Five instances (some portion) of dataset is given in Fig. 2.
Fig. 2 Five instances of the dataset
Deceptive News Prediction in Social Media Using Machine …
363
5.2 Preprocessing of Fake News Dataset In data preprocessing, we will take the text attribute from the dataset which comprises of actual news articles. For making the model more predictable, we will modify this text attribute so that more information could be extracted. This is done using ‘nltk library’. Firstly, we have removed stopwords present in the article. Stopwords are the words which are used to connect and tell the tense of sentences and thus have less importance in the context of sentences and can be removed. After that tokenization is performed followed by vectorization. Vectorization is the technique of natural language processing in which words are mapped with vectors of real numbers for semantic prediction. Three vectorization techniques are used in this paper namely count, hash and tfidf vectorizers. These vectorizers are used for extracting features with the help of text with an aim to build processing models. Tfidf stands for term frequency inverse document frequency vectorizer. The transformation of text into significant number representation is done by this vectorizer. It is a very common algorithm which is used to fit algorithms of machine learning for prediction. Count vectorizer is a best tool. On the basis of the occurrence of each word in the text, and this tool is used to transform a given text into a vector. It is helpful in the case of multiple texts. Hashing vectorizer uses hashing techniques which is used to find the name of string token so that it can be mapped with integers. It is a vectorizer which is used to transform collection of documents into sparse matrix which consists of the count of token.
5.3 Design and Train/Test the Models Before building the models, we divided the dataset into two parts namely train and test datasets. Train dataset consists of 67% of total instances and test dataset consists of 33%of total instances. The independent attribute taken for training the models is text attribute, and label attribute is taken as dependent.
5.4 Analyse the Models We evaluated the models using confusion matrix and accuracy report. Table 1 depicts the comparisons of accuracy of different models using three types of vectorizer. We have seen that when data is preprocessed using tfidf vectorizer, then highest accuracy is achieved by logistic regression and passive-aggressive classifier model. Logistic regression model gives best performance that is 95% and 93% when its data is preprocessed using count vectorizer and hash vectorizer, respectively.
364
A. Malviya and R. K. Dwivedi
Table 1 Comparison of accuracy S. No.
Machine learning algorithms
Tfidf vectorizer
Count vectorizer
Hash vectorizer
1.
Multinomial naïve Bayes
0.900
0.898
0.876
2.
Passive-aggressive classifier 0.951
0.935
0.925
3.
Multinomial classifier with hyperparameter
0.900
0.898
0.876
4.
Logistic regression
0.950
0.949
0.926
Figure 3 depicts the accuracy report for each algorithm in which X-axis represents machine learning techniques and Y-axis represents their accuracy, and Fig. 4 shows
Accuracy Report
Fig. 3 Accuracy of ML techniques
0.96 0.94
Accuracy
0.92 0.9 0.88 0.86 0.84 0.82 MNB
PAC
MCH
LR
Machine Learning Techniques
Tfidf
Count
Hashing
Accuracy Report
Fig. 4 Accuracy of algorithms versus vectorizers
Accuracy of models
0.96 0.94 0.92 0.9 0.88 0.86 0.84 0.82 Tfidf
Count
Hashing
Vectorizaon Techniques
MNB
PAC
MCH
LR
Deceptive News Prediction in Social Media Using Machine …
365
Table 2 Confusion matrix S. No. Vectorization
Machine learning algorithms
1.
Multinomial naïve Fake news Bayes Real news
3238
151
453
2193
Passive-aggressive Fake news classifier Real news
3233
156
142
2504
Multinomial classifier with hyperparameter
Fake news
3237
152
Real news
448
2198
Logistic regression Fake news
3259
130
Real news
170
2476
Multinomial naïve Fake news Bayes Real news
3136
253
364
2282
Passive-aggressive Fake news classifier Real news
3168
221
170
2476
Multinomial classifier with hyperparameter
Fake news
3136
253
Real news
364
2282
Logistic regression Fake news
3236
153 2491
2.
3.
Tfidf vectorizer
Count vectorizer
True label/predicted Fake news Real news label
Real news
155
Hashing vectorizer Multinomial naïve Fake news Bayes Real news
3297
92
659
1987
Passive-aggressive Fake news classifier Real news
3155
234
216
2430
Multinomial classifier with hyperparameter
Fake news
3293
96
Real news
656
1990
Logistic regression Fake news
3183
206
Real news
242
2404
accuracy of algorithms versus vectorization in which X-axis shows different vectorization techniques and Y-axis depicts their accuracy. Table 2 presents the confusion matrix. Confusion matrix is used to compare the performance of classification model on test data when true values are known.
6 Conclusions and Future Directions In this paper machine, learning models are developed such as logistic regression model, multinomial naïve Bayes model, passive-aggressive classifier model and
366
A. Malviya and R. K. Dwivedi
multinomial classifier with hyperparameter to predict fake news. Results show that overall logistic regression model gives best accuracy among all the proposed models and with all used data preprocessing algorithms for the detection of deceptive news spreading on social platforms and influencing human behaviour. It is also found that with tfidf vectorizer passive-aggressive classifier model gives best result with 95.1% accuracy. As future work, more experiments can be done by using more machine learning algorithms, preprocessing techniques and datasets for finding efficient system to detect fake news.
References 1. R.R. Mandical, R. Monica, N. Mamatha, A.N. Krishna, N. Shivakumar, Identification of Fake News Using Machine Learning (IEEE, 2020). ISBN: 978-1-7281-6828-9/20 2. P. Ksieniewicz, P. Zyblewski, M. Choras, R. Kozik, A. Giełczyk, M. Wozniak, Fake News Detection from Data Streams (IEEE, 2020). ISBN: 978-1-7281-6926-2/20 3. M. Bhogade, B. Deore, A. Sharma, O. Sonawane, M.S. Changpeng, A research paper on fake news detection. Int. J. Adv. Sci. Res. Eng. Trends 6(6) (2021). ISSN (Online) 2456-0774. https://doi.org/10.51319/2456-0774.2021.6.0067 4. P. Ashtaputre, A. Nawale, R. Pandit, S. Lohiya, A machine learning based fake news content detection using NLP. Int. J. Adv. Sci. Technol. 29(7), 11219–11226 (2020) 5. N.F. Baarir, A. Djeffal, Fake news detection using machine learning, in 2020 2nd International Workshop on Human-Centric Smart Environments for Health and Well-being (IHSH). ISBN: 978-1-6654-4084-4/21 6. G. Bharath, K.J. Manikanta, G.B. Prakash, R. Sumathi, P. Chinnasamy, Detecting fake news using machine learning algorithms, in 2021 International Conference on Computer Communication and Informatics (ICCCI—2021), Coimbatore, 27–29 Jan 2021 (IEEE). ISBN: 978-1-7281-5875-4/21/2021 7. U. Sharma, S. Saran, S.M. Patil, Fake news detection using machine learning algorithms. Int. J. Eng. Res. Technol. (IJERT). ISSN: 2278-0181. Special issue—(2021), NTASU–2020 conference proceedings 8. S. Ahmed, K. Hinkelmann, F. Corradini, Development of fake news model using machine learning through natural language processing. World Acad. Sci. Eng. Technol. Int. J. Comput. Inf. Eng. 14(12) (2020). ISNI: 00000000 919 5 0263 9. K.A. Kumar, G. Preethi, K. Vasanth, A study of fake news detection using machine learning algorithms. Int. J. Technol. Eng. Syst. (IJTES) 11(1), 1–7 (2020). ISSN: 0976-1345 10. A.D. Waghmare, G.K. Patnaik, Fake news detection of social media news in blockchain framework. Indian J. Comput. Sci. Eng. (IJCSE) 12(4) (2021). https://doi.org/10.21817/indjcse/2021/ v12i4/211204151. e-ISSN: 0976-5166, p-ISSN: 2231-3850 11. Y. Khivasara, Y. Khare, T. Bhadane, Fake news detection system using web-extension, in 2020 IEEE Pune Section International Conference (PuneCon), Vishwakarma Institute of Technology, Pune, India, 16–18 Dec 2020 (IEEE, 2020). ISBN: 978-1-7281-9600-8/20 12. R. Mishra, Fake news detection using higher-order user to user mutual-attention progression in propagation paths, in Computer Vision Foundation 2020 Workshop 13. J. Shaikh, R. Patil, Fake news detection using machine learning, in International Symposium on Sustainable Energy, Signal Processing and Cyber Security (iSSSC) (IEEE, 2020). ISBN: 978-1-7281-8880-5/20/ ©2020 IEEE. https://doi.org/10.1109/iSSSC50941.2020.93588 14. A. Chokshi, R. Mathew, Deep learning and natural language processing for fake news detection: a survey, in International Conference on IoT Based Control Networks and Intelligent Systems. ICICNIS (2020)
Deceptive News Prediction in Social Media Using Machine …
367
15. Y. Wang, W. Yang, F. Ma, J. Xu, B. Zhong, Q. Deng, J. Gao, Weak supervision for fake news detection via reinforcement learning. Assoc. Adv. Artif. Intell. (2020). www.aaai.org 16. D.H. Lee, Y.R. Kim, H.J. Kim, S.M. Park, Y.J. Yang, Fake news detection using deep learning. J. Inf. Process Syst. 15(5), 1119–1130 (2019). ISSN 1976-913X (Print), ISSN: 2092-805X (Electronic). https://doi.org/10.3745/JIPS.04.0142 17. E. Qawasmeh, M. Tawalbeh, M. Abdullah, Automatic identification of fake news using deep learning, in Sixth International Conference on Social Networks Analysis, Management and Security (SNAMS) (IEEE, 2019). ISBN: 978-1-7281-2946-4/19 18. K. Agarwalla, S. Nandan, V.A. Nair, D.D. Hema, Fake news detection using machine learning and natural language processing. Int. J. Recent Technol. Eng. (IJRTE) 7(6) (2019). ISSN: 2277-3878 19. W. Han, V. Mehta, Fake news detection in social networks using machine learning and deep learning: performance evaluation, in International Conference on Industrial Internet (ICII) (IEEE, 2019). ISBN: 978-1-7281-2977-8/19 ©2019 IEEE. https://doi.org/10.1109/ICII.2019. 00070 20. S.I. Manzoor, D.J.N. Singla, Fake news detection using machine learning approaches: a systematic review, in Proceedings of the Third International Conference on Trends in Electronics and Informatics (ICOEI 2019). IEEE Xplore Part Number: CFP19J32-ART; ISBN: 978-1-5386-9439-8
Several Categories of the Classification and Recommendation Models for Dengue Disease: A Review Salim G. Shaikh, B. Suresh Kumar, and Geetika Narang
Abstract Dengue fever is becoming more familiar with each passing year. To control the disease, it is necessary to conduct a complete analysis of the dengue-affected regions and the condition’s symptoms. Mosquitos spread dengue fever. Dengue fever is caused by a family of viruses called Flaviviridae with four genetic variants that spread by the bite of infected Aedes mosquitoes. Dengue fever affects over 2.5 billion people worldwide, with approximately 100 million new cases reported each year. Dengue fever’s global prevalence has risen considerably in recent years. Upward of 100 countries in the Americas, East Asia, the western Pacific, Africa, and the eastern Mediterranean now have the disease. In this paper, surveyed various signs and symptoms of dengue viral are discussed. The dengue classification and recommendationbased techniques are surveyed and analyzed. Different recommendation and classification models such as artificial neural network (ANN), support vector machine (SVM), ensemble learning, random forest, and decision tree are compared with the help of different performance metrics such as accuracy, specificity, and sensitivity rate. Keywords Dengue fever · Artificial neural network · Support vector machine
1 Introduction Dengue fever is a mosquito-borne infection transmitted by the DENV viral infection and spread by the Aedes Aegypti mosquito. According to the World Health Organization (WHO), around 4.2 million suspected fever cases were identified globally in 2019. Earlier, the same organization released an advisory designating dengue fever S. G. Shaikh (B) Department of CSE, Amity University, Jaipur, Jaipur, India e-mail: [email protected] B. Suresh Kumar Department of CSE, SGU Kolhapur, Kolhapur, India G. Narang Department of CSE, TCOER, Pune, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_28
369
370
S. G. Shaikh et al.
as one of the years with the most dangerous infections. Throughout Brazil, a global epidemic of dengue fever occurred in 2019, with a 149% increase in prevalence under certain regions leading to the majority of a unique kind of virus (DENV-2) [1]. The Aedes Aegypti mosquitoes replicate in wet conditions and prefer to breed in a tropical environment with high rainfall. To prevent infection, systems of government fund public information campaigns encouraging people to properly dispose of tires and bottles in the outdoors because they can collect water and provide a suitable habitat for mosquitos in the long term. Dengue fever affects over 2.5 billion people worldwide, with approximately 100 million new cases reported each year. Dengue fever’s global prevalence has risen considerably in recent years. Upward of 100 countries in the Americas, East Asia, the western Pacific, Africa, and the eastern Mediterranean now have the disease. The eastern Mediterranean, Africa, the West Pacific, the Americas, and Southeast Asia are perhaps the most severely affected [2]. Dengue fever is also known as dandy fever, dengue hemorrhagic fever, or break-borne flu. Dengue fever is an infectious disease caused by a retrovirus spread by mosquitoes. Dengue virus is transmitted by a mosquito species that also spreads chikungunya, yellow fever, and Zika virus. Dengue fever is spread through the bite of a dengue virus-infected Aedes Mosquitoes. Whenever a mosquito bites an individual with an infectious agent in their bloodstream, the mosquito gets contaminated. It is not contagious and cannot be passed through one individual to the next. Signs and symptoms of dengue often appear 4–6 days of infection, which can continue up to ten days. Sudden fever with high temperature is the most common sign of dengue. Unbearable headaches, the ache behind your eyes, knee and muscular pain that is excruciating fatigue sickness vomiting 2–5 days following the commencement of the temperature, rashes emerge on the epidermis [3]. Moderate discoloration such as easy bruising, nose bleeds, and bleeding gums. Signs and symptoms can be modest and misinterpreted for the flu or other chronic conditions. The virus affects small kids and persons that have never had it before in a gentler way than it does adult children and young people. Yet, significant issues can arise. Dengue hemorrhagic fever is a clinical syndrome marked by a high temperature, disruption to lymphatic blood arteries, blood from the nostrils and mouth, hepatic swelling, and cardiovascular equipment malfunction. Excessive blood, panic, and mortality may occur due to the signs. The condition is known as dengue shock syndrome (DSS). Dengue hemorrhagic sickness is more common in immunocompromised patients and those with a subsequent offense infectious disease [4]. Analysis of the several machine learning (ML) approaches have been applied in various areas over the last two decades, including geographies, the ecosystem, and health care, to extract significant results using complex and heterogeneous datasets. Machine learning, unlike statistical approaches, includes a combination of a substantial percentage of dependent variables, the modeling of multiple interrelationships among different factors, and the fitting of complex systems without making an assumption aspect (e.g., linear, exponential, and logistic) of operations. In dengue-forecasting studies, support vector machines, artificial neural networks, decision trees, ensemble learning, and random forest are shared machine learning-based prediction techniques.
Several Categories of the Classification and Recommendation …
371
The sections of this paper are: The brief introduction and history of dengue are explained in Sect. 1. In Sect. 2, various existing methods for dengue prediction are surveyed. Problems with dengue prediction are described in Sect. 3. Different types of data sources for dengue classification and prediction systems are explained in Sect. 4. In Sect. 5, the various recommendation and classification models for dengue disease.
2 Literature Review Lee et al. [5] studied the virus’s vibrant dissemination structure from the perspective of a stochastic process. With cross-infection of individuals and mosquitoes in multiple disease phases, authors had determined the parameters of an epidemiological compartment system. The authors investigated all of the atmospheric and insect factors that might influence the pandemic to anticipate dengue cases and outbreaks. In the proposed study, authors were used series data for the period from multiple metropolitan areas of 4.7 million to create a double approximate solution for a framework structure of finite difference for variable to variance decomposition storage area framework, Markov chain Monte Carlo technique, and integral boundary technique to analyze the pandemic intervals of dengue infection under the influence of environmental variables. Gangula et al. [6] designed a classification model to identify the key traits that propagate dengue fever. The authors were analyzed that among the essential methodologies in the modern assessment was machine learning. Numerous strategies were used in clinical use. The authors found that dengue fever was among the most dangerous bacterial infections, and it necessitates the development of optimistic predictions by a high-level machine to train. In hybrid implementations, the authors had used the ensemble machine learning approach to uncover the variables linked to the transmission of dengue fever and boost efficiency. Silitonga et al. [7] presented the random forest classification technique with a tenfold crossvalidation framework. As the classification model, an automated system was used to generate a more accurate and consistent model that predicts dengue’s diagnostic and therapeutic level inside the crucial period. If experimental attribute outcomes were recognized since it produced the maximum accuracy (58%) of the classification models. The goals of the proposed study were to measure the model performance constructed to identify the proper category inside a particular dataset that used an artificial neural network classification model and a random forest classification algorithm independently and to discover the best-performing classification. The findings of the proposed study would be employed in the creation of machine learning algorithms, which could forecast the clinical severity of dengue fever in the crucial stage if testing parameters were available, using the best-performing classification. Chakraborty and Chandru [8] developed a strategic framework for dengue detection. Several contemporary dengue-forecasting models take advantage of established correlations involving environmental and socio-demographic parameters and transmission rates. Still, they were not dynamic enough to account for the rapid and
372
S. G. Shaikh et al.
abrupt growth and decrease in occurrence count data. The authors had developed a nonparametric, adaptable probability distribution process. A vector error correction model was proposed based on previous dengue occurrence count data and potential climatological confounders. The authors had demonstrated that the proposed method outperforms other methodological approaches. Also, the authors showed that the proposed method was a brilliant strategy and robust framework for health experts. Balamurugan et al. [9] designed an innovative feature selection method; entropy weighted score-based optimal ranking algorithm. The proposed framework proved to be a valuable and effective technique for healthcare prediction and diagnosis. The suggested system’s outstanding features selection for quickly identifying qualities (components) was accountable for the condition’s primary causes. The dengue dataset was constructed in this investigation by gathering clinical laboratory testing results of adult and paediatric patients as real-time samples from several medical clinics in Tamil Nadu’s Thanjavur region. The prediction model was used to observe a statistical methodology and the project’s outcomes, surpass conventional systems. Mussumeci and Coelho [10] presented a machine learning algorithm to anticipate weekly dengue prevalence in 790 Brazilian cities. The machine learning-based algorithm includes selecting features like least absolute shrinkage and selection operator (LASSO), random forest regression with long short-term memory (LSTM), and a deep recurrent neural network. To identify the geographic dimension of illness propagation, the authors employed a multidimensional statistical model as classifiers and time-series data from comparable cities. The long short-term memory (LSTM) recurrent neural network approach outperformed all others in forecasting future dengue outbreaks in towns of various shapes and sizes. In Table 1, various existing techniques are depicted with research gaps and performance metrics. The different methods of dengue prediction with merits and demerits are illustrated in Table 2.
3 Issues Occurred in Dengue Fever Dengue fever is a significant global emerging sickness that significantly strains affected countries’ medical systems, necessitating the development of a highperformance machine learning prediction model. It is hard to distinguish dengue from several other familiar fulminant chronic conditions even before abnormalities a fast, and inexpensive approach is desperately required to promote timely detection, both to improve patient treatment as well as to enable the efficient use of available assets, as well as to evaluate a patient who is at elevated risk of adverse effects. It could also be advantageous to analyze early disease attributes with commonly available diagnostic procedures in a significant number of treating a wide variety of dengue illnesses encountered in the surroundings to develop a robust instance for dengue in the construction of predictive classifiers [11, 12]. Timely identification of infections with an expanded significant focus and an early response to looming epidemics may be tremendously beneficial in reducing the number of dengue illnesses globally.
Several Categories of the Classification and Recommendation …
373
Table 1 Dengue prediction and recommendation existing models Author’s year
Proposed method
Gap/problem definition
Performance metrics
Lee et al. (2021) [5]
Vector compartment technique for dengue prediction
Computational time and memory consumption is high
Accuracy
Gangula et al. (2021) [6]
Ensemble machine learning-based framework
Overfitting issues and Accuracy taking more time to train
Silitonga et al. (2021) Machine [7] learning-based approach
Overlapping matters in a large dataset
Accuracy
Chakraborty and Chandru (2020) [8]
Gaussian process-based regression methodology
Overfitting issues
Root mean square error Mean absolute deviation
Balamurugan et al. (2020) [9]
Entropy weighted score-based optimal ranking technique
Limited dataset and need to enhance feature selector technique
Accuracy Recall Precision Receiver operating characteristic (ROC) True positive rate False positive rate
Mussumeci and Coelho (2020) [10]
Large scale-based Training time is multivariate prediction enormous model Required large memory for training
Mean square logs an error Mean square error
When combined with risk mapping to determine places prone to dengue infection, these strategies might have significant health benefits of preventing the emergence and spread and reducing cases, lowering overall illness, and dengue fatality.
3.1 Data Sources for Dengue Classification and Prediction Systems The various data sources for dengue classification and prediction systems are depicted in Fig. 1.
3.2 Traditional Data Sources Health care and epidemiology information from conventional health storage systems (such as hospitals), ecological and climatic data from forecasting organizations,
374
S. G. Shaikh et al.
Table 2 Existing techniques of dengue prediction models with merits and demerits Author’s year
Techniques
Merits
Demerits
Lee et al. (2021) [5]
Markov chain-based MONTE CARLO technique
The unknown parameters can be reconstructed with parameter estimation using this technique’s entire probability distribution function
Convergences issues
Gangula et al. (2021) [6]
Decision tree Support vector machine (SVM) Naive Bayes
Reduces the dispersion issues and provides robustness
High deployment cost
Silitonga et al. (2021) Artificial neural [7] network (ANN) Random forest
Handle high dimension data
High computational cost
Chakraborty and Chandru (2020) [8]
Generalized additive model (GAM) Random forest
As the amount of Nonparametric and possible determinants is computational considerable, its complexity is high capabilities to simulate massively complex exponential associations
Balamurugan et al. (2020) [9]
Support vector machine Naive Bayes Multilayer perceptron
Provided efficient and effective results
High computing power
Mussumeci and Coelho (2020) [10]
Long short-term memory (LSTM) Random forest regression technique
Low complexity to update the weights
Overfitting issues Highly sensitive with random weights
and geographic and demographic statistics from many other relevant government resources are examples of traditional multiple data source materials.
3.3 Modern Data Sources Due to advancements in technologies, new data sources that can predict dengue outbreaks have been accessible, with large volumes of information emerging accessible on the Website. Nowadays, epidemiology investigators pay more attention to this reasonably necessary and distinctive data source [13].
Several Categories of the Classification and Recommendation …
375
WHO
Traditional Data sources
Hospitals
Department of meteorology Data Sources Phone Calls
Modern Data sources
BioSensor, Sensor
Social Networks
Fig. 1 Various data source for dengue classification and prediction systems
4 Different Categories of Recommendation and Classification Models for Dengue Disease Dengue is a significant universal medical concern that affects and harms people worldwide. Various recommendation and classification models are developed for the early detection of dengue. This section explains several recommendation and classification models categories such as artificial neural network (ANN), ensemble learning, random forest, and support vector machine (SVM) for dengue disease. Also, the dengue recommendation and classification methods are compared to the analysis.
4.1 Ensemble Learning Ensemble learning is used for classification purposes and discards all useless features that are not essential in model training. Ensemble learning can be used for classification purposes in the dengue disease recommendation model. Ensemble learning is built on the premise that combining the outcomes of multiple models produces improved products than using a particular model. The logic is premised on the idea of constructing a set of hypotheses using several approaches and afterward combining
376
S. G. Shaikh et al.
them to achieve better performance over acquiring only one assumption that used a particular method [6].
4.2 Artificial Neural Network Artificial neural network (ANN) is a central nervous system work technique. Such frameworks are modeled on the biological nervous system, although the nervous system only employs a subset of the principles found in biological nervous systems. ANN models, in particular, mimic the electrical impulses of the central nervous system. Components often referred to as a geek or a perceptron are linked to one another. Neural network models are a class of computing methods motivated by the biological central and peripheral nervous system trained to identify complex features and solve classification tasks without programming [14]. Algorithms recognize distinctive traits in the instances evaluated autonomously. Artificial neurons are the nodes that make up computational models. ANN can be used as a dengue detector and classifier in dengue recommendation systems.
4.3 Support Vector Machine A support vector machine (SVM) is the simplest way to classify two or more data classes. SVM is a machine learning algorithm based on supervised learning. It helps in classification as well as regression. A support vector machine (SVM) is a linear margin classification model that may also be used in nonlinear situations. Support vector machines (SVMs) are powerful and demanding data categorization technologies. It divides the dataset into two segments using hyperactive planes to classify data. It is a sophisticated process that outlines the association across attributes and consequences using multivariate levels. Despite its complexity, it may be used for realworld issues requiring categorization and forecasting [15]. There are several symptoms of dengue as mentioned in previous section, to classify such symptoms SVM classifier can be used and provided efficient results in detection and classification.
4.4 Random Forest Random forest is a flexible, straightforward computational model in the vast majority of circumstances, produces tremendous success with hyper-parameters or without hyper-parameters modification. Along with its simplicity and versatility, it has become one of the most commonly used approaches for the classification of dengue disease, it can be used for classification and regression tasks. Essential characteristics
Several Categories of the Classification and Recommendation …
377
Table 3 Comparison of different recommendation and classification models Recommendation and classification models
Accuracy (%)
Specificity
Sensitivity
Artificial neural network [14]
96
97%
96%
Support vector machine (SVM) [15]
82
87%
76%
Ensemble learning [6]
95
Nil
Nil
Random forest [16]
92.3
92.1%
94%
Decision tree [17]
99
84%
90%
of the random forest algorithm are that it can manage sets of data with both categorical and continuous, as in regression and classification issues. Random forest is an machine learning (ML)-based-supervised method. This transformation decision tree composition frequently learned to use the “bagging” approach into a “forest.” The essential idea of the bagging process is that integrating many techniques and algorithms boosts output tremendously [16].
4.5 Decision Tree Decision trees are a data mining method that combines computational and statistical approaches to aid in the characterization, classification, and refinement of a database. Arrangements in the shape of trees which reflect decisions sets are known as decision trees. Such choices result in classification dataset rules. The decision tree’s principal goal is to reveal the structure’s hidden patterns and relationships [17]. In dengue classification model, decision trees are used in two ways: the first way is classification and the second way is regression. The survey of the several dengue recommendation and classification models comparison analysis are depicted in Table 3. The accuracy, specificity, and sensitivity performance metrics are used for the survey comparison analysis. The graph presents the results of different dengue recommendation and classification models in Figs. 2 and 3. The graphical comparison analysis delivered that decision tree methodology has attained maximum accuracy rate. ANN model has achieved maximum sensitivity and specificity. Because ANN model has used the maximum iterations in the neural network and get the high-performance metrics.
5 Conclusion and Future Scope There is no particular diagnosis of dengue fever, and viable vaccinations are still in research. The most efficient and straightforward method of managing dengue infection is to disrupt pathogen circulation through mosquito control. In this paper, problems with dengue classification models are discussed. One shortcoming of dengue
378
Fig. 3 Parameter analysis with specificity and sensitivity rate
% age of accuracy
Comparison Analysis with Various Models 100% 80% 60% 40% 20% 0%
Accuracy
Comparison Analysis with Various Models % age of specificity and sensitivity
Fig. 2 Performance analysis of dengue recommendation and classification models: accuracy
S. G. Shaikh et al.
100% 80% 60% 40% 20% 0%
Specificity Sensitivity
viral prediction methods is that individuals without technical experience, training, or skill may find it challenging to evaluate the predicted outcomes. Although health care experts find it extremely difficult to grasp clinical records, and most individuals lack knowledge or expertise over what is, to so many, an obscure and complicated issue finds machine learning algorithms are hard to comprehend. There are various signs and symptoms of dengue infection, such as high fever and joint body pain. The several data sources for dengue classification and prediction systems are discussed. The various traditional data sources are World Health Organization (WHO), hospitals, etc. Sensors, social networks, biosensors, phone calls, etc., are modern data sources for dengue prediction systems. Several recommendations and classification models are compared with evaluation parameters such as accuracy, specificity, and sensitivity. The comparison analysis presented that decision tree methodology has attained maximum accuracy. ANN model has achieved maximum sensitivity and specificity. In the future, more classification and prediction models of dengue will be compared for better analysis. The recommendation and prediction systems are dependent on data sources. Therefore, in the future, more data will be collected from different data sources for training and testing of the recommendation and prediction models. A novel model will be developed for the reduction of existing systems issues.
Several Categories of the Classification and Recommendation …
379
References 1. E.D. de Araujo Batista, F.M. Bublitz, W.C. de Araujo, R.V. Lira, Dengue prediction through machine learning and deep learning: a scoping review protocol. Res. Square 2(04), 1–9 (2020) 2. V.R. Louis, R. Phalkey, O. Horstick, P. Ratanawong, A. Wilder-Smith, Y. Tozan, P. Dambach, Modeling tools for dengue risk mapping-a systematic review. Int. J. Health Geogr. 13(1), 1–14 (2014) 3. A. Wilder-Smith, E.E. Ooi, O. Horstick, B. Wills, Dengue. The Lancet 393(10169), 350–363 (2019) 4. C. Cobra, J.G. Rigau-Pérez, G. Kuno, V. Vomdam, Symptoms of dengue fever in relation to host immunologic response and virus serotype, Puerto Rico, 1990–1991. Am. J. Epidemiol. 142(11), 1204–1211 (1995) 5. C.H. Lee, K. Chang, Y.M. Chen, J.T. Tsai, Y.J. Chen, W.H. Ho, Epidemic prediction of dengue fever based on vector compartment model and Markov chain Monte Carlo method. BMC Bioinform. 22(5), 1–11 (2021) 6. R. Gangula, L. Thirupathi, R. Parupati, K. Sreeveda, S. Gattoju, Ensemble machine learning based prediction of dengue disease with performance and accuracy elevation patterns. Mater. Today Proc. (2021) 7. P. Silitonga, B.E. Dewi, A. Bustamam, H.S. Al-Ash, Evaluation of dengue model performances developed using artificial neural network and random forest classifiers. Procedia Comput. Sci. 179, 135–143 (2021) 8. A. Chakraborty, V. Chandru, A robust and non-parametric model for prediction of dengue incidence. J. Indian Inst. Sci. 1–7 (2020) 9. S.A. Balamurugan, M.M. Mallick, G. Chinthana, Improved prediction of dengue outbreak using combinatorial feature selector and classifier based on entropy weighted score based optimal ranking. Inform. Med. Unlocked 20, 100400 (2020) 10. E. Mussumeci, F.C. Coelho, Large-scale multivariate forecasting models for dengue-LSTM versus random forest regression. Spat. Spatio-Temporal Epidemiol. 35, 100372 (2020) 11. J.D. Mello-Román, J.C. Mello-Román, S. Gomez-Guerrero, M. García-Torres, Predictive models for the medical diagnosis of dengue: a case study in Paraguay. Comput. Math. Methods Med. (2019) 12. A.L. Buczak, B. Baugher, L.J. Moniz, T. Bagley, S.M. Babin, E. Guven, Ensemble method for dengue prediction. PLoS ONE 13(1), e0189988 (2018) 13. P. Siriyasatien, S. Chadsuthi, K. Jampachaisri, K. Kesorn, Dengue epidemics prediction: a survey of the state-of-the-art based on data science processes. IEEE Access 6, 53757–53795 (2018) 14. N. Zhao, K. Charland, M. Carabali, E.O. Nsoesie, M. Maheu-Giroux, E. Rees, K. Zinszer, Machine learning and dengue forecasting: comparing random forests and artificial neural networks for predicting dengue burden at national and sub-national scales in Colombia. PLoS Negl. Trop. Dis. 14(9), e0008056 (2020) 15. N.I. Nordin, N.M. Sobri, N.A. Ismail, S.N. Zulkifli, N.F. Abd Razak, M. Mahmud, The classification performance using support vector machine for endemic dengue cases. J. Phys. Conf. Ser. 1496(1), 012006 (2020). IOP Publishing 16. G.M. Hair, F.F. Nobre, P. Brasil, Characterization of clinical patterns of dengue patients using an unsupervised machine learning approach. BMC Infect. Dis. 19(1), 1–11 (2019) 17. D.S.R. Sanjudevi, D. Savitha, Dengue fever prediction using classification techniques. Int. Res. J. Eng. Technol. (IRJET) 6(02), 558–563 (2019)
Performance Analysis of Supervised Machine Learning Algorithms for Detection of Cyberbullying in Twitter Nida Shakeel and Rajendra Kumar Dwivedi
Abstract These days, the use of social media is inevitable. Social media is beneficial in several means, but there are severe terrible influences. A crucial difficulty that needs to be addressed is cyberbullying. Social media, especially Twitter, advances numerous concerns due to a misunderstanding concerning the notion of freedom of speaking. One of those problems is cyberbullying, which influences both man or woman victims as well as the societies. Harassment by way of cyberbullies is a big issue on social media. Cyberbullying affects both in terms of the mental and expressive manner of someone. So there is a need to plan a technique to locate and inhibit cyberbullying in social networks. To conquer this condition of cyberbullying, numerous methods have been devised. This paper would help to comprehend the methods and procedures like logistic regression (LR ), naïve Bayes (NB ), support vector machine (SVM), and term frequency—inverse document frequency (TF-IDF) which are used by numerous social media web sites, especially Twitter. In this paper, we have worked on the accuracy of the SVM, LR, and NB algorithms to detect cyberbullying. We observed that the SVM outperforms the others. Keywords Cyberbullying · Social networks · Machine learning · Twitter · Victims · Logistic regression (LR) · Naïve Bayes (NB) · Support vector machine (SVM)
1 Introduction Due to the large improvement of Internet technology, social media web sites which include Twitter and Facebook have to turn out to be famous and play a massive function in transforming human life. Millions of youths are spending their time on social media devotedly and exchanging records online. Social media has the potential to attach and proportion facts with everyone at any time with many humans concurrently. Cyberbullying exists via the internet where cell phones, video game N. Shakeel (B) · R. K. Dwivedi Department of Information Technology and Computer Application, MMMUT Gorakhpur, Gorakhpur, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_29
381
382
N. Shakeel and R. K. Dwivedi
packages, or other mediums ship or put up textual content, photos, or movies to hurt or embarrass some other character deliberately. Cyberbullying can happen at any time throughout the day, in a week, and outreach a person everywhere via the internet. Cyberbullying texts, pix, or motion pictures may be published in an undisclosed way a disbursed immediately to a very huge target market. Twitter is the furthermost regularly used social networking software that permits humans to micro-weblog around an intensive variety of areas. It is a community stage for verbal exchange, creativity, and public contribution with nearly 330 million vigorous monthly customers, greater than one hundred billion day-by-day lively customers, and about 500 billion tweets are produced on ordinary every day. Conversely, with Twitter turning into a high quality as well as a real communication network, a have a look at has stated that Twitter exists as a “cyberbullying playground”. This paper focuses on the detection of cyberbullying which is one of the critical problems. Some methodologies used, here in this paper are SVM, LR, NB, and TD-IDF. SVM is any of the utmost famous strategies used for classification and regression in machine learning. TFIDF is a time used inside the recovery of statistics. It concludes the rate of phrases in a record as well as its converse file regularity. NB is a set of grouping procedures based totally on the Bayes hypothesis. In this, every pair of features being categorized is independent of each different.
1.1 Motivation Today, social media has grown to be part of anyone’s life. Which makes it clear for bullies to pressurize them as many human beings take it very seriously? Many instances had been pronounced within the past few years but the rates of cyberbullying were elevated in the ultimate 4–5 years. As the rate increases, many humans dedicated suicide due to the fact they get pissed off by way of bullies’ hate messages, and they do not get every other way to overcome it. By noticing all the increments in those cases, it is very vital to take action against bullies.
1.2 Contribution This paper makes the following contribution as follows: • Firstly, inform us about the effects and causes of cyberbullying and how fast it is growing because the day passes. • We have also discussed a few technologies which might be very beneficial to come across positive and negative comments, feedback, or posts on social networks. • Implementation of the accuracy of SVM, LR, and NB.
Performance Analysis of Supervised Machine Learning Algorithms …
383
1.3 Organization The rest of the paper is organized as follows. Section 2 presents the background. Section 3 presents related work. Section 4 presents the proposed approach. Section 5 presents the novelty of the proposed approach, and Sect. 6 presents the conclusion and future work.
2 Background Bullying is commonly described as repeated antagonistic behavior. Bullying has covered especially bodily acts, verbal abuse, and social exclusion. The boom of electronic communications technology causes teens and children to undergo a brand new manner of bullying. There are many exceptional kinds of cyberbullying along with harassment, flaming, exclusion, outing, and masquerading. Reducing cyberharassment is vital due to the fact numerous negative health outcomes were determined among people who had been stricken by cyberbullying, along with depression, tension, loneliness, and suicidal conduct. The major participants in cyberbullying are social networking web sites. The societal media system offers us high-quality communication stage chances in addition they boom the liability of younger human beings to intimidating circumstances virtual. Cyberbullying on a community media community is a worldwide occurrence due to its vast volumes of active users. The style suggests that cyberbullying in a social community is developing hastily every day. The vigorous nature of these web sites helps inside the boom of virtual competitive behavior. The nameless function of person profiles grows the complication to discover the intimidator. Community media is generally owed to its connectivity in the shape of systems. But then this could be dangerous while rumors or intimidation posts are extended into the community which cannot be simply managed. Twitter and Facebook may be occupied as instances that might be general among numerous community media websites. According to Facebook customers must extra than one hundred fifty billion links which offer the clue approximately how intimidation content can be extended inside the community in a portion of the period. To physically perceive, these intimidation messages over this massive system is hard. It has been recognized that because of cyberbullying victims grow to be dangerously timid and may get violent minds of revenge or even suicidal thoughts. They suffer from despair, low self-worth, and anxiety. It is worse than bodily bullying because cyberbullying is “behind-the-scenes” and “24/7”. Even the bully’s tweets or remarks do not vanish; they stay for a long period and constantly impact the sufferer mentally. It is nearly like ragging besides it occurs in the front of heaps of mutual buddies, and the scars live for all time because the messages live forever on the net. The hurtful and tormenting messages embarrass the sufferers to a degree that cannot be imagined. The outcomes are even worse and more excessive. In most instances, this is nine out of ten instances the younger
384
N. Shakeel and R. K. Dwivedi
victims do now not inform their dad and mom or guardian out of embarrassment and get into depression or worse suicide.
2.1 Categorize of Cyberbullying Since the internet is inclusive international, there are few ways to use it for some purpose you need. There exist web sites, media, and many stages which can be discovered for one’s motive. Certain most commonplace kinds of cyberbullying are: • Threatening emails through WhatsApp and SMS. • Constructing a laugh at the sufferer via posting private substance on social media stages like Facebook, Instagram, WhatsApp, Twitter, and so on. • Creating and usage of false profiles to become courtesy, collect a few data, and then embarrass the object character. • Making a bad photo of the intimidated by posting such material. • With false pix and scripts to embarrass the individual.
2.2 Cause and Effect of Cyberbullying The reason for cyberbullying contains inside the inspiration of the bully. Why do they pick out to intimidate and what creates them these tons assured? It wholly begins while any individual selects to move the bounds to disgrace the alternative one. Therefore, what are the inspiring aspects of cyberbullying? Figure 1 depicts the causes of cyberbullying. This includes an absence of understanding, the victims deserve it, self-loathing and it becomes an obsession. Following are the causes of cyberbullying. a.
An Absence of Understanding
Where generation has untied up the area, it has additionally specified the right of declaring some view, analyzing anybody at the same time as sedentary at family. It may be very easy to expanse yourself from the extreme conditions above the net by using simply shutting down. That’s why folks that do not appreciate the level of ache that they have would possibly affect the alternative character remain those who emerge as an aggressor. This creates them to experience power. b.
The Victim Deserves It
The notion of getting the right to choose who deserves what is one of the major reasons for cyberbullying. Once it’s far nearby college, the teens frequently sense that they need to do something to make them feel advanced. For this, they are inclined to dishonor or persecutor different human beings to create a sense of inferiority. Someway, they suppose that it is k to bully others as of their fame.
Performance Analysis of Supervised Machine Learning Algorithms …
385
Fig. 1 Causes of cyberbullying
c.
Self-loathing
Studies have observed that there is a robust link between the humans who must bully formerly and folks who are the tyrants now. The individuals who have been as soon as sufferers might reoccurrence as a bully to vent out the annoyance that they had. However someway, the series keeps, and that they grow to be aching harmless children also. d.
It Becomes an Obsession
If you have got existing using community media systems like Facebook or Instagram, you influence realize how hard it is to disregard the memos and reports. Subsequently, when an intimidator starts somewhat on such stages, non-stop commitment creates him to get hooked to it. Cyberbullying comes in numerous paperwork a number of them contain flickering, maltreatment, criticism, pretend, cyber-stalking, and dangers. Cyberbullying includes and acts as making private threats, sending, scary abuses or tribal offenses seeking to add the sufferer’s laptop with worms, and spamming his/her inbox through emails. The sufferer could cope by cyberbullying up to a certain level through proscribing his/her processor association period, snubbing intimidating messages, and snubbing emails from strange sources. Additional operational and deal with often, swapping ISPs swapping cellular mobile bills, and looking to hint the source. As we all understand, this generation is socializing the arena and is soon going to undertake new technology. We can get linked to the world very easily; however, there is a high-quality factor as well as poor components as it is not always most effective distracting the youngsters; however, many crimes are being committed over the internet each day and one of the maximum committed crimes over the internet is cyberbullying.
386
N. Shakeel and R. K. Dwivedi
Cyberbullying is a practice of victimization or harassment that is completed through the usage of electronic means. Similarly, cyberbullying and cyberharassment are also called online bullying. It is amassed among youths such as we can say that after somebody annoys every other individual on the internet or some specific community media web site, it clues to certain dangerous oppression manners like posting gossips, threats, and so forth. However, bullying or harassment through the internet is very risky to a whole slice of human beings as they go over mental harm as harrying several humans impend others to submit their pix or else nudes over the internet while to keep away from this they must fulfill the needs of the culprits. Cyberbullying is an unlawful and illicit action. The internet is a place, wherein cyberbullying may be very not unusual. Social media sites like Facebook, Snap chat, Instagram, and other commonplace web sites.
3 Related Work Different cyberbullying detection techniques have been invented over the latter few years. All techniques had proven their better effort on certain unique datasets. There are several strategies for figuring out cyberbullying. This segment offers a short survey on such strategies. Nurrahmi [1] described that cyberbullying is a repetitive action that annoys, disgraces, hovers, or bothers different human beings via digital gadgets and virtual communal interacting web sites. Cyberbullying via the internet is riskier than outdated victimization, as it may probably increase the disgrace to vast virtual target spectators. The authors aimed to hit upon cyberbullying performers created on scripts and the sincerity evaluation of customers and inform them around the damage of cyberbullying. They pragmatic SVM and KNN to study plus stumble on cyberbullying texts. Prasanna Kumar et al. [2] describes the fast growth of the net and the development of verbal exchange technology. Due to the upward thrust of the popularity of social networking, the offensive behaviors taking birth that’s one of the critical problems referred to as cyberbullying. Further, the authors mentioned that cyberbullying has emerged as a menace in social networks and it calls for extensive research for identification and detection over web customers. The authors used some technologies such as semi-supervised targeted event detection (STED), Twitter-based event detection and analysis system (TEDAS). Pradheep et al. [3] explained that the social networking platform has emerged as very famous inside the last few years. Through this social media, human beings engage percentage, speak and disseminate know-how for the gain of others through the use of multimodal capabilities like multimedia text, photographs, movies, and audio. Cyberbullying influences each in phrases of mental and expressive funds of a person. Therefore, there is essential to conceive a technique toward stumbling on and inhibiting cyberbullying in communal networks. The cyberbullying image can be detected by the use of the pc imaginative and prescient set of rules which incorporates methods like image similarity and optical character recognition (OCR). The cyberbullying video could be detected
Performance Analysis of Supervised Machine Learning Algorithms …
387
using the shot boundary detection algorithm where the video could be broken into frames and analyzed the use of numerous strategies in it. The proposed framework also helps figure out the cyberbullying audio in the social community. Mangaonkar et al. [4] explained the usage of Twitter statistics is growing day-with the aid of-day, hence are unwanted deeds of its workers. The solitary of the unwanted conduct is cyberbullying which might also even result in a suicidal attempt. Different collaborative paradigms are recommended and discussed in this paper. They used the strategies like Naive Bayes (NB). Al-Ajlan and Ykhlef [5] explained that tools are ruling our survives now we trust upon a generation to perform a maximum of our everyday deeds. The nameless natural surroundings of communal networks, in which operators use nicknames in place of their actual ones creating their actions exact tough near hint has led to a developing range of online crimes like cyberbullying. In this paper, we advocate-boosted Twitter cyberbullying discovery founded totally on deep getting to know (OCDD). As per the cataloging segment, deep learning may be recycled, at the side of a metaheuristic optimization algorithm for parameter tuning. Agrawal and Awekar [6] described that internet devices affected each aspect of humanoid existence, getting easiness in linking human beings about the sphere and has made records to be had to massive levels of the civilization with a click on of a button. Harassment by way of cyberbullies is an enormous phenomenon on social media. In this paper, deep neural network (DNN) technique has been used. Meliana and Fadlil [7] explained that nowadays community media is actual crucial on behalf of certain human beings for the reason that it is the nature of community media that can create humans waste to the use of societal media, and this takes extensively validated in phrases of healing, and community interplay has been abridged because of community media, and kits linked to the somatic to be decreased, and social media may be fine and terrible, fantastic uncertainty recycled to deal hoary associates who do not encounter for lengthy; however, poor matters may be used to crime or matters that are not suitable. Keni et al. [8] described that state-of-the-art youngsters have grown in an era that is ruled by new technologies where communications have typically been accomplished through the use of social media. The rapidly growing use of social networking sites by a few young adults has made them susceptible to getting uncovered to bullying. Cyberbullying is the usage of technology as a medium to bully a person. Although it is been a difficulty for many years. The effect of cyberbullying has been boom due to the speedy use of social media. Wade et al. [9] explained that within the latest years social networking sites have been used as a platform of leisure, task opportunity, and advertising; however, it has additionally caused cyberbullying. Generally, cyberbullying is an era used as a medium to bully someone. Nirmal et al. [10] defined the growth within the use of the Internet and facilitating get admission to online communities which includes social media have brought about the emergence of cybercrime. Cyberbullying may be very not unusual nowadays. Patidar et al. [11] defined the speedy increase of social networking web sites. Firstly, the authors cited the nice aspect of social networking sites after that they cited some of its downside which is using social media human beings can be humiliated, insulted, bullied, and pressured using nameless customers, outsiders, or else
388
N. Shakeel and R. K. Dwivedi
friends. Cyberbullying rises to use tools to embarrassing social media customers. Ingle et al. [12] described in the current year’s Twitter has appeared to be a highquality basis for customers to show their day-by-day events, thoughts, and emotions via scripts and snapshots. Because of the rapid use of social networking web sites like Twitter the case of cyberbullying has been also been boomed. Desai et al. [13] describe that the use of internet plus community media credentials tend within the consumption of distribution, receiving, and posting of poor, dangerous, fake, or else suggest gratified material approximately any other man or woman which, therefore, manner cyberbullying. Cyberbullying has caused an intense growth in intellectual health problems, mainly many of the younger technology. Khokale et al. [14] defined as internet having affected each phase of social lifestyles, fetching easiness in linking human beings about the sphere and has made facts to be had to large sections of the civilization on a click on of a knob. Cyberbullying is a practice of automatic verbal exchange, which evils the popularity or confidentiality of a discrete or hovers, or teases, exit a protracted—eternal effect. Mukhopadhyay et al. [15] defined social media as the usage of a digital stage for linking, interrelating, distributing content, and opinions nearby the sphere. Shah et al. [16] described that in a cutting-edge technologically sound world using social media is inevitable. Along with the blessings of social media, there are serious terrible influences as properly. A crucial problem that desires to be addressed here is cyberbullying. Zhang et al. [17] explained cyberbullying can ensure a deep and durable effect on its sufferers, who are frequently youth. Precisely detecting cyberbullying aids prevent it. Conversely, the sound and faults in community media posts and messages mark identifying cyberbullying exactly hard. Dwivedi et al. [18] explained that nowadays IoT-based structures are developing very rapidly which have a diverse form of wireless sensor networks. Further, the authors defined anomaly or outlier detection in this paper. Singh and Dwivedi [19] defined human identity as a part of supplying security to societies. In this paper, the writer labored at the techniques to find which methodology has better overall performance so that you can provide human identification protection. Sahay et al. [20] defined that cyberbullying impacts greater than 1/2 of younger social media users’ global, laid low with extended and/or coordinated digital harassment. Dwivedi et al. [21] defined a system studying-based scheme for outlier detection in smart healthcare sensor cloud. The authors used numerous performance metrics to evaluate the proposed paintings. Shakeel and Dwivedi [22] explained that social media plays an important role in today’s world. Further, the author described influence maximization as a problem and how to conquer this problem if you want to find the maximum influential node on social media. Malpe and Vaikole [23] depicted that cyberbullying is a movement where someone or else a collection of folk’s usages societal networking web sites happening the internet through smartphones, processors, as well as pills toward misfortune, depression, injured, or damage an alternative person. Dwivedi et al. [24] defined how a smart information system is based totally on sensors to generate a massive amount of records. This generated record may be stored in the cloud for additional processing. Further, in this paper, the authors defined healthcare monitoring sensor cloud and integration of numerous frame sensors of various patients and cloud. Chatzakou
Performance Analysis of Supervised Machine Learning Algorithms …
389
et al. [25] defined cyberbullying and cyber-aggression as more and more worrisome phenomena affecting human beings across all demographics. The authors similarly referred to that more than half of younger face cyberbullying and cyber-aggression because of using social media. Rai and Dwivedi [26] defined that many technologies have made credit cards common for both online and offline purchases. So, protection is expected to save you fraud transactions. Therefore, the authors worked on it using different methodologies and thus find the accuracies of every methodology. Shakeel and Dwivedi [27] explained the positive and negative impact of using social media. The authors mainly focuses on negative impact of social media one of them is cyberbullying, because of cyberbullying which many people committed suicide. Zhao et al. [28–30] described that the rate of cyberbullying is increasing because of the maximum use of social media. The author worked on how to reduce and stop cyberbullying and for that reason designed a software program to automatically detect bullying content material on social media. Pasumpon Pandian [31] defined some commonplace programs of deep gaining knowledge of like sentiment evaluation which possess a higher appearing and green automatic feature extraction techniques in comparison to conventional methodologies like surface approach and so forth. Andi [32] described how the call for machine studying and AI-assisted buying and selling has been extended. In this paper, the writer proposed a set of rules to demonstrate a higher algorithm for predicting the bitcoin price based on the present global stock marketplace. Chen and Lai [33] explained how the usage of the net, numerous enterprises including the economic industry has been exponentially extended. Further, this paper addresses numerous demanding situations that have compounded despite the emerging technological growth and reliance. Tripathi [34] explained the lack of physical reference to each other due to COVID19 lockdown which ends up in an increase in the social media verbal exchange. Some social media like Twitter come to be the most famous region for the people to specific their opinion and also to communicate with every other.
3.1 Research Gap The existing work’s result concluded that the voting classifier has the best accuracy and the support vector machine has the lowest accuracy. Thus, in the proposed paper, we have worked on the accuracy of SVM, NB, and LR and compared the accuracy of a majority of these three classifiers. Table 1 presents a comparative study of related work. Based on the survey, inside the comparative table of related work, we have got referred to the author’s name, years of the paper, name of the paper, and methodology used.
390
N. Shakeel and R. K. Dwivedi
Table 1 Comparative study of related work S. No.
Author’s
Year
Title
1.
Nurrahmi [1]
2016
Indonesian Twitter SVM and KNN cyberbullying detection using text classification and user trustworthiness
Algorithm used
There are some wrong tags in the Indonesian POS tagger
Limitations
2.
Prasanna 2017 Kumar et al. [2]
A survey on cyberbullying
Semi-supervised targeted event detection (STED), Twitter-based event detection and analysis system (TEDAS)
Not any
3.
Pradheep et al. [3]
2017
Automatic multimodel cyberbullying detection from social networks
Naïve Bayes
Sometimes a proposed model is not able to control the stop words
4.
Mangaonkar et al. [4]
2018
Collaborative detection of cyberbullying behavior in Twitter data
Naive Bayes (NB), logistic regression, and support vector machine (SVM)
When the true negatives increase then the models are not working
5.
Al-Ajlan and Ykhlef [5]
2018
Improved Twitter cyberbullying detection based on deep learning
Convolutional neural network (CNN)
The metaheuristic optimization algorithm is incorporated to find the optimal or near-optimal value
6.
Agrawal and Awekar [6]
2018
Deep learning for detecting cyberbullying across multiple social media platforms
Deep neural network (DNN)
Some models do not work properly
7.
Meliana and Fadlil [7]
2019
Identification of cyberbullying by using clustering approaches on social media Twitter
Naïve Bayes and decision tree
Naïve Bayes was not able to find the hate comments as compared to that of decision tree J48
8.
Keni et al. [8]
2020
Cyberbullying detection using machine learning algorithms
Principle component analysis (PCA) and latent semantic analysis (LSA)
The performance of other classifications is not good (continued)
Performance Analysis of Supervised Machine Learning Algorithms …
391
Table 1 (continued) S. No.
Author’s
Year
Title
Algorithm used
Limitations
9.
Wade et al. [9]
2020
Cyberbullying detection on Twitter mining
Convolutional neural network (CNN) and long short-term memory (LSTM)
CNN-based models do not have better performance as compared to that DNN models
10.
Nirmal et al. [10]
2020
Automated detection of cyberbullying using machine learning
Naïve Bayes model, SVM model, and DNN model
Difficult to detect some hate words because of specific code
11.
Ingle et al. [12]
2021
Cyberbullying monitoring system for Twitter
Gradient boosting
Naïve Bayes, logistic regression does not possess good results
12.
Desai et al. [13] 2021
Cyberbullying detection on social media using machine learning
BERT
The accuracy of SVM and NB is not good as compared to that of pre-trained BERT
13.
Khokale et al. [14]
2021
Review on detection of cyberbullying using machine learning
Support vector machine (SVM) classifier, logistic regression
The authors did not find the use of K-folds across the techniques
14.
Mukhopadhyay et al. [15]
2021
Cyberbullying detection based on Twitter dataset
Convolutional neural network (CNN)
Not good performance
15.
Malpe et al. [23] 2020
A comprehensive study on cyberbullying detection using machine learning technique
Deep neural network (DNN)
Not any
16.
Chatzakou et al. 2019 [25]
Detecting LDA cyberbullying and cyber-aggression in social media
Effective tools for detecting harmful actions are scarce, as this type of behavior is often ambiguous
392
N. Shakeel and R. K. Dwivedi
4 Proposed Approach In this paper, we can develop using Python technology. Within that first, we can seek and discover the dataset and download it to train the model. After downloading, first, we can preprocess the data after which transferred to TF-IDF. Then, with the help of NB, SVM, and LR algorithms, we train the dataset and generate models one after the other. Then, we are going to develop web-based software that uses the Anaconda framework. We will fetch the actual time tweets from Twitter and then we apply generated model to those fetched tweets and test the textual content or images are cyberbullying or not. Figure 2 shows the proposed framework of this paper. Algorithm 1 Detection of cyberbullying and non-cyberbullying words Input: Twitter datasets Output: identifies cyberbullying or non-cyberbullying Begin Step 1: Take the input data from Twitter Step 2: Start preprocessing Step 3: Divide the processed data into each comment Step 4:Classify the feature selection Step 5: Apply machine learning algorithm (i) NB (ii) SVM (iii) LR Step 6: If cyberbullying words occurs Then, identify and classify cyberbully words
Fig. 2 Proposed framework
Performance Analysis of Supervised Machine Learning Algorithms …
393
Else Calculate the non-cyberbullying words End In Algorithm 1, firstly, we take datasets from Twitter. And then, we will preprocess the data and transferred it to TF-IDF. After preprocessing, we divide the processed data into cyberbully or non-cyberbully comments. After that, we classify the feature selection and apply machine learning algorithms like NB, SVM, and, LR and check the cyberbullying and non-cyberbullying words.
4.1 Preprocessing In the proposed model, it is remained useful to eliminate and clean undesirable noise in text detection. Later information cleaning, the dataset is distributed into dual groups: a training set and a testing set wherever every dataset is labeled as cyberbullying or non-cyberbullying. The first element includes 70% of the tweets used for training purposes; also the further component incorporates 30% used for prediction reasons. The data has been fetched from Twitter. The information accumulated should encompass three attributes user attributes, class, and format. The user attributes are used for the identification of a user, and the class feature is used to understand businesses and the format expresses the consumer touch upon numerous status/corporations. When the dataset has been organized, it must be broken up into texts which encompass remarks, conversations, and many others. The selected attributes to classify the tweets are shown in Table 2.
4.2 Feature Extraction Subsequently, cleaning the dataset within the overhead stages, tokens may be removed from it. The system of removing tokens is known as tokenization, wherein we proceed with the extracted records as sentences or passages and then amount produced the arrived textual content as detached words, letterings, or sub phrases in the shape of a list. These words want to be transformed into statistical courses hence that every dataset may be signified within the procedure of statistical information.
4.3 Feature Engineering and Feature Selection Some of the extremely communal methods to expand cyberbullying detection is to carry out feature engineering, and the supreme shared features that progress the
394
N. Shakeel and R. K. Dwivedi
Table 2 Selected attributes to classify the tweets Attributes
Class
Format
Noun
CB/non-CB
Text
Pronoun
CB/non-CB
Text
Adjective
CB/non-CB
Text
Local features
The basic features extracted from a tweet
Text
Contextual features
Professional, religious, family, legal, and financial factors specific to CB
Text
Sentiment features
Positive or negative (foul words specific to CB) or direct or indirect CB polite words, modal words, unknown words, number of insults and hateful blacklisted words
Text
Emotion features
Harming with detail description, power differential any form Text of aggression, targeting a person, targeting to a more persons, intent, repetition, one-time CB, harm, perception, reasonable person/witness, and racist sentiments
Gender specific
Male/female
Text
User features
Network information, user information, his/her activity information, tweet content, account creation time, and verified account time
Numeric
Twitter basic features Number of followers, number of mentions, and number of Numeric following, favorite count, popularity number of hash tags, and status count Linguistic features
Other languages words, punctuation marks, and abbreviated words rather than abusive sentence judgments
Text
satisfaction of cyberbullying discovery classifier overall concert are; literal, community, consumer, feeling, word embedding’s functions. We tried to construct functions based totally on the textual context and their semantic alignment.
4.4 Classification Techniques In this section, numerous classifiers have been used to categorize whether the tweet is cyberbullying or non-cyberbullying. The classifier models constructed are NB, LR, and SVM. a.
Naïve Bayes (NB)
Naïve Bayes is broadly used for file/textual classification problems. Conversely, in the cyberbullying detection area, Naïve Bayes turned into the utmost generally used to ensure cyberbullying predictions simulations. b.
Logistic Regression (LR)
Logistic regression is any of the maximum common machine learning algorithms, which arises under the supervised learning method. Logistic regression can be used
Performance Analysis of Supervised Machine Learning Algorithms …
395
to categorize the opinions using dissimilar forms of records and can simply decide the simplest variables used for the taxonomy. c.
Support Vector Machine (SVM)
An SVM model is the demonstration of information as facts in an area drawn hence that the samples of the distinct groups are separated by way of a clean hole that is varied as viable. SVM’s can successfully implement a nonlinear category, indirectly representing their contributions into great dimensional feature space.
4.5 Performance Evaluation The proposed work of this paper states the performance measurement, datasets, and results. In overall performance measurement, we have used some metrics along with accuracy; such as accuracy recall, precision, F1-score, and specificity. We have taken the datasets of Twitter to discover the accuracy, F1-score, balance accuracy, specificity of some of the classifiers. A.
Performance Measurement
To examine our classifiers, numerous estimation metrics should be used. We have assumed the extremely shared norms which can be normally used, namely: accuracy, precision, consider, F1-measure. Such norms are described as follows: i.
Accuracy: It is defined as the ratio of the quantity of effectively expected opinions to the quantity of general quantity of critiques present in the corpus. Accuracy = (TP + TN)/(TP + TN + FP + FN)
ii.
Precision: It provides the correctness of the classifier. It is the ratio of the range of properly expected wonderful analyzes to the whole quantity of opinions expected as high quality. Precision = TP/(TP + FP)
iii.
(2)
Recall: It is the ratio of a wide variety of efficaciously expected effective analyzes toward the real quantity of superb critiques gift in the quantity. Recall = TP/(TP + FN)
iv.
(1)
(3)
F 1-Score: It is the vocal suggestion of exactness and does not forget. F1-degree will have a great price of 1 and the worst value is 0. F-measure = 2 ∗ (Recall ∗ Precision)/(Recall + Precision)
(4)
396
N. Shakeel and R. K. Dwivedi
Table 3 Confusion matrix corresponding to several classifiers Approaches used
True negative (TN)
False positive (FP)
False negative (FN)
True positive (TP)
Naïve Bayes (NB)
10,501
1569
1409
11,088
Support vector machine (SVM)
10,812
1688
1398
11,102
Logistic regression (LR)
10,772
1298
1161
10,499
v.
Specificity: It is the ratio of true negative with the sum of true negative and false positive. Specificity = TN/(TN + FP)
(5)
where True positive (TP) is successful; efficiently categorized as high quality. True negative (TN) is a rejection; effectively labeled as negative. False positive (FP) is a false alarm, falsely categorized as tremendous. False negative (FN) is an error, falsely categorized as poor. B.
Dataset
Detecting cyberbullying in social media over cyberbullying keywords also the use of machine learning for finding are hypothetical and sensible demanding situations. In this paper, we used an international dataset of 37,373 tweets to estimate classifiers that are typically utilized in cyberbullies’ gratified material exposure. C.
Confusion Matrix
A confusion matrix is a matrix that characterizes the overall presentation of some classifiers. The cataloging matrix of NB, SVM, and LR classifiers utilized in cyberbullying detection are shown in Table 3. LR gives the lowest false positive, lowest false negative, and lowest true positive, and SVM provides the maximum true negative, false positive, and true positive, while the NB offers the highest false negative and lowest true negative. D.
Results
The existing work of this paper is that the accuracy of SVM is lowest as compared to that of the voting classifier. So, the proposed work of this paper is that I have worked at the accuracy of the support vector machine and established that the result is SVM has the best accuracy among the rest of the classifiers including NB and LR. Cyberbullying detection algorithm is carried out with three classifiers NB, SVM, and LR. Their overall performance comparison is shown in Table 4. Table 4 shows the performance comparison of several classifiers. It can be comprehended from Table 4 that the accuracy of the SVM is good among the rest of the two classifiers.
Performance Analysis of Supervised Machine Learning Algorithms … Table 4 Performance comparison of several classifiers
Metrics
Precision
397
Classifiers Naïve Bayes (NB)
Support vector machine (SVM)
Logistic regression (LR)
0.85
0.87
0.88
Recall
0.87
0.89
0.86
Specificity
0.88
0.88
0.89
F1-score
0.86
0.87
0.90
Accuracy
0.87
0.94
0.89
Fig. 3 Comparison of precision
Figure 3 gives a comparison of the precision of numerous classifiers viz., NB, SVM, and LR. It may be visible that SVM has the best precision. Figure 4 shows the comparison of these classifiers, and it is determined that recall of SVM is the best. Figure 5 provides the evaluation of the specificity of those classification methods, and it is far discovered that the specificity of LR is the highest. Figure 6 describes the F1-score of these schemes, and it can be noticed that LR has the very best F1score. Figure 7 depicts the accuracy of NB, SVM, and LR, and we can see that SVM gives the very best accuracy. Thus, we will say that SVM is the best scheme for cyberbullying detection. Table 5 indicates the comparison table of existing work and proposed work in which we have taken a few metrics like precision, recall, balance accuracy, F1score, and accuracy. And we compare most of these metrics of our proposed work to that of existing work and observe that the precision metric of existing work is 0.87 while the precision metric of our proposed work is 0.88. Which expect that the precision metric of proposed works is better whereas the precision metric of existing work is lower? After, evaluating the precision metric, we come to the take into recall metric and observed that the recall metric of existing work is 0.81 whereas the recall
398
Fig. 4 Comparison of recall
Fig. 5 Comparison of specificity
Fig. 6 Comparison of F1-score
N. Shakeel and R. K. Dwivedi
Performance Analysis of Supervised Machine Learning Algorithms …
399
Fig. 7 Comparison of accuracy
Table 5 A comparison study of existing and proposed work
Metrics
Existing work
Proposed work
Precision
0.87
0.88
Recall
0.81
0.86
Specificity
0.87
0.88
F1-score
0.88
0.90
Accuracy
0.93
0.94
metric of our proposed works is 0.86. Which predicts that the recall metric of the proposed work is better whereas the recall metric of existing work is lower. After, comparing the recall metric, we come to the F1-score metric and observed that the F1-score metric of existing work is 0.88, while the F1-score metric of our proposed work is 0.90. Which are expecting that the F1-score metric of proposed work is higher while the F1-score metric of existing work is lower. After, evaluating most of these metrics, we come to the accuracy metric of SVM and found that the accuracy metric of the existing work is 0.93 and the accuracy of the proposed work is 0.94 which predicts that the accuracy metric of the proposed work is better than that of the existing work.
5 Novelty of the Proposed Approach The existing work’s result concluded that the voting classifier has the best accuracy and the support vector machine has the lowest accuracy. Thus, in the proposed paper, we have worked at the accuracy of SVM, NB, and LR to compare the accuracy of a majority of these three classifiers and determine the accuracy, F1-score, specificity, and recall of SVM is maximum among that three classifiers.
400
N. Shakeel and R. K. Dwivedi
6 Conclusion and Future Work Although social media platform has come to be an essential entity for all of us, cyberbullying has numerous negative influences on a person’s lifestyle which contains sadness, nervousness, irritation, worry, consider concerns, small shallowness, prohibiting from social activities, and occasionally desperate behavior also. Cyberbullying occurrences are not the simplest taking region via texts; however, moreover audio and video capabilities play an essential function in dispersal cyberbullying. This study has discussed a designated and comprehensive review of the preceding research completed within the subject of cyberbullying. The existing work of this paper is that the accuracy of SVM is lowest as compared to that of the voting classifier. Further, in this paper, we have worked on the accuracy, F1-score, specificity, recall, and precision of NB, SVM, and LR and observed that the accuracy, F1-score, balance accuracy, specificity, and recall of SVM is better in comparison to that of NB. The accuracy of SVM is 0.94 which outperforms the existing work. The future may be prolonged to examine distinctive Twitter organizations or network pages to perceive every unfamiliar or violent post using the societies in opposition to authorities businesses or others.
References 1. H. Nurrahmi, D. Nurjanah, Indonesian Twitter cyberbullying detection using text classification and user credibility, in International Conference on Information and Communications Technology (ICOIACT) (2016), pp. 542–547 2. G. Prasanna Kumar et al., Survey on cyberbullying. Int. J. Eng. Res. Technol. (IJERT) 1–4 (2017) 3. T. Pradheep, J.I. Sheeba, T. Yogeshwaran, Automatic multimodal cyberbullying detection from social networks, in International Conference on Intelligent Computing Systems (ICICS) (2017), pp. 248–254 4. A. Mangaonkar, A. Hayrapetian, R. Raje, Collaborative detection of cyberbullying behavior in Twitter, in IEEE (2018) 5. M.A. Al-Ajlan, M. Ykhlef, Optimized cyberbullying detection based on deep learning (2018) 6. S. Agrawal, A. Awekar, Deep learning for cyberbullying across multiple social media platforms (2018), pp. 2–12 7. N. Meliana, A. Fadlil, Identification of cyberbullying by using clustering method on social media Twitter, in The 2019 Conference on Fundamental and Applied Science for Advanced Technology (2019), pp. 1–12 8. A. Keni, Deepa, M. Kini, K.V. Deepika, C.H. Divya, Cyberbullying detection using machine learning algorithms. Int. J. Creat. Res. Thoughts (IJCRT) 1966–1972 (2020) 9. S. Wade, M. Parulekar, K. Wasnik, Survey on detection of cyberbullying. Int. Res. J. Eng. Technol. (IRJET) 3180–3185 (2020) 10. N. Nirmal, P. Sable, P. Patil, S. Kuchiwale, Automated detection of cyberbullying using machine learning. Int. Res. J. Eng. Technol. (IRJET) 2054–2061 (2021) 11. M. Patidar, M. Lathi, M. Jain, M. Dharkad, Y. Barge, Cyber bullying detection for Twitter using ML classification algorithms. Int. J. Res. Appl. Sci. Eng. Technol. (IJRASET) 24–29 (2021) 12. P. Ingle, R. Joshi, N. Kaulgud, A. Suryawanshi, M. Lokhande, Cyberbullying monitoring system for Twitter. Int. J. Sci. Res. Publ. 540–543 (2021)
Performance Analysis of Supervised Machine Learning Algorithms …
401
13. A. Desai, S. Kalaskar, O. Kumbhar, R. Dhumal, Cyberbullying detection on social media using machine learning. ITM Web Conf. 2–5 (2021) 14. S. Khokale, V. Gujrathi, R. Thakur, A. Mhalas, S. Kushwaha, Review on detection of cyberbullying using machine learning. J. Emerg. Technol. Innov. Res. (JETIR) 61–65 (2021) 15. D. Mukhopadhyay, K. Mishra, L. Tiwari, Cyber bullying detection based on Twitter dataset. ResearchGate 87–94 (2021) 16. R. Shah, S. Aparajit, R. Chopdekar, R. Patil, Machine learning-based approach for detection of cyberbullying tweets. Int. J. Comput. Appl. 52–57 (2020) 17. X. Zhang, J. Tong, N. Vishwamitra, E. Whittaker, Cyberbullying detection with a pronunciation based convolutional neural network, in 15th IEEE International Conference on Machine Learning and Applications (2016), pp. 740–745 18. R.K. Dwivedi, A.K. Rai, R. Kumar, Outlier detection in wireless sensor networks using machine learning techniques: a survey, in IEEE International Conference on Electrical and Electronics Engineering (ICE3) (2020), pp. 316–321 19. A. Singh, R.K. Dwivedi, A survey on learning-based gait recognition for human authentication in smart cities, in Part of the Lecture Notes in Networks and Systems, Series 334 (Springer, 2021), pp. 431–438 20. K. Sahay, H.S. Khaira, P. Kukreja, N. Shukla, Detecting cyberbullying and aggression in social commentary using NLP and machine learning. Int. J. Eng. Technol. Sci. Res. 1428–1435 (2018) 21. R.K. Dwivedi, R. Kumar, R. Buyya, A novel machine learning-based approach for outlier detection in smart healthcare sensor clouds. Int. J. Healthc. Inf. Syst. Inform. 4(26), 1–26 (2021) 22. N. Shakeel, R.K. Dwivedi, A learning-based influence maximization across multiple social networks, in 12th International Conference on Cloud Computing, Data Science & Engineering (2022) 23. V. Malpe, S. Vaikole, A comprehensive study on cyberbullying detection using machine learning approach. Int. J. Futur. Gener. Commun. Netw. 342–351 (2020) 24. R.K. Dwivedi, R. Kumar, R. Buyya, Gaussian distribution based machine learning scheme for anomaly detection in wireless sensor network. Int. J. Cloud Appl. Comput. 3(11), 52–72 (2021) 25. D. Chatzakou, I. Leontiadis, J. Blackbum, E. De Cristofaro, G. Stringhini, A. Vakali, N. Kourtellis, Detecting cyberbullying and cyber aggregation in social media. ACM Trans. Web 1–33 (2019) 26. A.K. Rai, R.K. Dwivedi, Fraud detection in credit card data using machine learning techniques, in Part of the Communications in Computer and Information Science (CCIS), no. 1241 (2020), pp. 369–382 27. N. Shakeel, R.K. Dwivedi, A survey on detection of cyberbullying in social media using machine learning techniques, in 4th International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV) (2022) 28. R. Zhao, A. Zhou, K. Mao, Automatic detection of cyberbullying on social networks based on bullying features, in International Conference on Distributed Computing and Networks (ICDCN) (2019) 29. S.M. Ho, D. Kao, M.-J. Chiu-Huang, W. Li, Detecting “hotspots” on twitter: a predictive analysis approach. Forensic Sci. Int. Digit. Investig. 3, 51–53 (2020) 30. S.R. Mugunthan, T. Vijayakumar, Design of improved version of sigmoidal function with biases for classification task in ELM domain. J. Soft Comput. Paradigm (JSCP) 3(02), 70–82 (2021) 31. A. Pasumpon Pandian, Performance evaluation and comparison using deep learning techniques in sentiment analysis. J. Soft Comput. Paradigm (JSCP) 3(02), 123–134 (2021) 32. H.K. Andi, An accurate bitcoin price prediction using logistic regression with LSTM machine learning model. J. Soft Comput. Paradigm 3(3), 205–217 (2021) 33. J.I.-Z. Chen, K.-L. Lai, Deep convolution neural network model for credit card fraud detection and alert. J. Artif. Intell. 3(02), 101–112 (2021) 34. M. Tripathi, Sentiment analysis of Nepali COVID19 tweets using NB, SVM, AND LSTM. J. Artif. Intell. 3(03), 151–168 (2021)
Text Summarization of Legal Documents Using Reinforcement Learning: A Study Bharti Shukla, Sonam Gupta, Arun Kumar Yadav, and Divakar Yadav
Abstract Studying and analyzing judicial documents are challenging tasks for the common person. The basic reason for the complexity of documents is their long length with complex language. In this regard, a summarize document is required in simple language that can be understandable to the common person. Manual summarization of legal documents is tedious, and it requires an automatic legal document text summarization technique. In the current scenario, deep learning and machine learning play crucial role in text processing and text summarization. This paper discusses the detailed survey of legal document summarization and the dataset used in various legal documents. Also, discuss the various machine learning models and analyze them evaluation metrics to evaluate the best summarization model. Secondly, this study discusses the quality of summarization with and without t reinforcement learning. The analysis concluded that the rational augmented model [RAG] with deep reinforcement learning outperforms for legal document text summarization with 79.8% accuracy, 70.5% precision, 90.75% recalls, and 75.09% F-measure, respectively. A long short-term memory network model without deep reinforcement learning outperforms 0.93 and 0.98 recall and F-measure, respectively. Keywords Text summarization · Deep reinforcement learning · Legal documents · Text extraction · Deep learning
1 Introduction In the digital world, the growth of big data is increasing day by day on a large scale. A massive amount of data is available in English or Hindi on social media, World Wide Web, etc. However, users need summarized information in optimal form. B. Shukla · S. Gupta (B) Ajay Kumar Garg Engineering College, Ghaziabad, India e-mail: [email protected] A. K. Yadav · D. Yadav National Institute of Technology, Hamirpur, H.P., India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_30
403
404
B. Shukla et al.
Text summarization is a big challenge in the world of legal documents. Because, if summarized information is available of any particular documents, it is easy to predict the judgment and retrieve the exact information [1]. In other words, we can say that the text summarization summarizes the original legal document for easily understandable and readable content without reading individual documents and is helpful to predict the judgment of legal documents [2]. The automatic text summarization is most important in legal domains to retrieve the legal text. However, it is most challenging to summarize legal text because it is stored in various formats, i.e., structured, unstructured, different styles, etc. Automatic summarization of legal text from legal documents can be done with the help of NLP language and machine learning. Natural language processing plays a crucial role in text summarization, information retrieval, automatic extraction, automatic retrieval, data classification, etc. In the legal domain, the keywords play an important role to create the index and help to predict the judgment of legal text. Previously, supervised, and unsupervised techniques were also used to retrieve the text from the legal domain [3, 4]. The reinforcement learning is also performed better for automatic text summarization. It is working as an agent while summarizing the legal text. With the help of reinforcement learning, we can decide the better summarization in the documents. The summarized information of original documents helps understand the brief idea of legal cases [5]. Nowadays, deep learning methods are also used for document summarization and can perform better. In this regard, researchers commented that the bidirectional long short-term memory network approach (BI-LSTM) could analyze the whole original document to summarize data. It can explore more contextual information from the legal domain. This model is executed after retrieving legal text or extraction of legal text [6, 7]. Figure 1 represents the overall working of text summarization in legal documents. Fig. 1 Text summarization process [8]
Text Summarization of Legal Documents Using Reinforcement …
405
There are some other challenges of automatic text summarization for legal documents that are classified as follows: a.
USAGE • Multi-document summarization • User-specific summarization • Application of text summarization
b.
INPUT • Input and output formats • Length of input documents • Supported language
c.
METHOD • Text summarization approaches • Statistical and linguistic feature • Using deep learning for text summarization
d.
OUTPUT • Stop criteria of summarization process • Quality of generated summary • Evaluation of generated summary.
After understanding the importance of the summarization of legal documents, this study motivates the authors to study the challenges, working strategies, methods of summarizing, and evaluating the best method to find the optimal summarized document. • If we apply text summarization in the legal document area to predict the judgment, humans can easily predict the legal judgment. • In this research paper, the author suggests the best text summarization technique with and without reinforcement learning for legal document The rest of the paper is organized as follows; Sect. 2 represents the methods used for text summarization. The analysis of datasets used in the study for legal documents is done in Sect. 3. In Sect. 4, performance comparison of proposed methods for legal documents has been done. After comparison, we discuss the best technique in Sect. 5, and conclusion of research paper is discussed in Sect. 6.
2 Approaches/Methods Used for Text Summarization In this section, the authors describe the detailed study on approaches used for text summarization for legal documents. For a detailed survey on text summarization, authors search documents from Google with different combinations of keywords
406
B. Shukla et al.
such as: “legal document text summarization”, “legal document summarization using machine learning”, and “legal document summarization using deep learning”. The authors collected many research papers from various social media for the literature survey. Finally, the author selected 21 relevant documents to analyze the topic domain’s work so far. The Authors found around 30 papers related to the research topic and found that 21 papers are relevant out of 30. The paper [9] applies XML technology and transductor for text summarization and text extraction from legal documents. The authors used this idea on the information management platform for organizing legal documents. The authors proposed two classifiers, i.e., catchphrase extraction and catchphrase prediction task, and the model achieved 70% accuracy. The paper [10] introduced a multi-model architecture for text summarization, classification, and prediction problems. The authors try to decrease the lack of data problem for the legal domain. The model shows the best performance in state-of-the-art. In the paper [11], the researchers provide a detailed survey of automatic text summarization [ATS] techniques. In this research paper, the authors introduced various algorithms used by previous researchers. Also, the authors introduced issues and challenges of the automatic text summarization technique. In the paper [12], the researchers provide a detailed survey of extractive text summarization techniques. The authors reviewed various challenges and datasets of extractive summarization techniques in this paper. Also, the authors provide details of the most popular benchmarking datasets. In the paper [13], the authors implement a LetSum model for legal text summarization in the legal domain. LetSum model implements a document summary of text summarization for legal documents in table style for improved text view and ease to read. The LetSum model achieves 57.5% accuracy. After surveying the literature in Table 1 of text summarization, we found several important points. One such significant gap is that the text summarization field is still interested among the various researchers. So, that there is further improvement in performing text summarization of legal documents. Another important point is, with the using of machine learning and deep learning, not much work have done on legal document text summarization. Only few techniques have discussed, for text summarization in reinforcement learning for legal documents. Also, there are requirements of an accurate approach for text summarization in reinforcement learning. Based on the above researched gaps, we found and defined the following research questions (RQs) for our research work. (i) RQ1: Which technique gives the best comparison result for legal document text summarization with reinforcement learning? (ii) RQ2: Which technique gives the best comparison result for text summarization without reinforcement learning?
3 Datasets Used in the Study Various researchers have used multiple datasets for text summarization in legal documents. The dataset is used in multiple languages like English, Chinese, Japanese, etc.
Text Summarization of Legal Documents Using Reinforcement …
407
Table 1 Comparison of various techniques of text summarization Key No.
Publication (year) Methodology/finding
Proposed classifiers
L1
A comparative study of summarization algorithms applied to legal case judgments (2019) [14]
Summarization of Summarized algorithm [supervised algorithm and unsupervised]
Legal at model shows best performance on real-world dataset
L2
A deep learning approach to contract element extraction (2017) [15]
1. The authors implement BILSTM-LR model for linear regression operate on fixed size window without involve any manual rule 2. BILSTM-CRF model is introduced for extracting multiple tokens of contract element
BILSTM-CRF model have best performance as compared to BILSTM-LR model
L3
An automatic system for summarization and information extraction of legal information (2010) [9]
The authors described XML technologies idea was to elaborate and transductor on an information management platform to organize the linguistic cues and semantic rules to achieve a precise information extraction in different fields
Better performance on basis of parameters
L4
Overview of the FIRE 2017 IRLeD track: information retrieval from legal documents (2017) [3]
This model is used to (i) Catchphrase retrieving information extraction task, and from legal document (ii) precedence retrieval task
Better performance
L5
Text summarization from legal documents: a survey [16]
The detailed survey on text summarization for legal text
Done survey of all summarization algorithms
1. BILSTM-LSTM-LR 2. BILSTM-CRF
Text summarization techniques
Performance/result
(continued)
408
B. Shukla et al.
Table 1 (continued) Key No.
Publication (year) Methodology/finding
Proposed classifiers
Performance/result
L6
Interpretable rationale augmented charge prediction system (2018) [17]
The author proposed RA model for predication the accuracy
Rationale-augmented classification model
The performance is good and it having comparable accuracy
L7
Multi-task deep learning for legal document translation, summarization and multi-label classification (2018) [10]
The authors developed a multi-model architecture to decrease the problem of data scarcity in legal domain
Multi-model architecture
The multi-task deep learning model performed best in state-of-the-art result of all task
L8
Automatic text A comprehensive summarization: a survey of ATS comprehensive survey [11]
ATS
The researched paper provides a systematic review of ATS approaches
L9
A survey on extractive text summarization [12]
The various techniques, populous benchmarking datasets and challenges of extractive summarization have been reviewed
ETS
The researched paper interprets extractive text summarization methods with a less redundant summary
L10
Legal texts summarization by exploration of the thematic structures and argumentative roles [13]
The model builds a LetSum table style summary for improving coherency and readability of the text
The preliminary evaluation results are very promising
L11
Robust deep reinforcement learning for extractive legal summarization [5, 18]
For training the deep summarization models with the help of reinforcement learning models
The researched model improve performance of text summarization in legal domain
ELS
Table 2 represents the research studies based on the dataset. The PESC dataset [18] is the best dataset for text summarization in legal documents. The PESC dataset is used with deep reinforcement learning for summarizing the text. The dataset achieved the best accuracy compared to another dataset with a deep reinforcement learning field.
Text Summarization of Legal Documents Using Reinforcement …
409
Table 2 Datasets used for text summarization Key No.
Dataset
Domain
Url
L1
17,347 legal case documents [14]
Supreme Court of India
https://www.westlawasia. com/
L2
3500 English contracts [15] UK legal documents
http://nlp.cs.aueb.gr/public ations.html
L3
Legal information [9]
http://www.lexisnexis.ca http://www.carswell.co
L4
1. A collection of legal case LII of India documents with their catchphrases 2. A collection of legal case documents, and prior case [3]
L5
Text Retrieval Conference (TREC), Message Understanding Conference (MUC), Document Understanding Conference (DUC4), Text Analysis Conference (TAC5), and Forum for Information Retrieval Evaluation (FIRE6) [16]
MEAD open-source http://www.trec.nist.gov dataset for summarization http://www-nlpir.nist.gov/ related_projects/muc/
L6
Chinese legal dataset [17]
CAIL2018
L7
Digital corpus of the European parliament European parliament (DCEP) and Joint Research Centre—Acquis Communaut are (JRC-Acquis) [10]
https://mediatum.ub.tum. de/1446650 https://mediatum.ub.tum. de/1446648 https://mediatum.ub.tum. de/1446655 https://mediatum.ub.tum. de/1446654 https://mediatum.ub.tum. de/1446653
L8
The most common Essex Arabic summaries benchmarking datasets [11] corpus (EASC) dataset
https://www.lancaster.ac
L10
3500 judgments of the Federal Court of Canada [13]
Corpus
http://www.canlii.org/ca/ cas/fct/
L11
Legal snippet contains approximately 595 characters [5, 18]
PESC dataset
NA
1. Canadian federals 2. QuickLaw 3. Westlaw-Carswell
www.liiofindia.org
1. http://wenshu.court. gov.cn/ 2. https://github.com/han kcs/HanLP
410
B. Shukla et al.
4 Performance Comparison of Proposed Methods in the Study As per the discussion in this section, we compared previous techniques based on various parameters. The techniques use different datasets in different languages. The previous research techniques provide the best result for text summarization for legal documents with reinforcement learning. Also, some techniques are used for text summarization without applying reinforcement learning. Table 3 shows the performance comparison of techniques based on parameters. Figure 2 represents the comparison graph of text summarization techniques for legal documents. In Fig. 2, we combine all techniques of text summarization with and without applying reinforcement learning for performance comparison. All techniques try to achieve the best parameter, but some techniques cannot summarize the text for legal documents. The LSTM model achieves the highest value among all models. The LSTM model achieved 98.8% recall without deep reinforcement learning. At the same time, the rational augmented model achieved 90.75% recall with deep reinforcement learning.
5 Discussion As per the findings in performance comparison, we can say that text summarization plays a crucial role in legal documents. The text summarization is a summary of the original legal document for easily understandable and readable content without reading individual documents. Moreover, it is helpful to predict the judgment of legal documents. In Table 3, we have summarized the comparison of previous techniques to meet our research questions. We come to the following conclusion: RQ1: Which technique gives the best comparison result for text summarization with reinforcement learning?
After analyzing Fig. 3, we can say that the rational augmented model [RAG] model is the best model for summarizing data. Because the RAG model extracts the information from a legal document. Along with this, it is also helpful to predict the judgment in a legal document. RAG model achieves the best result as compared to other models. Accuracy, precision, recall, F-measure value of rational augmented model [RAG] model is 79.8, 70.5, 90.75, and 75.09. The LetSum model is used with deep reinforcement learning for text summarization. After analyzing the performance of the LetSum model, we conclude that the RAG model is best for text summarization in legal documents. RQ2: Which technique gives the best comparison result for text summarization without reinforcement learning?
After analyzing Table 3, we can say that the long short-term memory network model is the best model for text summarization. The LSTM model achieves 93.1%
LSTM
Long short-term memory network (catchphrase) [3]
Text summarization [16]
Multi-task deep learning [10]
PESC dataset [18]
Deep reinforcement learning—summarization approaches on legal documents [13]
Deep reinforcement learning [17]
L4
L5
L6
L11
L10
L6
Rational augmented model
LetSum
ELS
Multimodal architecture
Hybrid automatic summarization system
XML
XML technologies and transductor [9]
L3
Elison
Proposed models
Summarize algorithm [14] NA
Key No.
L1
79.8
57.5
25.70
NA
NA
NA
70
NA
Accuracy (%)
Table 3 Performance comparison of text summarization techniques Precision (%)
70.5
NA
NA
NA
NA
93.1
NA
NA
Recall (%)
90.75
NA
NA
NA
62.4
98.8
NA
50.5
F-measure (%)
75.09
NA
NA
82
54.9
NA
NA
37.10
Remark
The model achieve all parameters
Better performance on basis of parameters
Better performance on basis of parameters
The model outperforms as compared to others model but unable to calculate remaining parameters
The model achieves better performance
The model outperforms as compared to others model
Better performance on the basis of parameters
The model achieves better performance
Text Summarization of Legal Documents Using Reinforcement … 411
412
B. Shukla et al. Accuracy (%)
90
F-Measure (%)
79.8
90
80
70
80
70
75.09
70
57.5
60
82
Accuracy (%)
50 40
60
54.9
50 37.1
40
25.7
30
30
20
20
10
10
F-Measure (%)
0
0 XML tech LetSum[17] PESC Rational dataset[16] Augumneted and tran [9] Model[12]
Models
Models
(a) Performance Comparison of accuracy parameters
(b) Performance Comparison of F-Measure parameters
RECALL(%) 98.8 100 90 80 70 60 50 40 30 20 10 0
90.75 62.4
50.5
RECALL(%)
Models
(c) Performance Comparison of Recall parameters
Fig. 2 Performance comparison of accuracy, F-measure and recall in text summarization techniques
recall and 98.8 F-measure. The remaining models also try to achieve the best result, but as compared to other models, the LSTM model performs best for text extraction without applying reinforcement learning.
6 Conclusion Data summarization is brief information of original documents. Data summarization is most important in the legal domain. Because, after summarizing the document,
Text Summarization of Legal Documents Using Reinforcement … Fig. 3 Text summarization with deep reinforcement learning
413
Text Summarizaon with Deep Reinforcemet learning 90 80 70 60 50 40 30 20 10 0
79.8 57.5
Accuracy…
Rational LetSum [17] Augumneted Model [12]
it is straightforward to predict the judgment of the legal document. We analyzed the different datasets used by researchers in multiple languages. Also, we perform a comparison of previous baseline models based on their advantages, finding, and outcomes. After comparing the performance of various baseline models, we conclude that the rational augmented model [RAG] model with reinforcement learning is the best model for data summarization in a legal document. Also, long short-term memory network model without deep reinforcement learning is the best model for text summarization. The LSTM model achieves 93.1% recall and 98.8 F-measure. In the future work, we will use the automatic text summarization technique with reinforcement learning to predict legal documents’ judgment. Because predicting the judgment of lengthy documents are very costly and time-consuming process. So, we can apply the text summarization model with reinforcement learning to predict legal documents’ judgment. Acknowledgements This research is supported by Council of Science and Technology, Lucknow, Uttar Pradesh via Project Sanction letter number CST/D-3330.
References 1. S.P. Singh, A. Kumar, A. Mangal, S. Singhal, Bilingual automatic text summarization using unsupervised deep learning, in 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT) (2016), pp. 1195–1200. https://doi.org/10.1109/ICEEOT. 2016.7754874 2. S. Ryang, T. Abekawa, Framework of automatic text summarization using reinforcement learning, in Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (2012)
414
B. Shukla et al.
3. A. Mandal et al., Overview of the FIRE 2017 IRLeD track: information retrieval from legal documents, in FIRE (Working Notes) (2017) 4. T.N. Le, M. Le Nguyen, A. Shimazu, Unsupervised keyword extraction for Japanese legal documents. JURIX (2013) 5. D.-H. Nguyen et al., Robust deep reinforcement learning for extractive legal summarization, in International Conference on Neural Information Processing (Springer, Cham, 2021) 6. F.A. Braz et al., Document classification using a Bi-LSTM to unclog Brazil’s Supreme Court. arXiv preprint arXiv:1811.11569 (2018) 7. J.S. Manoharan, Capsule network algorithm for performance optimization of text classification. J. Soft Comput. Paradigm (JSCP) 3(01), 1–9 (2021) 8. https://marketbusinessnews.com/automatic-text-summarization-in-business/267019/ 9. E. Chieze, A. Farzindar, G. Lapalme, An automatic system for summarization and information extraction of legal information, in Semantic Processing of Legal Texts (Springer, Berlin, Heidelberg, 2010), pp. 216–234 10. A. Elnaggar et al., Multi-task deep learning for legal document translation, summarization and multi-label classification, in Proceedings of the 2018 Artificial Intelligence and Cloud Computing Conference (2018) 11. W.S. El-Kassas et al., Automatic text summarization: a comprehensive survey. Expert Syst. Appl. 165, 113679 (2021) 12. N. Moratanch, S. Chitrakala, A survey on extractive text summarization, in 2017 International Conference on Computer, Communication and Signal Processing (ICCCSP) (2017), pp. 1–6. https://doi.org/10.1109/ICCCSP.2017.7944061 13. A. Farzindar, G. Lapalme, Legal text summarization by exploration of the thematic structure and argumentative roles, in Text Summarization Branches Out (2004) 14. P. Bhattacharya et al., A comparative study of summarization algorithms applied to legal case judgments, in European Conference on Information Retrieval (Springer, Cham, 2019) 15. I. Chalkidis, I. Androutsopoulos, A deep learning approach to contract element extraction. JURIX (2017) 16. A. Kanapala, S. Pal, R. Pamula, Text summarization from legal documents: a survey. Artif. Intell. Rev. 51(3), 371–402 (2019) 17. X. Jiang et al., Interpretable rationale augmented charge prediction system, in Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations (2018) 18. L. Manor, J.J. Li, Plain English summarization of contracts. arXiv preprint arXiv:1906.00424 (2019)
Use of Near-field Communication (NFC) and Fingerprint Technology for Authentication of ATM Transactions K. Renuka, R. P. Janani, K. Lakshmi Narayanan, P. Kannan, R. Santhana Krishnan, and Y. Harold Robinson
Abstract The automated teller machine (ATM) is a handy solution for users to meet their banking needs. However, using a bank card or check card type of card during ATM cash transaction or withdraw has some drawbacks, such as being vulnerable to ATM rigging, card magnetic strips being destroyed, card manufacture and transportation costs, and taking longer to authenticate customers. This study focuses on near-field communication (NFC) card-emulation mode and fingerprint technology as cash card alternatives that can be employed at the user’s discretion. NFC requires a very short distance between the two devices (usually less than 4 cm), making it perfect for making transactions requiring important data. To collect user information, a fingerprint sensor can also be utilized instead of an NFC tag and reader. Although cash card is not mandatory for authentication, the system will nonetheless be more secure than the current approach, which uses an ATM card. This ensures high level of security during authentication. Keywords ATM · Card cloning · Fingerprint technology · Secure authentication · OTP · Rigging · Near-field communication
K. Renuka · R. P. Janani Francis Xavier Engineering College, Tirunelveli, Tamil Nadu, India K. Lakshmi Narayanan (B) · P. Kannan ECE Department, Francis Xavier Engineering College, Tirunelveli, Tamil Nadu, India e-mail: [email protected] P. Kannan e-mail: [email protected] R. Santhana Krishnan ECE Department, SCAD College of Engineering and Technology, Tirunelveli, Tamil Nadu, India Y. Harold Robinson School of Information Technology and Engineering, Vellore Institute of Technology, Vellore, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_31
415
416
K. Renuka et al.
1 Introduction Due to the increased use of plastic money in today’s world, the use of ATM counters has also increased a lot. But in reality, the transactions that we do at the ATM counters are not safe. Users are provided with a card that contains a magnetic strip, and every time the user makes any transaction, he/she should enter the personal identification number PIN into the ATM, which, if revealed to others, will create a great issue. And the repeated usage of the card during every transaction will damage the magnetic strip in the card and can create a malfunction of the card. Moreover, there are many critical problems, like ATM skimming, card cloning, etc., which make transactions even more threatening. The ultimate aim of this paper is to ensure completely secure ATM transactions without any fear of the user’s details being stolen. The two technologies used in “Use of Near-field Communication (NFC) and Fingerprint Technology for Authentication of ATM Transactions” for secured authentication of ATM transactions are near field communication (NFC) technology and fingerprint technology. NFC technology permits data transmission to a receiving device in real time [1]. With technology embedded into mobile phones to make people’s lives easier, it is not necessary to carry cash or credit card to perform bank transactions. NFC payments make payments faster and easier, and they will be more convenient than old ways [2]. The following is how fingerprint technology is used. Fingerprint module scans the customer’s biometric data, it generates four-digit code as a message and sends it to the registered customer’s mobile number via a GSM modem which is connected to the microcontroller. Therefore, the system we have proposed will be very effective for safe ATM transactions [3, 4]. Say for example, if a person is in a situation that, he/she missed the ATM card, if the hackers come to know their PIN, their money can be easily taken from their account; or if the ATM machine is attached with a skimming machine, there will be high risk of money theft from the users account and the user will be left helpless, as the user details are collected without their knowledge. To solve this problem, our system comes with NFC and fingerprint technology. Even if the user losses his/her card with the NFC tag, no one can take money from the user’s account, as the OTP will be sent to the user’s mobile number that is registered with the bank account. Money can be withdrawn only with OTP [5, 6]. The other option is to use fingerprint sensor. Here, even the card is not needed. Our fingerprint and the OTP are enough to access the bank account [7, 8]. So with this system, high-level safety can be ensured by the proposed system. The main reason for choosing fingerprint for ATM transaction is that every user has an identical fingerprint pattern and any fingerprint cannot be same as other. This leads the user to make their ATM transaction more secure than other type of transaction. This system is one-of-a-kind in its own way. To access the system, a user can use a card with an NFC tag or a fingerprint. The system will link to the user’s details in either of the two ways. The system asks the user to enter a four-digit number to determine whether or not they are human. Following the identification of the user’s information, an OTP will be delivered to the user’s
Use of Near-field Communication (NFC) and Fingerprint …
417
registered mobile number. The user is next requested to enter the needed amount. The transaction will then be processed.
1.1 Motivation The primary goal of our proposed work is the replacement of conventional ATM cards with NFC tag readers and finger print technology to provide users with the best ATM transaction experience. The NFC technology exhibits a variety of applications, like contactless payment, easy pairing of a Bluetooth device by the use of an NFC tag and reader, record keeping, exchange of contact details and other confidential data securely, etc. The NFC reader contains three modes, namely reader/writer mode, peer-to-peer mode, card-emulation mode and wireless charging mode. Of these modes, the card-emulation mode acts as a contactless smart card that can be used for secure and contactless ATM transactions. Any NFC-enabled card or NFC-enabled smartphone can be shown above the NFC reader for the user’s data to be read. The fingerprint technology makes use of a fingerprint sensor to scan the fingerprint of the user to gather the information that is registered with the user’s fingerprint. Because fingerprint technology eliminates the need for ATM cards, ATM transactions are more secure and safer.
2 Related Works Hassan et al. [9] have suggested a system in which the card is replaced with a fingerprint that is linked to the bank account, and the PIN is input on a shuffled keypad. The technology was created in such a way that it prevents the misuse of actual ATM cards and allows for secure transactions. Christian et al. [2] have shown that e-service quality influences perceived usefulness, which influences intention to use, and that indicators influence perceived usefulness and NFC indicators influence perceived ease of use. Mahansaria and Roy [10] have discussed the security analysis and threat modeling, which highlight the system’s security strength during authentication. Deelaka Ranasinghe and Yu [11] have presented a unique concept design for a device that serves as an RFID or NFC tag with fingerprint authentication in their paper. Govindraj et al. [12] have proposed and addressed the challenges that the systems experience by lowering authentication time with the use of a biometric fingerprint sensor and adding an extra layer of security by generating OTP authentication using a local server. In comparison to existing methodologies, their implementation has produced better outcomes and a greater performance rate. Kolev [13] has used the advent of NFC modules to lower the cost of access control of commercial and office buildings and has devised a system that is meant to be utilized for patrol team control. Embarak [14] has presented a two-step strategy for completing ATM transactions
418
K. Renuka et al.
utilizing a closed end-to-end fraud prevention system and by adding a smartphone as an additional layer for ATM transactions, and employing authentic user smartphone ID numbers to safeguard ATM transactions using current technology. Lazaro et al. [15] have investigated the feasibility of employing an NFC-enabled smartphone as a reader to read implanted sensors based on battery-free near-field communication (NFC) integrated circuits. Gupta et al. [16] have investigated the feasibility of employing an NFC-enabled smartphone as a reader to read implanted sensors based on battery-free near-field communication (NFC) integrated circuits.
3 Technologies Used 3.1 Near-field Communication (NFC) Technology Near-field communication (NFC) technology is widely used for contactless communications for a short range. It uses electromagnetic radio fields to enable communication between any two electronic devices within a short range using NFC tag and NFC reader. Because transactions take place over such a short distance, NFC technology requires an NFC reader on the receiving device and an NFC tag/chip on the transmitting device. To transfer data, NFC-enabled devices must be physically touching or within a few centimeters of each other. Near-field communication works because the receiving device reads your data as soon as you send it. Near-field communication (NFC) is a way of establishing communication between two electronic devices (or a device and an NFC tag) by bringing them close to each other. The NFC has benefits of radio frequency identification over Bluetooth to use radio frequency identification (RFID) and other communication technologies to carry out secure transactions. One of the benefits in NFC tag is that, it does not require a power supply because it is self-contained [10].
3.2 Fingerprint Biometrics There are two approaches for recognizing a fingerprint: minutiae-based and imagebased methods. The fingerprint is represented by local features such as termination and bifurcations in the minutiae-based technique. The other method is based on an image. It matches fingerprints based on the global properties of the entire image. It is a sophisticated strategy. In this process, all of the photographs are saved. The use of fingerprint technology makes ATM transactions extremely safe and straightforward. Because fingerprints are unique to each individual, no one else will be able to access an individual’s information [17]. Physical ATM cards can be replaced with fingerprint technology to avoid issues such as ATM card skimming and card cloning.
Use of Near-field Communication (NFC) and Fingerprint …
419
A fingerprint verification system uses a one-to-one comparison to identify a person’s genuine identification by comparing their fingerprints to the fingerprint that was acquired and saved initially. To recognize an individual, an identification system performs one-to-many comparisons and searches throughout the whole template of the database [9, 18, 19].
3.3 Hardware and Specifications Table 1 lists the hardware components and its specifications (Fig. 1). Table 1 Hardware components
Fig. 1 Proposed system
S. No.
Hardware component
Specification
1
PIC microcontroller
–
2
NFC reader and NFC tag
–
3
Finger print sensor
–
4
GSM module
–
5
Relay
12 V
6
DC motor
5V
7
LCD
16 × 2 display
8
Keypad
4 × 3 matrix
9
Connecting wires
As required
420
K. Renuka et al.
4 Proposed System The PIC microcontroller [20] has a RISC design that allows for quick performance as well as simple programming and interface. It is the central component to which all other hardware is attached. A fingerprint sensor that scans the user’s fingerprint and an NFC reader that reads the NFC tag are the two major devices that are used as input sources [21]. To provide input to the system, the fingerprint sensor and NFC reader are connected to the PIC microcontroller. The output is produced by the sequential operation of three devices: the GSM module, the LCD, and the motor. Using the GSM phone network, the GSM module is utilized to receive data from a remote place. It is used to validate the OTP in this system. The transaction status is displayed on the LCD (16 × 2). To signify that the transaction is taking place, a DC motor (5 V) is employed. Finally, for power, the entire system is connected to a nearby power socket. Table 2 given lists the comparison of the existing system to the proposed system.
4.1 Working The PIC microcontroller is the ultimate processing unit of the entire system. It receives the input through an NFC reader and fingerprint sensor, processes it, and gives the output with the help of a GSM module, LCD display, and motor. It is the user’s wish to use either an NFC tag or fingerprint. If the user wants to use the NFC reader, he/she must show the ATM card with an NFC tag above the NFC reader at a distance of less than 4 cm [22] as shown in Fig. 3. Once the NFC reader reads Table 2 Existing system versus proposed system Existing system
Proposed system
ATM hacking is a serious issues. Nowadays, online fraud, cloning, and counterfeiting are commonplace
Fingerprint technology and NFC are employed; therefore, hacking is impossible
ATM card is a mandatory requirement for transaction
Fingerprint technology is also incorporated, so there is no need to always carry an ATM card for money transactions
There is a greater risk of the user’s personal information being misused
Users’ information will be highly safeguarded and safe in both NFC and fingerprint technology scenarios
If an ATM card is used repeatedly, the magnetic The NFC tag does not need to be in contact strips will be damaged with the system If the PIN is found by card skimming and the user’s card is stolen, money can be taken from the user’s account without their knowledge
If a transaction is requested, an OTP will be issued to the user’s mobile number, letting them know if their card is being used by someone else
Use of Near-field Communication (NFC) and Fingerprint …
421
Fig. 2 Experimental setup
the NFC tag and gathers the user information and the information is valid, the PIC microcontroller continues the further process by transmitting the signal to the GSM module (Fig. 2). The GSM module sends a four-digit OTP to the user’s mobile number that is registered with the user’s bank account [23] as shown in Fig. 4a, b. The user should enter the OTP by typing it into the keypad. Then, the system will ask the user to enter a four-digit number to check whether the user is a human or robot. After the number is entered, the transaction will be processed as shown in Figs. 5 and 6. This ensures the security of every ATM transaction made using this system [24]. If the user wants to use the fingerprint sensor instead of NFC, he/she must keep their finger on the fingerprint sensor for the collection of user information [25]. The fingerprint of the user should be connected to the user’s details by registering it with the bank account [11]. After the user’s information is collected, the same process should be repeated as it is done while using NFC. The main advantage of the fingerprint sensor is that there is no use of a physical ATM card in this process. The motor and motor driver work and indicate that the transaction is being processed [26]. The LCD displays transaction types in the display once the GSM initialization is completed to 100%. It also asks us to choose between other types of transactions,
422
K. Renuka et al.
Fig. 3 NFC tag and reader
Fig. 4 a Verification of user with a four-digit number. b OTP is sent to registered mobile
Fig. 5 Enter the received OTP
such as contactless and cardless. The LCD displays a message such as SHOW YOUR CARD when we choose a contactless transaction (i.e., one that does not require a card). The card must next be inserted into the NFC reader as shown in the diagram. The NFC card reader displays a blue light when the card is shown to it, and it is properly identified. The graphic below illustrates this. When the card is properly recognized, it prompts us to verify it by displaying ANY 4 DT 4 HUMAN on the LCD, requesting us to enter any four digits to verify. It sends a one time password
Use of Near-field Communication (NFC) and Fingerprint …
423
Fig. 6 Keypad for entering the OTP
to your registered mobile phone and displays a message reading “CHECK YOUR MOBILE” to validate that the user is the correct user of the appropriate account after validating that the user is human. The LCD displays a message as soon as the OTP is received to the registered mobile number. ENTER UR DYNAMIC NO, THEN TYPE IN YOUR ONE TIME PASSWORD. Following the entry of the OTP, the authentication process is signaled by the operation of a motor coupled to a relay. When the user has been correctly validated, the relay will flash a blue light [12]. When we choose a cardless transaction (i.e., a transaction that uses a fingerprint), the LCD displays a notification that says “PUT YOUR FINGER IN THE DISPLAY.” The finger must then be inserted into the fingerprint reader. If the put fingerprint is correctly identified, the fingerprint controller displays a green light, otherwise, it displays a red light [27]. When a fingerprint is correctly identified, it requests human verification, and the procedure is repeated in the same way as with an NFC tag reader.
5 Conclusion This paper went over every aspect of near-field communication (NFC) technology. NFC’s range can be extended by combining it with current infrared and Bluetooth technologies. NFC is a safe and convenient means to transfer data between two
424
K. Renuka et al.
electronic devices. NFC’s interoperability with RFID technology is another benefit. NFC is based on the RFID technology. Magnetic field induction is used by RFID to establish communication between electrical devices in close proximity. NFC works at a frequency of 13.56 MHz and has a maximum data transfer rate of 424 kbps. ATMs are a convenient way for users to meet their banking demands. ATM machines are found all over the world and are utilized by a vast number of people. As a result, it is critical that ATM transactions be secure and rapid. The usage of a debit card or other type of card during ATM transactions has a number of drawbacks, including the possibility of ATM skimming, card magnetic strips becoming destroyed, card production and transit costs, and a longer time to identify users. The advantages of utilizing a smartphone in NFC card-emulation mode over an ATM card have been discussed, and it is clear that it is a viable option [14]. The suggested system’s security robustness against vulnerable assaults during authentication is highlighted by security analysis and threat modeling. In the future, we intend to replace the PIC microcontroller with a Raspberry Pi to increase the system’s speed and networking. Also, by using a smartphone to replace the NFC card, cardless ATM transactions are now possible.
5.1 Future Scope • Fingerprint technology keeps the customer’s information safe, as it is the most advanced technology to safeguard any kind of user information. • Security analysis and threat modeling shown in this work highlight the security strength of the system during authentication. • As tags and readers are integrated into NFC, privacy can be maintained more easily than with other tags. • An ATM transaction in this project does not necessitate the use of a card because it improves ATM transactions for users by introducing cardless ATM transactions. The objective for our future work is to concentrate on the security features of the custom authentication software that will be installed on the phone [28]. In addition, an in-depth examination of the probable security assaults during ATM transactions will be conducted, as part of the scope of this project. Acknowledgements This work was supported in part by the Embedded and IoT Applied Laboratory at Francis Xavier Engineering College, Tamil Nadu, India. Also, we would like to thank the anonymous reviewers for their valuable comments and suggestions.
Use of Near-field Communication (NFC) and Fingerprint …
425
References 1. C. Shuran, Y. Xiaoling, A new public transport payment method based on NFC and QR code, 2020 IEEE 5th International Conference on Intelligent Transportation Engineering (ICITE) (2020), pp. 240–244. https://doi.org/10.1109/ICITE50838.2020.9231356 2. L. Christian, H. Juwitasary, Y.U. Chandra, E.P. Putra, Fifilia, Evaluation of the E-service quality for the intention of community to use NFC technology for mobile payment with TAM, in 2019 International Conference on Information Management and Technology (ICIMTech) (2019), pp. 24–29. https://doi.org/10.1109/ICIMTech.2019.8843811 3. Shakya, S., Smys, S.: Big data analytics for improved risk management and customer segregation in banking applications. J. ISMAC 3(03), 235–249 (2021) 4. Ityala, S., Sharma, O., Honnavalli, P.B.: Transparent watermarking QR code authentication for mobile banking applications, in International Conference on Inventive Computation Technologies (Springer, Cham, 2019), pp. 738–748 5. M. Satheesh, M. Deepika, Implementation of multifactor authentication using optimistic fair exchange. J. Ubiquitous Comput. Commun. Technol. (UCCT) 2(02), 70–78 (2020) 6. Manoharan, J.S., A novel user layer cloud security model based on chaotic Arnold transformation using fingerprint biometric traits. J. Innov. Image Process. (JIIP) 3(01), 36–51 (2021) 7. Joe, C.V., Raj, J.S.: Deniable authentication encryption for privacy protection using blockchain. J. Artif. Intell. Capsule Netw. 3(3), 259–271 (2021) 8. Pathak, B., Pondkule, D., Shaha, R., Surve, A.: Visual cryptography and image processing based approach for bank security applications, in International Conference on Computer Networks and Inventive Communication Technologies (Springer, Cham, 2019), pp. 292–298 9. A. Hassan, A. George, L. Varghese, M. Antony, K.K. Sherly, The biometric cardless transaction with shuffling keypad using proximity sensor, in 2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA) (2020), pp. 505–508. https://doi.org/ 10.1109/ICIRCA48905.2020.9183314 10. D. Mahansaria, U.K. Roy, Secure authentication for ATM transactions using NFC technology, in 2019 International Carnahan Conference on Security Technology (ICCST) (2019), pp. 1–5. https://doi.org/10.1109/CCST.2019.8888427 11. R.M.N. Deelaka Ranasinghe, G.Z. Yu, RFID/NFC device with embedded fingerprint authentication system, in 2017 8th IEEE International Conference on Software Engineering and Service Science (ICSESS) (2017), pp. 266–269. https://doi.org/10.1109/ICSESS.2017.8342911 12. V.J. Govindraj, P.V. Yashwanth, S.V. Bhat, T.K. Ramesh, Smart door using biometric NFC band and OTP based methods, in 2020 International Conference for Emerging Technology (INCET) (2020), pp. 1–4. https://doi.org/10.1109/INCET49848.2020.9153970 13. S. Kolev, Designing a NFC system, in 2021 56th International Scientific Conference on Information, Communication and Energy Systems and Technologies (ICEST) (2021), pp. 111–113. https://doi.org/10.1109/ICEST52640.2021.9483482 14. O.H. Embarak, A two-steps prevention model of ATM frauds communications, in 2018 Fifth HCT Information Technology Trends (ITT) (2018), pp. 306–311. https://doi.org/10.1109/CTIT. 2018.8649551 15. A. Lazaro, M. Boada, R. Villarino, D. Girbau, Feasibility study on the reading of energyharvested implanted NFC tags using mobile phones and commercial NFC IC, in 2020 IEEE MTT-S International Microwave Biomedical Conference (IMBioC) (2020), pp. 1–3. https://doi. org/10.1109/IMBIoC47321.2020.9385033 16. R. Gupta, G. Arora, A. Rana, USB fingerprint login key, in 2020 8th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO) (2020), pp. 454–457. https://doi.org/10.1109/ICRITO48877.2020.9197785 17. Amruth, Y., Gopinatha, B.M., Gowtham, M.P., Kiran, M., Harshalatha, Y., Fingerprint and signature authentication system using CNN, in 2020 IEEE International Conference for Innovation in Technology (INOCON) (2020), pp. 1–4. https://doi.org/10.1109/INOCON50539.2020. 9298235
426
K. Renuka et al.
18. A. Sathesh, Enhanced soft computing approaches for intrusion detection schemes in social media networks. J. Soft Comput. Paradigm (JSCP) 1(02), 69–79 (2019) 19. S.R. Mugunthan, Soft computing based autonomous low rate DDOS attack detection and security for cloud computing. J. Soft Comput. Paradigm (JSCP) 1(02), 80–90 (2019) 20. S.S. Devi, T.S. Prakash, G. Vignesh, P.V. Venkatesan, Ignition system based licensing using PIC microcontroller, in 2021 Second International Conference on Electronics and Sustainable Communication Systems (ICESC) (2021), pp. 252–256. https://doi.org/10.1109/ICESC51422. 2021.9532920 21. A. Mandalapu, V. Daffney Deepa, L.D. Raj, J. Anish Dev, An NFC featured three level authentication system for tenable transaction and abridgment of ATM card blocking intricacies, in 2015 International Conference and Workshop on Computing and Communication (IEMCON) (2015), pp. 1–6. https://doi.org/10.1109/IEMCON.2015.7344491 22. A. Albattah, Y. Alghofaili, S. Elkhediri, NFC technology: assessment effective of security towards protecting NFC devices & services, in 2020 International Conference on Computing and Information Technology (ICCIT-1441) (2020), pp. 1–5. https://doi.org/10.1109/ICCIT-144 147971.2020.9213758 23. A. Khatoon, M. Sharique, Performance of GSM and GSM-SM over α-μ fading channel model, in TENCON 2019—2019 IEEE Region 10 Conference (TENCON) (2019), pp. 1981–1985. https://doi.org/10.1109/TENCON.2019.8929377 24. S. Sridharan, K. Malladi, New generation ATM terminal services, in 2016 International Conference on Computer Communication and Informatics (ICCCI) (2016), pp. 1–6. https://doi.org/ 10.1109/ICCCI.2016.7479928 25. P. Poonia, O.G. Deshmukh, P.K. Ajmera, Adaptive quality enhancement fingerprint analysis, in 2020 3rd International Conference on Emerging Technologies in Computer Engineering: Machine Learning and Internet of Things (ICETCE) (2020), pp. 149–153. https://doi.org/10. 1109/ICETCE48199.2020.9091760 26. J. Li, Research on DC motor driver in automobile electric power steering system, in 2020 International Conference on Intelligent Transportation, Big Data & Smart City (ICITBS) (2020), pp. 449–453. https://doi.org/10.1109/ICITBS49701.2020.00097 27. R.B.S. Prajeesha, N. Nagabhushan, T. Madhavi, Fingerprint-based licensing for driving, in 2021 6th International Conference for Convergence in Technology (I2CT) (2021), pp. 1–6. https://doi.org/10.1109/I2CT51068.2021.9418134 28. Y. Kim, M. Jun, A design of user authentication system using QR code identifying method, in 2011 6th International Conference on Computer Sciences and Convergence Information Technology (ICCIT) (2011), pp. 31–35
Light Gradient Boosting Machine in Software Defect Prediction: Concurrent Feature Selection and Hyper Parameter Tuning Suresh Kumar Pemmada, Janmenjoy Nayak, H. S. Behera, and Danilo Pelusi Abstract Predicting software defects is critical for ensuring software quality. Many supervised learning approaches have been used to detect defect-prone instances in recent years. However, the efficacy of these supervised learning approaches is still inadequate, and more sophisticated techniques will be required to boost the effectiveness of defect prediction models. In this paper, we present a light gradient boosting methodology based on ensemble learning that uses simultaneous feature selection (Recursive Feature Elimination (RFE)) and hyperparameter tuning (Random search). Our proposed technique LGBM + Randomsearch + RFE method is evaluated using the AEEEM dataset, including Apache Lucene, Eclipse JDT Core, Equinox, Mylyn, and Eclipse PDE UI. The experimental findings demonstrate that the proposed approach outperforms LGBM + Randomsearch, LGBM, and the top classical machine learning algorithms on all performance criteria considered. Keywords Light gradient boosting machine · Recursive feature elimination · Software defect prediction · Ensemble learning
S. K. Pemmada (B) Department of Computer Science and Engineering, Aditya Institute of Technology and Management (AITAM), Tekkali 532201, India e-mail: [email protected] J. Nayak Department of Computer Science, Maharaja Sriram Chandra Bhanja Deo University, Baripada, Odisha 757003, India S. K. Pemmada · H. S. Behera Department of Information Technology, Veer Surendra Sai University of Technology, Burla 768018, India D. Pelusi Communication Sciences, University of Teramo, Teramo, Italy e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_32
427
428
S. K. Pemmada et al.
1 Introduction In the software development life cycle, software testing is a vital but costly procedure. Defects are inevitable in contemporary software development due to their complexity. Defect-prone software projects may have unintended repercussions when implemented, resulting in massive losses for businesses or even endangering people’s lives. Most approaches for assessing defects rely on historical information and pertinent data to build a prediction model. They allow the software industry to inspect source files and code before release. As a result, any defects can be identified and corrected on time [1]. Currently, defect fixing consumes more than 80% of software maintenance and development expenditures. If these defects could be caught early in the software development cycle, the cost would be significantly reduced. As a consequence, a number of academics have sought to build defect prediction models to assist programmers in discovering probable defects ahead of time [2]. Software defect prediction (SDP) has attracted a lot of attention as a quality factor that may assist software developers in finding more defects in their systems. It entails using machine learning (ML) approaches on software metrics generated from software system repositories to anticipate a software system’s quality and dependability [3]. Software engineers may utilize the knowledge acquired from SDP procedures to improve software development processes and manage constrained software resources [4]. Due to the constraints of software testing abilities and performance, defect prediction methodologies are still unsatisfactory in practice. The software may be harmed if the prediction model deviates from the norm or low prediction performance. Because most of the instances are defect-free and only a few are defect-prone, class imbalance is a major factor impacting prediction performance in SDP. This means that the dataset is imbalanced. As a result, a significant number of researchers have focused on unbalanced learning for SDP, and several earlier studies [5, 6] have presented various ways for dealing with dataset imbalances. The problem of class imbalance has been addressed in this article using a technique known as Synthetic Minority Over-sampling Technique (SMOTE). SMOTE is an over-sampling approach in which synthetic samples are created for the minority class. However, the performance of these models is still unsatisfactory to accomplish the intended outcomes. It concentrates on the feature space to produce new examples by interpolating between positive instances that are close together. Feature selection is the process of selecting features from a large number of features in a dataset. It is one of the most important fields of high-dimensional data analysis study. The feature selection process is very important to construct highly effective machine learning algorithms. There are three types of feature selection approaches available: filter, wrapper, and embedding methods. In this article, wrapper method is used. Wrapper approaches choose subsets of features from the whole feature set and then train the model. The features that need to be removed or added to the feature subset are determined depending on the results of the preceding model. Recursive Feature Elimination (RFE) is a feature selection wrapper concept that employs a greedy algorithm in its execution. The RFE algorithm begins with the
Light Gradient Boosting Machine in Software Defect …
429
entire features. The feature set has been selected based on the classification accuracy. Feature sets have been ranked at the end of each iteration, and the feature with the least importance is eliminated. This method is repeated until only the relevant features remain in the feature set [7]. Aiming at improving the performance of the defect prediction, Logistic Regression (LR) [8], Naive Bayes (NB) [9], Support Vector Machine (SVM) [10], and Random Forest (RF) [11] have been effectively implemented and predicting different defects in software systems. However, in skewed and duplicated defect datasets, these techniques are sub-optimal. The prediction performance of these approaches deteriorates when the defect datasets include missing or irrelevant information. Individual classifiers, such as support vector machines (SVM) and artificial neural networks (ANN), are biased toward the majority class and disregard the minority class, resulting in a high false negative rate (FNR) [12]. It is worth noting that ensemble learning models are excellent for dealing with the data difficulties stated above. Ensembles are models that incorporate predictions from two or more different models. Ensemble learning approaches lower the spread in a predictive model’s average skill, enhance average prediction performance across any contributing classifiers in the ensemble, and frequently reduce the variance component of prediction mistakes generated by the contributing models. Despite the fact that ensemble learning techniques increase computational cost and complexity, they may provide superior predictions and performance than any individual contributing model. The following is the article’s main contribution: (A)
(B) (C)
This article aims to demonstrate the utilization of a light gradient boosting machine. This ensemble learning methodology integrates distinct machine learning algorithms in identifying defect-prone and defect-free modules. The class imbalance problem is addressed by employing SMOTE to generate synthetic minority samples. Recursive feature elimination is employed to identify significant features in improving performance.
The rest of this work is broken into the parts below: The literature on software defect prediction using different machine learning and ensemble learning techniques are presented in Sect. 2. The proposed methodology is presented in Sect. 3, together with the framework of the proposed method. Section 4 includes empirical data, simulation environment, model parameter configuration, and analysis of results. Finally, Sect. 5 concluded the work by suggesting potential future directions.
2 Literature Study This section studied the literature on software defect prediction, which deals with class imbalance and feature selection. Catherine and Djodilatchoumy [13] investigated the usage of Multi-Layer Perceptron Neural Network for effective prediction of defects. They used MLP-NN as an
430
S. K. Pemmada et al.
attribute evaluator with a subset of features selected by utilizing a collection-basedmulti filter selection technique and correlation-based feature selection. Five available in AEEEM have been used to test the model. The results are then compared to those of other well-known classifiers as Logistic Regression, Random Tree, and MLP-NN. The results show that the proposed approach outperformed well, and feature selection significantly enhanced prediction accuracy. In order to deal with the unbalanced data in software defect prediction, Guo et al. [14] employed a random over-sampling strategy to construct minority class instances from a high-dimensional sample space. To provide a robust approach to generating new synthetic samples, two limits have been imposed: scaling the random oversampling scope to a sufficient region and distinguishing the majority class samples in a critical position. They experimentally validated the proposed technique on datasets ivy02.0, log4j-1.1, xalan-2.5, velocity-1.4, redator, synapse-1.1, arc, lucene-2.4, and MW1 of software projects, and the result outperforms typical unbalanced processing algorithms Table 1 summarizes some further literature on software defect prediction.
3 Proposed Method This section outlines the proposed intelligent method light gradient boosting method with random search and RFE on predicting defect-free or defect-prone modules built on tree learning algorithms based on the boosting concept [23]. It enables the light gradient boosting model with hyperparameter tweaking and feature selection simultaneously. Random search is utilized to tune model parameters while ranking feature selection, such as recursive feature elimination, is also employed to enhance the proposed method’s performance. LGBM accelerates the training phase and reduces memory use by using a leaf-wise growth process with depth restrictions and histogram-based techniques. The level-wise design process of a decision tree is ineffectual because it handles the leaves of the same layer, resulting in a large amount of additional memory that is superfluous. Leaf-wise is a more effective method that discovers the leaves with the greatest branching advantage and progresses through the branching cycle. Therefore, LGBM adds a maximum depth limit to the top of the leaf, preventing overfitting while maintaining high performance [24]. The framework of the proposed method is presented in Fig. 1.
4 Experimental Setup and Result Discussion This section contains information on the dataset, simulation environment, as well as a discussion of the findings of proposed and compared methods.
Light Gradient Boosting Machine in Software Defect …
431
Table 1 Literature on software Defect Prediction S. No
Dataset (source)
Method
Performance
Evolution factor
References
1
PROMISE, AEEEM
Support vector machine—SMOTE
Mylyn: F-measure, AUC: 0.74
F-measure, AUC
[15]
2
NASA, AEEEM
Sampling with the majority (SWIM) boost
G-mean: 0.72, AUC: 0.74
G-mean, AUC [16]
3
AEEEM
Extended nearest neighbor algorithm, SMOTE, simulated annealing
Accuracy: 87.19
Accuracy
[17]
4
PROMISE
Naïve Bayes
Accuracy: 98.7
Accuracy
[18]
5
NASA
Linear regression, Naïve Bayes
Accuracy: linear Accuracy regression, Naïve Bayes: 98.1
[19]
6
PROMISE
Random forest
Precision: 0.893, recall: 0.919, accuracy: 0.946, and F-measure: 0.919
Precision, recall, accuracy, and F-measure
[20]
7
NASA, PROMISE, ReLink, and AEEEM
Naïve Bayes, decision tree
Relink AUC—95%
AUC
[21]
8
AEEEM’s JDT, PDE, Mylyn, jm1, kc1, PR, and ECLIPSE
Multiview transfer learning for SDP
Eclipse AUC 0.8964
AUC
[1]
9
NASA, PROMISE, AEEEM, and Relink
Deep forest using defect prediction
NASA AUC—0.92
AUC
[2]
10
AEEEM, NASA, PROMISE
Distance-based classifiers, connectivity-based classifiers
AEEEM AUC—0.83
AUC
[22]
4.1 Empirical Data D’Ambros et al. collected the AEEEM dataset [25] based on Apache and Eclipse. AEEEM contains 61 different metrics for each program, which integrate numerous traditional source code metrics that provide software features. The source code metrics that are based on change metrics, entropy, and churn of source code metrics
432
S. K. Pemmada et al.
Fig. 1 The framework of the proposed method
[26]. Table 2 shows the AEEEM dataset’s precise details, including the total number of modules, the number of software metrics, the number of defect-prone modules, the number of defect-free modules, and the imbalance rate (IR) for each dataset. Table 2 Detailed statics of AEEEM repository Dataset program
Modules Features Majority samples Minority samples Imbalance rate
Apache Lucene
691
61
627
64
9.8
Equinox
324
61
195
129
1.5
Eclipse JDT Core
997
61
791
206
3.8
Mylyn
1862
61
1617
245
9.8
Eclipse PDE UI
1497
61
1288
209
6.2
Light Gradient Boosting Machine in Software Defect …
433
Table 3 LGBM parameters in AEEEM Dataset
LGBM parameters (RFE + random search)
Apache Lucene
boosting_type = ‘goss’, learning_rate = 0.29788814813308806, max_depth = 10, n_estimators = 150, num_leaves = 30, random_state = 25
Equinox
boosting_type = ‘goss’, class_weight = none, colsample_bytree = 1.0, importance_type = ‘split’, learning_rate = 0.29923603106613095, max_depth = 12, min_child_samples = 20, min_child_weight = 0.001, min_split_gain = 0.0, n_estimators = 150, n_jobs = −1, num_leaves = 30, objective = none, random_state = 25, reg_alpha = 0.0, reg_lambda = 0.0, silent = true, subsample = 1.0, subsample_for_bin = 200,000, subsample_freq = 0
Eclipse JDT Core boosting_type = ‘goss’, class_weight = none, colsample_bytree = 1.0, importance_type = ‘split’, learning_rate = 0.2975322352484806, max_depth = 12, min_child_samples = 20, min_child_weight = 0.001, min_split_gain = 0.0, n_estimators = 150, n_jobs = −1, num_leaves = 150, objective = none, random_state = 25, reg_alpha = 0.0, reg_lambda = 0.0, silent = true, subsample = 1.0, subsample_for_bin = 200,000, subsample_freq = 0 Mylyn
learning_rate = 0.25589949734941975, max_depth = 10, n_estimators = 150, num_leaves = 50, random_state = 25
Eclipse PDE UI
learning_rate = 0.20777294930484105, max_depth = 12, n_estimators = 150, num_leaves = 100, random_state = 25
4.2 Simulation Environment and Parameter Setting The proposed approach and all comparative machine learning methods have been implemented in a Lenovo Ideapad Flex5 machine with the following system settings: processor: 11th Gen Intel (R) Core (TM) i7-1165G7 @ 2.80 GHz 2.80 GHz, RAM: 16 GB, GPU: Intel(R) Iris(R) Xe Graphics, OS: Windows 11 Home, and python modules such as Numpy, Pandas, Sklearn, lightgbm, matplotlib. Recursive Feature Elimination is used to choose features, while Random Search is utilized to tweak parameters for the Light Gradient Boosting Machine in Software Defect Prediction. In the recursive feature elimination—feature selection process, we considered 25 features out of 61 features, considered step is 1, and continued the process up to 22 iterations. Table 3 shows the parameter configuration used to evaluate the proposed approach in various projects of AEEEM.
4.3 Result Discussion The dataset mentioned above contains the number of software projects that have been categorized as defect-free or defect-prone. A light gradient boosting approach has been proposed to detect the defect-prone modules in the AEEEM data. According to the literature review, most researcher papers regarded accuracy as the significant
434
S. K. Pemmada et al.
measure in the classification problem. Apart from accuracy, various metrics such as true positive rate, false positive rate, precision, true negative rate, f 1-score, and ROCAUC might be useful in gaining a better understanding of the simulated outcomes. The proposed method has been validated by comparing different ML algorithms such as KNN, RF, SGD, DT, LR, GNB, LDA, and QDA using performance measures such as accuracy, true positive rate, false positive rate precision, true negative rate, f 1-score, and ROC-AUC [27]. Table 4 presents the performance of the proposed LGBM + Randomsearch + RFESHAP, LGBM + Randomsearch, LGBM and several machine learning algorithms such as KNN, RF, SGD, DT, LR, GNB, LDA, and QDA with AEEEM data. In Apache Lucene, LGBM + Randomsearch + RFE and LGBM + Randomsearch have a testing accuracy of 0.952, whereas LGBM, KNN, RF, SGD, DT, GNB, LDA, LR, and QDA have testing accuracies of 0.932, 0.793, 0.916, 0.713, 0.900, 0.685, 0.785, 0.761, and 0.737, respectively. In terms of TPR, LGBM + Randomsearch and LGBM did well with 0.945. However, the proposed technique LGBM + Randomsearch + RFE outperformed the others in TNR, f 1, ROC-AUC, FPR, and Precision. In the case of Equinox, LGBM + Randomsearch + RFESHAP is ranked first with 0.872 testing accuracy, followed by LGBM + Randomsearch and LDA with 0.859 and 0.821 testing accuracy, respectively. In terms of TPR, LGBM performed well, while the proposed approach did better in precision, FPR, TNR, ROC-AUC, and F1-score. In the case of Eclipse JDT core, the proposed method obtained the testing accuracy of 0.912; the proposed LGBM + Randomsearch + RFESHAP outperformed all other approaches. The proposed technique performed better with 0.920, 0.097, 0.909, 0.903, 0.914, 0.911, respectively for TPR, FPR, Precision, TNR, F1, ROCAUC. LGBM + Randomsearch + RFESHAP produced better results in the case of Mylyn followed by LGBM that is equal to the proposed LGBM + Randomsearch. KNN, SGD, RF, DT, GNB, LR, LDA, and QDA achieved accuracies of 0.833, 0.669, 0.918, 0.867, 0.631, 0.767, 0.757, and 0.675, respectively. In the case of Eclipse PDE UI, LGBM + Randomsearch + RFESHAP is more accurate, with a score of 0.928, followed by 0.926, 0.923, and 0.919 for LGBM + Randomsearch, LGBM, and random forest, respectively. The accuracies of KNN, SGD, DT, GNB, LR, LDA, and QDA are 0.841, 0.611, 0.866, 0.640, 0.723, 0.738, and 0.690, respectively. The proposed technique outperformed other techniques in terms of all performance criteria considered. This research reveals that the proposed approach is a reliable model that outperforms all others in terms of considered performance measures. The ROC curve of the proposed approach for both classes, as well as their coverage of analyzing the difference between classes of Apache Lucene, Equinox, Eclipse, JDT Core, Mylyn, and Eclipse PDE UI datasets, is shown in Fig. 2a–e. The proper method has a higher coverage area in both micro and macro average than all of the ROC curves generated for ensemble and machine learning-based models, demonstrating a better capacity to distinguish between defect-free and defect-prone models. Table 5 shows a comparison of the proposed approach’s performance with that of several earlier articles; the results reveal that the proposed method outperformed in
0.65
1.00
1.00
0.67
0.71
1.00
1.00
0.67
SGD
RF
DT
GNB
1.00
LU
EQ
1.00
0.82
SGD
0.62
0.67
0.62
0.85
0.71
LGBM
KNN
0.93
0.75
0.80
0.89
0.91
0.72
0.80
0.81
0.73
0.68
0.85
0.71
0.86
0.94
0.92
0.69
0.77
0.77
0.64
1.00
1.00
1.00
0.27
0.70
0.82
0.96
0.94
0.69
0.74
0.72
0.66
1.00
1.00
0.63
0.88
0.74
0.78
0.76
0.69
0.90
0.92
0.71
0.79
0.93
0.95
0.87
0.74
0.82
0.73
0.68
0.78
0.82
0.73
0.74
0.76
0.86
0.71
0.71
0.83
0.96
0.69
0.69
0.73
0.87
0.87
0.97
0.87
0.98
0.97
Proposed method
LGBM + RS
0.87
0.67
1.00
0.99
0.88
1.00
True negative rate
0.79
0.72
QDA
1.00
0.89
1.00
1.00
PD
Precision
0.74
0.78
0.74
0.75
LR
LDA
0.85
1.00
0.89
LGBM
KNN
1.00
1.00
My
0.95
0.93
0.93
1.00
1.00
Proposed method
LGBM + RS
1.00
JD Testing accuracy
EQ
Training accuracy
LU
0.83
0.77
0.81
0.89
0.90
0.70
0.79
0.79
0.72
0.87
0.90
0.67
0.83
0.92
0.90
0.91
JD
0.75
0.76
0.87
0.93
0.92
0.68
0.76
0.77
0.63
0.87
0.92
0.67
0.83
0.92
0.93
0.92
My
0.57
0.77
0.85
0.96
0.94
0.69
0.74
0.72
0.64
0.87
0.92
0.61
0.84
0.92
0.93
0.93
PD
Table 4 Performance of the proposed and several comparison models in AEEEM LU
EQ
0.76
0.80
0.90
0.96
0.96
F1-score
0.71
0.80
0.75
0.66
0.94
0.95
0.71
0.90
0.95
0.95
0.94
0.70
0.71
0.76
0.86
0.87
0.72
0.82
0.73
0.62
0.82
0.84
0.80
0.83
0.90
0.85
0.87
True positive rate
0.74
0.82
0.85
0.90
0.91
0.67
0.76
0.79
0.67
0.89
0.92
0.62
0.90
0.90
0.92
0.92
JD
0.72
0.81
0.91
0.93
0.92
0.63
0.77
0.78
0.59
0.89
0.93
0.63
0.95
0.97
0.93
0.92
My
0.41
0.82
0.89
0.94
0.93
0.64
0.71
0.71
0.60
0.88
0.93
0.85
0.97
0.98
0.92
0.91
PD
LU
EQ
0.13
0.23
0.18
0.26
0.15
0.25
0.20
0.31
0.31
0.27
0.13
0.71
0.80
0.89
0.95
0.96
0.74
0.76
0.82
0.86
0.87
ROC-AUC
0.19
0.23
0.21
0.21
0.14
0.12
0.29
0.29
0.17
0.04
0.03
False positive rate
0.73
0.84
0.86
0.90
0.91
0.25
0.18
0.22
0.17
0.14
0.12
0.17
0.23
0.19
0.11
0.10
JD
0.69
0.85
0.92
0.93
0.92
0.20
0.26
0.24
0.25
0.15
0.09
0.25
0.24
0.13
0.07
0.08
My
(continued)
0.71
0.87
0.91
0.94
0.93
0.19
0.23
0.27
0.24
0.15
0.09
0.43
0.23
0.15
0.04
0.06
PD
Light Gradient Boosting Machine in Software Defect … 435
0.72
0.92
0.74
0.88
0.92
DT
GNB
0.91
QDA
0.79
JD
0.83
0.85
0.80
0.90
0.86
0.88
My
0.89
0.74
0.76
0.88
0.85
0.91
PD
0.88
0.80
0.74
0.87
0.85
0.90
LU
0.81
0.77
0.79
0.79
0.86
0.88
EQ
0.77
0.82
0.74
0.85
0.75
0.80
JD
0.75
0.82
0.78
0.83
0.86
0.88
My
0.80
0.74
0.76
0.75
0.85
0.91
PD
0.81
0.77
0.73
0.76
0.85
0.91
LU
0.80
0.81
0.81
0.77
0.91
0.92
EQ
0.76
0.82
0.73
0.74
0.77
0.82
JD
0.74
0.80
0.80
0.77
0.88
0.90
My
0.74
0.76
0.77
0.71
0.87
0.92
PD
0.74
0.75
0.73
0.71
0.86
0.92
LU
0.76
0.78
0.77
0.72
0.90
0.91
EQ
0.75
0.82
0.73
0.74
0.79
0.82
JD
0.71
0.79
0.79
0.75
0.87
0.90
Proposed Method: LGBM + Randomsearch + RFESHAP; LGBM + RS: LGBM + Randomsearch; LU: Apache Lucene; JD: Eclipse JDT Core; EQ: Equinox; My: Mylyn; PD: Eclipse PDE UI
0.87
0.83
LR
LDA
0.82
EQ
0.79
LU
RF
0.90
Table 4 (continued) My
PD
0.74 0.72
0.71
0.72
0.68
0.87
0.92
0.76
0.77
0.67
0.87
0.92
436 S. K. Pemmada et al.
Fig. 2 ROC-AUC curve of the proposed method with a Apache Lucene, b Equinox, c Eclipse JDT Core, d Mylyn, e Eclipse PDE UI
Light Gradient Boosting Machine in Software Defect … 437
438
S. K. Pemmada et al.
Table 5 Performance comparison of the proposed method with previous articles Project name
Proposed methods
Previous methods
References
Apache Lucene
LGBM + Randomsearch + RFESHAP Accuracy—0.952, AUC—0.955, F1-score—0.959
DPCMM-LR DPSAM 64.2 DPCMM 76.1 (accuracy) + LR 72.3 (F1-score) (precision)
[28]
MLP + FS 89.00 (accuracy)
[13]
Equinox LGBM + Randomsearch + RFESHAP Accuracy—0.872, AUC—0.872, F1-score—0.872
MDA-O 0.6637 (F-measure)
MDA-O 0.7102 (G-measure)
MDA-O 0.765 (AUC)
DPDF 0.82 (AUC)
DPDF 0.93 (accuracy)
DPDF LR 0.39 [2] 0.81 (F1-score) (precision)
SC 0.79 (AUC)
[22]
MLP + FS 80.00 (accuracy)
[13]
ManualDown ManualDown MDA-O 0.6742 0.7082 0.7874 (F-measure) (G-measure) (AUC)
[29]
DPDF 0.85 (AUC)
Eclipse JDT Core
LGBM + Randomsearch + RFESHAP Accuracy—0.912, AUC—0.911, F1-score—0.914
LGBM + Randomsearch + RFESHAP Accuracy—0.9212, AUC—0.9211, F1-score—0.9224
DPDF 0.78 (accuracy)
NB 0.72 DPDF [2] (precision) 0.75 (F1-score)
SC 0.81 (AUC)
[22]
LR 84.6 (accuracy)
[13]
MDA-O 0.5754 (F-measure)
MDA-O 0.6999 (G-measure)
MDA-O 0.764 (AUC)
MTDT 0.869 (AUC) DPDF 0.86 (AUC)
Mylyn
[29]
[29]
[1] DPDF 0.85 (accuracy)
SVM 0.74 LR 0.56 [2] (precision) (F1-score)
SC 0.83 (AUC)
[22]
MLP + FS 87.00 (accuracy)
[13]
MDA-O 0.6381 (F-measure)
MDA-O 0.6907 (G-measure)
MDA-O 0.7502 (AUC)
[29]
(continued)
Light Gradient Boosting Machine in Software Defect …
439
Table 5 (continued) Project name
Proposed methods
Previous methods DPDF 0.82 (AUC)
Eclipse LGBM + PDE UI Randomsearch + RFESHAP Accuracy—0.926, AUC—0.927, F1-score—0.927
DPDF 0.87 (accuracy)
References DPDF NB 0.36 [2] 0.47 (F1-score) (precision)
SC 0.63 (AUC)
[22]
MLP + FS 87.08 (accuracy)
[13]
MDA-O 0.663 (F-measure)
MDA-O 0.7085 (G-measure)
MTDT 0.8252 (AUC) DPDF 0.77 (AUC) SC 0.72 (AUC)
MDA-O 0.7301 (AUC)
[29]
[1]
DPDF 0.87 (accuracy)
DBN 0.79 LR 0.35 [2] (precision) (F1-score) [22]
all performance measures. In Apache Lucene, the proposed approach scored 0.952 accuracy, 0.955 AUC, and 0.959 f 1-score that is better compared to 0.93 accuracy, 0.82 AUC, and 0.761 f 1-score in the previous study. In Equinox, the proposed method achieved the same accuracy, AUC, and f 1-score of 0.872, while the prior article’s greatest accuracy, AUC, and f 1-score were 0.80, 0.85, and 0.75, respectively. The accuracy, AUC, and f 1-score of the proposed technique in Eclipse JDT Core are 0.912, 0.911, and 0.914, respectively, while the maximum accuracy, AUC, and f 1score in prior publications were 0.85, 0.86, and 0.699, respectively. The suggested technique’s accuracy, AUC, and f 1-score in Mylyn are 0.9212, 0.9211, and 0.9224, respectively, while previous publications’ greatest accuracy, AUC, and f 1-score were 0.87, 0.82, and 0.63, respectively. The proposed approach scored 0.926 accuracy, 0.927 AUC, and 0.927 f 1-score in Eclipse PDE UI, whereas the prior paper scored 0.87 accuracy, 0.82 AUC, and 0.66 f 1-score. Findings indicate that the proposed approach performed much better than earlier publications in terms of accuracy, AUC, and f 1-score.
5 Conclusion Despite various machine learning-based algorithms, detecting and identifying software defects has always been difficult. To solve the current research gap, this work proposes an ensemble learning-based LGBM model. This research aims to resolve
440
S. K. Pemmada et al.
high dimensionality and choose the best hyperparameters for the learning algorithm. The efficacy of various filter feature selection techniques varies, making it difficult to choose an appropriate and relevant filter feature selection approach to utilize in SDP. On the proposed light gradient boosting approach, random feature elimination is utilized for feature selection, and random search is employed for hyperparameter tuning simultaneously. The effectiveness of the proposed method LGBM + Randomsearch + RFESHAP has been validated using LGBM + Randomsearch, LGBM, and several ML techniques such as SGD, KNN, RF, GNB, DT, LR, LDA, and QDA. Based on various performance measures and extensive research, it is clear that the proposed approach is effective in detecting software defects. The proposed model identified defects in software modules with 0.952, 0.872, 0.912, 0.921, and 0.926 accuracy for Apache Lucene, Equinox, Eclipse JDT Core, Mylyn, and Eclipse PDE UI, respectively. In the future, we will look at the possibilities of using our proposed technology for cross-project defect prediction on additional projects in the future. Finally, we had like to investigate the possibility of SMOTUNED technique to handle the class imbalance issue in defect prediction or conduct comprehensive research to compare with the SMOTE/SMOTUNED method on unbalanced datasets.
References 1. J. Chen, Y. Yang, K. Hu, Q. Xuan, Y. Liu, C. Yang, Multiview transfer learning for software defect prediction. IEEE Access 7, 8901–8916 (2019). https://doi.org/10.1109/ACCESS.2018. 2890733 2. T. Zhou, X. Sun, X. Xia, B. Li, X. Chen, Improving defect prediction with deep forest. Inf. Softw. Technol. 114, 204–216 (2019). https://doi.org/10.1016/j.infsof.2019.07.003 3. P. Suresh Kumar, H.S. Behera, J. Nayak, B. Naik, A pragmatic ensemble learning approach for effective software effort estimation. Innov. Syst. Softw. Eng. (2021). https://doi.org/10.1007/ s11334-020-00379-y 4. P. Suresh Kumar, H.S. Behera, J. Nayak, B. Naik, Bootstrap aggregation ensemble learningbased reliable approach for software defect prediction by using characterized code feature. Innov. Syst. Softw. Eng. 17(4), 355–379 (2021). https://doi.org/10.1007/s11334-021-00399-2 5. R. Shatnawi, Improving software fault-prediction for imbalanced data, in 2012 International Conference on Innovations in Information Technology (IIT), Mar 2012, pp. 54–59. https://doi. org/10.1109/INNOVATIONS.2012.6207774 6. R. Chen, S.-K. Guo, X.-Z. Wang, T.-L. Zhang, Fusion of multi-RSMOTE with fuzzy integral to classify bug reports with an imbalanced distribution. IEEE Trans. Fuzzy Syst. 27(12), 2406– 2420 (2019). https://doi.org/10.1109/TFUZZ.2019.2899809 7. S. Mehta, K.S. Patnaik, Improved prediction of software defects using ensemble machine learning techniques. Neural Comput. Appl. 33(16), 10551–10562 (2021). https://doi.org/10. 1007/s00521-021-05811-3 8. V.U.B. Challagulla, F.B. Bastani, I.-L. Yen, R.A. Paul, Empirical assessment of machine learning based software defect prediction techniques, in 10th IEEE International Workshop on Object-Oriented Real-Time Dependable Systems (2005), pp. 263–270. https://doi.org/10. 1109/WORDS.2005.32 9. Ö.F. Arar, K. Ayan, A feature dependent Naive Bayes approach and its application to the software defect prediction problem. Appl. Soft Comput. 59, 197–209 (2017). https://doi.org/ 10.1016/j.asoc.2017.05.043
Light Gradient Boosting Machine in Software Defect …
441
10. X. Rong, F. Li, Z. Cui, A model for software defect prediction using support vector machine based on CBA. Int. J. Intell. Syst. Technol. Appl. 15(1), 19 (2016). https://doi.org/10.1504/IJI STA.2016.076102 11. H. Lu, B. Cukic, M. Culp, Software defect prediction using semi-supervised learning with dimension reduction, in Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering—ASE 2012 (2012), p. 314. https://doi.org/10.1145/2351676.235 1734 12. I.H. Laradji, M. Alshayeb, L. Ghouti, Software defect prediction using ensemble learning on selected features. Inf. Softw. Technol. 58, 388–402 (2015). https://doi.org/10.1016/j.infsof. 2014.07.005 13. J.M. Catherine, S. Djodilatchoumy, Multi-layer perceptron neural network with feature selection for software defect prediction, in 2021 2nd International Conference on Intelligent Engineering and Management (ICIEM), Apr 2021, pp. 228–232. https://doi.org/10.1109/ICIEM5 1511.2021.9445350 14. S. Guo, J. Dong, H. Li, J. Wang, Software defect prediction with imbalanced distribution by radius-synthetic minority over-sampling technique. J. Softw. Evol. Process 33(7), 1–21 (2021). https://doi.org/10.1002/smr.2362 15. R. Malhotra, V. Agrawal, V. Pal, T. Agarwal, Support vector based oversampling technique for handling class imbalance in software defect prediction, in 2021 11th International Conference on Cloud Computing, Data Science & Engineering (Confluence), Jan 2021, pp. 1078–1083. https://doi.org/10.1109/Confluence51648.2021.9377068 16. J. Zheng, X. Wang, D. Wei, B. Chen, Y. Shao, A novel imbalanced ensemble learning in software defect predication. IEEE Access 9, 86855–86868 (2021). https://doi.org/10.1109/ ACCESS.2021.3072682 17. Y. Liu, F. Sun, J. Yang, D. Zhou, Software defect prediction model based on improved BP neural network, in 2019 6th International Conference on Dependable Systems and Their Applications (DSA), Jan 2020, pp. 521–522. https://doi.org/10.1109/DSA.2019.00095 18. A. Rahim, Z. Hayat, M. Abbas, A. Rahim, M.A. Rahim, Software defect prediction with Naïve Bayes classifier, in 2021 International Bhurban Conference on Applied Sciences and Technologies (IBCAST), Jan 2021, pp. 293–297. https://doi.org/10.1109/IBCAST51254.2021. 9393250 19. A. Arya, S. Kumar, V. Singh, Prediction of defects in software using machine learning classifiers (2021), pp. 481–494 20. K.V. Kumar, P. Kumari, A. Chatterjee, D.P. Mohapatra, Software fault prediction using random forests, in Smart Innovation, Systems and Technologies, vol. 194 (2021), pp. 95–103. https:// doi.org/10.1007/978-981-15-5971-6_10 21. A.O. Balogun et al., Impact of feature selection methods on the predictive performance of software defect prediction models: an extensive empirical study. Symmetry (Basel) 12(7), 1147 (2020). https://doi.org/10.3390/sym12071147 22. F. Zhang, Q. Zheng, Y. Zou, A.E. Hassan, Cross-project defect prediction using a connectivitybased unsupervised classifier, in Proceedings of the 38th International Conference on Software Engineering—ICSE ’16 (2016), 14–22 May 2016, pp. 309–320. https://doi.org/10.1145/288 4781.2884839 23. G. Ke et al., LightGBM: a highly efficient gradient boosting decision tree, in 31st Conference on Neural Information Processing Systems (NIPS 2017) (2017), pp. 3147–3155. [Online]. Available: https://proceedings.neurips.cc/paper/2017/file/6449f44a102fde848669b dd9eb6b76fa-Paper.pdf 24. J. Fan, X. Ma, L. Wu, F. Zhang, X. Yu, W. Zeng, Light gradient boosting machine: an efficient soft computing model for estimating daily reference evapotranspiration with local and external meteorological data. Agric. Water Manag. 225, 105758 (2019). https://doi.org/10.1016/j.agwat. 2019.105758 25. M. D’Ambros, M. Lanza, R. Robbes, An extensive comparison of bug prediction approaches, in 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010), May 2010, pp. 31–41. https://doi.org/10.1109/MSR.2010.5463279
442
S. K. Pemmada et al.
26. A.O. Balogun et al., Impact of feature selection methods on the predictive performance of software defect prediction models: an extensive empirical study. Symmetry (Basel) 12(7) (2020). https://doi.org/10.3390/sym12071147 27. J. Nayak, P.S. Kumar, D.K. Reddy, B. Naik, Identification and classification of hepatitis C virus: an advance machine-learning-based approach, in Blockchain and Machine Learning for e-Healthcare Systems (Institution of Engineering and Technology, 2020), pp. 393–415 28. T. Yu, C.-Y. Huang, N.C. Fang, Use of deep learning model with attention mechanism for software fault prediction, in International Conference on Dependable Systems and Their Applications (2021), pp. 161–171. https://doi.org/10.1109/DSA52907.2021.00025 29. Y. Sun, X.Y. Jing, F. Wu, Y. Sun, Manifold embedded distribution adaptation for cross-project defect prediction. IET Softw. 14(7), 825–838 (2020). https://doi.org/10.1049/iet-sen.2019.0389
An Three-Level Active NPC Inverter Open-Circuit Fault Diagnosis Using SVM and ANN P. Selvakumar and G. Muthukumaran
Abstract This paper has come up with a combination of Support Vector Machine (SVM) and Artificial Neural Network (ANN) for fault diagnosis of a single battery and inverter switch fault of an three-level active neutral-point clamped (ANPC) inverter. Moreover, a 3L-ANPC inverter is capable of gaining the EV’s controllability in the power train and need not have to halt even after the occurrence of the fault. Hence, an efficient fault diagnosis methodology is required in which battery fault is identified by an SVM which is a machine learning model that consists of sets of labeled training data with regression and classification challenges. Finally, when the fault arises in the ANPC inverter, the location of faulty switch can be identified by an ANN which determines the weights and threshold of the ANN and thus reduces the training time with increase in efficiency and accuracy. Keywords ANPC inverter · EV · SVM · ANN · Fault diagnosis
Nomenclature NPC 3L-NPC SVM ANPC MLI IPMSM NPPF JADE
Neutral-Point Clamped Inverter Three-Level Neutral-Point Clamped Inverter Space Vector Modulation Active Neutral-Point Clamped Inverter Multi-Level Inverters Interior Permanent Magnet Synchronous Machine Neutral-Point Potential Fluctuation Joint Approximative Diagonalization of Eigen matrix
P. Selvakumar (B) · G. Muthukumaran Department of Electrical and Electronics Engineering, School of Electrical Sciences, Hindustan Institute of Technology and Science, Chennai 603 103, India e-mail: [email protected] G. Muthukumaran e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_33
443
444
ICA NN CBM SVPWM IZSV DPWM PMSM PWM CB-PWM SVPWM SHE SiC IGBT SVM ANN
P. Selvakumar and G. Muthukumaran
Independent Component Analysis Neural Network Carrier-Based Modulation Space Vector Pulse Width Modulation Injected Zero Sequence Voltage Discontinuous PWM Permanent Magnet Synchronous Motor Pulse Width Modulation Carrier-Based PWM Space Vector PWM Selective Harmonic Elimination Silicon-Carbide Insulated Gate Bipolar Transistor Support Vector Machine Artificial Neural Network
1 Introduction The development of electric vehicles (EVs) may offer many opportunities and challenges for the development of traction motor drive which results in increased fuel efficiency, lower emissions, and better vehicle performance. Normally, three-phase NPC-MLI is connected to a high voltage DC source which is followed by a seriesconnected capacitor with many benefits as well as drawbacks [1–3]. To overcome the reliability improvement, the NPC inverter uses IGBT as a switching device, and if one of the IGBT fails, the inverter stops functioning and the faults can be classified into short-circuit fault and open-circuit fault [4, 5]. In [6], instead of Si-based switches, SiC switches could be used, which reduces losses significantly, or a three-level inverter, which reduces overall converter losses by about 60% when compared to a two-level inverter at higher switching frequencies. Furthermore, an additional switching injection requires fault detection which burdens the circuit complexity [7]. As a result, there are three types of PWM for three-level NPC converters: CB-PWM, SVPWM, and SHE methods while the CB-PWM is the most effective method that generates switches’ duty ratio directly from the reference voltage vector [7, 8]. As a result, the implementation of CB-PWM is very simple because it has a major demerit that the switches’ power losses because of various switching operations reduce the system efficiency [9]. Likewise, the single-phase 3LNPC operating in rail vehicles may cause power devices to break down [10], and to ensure safety and reliability, a battery management system is proposed [11]. During the engine start-up process, the Electric Starter Generator accelerates high speed so that a 3LNPC was used because of its reduced switching loss [12], high switching frequency, and improved power quality [13, 14]. Moreover, the high switching frequency minimizes the magnetic component size as
An Three-Level Active NPC Inverter Open-Circuit Fault …
445
well as torque ripple. The insertion of switching signals’ dead time and the nonideal property associated with the two DC-link capacitances will cause neutral-point voltage oscillation. Hence, to address this issue, a carrier-assisted SVPWM technique is developed [15, 16]. The major contribution of this research work is as follows: • Introduces two machine learning models in which an SVM model for identifying the battery fault and an ANN model for determining the location of open-circuit switch fault. • Carries out performance analysis by comparing the presented model using distinct testing ratios for diagnostic accuracy rates and so on. The following is the layout of the paper: The reviews can be seen in Sect. 2. Section 3 portrays the system model of an three-level active NPC inverter under fault condition. Section 4 portrays open-circuit fault identification and localization using SVM and ANN, Sect. 6 portrays the results, and the work is concluded by Sect. 7.
2 Literature Review 2.1 Related Works In 2019, Kersten et al. [17] have established an approach that focuses on the detection of inverter switch fault in an ANPC. It uses a current estimator for fault detection algorithm and two fault localization algorithms using an adapted SVM which helps to drive the vehicle with limited maximum power. Moreover, a fault in the switch or diode of the ANPC inverter’s clamping path loses the current controllability by affecting the current commutation. Due to this reason, there is no use of additional hardware for the implementation of NPC or ANPC inverter-driven vehicles. In 2019, Choudhury and Pillay [18] have developed a novel virtual spaced vector aided DC-link capacitor voltage balancing topology for traction drive application. Mainly for traction applications, an IPMSM is connected to the system. For the nearby voltage vectors, the duty cycles are calculated which helps to reduce the NPPF, and thus, the torque pulsation can be reduced. In 2020, Hu et al. [19] have presented a new fault feature extraction approach for the JADE-ICA for NPC inverters. By obtaining three-phase voltage fault characteristics, an NN method is proposed for fault diagnosis. Moreover, JADE-ICA can overcome the effects of nonlinearity and time difference by reducing the time taken for training the NN and hence achieving accuracy improvement. In 2017, Wu et al. [20] have suggested a new hybrid PWM for a 3L-NPC converter with uneven DC-links and an unsymmetrical control strategy. Moreover, the proposed methodology is a blend of the SVPWM and CBM techniques. Afterward, this work utilizes the CBM with IZSV which helps the pulse generation exactly without computing the used voltage vectors’ duty cycle.
446
P. Selvakumar and G. Muthukumaran
In 2018, Mukherjee et al. [21] have presented a flexible discontinuous modulation strategy for the 3L-NPC inverter which certifies minimum switching losses for power factor and modulation depth. To mitigate the DC-link capacitor, a hybrid voltage balancing approach was proposed. Hence, the proposed strategy generates a uniform neutral current which results in predicting voltage balancing strategy covering SPWM and DPWM.
3 System Model of an ANPC Inverter Under Fault Condition 3.1 Neutral-Point Connected ANPC Inverter The three states of switching {1, 0, −1} e S y operate each phase leg in the ANPC inverter resulting the output voltage. Vy−NP =
Vdc Sy 2
(1)
Moreover, an average of switching compositions in one sample duration produces the desired output voltage. In the absence of low-order harmonics, the maximal rms line voltage is generated as in Eq. (2). VL , rms =
Vdc0 + Vdc1 = 0.707Vdc √ 2
(2)
As a result, if one of the voltage sources fails, then the maximal output voltage gets halved. In the NPC inverter, over the positive and negative DC-link rails, the NP current is a third-harmonic current component that is conducted, shutting via capacitors and midway interconnection of the inverter. Moreover, the capacitor voltages oscillate at three times the output voltage’s frequency. Due to a high voltage oscillation, the capacitor reactance is relatively impressive at low frequencies and standstill conditions. Generally, this oscillation is caused by the DC capacitor unbalance, allowing the switches to endure high voltages during normal operations [17]. On the other hand, an NP connection with two battery packs eliminates voltage fluctuations in the capacitors by using the batteries to ensure that the third-harmonic current has a low-impedance path. Another significant benefit of the NP connection allows the powertrain to be operated with just a solo battery or a solo switch failure, known as “limp home” mode.
An Three-Level Active NPC Inverter Open-Circuit Fault …
447
3.2 Three-Level Active NPC Inverter In comparison with the 3L-NPC inverter, the 3L-ANPC inverter has an extra active anti-parallel switch instead of clamping diodes which are connected to a common neutral point. The 3L-ANPC inverter circuit topology is shown in Fig. 1. The source voltage V DC comprises two serially connected secondary sources. Moreover, the structure of the 3L-ANPC has six two-way switches which can supply half of V DC equally, i.e., V DC/2 [22]. Furthermore, the active switches are made of MOSFET in which a 3L-ANPC inverter uniformly distributes the losses and device connecting temperatures through zero switching states [23]. Therefore, the 3-level ANPC inverter can switch either positive or negative phase current during upper neutral path or lower neutral path. Each phase of the ANPC inverter consists of six switches. Among the various switching methods, a relatively simple procedure has been chosen to reduce complexity. The upper half switches Sw1, Sw2, and Sw5 are active during the positive half cycle, and the lower half switches Sw4, Sw6, and Sw3 are active over the negative half cycle [24]. Moreover, Sw2 and Sw3 are slow switches that connect the inductor in positive and negative half cycles of either upper high-frequency switching pairs of Sw1, Sw5, or Sw4, Sw6. During their respective half cycles, each high-frequency switch pair operated as a synchronous buck converter [25].
Dual Battery
VDC1=VDC/2 Sw1 Sw5
Sw2
Ic
NP connection
Ib VDC2=VDC/2
Sw6
MOSFETs
Fig. 1 Circuit topology of 3L-ANPC inverter
Sw4
Ia Sw3
448
P. Selvakumar and G. Muthukumaran
The operation of 3L-ANPC inverters with the faulty device is investigated in this paper, and fault diagnosis strategies are proposed for inverter and drive system continuous operation under open-circuit failure conditions. Moreover, an ANPC inverter is capable of gaining the power train control of EV and need not have to halt even after the occurrence of a fault. Then, an efficient fault diagnosis methodology is required in which the battery fault is identified by an SVM and an ANN for determining the inverter switch fault in which the weights and threshold of the neural network reduce the training time with an increase in efficiency and accuracy. Additional analysis reveals that 3L-ANPC inverters can continue to operate even if a failure occurs in various devices in one or more phases at the same time. As a result, the inverters’ and electrical drives’ reliability and robustness have greatly improved.
4 Open-Circuit Fault Identification and Localization Using SVM and ANN 4.1 Detection of Battery Fault Using SVM After extracting the signal’s feature information, the battery fault can be identified by an SVM classifier which is a machine learning algorithm that consists of sets of labeled training data with regression and classification challenges [24]. Due to its robustness and good judgment capability, in classification and regression problems, SVM was widely used. As a result, the main objective of SVM is to create a decision surface with a hyperplane, which helps to maximize the distance of separation among data samples with two classes. In this proposed work, SVM is used for identifying the battery fault in which the datasets are splitted as training set and testing set. The SVM is then trained using the training data, resulting in the training model. Eventually, the model predicts the labels for testing samples, which are then compared to the actual labels to determine the proper rate of defect diagnosis. The single battery or unsymmetrical voltage operation should be the result of battery problems or charge imbalances. As a result, the fault could be caused by an outside short circuit of the single battery, the batteries’ distinct SOC/SOH, and so on. The battery management system should detect outer short-circuit problems in any of its packs of battery, which need to trip the relevant battery relay. As a result, the power train can only run at half of its rated power because of its limited output voltage. Figure 2a and b shows the identification of battery fault using SVM based on phase current and speed of the ANPC inverter. Hence, the open-circuit battery fault is seen to occur after t = 1.6 s. Because of the significant distortion in the 3-phase current, instantaneous fault identification can be easily achieved.
An Three-Level Active NPC Inverter Open-Circuit Fault …
449
Fig. 2 Detection of battery fault of the ANPC inverter using SVM, a phase current, b speed
4.2 Localization of Inverter Switch Fault Using ANN Inverter open-circuit failures are more difficult to identify and locate than short-circuit problems. Furthermore, it is necessary to differentiate between an open-circuit fault caused by a switch failure or by the failure of the control circuit [26]. It is important to note that the open-circuit condition is caused by a switch breakdown which results in overvoltage and most probably affects the circuit. Nonetheless, the drive circuit is the source of the majority of open-circuit faults [27, 28]. Therefore, open-circuit failures in semiconductor devices like clamping diodes or MOSFETs and control circuitry have to be identified precisely to avoid these issues. The average phase currents’ condition is matched by a response then every switch that is conducting is in good condition, and so case response is graded as “1” or else “0.” The error is localized using a blend of the six distinct solutions, depending on the results of the responses as described in Table 1. The switching states of the 3L-ANPC inverters consist of six switching states and are represented as 1, 2, 3, 4, 5, and 6. In this proposed work, when the open-circuit fault occurs at switch 1, it is represented Table 1 Switching state of 3L-ANPC inverter Switching states 1
Switching sequence Sw1
Sw2
Sw3
Sw4
Sw5
Sw6
0
1
1
1
1
1
2
1
0
1
1
1
1
3
1
1
0
1
1
1
4
1
1
1
0
1
1
5
1
1
1
1
0
1
6
1
1
1
1
1
0
450
P. Selvakumar and G. Muthukumaran
by “0” and all other switches will be “1.” Then, the same procedure is repeated for the next five switches. Table 1 shows the switching state of the 3L-ANPC inverter. In this proposed work, an ANN classification is proposed for localizing the opencircuit switch fault. Moreover, ANN is fundamentally different from biological networks, in which ANN is a type of nonlinear processing system that is wellsuited for a variety of tasks, where there is no existing solution. Using a training approach and sample data, an ANN can be trained to address specific problems. As a result, precisely built ANNs can be utilized to accomplish various tasks based on the training acquired. With the appropriate training, ANN can generalize or recognize similarities between different input patterns, especially those that have been distorted by noise. ANNs are taught for matching a function by exposing them to a diverse set of input/output patterns of operation. For all given training patterns, the backpropagation training technique with input data samples can predict the location of a switch fault by determining the weights and threshold of the neural network which may lead to low training time with an increase in efficiency and accuracy. Thus, the gap between the factual output and the desired output can be minimized.
4.3 Open-Circuit Fault at Sw1 In this case, as shown in Fig. 3a and b, localization of open-circuit fault Sw1 arises at t = 1 s. At switch Sw1, an open circuit was considered, because it would have no impact on the inverter’s operation. In current regulated motors or grid feeding inverter applications, a current estimator is used for finding open-circuit failures. Loop shaping is widely utilized while developing the controller for current indirect and quadrature values for an inside PMSM.
Fig. 3 Localization of switch fault of the ANPC inverter using ANN, a phase current, b speed
An Three-Level Active NPC Inverter Open-Circuit Fault …
451
From Fig. 3a, it is seen that during the time 1 to 1.2 s, there occurs a distortion in the phase current. In Fig. 3b, the fault reaches the steady state at t = 1 s at a speed of 40 rad/s. Therefore, the current of the faulty phase is negative during starting condition of the fault, such that the fault cannot be identified till the current is positive, and the situation becomes even worse. Therefore, the training and testing of fault are performed for determining the location of switch fault and the result stating faulty switch Sw1 in which the error occurs between the time intervals 1– 1.2 s. Figure 3a and b shows the detection of switch fault using ANN based on phase current and speed of the ANPC inverter.
5 Simulation Setup The presented model was executed in MATLAB, and the simulation was done for fault identification and fault localization of a three-level active NPC inverter using SVM and ANN classifiers. The synthetic dataset is collected using the MATLAB/Simulink model of the 3L-ANPC for EV. The performance analysis was carried out based on the accuracy, precision, specificity, and so on.
6 Performance Analysis The performance of the proposed work regarding SVM is depicted in Table 2. On observing the outcomes, consider 15% testing ratio in which the accuracy is 0.745, sensitivity is 1, specificity is 0.5, precision is 0.66667, FPR is 0.57735, F1_score is 0.5, MCC is be 0.57735, FNR is zero, NPV is 0.5, and FDR is 0.33333, respectively. Similarly, consider performance at 30%, here the accuracy, sensitivity, specificity, precision, FPR, F1_score, MCC, FNR, NPV, and FDR are 0.75, 1, 0.5, 0.66667, 0.5, 0.8, 0.57735, 0, 0.5, and 0.33333, respectively. Table 3 shows the performance of the suggested work in terms of Artificial Neural Network. At 50%, the performance of the proposed method in terms of accuracy, sensitivity, specificity, precision, FPR, F1_score, MCC, FNR, NPV, and FDR are 0.83333, 0.88889, 0.93333, 0.83333, 0.066667, 0.82222, 0.77985, 0.11111, 0.93333, and 0.16667, respectively. Similarly, at 80%, the performance of the proposed method in terms of accuracy, sensitivity, specificity, precision, FPR, F1_score, MCC, FNR, NPV, and FDR are 0.9, 0.93333, 0.97778, 0.9, 0.022222, 0.89333, 0.88609, 0.066667, 0.97778, and 0.1, respectively.
7 Conclusion In this proposed work, the combination of SVM and ANN classifiers is used for the detection and localization of open-circuit fault using 3L-ANPC inverter. In the
1
1
0.9
0.91667
0.92857
70
85
100
1
1
1
0.83333
0.875
45
1
1
Sensitivity
60
0.745
0.75
15
30
Accuracy
Testing ratio (%)
0.5
0.5
0.5
0.5
0.5
0.5
0.5
Specificity
Table 2 Overall performance analysis of support vector machine
0.92308
0.90909
0.88889
0.85714
0.8
0.66667
0.66667
Precision
0.5
0.5
0.5
0.5
0.5
0.5
0.5
FPR
0.96
0.95238
0.94118
0.92308
0.88889
0.8
0.8
F1_score
0.67937
0.6742
0.66667
0.65465
0.63246
0.57735
0.57735
MCC
0
0
0
0
0
0
0
FNR
0.5
0.5
0.5
0.5
0.5
0.5
0.5
NPV
0.07692
0.09091
0.11111
0.14286
0.2
0.33333
0.33333
FDR
452 P. Selvakumar and G. Muthukumaran
0.89
0.92
0.75
0.8333
0.875
0.9
0.8333
30
50
65
80
100
0.78
0.93
1
1
0.5
15
Sensitivity
Accuracy
Testing ratio (%)
0.97
0.98
0.96
0.93
0.67
0
Specificity
Table 3 Overall performance analysis of neural network
0.8475
0.9
0.875
0.8333
0.5
0.5
Precision
0.03
0.02
0.04
0.07
0.33
1
FPR
0.72063
0.89333
0.86667
0.82222
0.66667
0.66667
F1_score
0.81137
0.88609
0.85
0.77985
0.57735
0.05964
MCC
0.2
0.1
0.1
0.1
0
0
FNR
1
1
1
0.9
0.7
0
NPV
0.1459
0.1
0.125
0.1666
0.5
0.5
FDR
An Three-Level Active NPC Inverter Open-Circuit Fault … 453
454
P. Selvakumar and G. Muthukumaran
field of vehicle traction, three-level neutral-point clamped (NPC) inverters are widely used for industrial medium voltage applications which offer redundancy with voltage reduction, low distortion, low switching stress, etc. When operating at high current or high temperature conditions, the operating switches may get damaged. Therefore, to detect the battery fault of an NPC inverter, a SVM classifier is proposed in which the training samples are equal to NN. Furthermore, for localizing the open-circuit fault of a switch, an ANN is used which determines the weights and threshold of the NN and reduces the training time with increase in efficiency and accuracy. Thus, the superiority of the presented model has been validated effectively.
References 1. A. Sheir, M.Z. Youssef, A novel power balancing technique in neutral point clamping multilevel ınverters for the electric vehicle ındustry under distributed unbalance battery powering scheme, in 2019 IEEE Applied Power Electronics Conference and Exposition (APEC), pp. 3304–3308 (2019). https://doi.org/10.1109/APEC.2019.8722183 2. K. Kandasamy, D.M. Vilathgamuwa, K.J. Tseng, Double star chopper cell converter for battery electric vehicles with inter-module SoC balancing and fault tolerant control, in IECON 2014—40th Annual Conference of the IEEE Industrial Electronics Society (Dallas, TX, 2014), pp. 2991–2996 3. J. Rodriguez, J.-S. Lai, F.Z. Peng, Multilevel inverters: a survey of topologies, controls, and applications. IEEE Trans. Ind. Electron. 49(4), 724–738 (2002) 4. G.S. Lakshmi, O. Rubanenko, M.L. Swarupa, K. Deepika, Analysis of ANPCI & DCMLI fed to PMSM drive for electric vehicles, in 2020 IEEE India Council International Subsections Conference (INDISCON) (2020). https://doi.org/10.1109/indiscon50162.2020.00059 5. X. Wan, H. Hu, Y. Yu, Open-circuit fault diagnosis for grid-connected NPC ınverter based on ındependent component analysis and neural network. TELKOMNIKA (Telecommun. Comput. Electron. Control) 15, 36 (2017). https://doi.org/10.12928/telkomnika.v15i1.3677 6. A. Choudhury, P. Pillay, S.S. Williamson, Modified DC-bus voltage balancing algorithm based three-level neutral point clamped (NPC) IPMSM drive for electric vehicle application, in IECON 2014—40th Annual Conference of the IEEE Industrial Electronics Society (2014). https://doi.org/10.1109/iecon.2014.7048941 7. S.-H. Kim, D.-Y. Yoo, S.-W. An, Y.-S. Park, J.-W. Lee, K.-B. Lee, Fault detection method using a convolution neural network for hybrid active neutral-point clamped inverters. IEEE Access 8, 140632–140642 (2020). https://doi.org/10.1109/ACCESS.2020.3011730 8. A. Choudhury, P. Pillay, S.S. Williamson, A hybrid-PWM based DC-link voltage balancing algorithm for a 3-level neutral-point-clamped (NPC) DC/AC traction inverter drive, in 2015 IEEE Applied Power Electronics Conference and Exposition (APEC), pp. 1347–1352 (2015). https://doi.org/10.1109/APEC.2015.7104523 9. M. Farhadi, M. Abapour, M. Sabahi, Failure analysis and reliability evaluation of modulation techniques for neutral point clamped inverters—a usage model approach. Eng. Fail. Anal. 71, 90–104 (2017). ISSN 1350-6307. https://doi.org/10.1016/j.engfailanal.2016.06.010 10. J. Lee, R. Kwak, K. Lee, Novel discontinuous PWM method for a single-phase three-level neutral point clamped inverter with efficiency improvement and harmonic reduction. IEEE Trans. Power Electron. 33(11), 9253–9266 (2018). https://doi.org/10.1109/TPEL.2018.279 4547 11. X. Ge, J. Pu, B. Gou, Y. Liu, An open-circuit fault diagnosis approach for single-phase threelevel neutral-point-clamped converters. IEEE Trans. Power Electron. 33(3), 2559–2570 (2018). https://doi.org/10.1109/TPEL.2017.2691804
An Three-Level Active NPC Inverter Open-Circuit Fault …
455
12. A. Nabae, I. Takahashi, H. Akagi, A new neutral-point-clamped PWM inverter. IEEE Trans. Ind. Appl. 17(5), 518–523 (1981) 13. M.T. Fard, M. Abarzadeh, K.A. Noghani, J. He, K. Al-Haddad, Si/SiC hybrid 5-level active NPC inverter for electric aircraft propulsion drive applications. Chin. J. Electr. Eng. 6(4), 63–76 (2020). https://doi.org/10.23919/CJEE.2020.000031 14. Baghli, C. Delpha, D. Diallo, Hallouche, D. Mba, W. Tianzhen, Three-level NPC ınverter ıncipient fault detection and classification using output current statistical analysis. Energies 12, 1372 (2019). https://doi.org/10.3390/en12071372 15. S. Monge, B. Bordonau, D. Boroyevich, S. Somavilla, The nearest three virtual space vector PWM—a modulation for the comprehensive neutral-point balancing in the three-level NPC inverter. IEEE Trans. Power Electron. 2(1), 11–15 (2004) 16. J. Weidong, L. Wang, J. Wang, X. Zhang, P. Wang, A carrier-based virtual space vector modulation with active neutral point voltage control for neutral point clamped three-level inverter. IEEE Trans. Ind. Electron. 65(11), 8687–8696 (2018) 17. A. Kersten et al., Fault detection and localization for limp home functionality of three-level NPC inverters with connected neutral point for electric vehicles. IEEE Trans. Transp. Electrification 5(2), 416–432 (2019). https://doi.org/10.1109/TTE.2019.2899722 18. A. Choudhury, P. Pillay, Space vector based capacitor voltage balancing for a three-level NPC traction ınverter drive. IEEE J. Emerg. Sel. Top. Power Electron. 1–1 (2019). https://doi.org/ 10.1109/jestpe.2019.2953183 19. H. Hu, F. Feng, T. Wang, Open-circuit fault diagnosis of NPC inverter IGBT based on independent component analysis and neural network. Energy Rep. 6(Supplement 9), 134–143 (2020). ISSN 2352-4847. https://doi.org/10.1016/j.egyr.2020.11.273 20. X. Wu, G. Tan, G. Yao, C. Sun, G. Liu, A hybrid PWM strategy for three-level inverter with unbalanced DC links. IEEE J. Emerg. Sel. Top. Power Electron. 6(1), 1–15 (2018). https://doi. org/10.1109/jestpe.2017.2756999 21. S. Mukherjee, S. Kumar Giri, S. Kundu, S. Banerjee, A generalized discontinuous PWM scheme for three-level NPC traction ınverter with minimum switching loss for electric vehicles. IEEE Trans. Ind. Appl. 55(1), 516–528 (2019). https://doi.org/10.1109/TIA.2018.2866565 22. D. Floricau, E. Floricau, G. Gateau, Three-level active NPC converter: PWM strategies and loss distribution, in 2008 34th Annual Conference of IEEE Industrial Electronics (2008). https:// doi.org/10.1109/iecon.2008.4758494 23. R. Katebi, J. He, N. Weise, An advanced three-level active neutral-point-clamped converter with improved fault-tolerant capabilities. IEEE Trans. Power Electron. 33(8), 6897–6909 (2018). https://doi.org/10.1109/tpel.2017.2759760 24. 6.6 kW three-phase, three-level ANPC ınverter/PFC bidirectional power stage reference design, in TIDUEZ0 (2021) 25. J. Li, A.Q. Huang, Z. Liang, S. Bhattacharya, Analysis and design of active NPC (ANPC) inverters for fault-tolerant operation of high-power electrical drives. IEEE Trans. Power Electron. 27(2), 519–533 (2012). https://doi.org/10.1109/tpel.2011.2143430 26. Y. Yu, S. Pei, Open-circuit fault diagnosis of neutral point clamped three-level inverter based on sparse representation. IEEE Access 1–1 (2018). https://doi.org/10.1109/access.2018.2883219 27. V. Balasubramaniam, Fault detection and diagnosis in air handling units with a novel integrated decision tree algorithm. J. Trends Comput. Sci. Smart Technol. 3(1), 49–58 (2021) 28. T. Vijayakumar, Posed inverse problem rectification using novel deep convolutional neural network. J. Innov. Image Process. (JIIP) 2(03), 121–127 (2020)
Hybrid Control Design Techniques for Aircraft Yaw and Roll Control System A. C. Pavithra and N. V. Archana
Abstract Presently, controlling of an aircraft system is quite complex due to huge variations in the number of aircraft flying in the space, environment, etc. The control system and aircraft community jointly work together hardly to stabilize and control the aircraft system under the various operating conditions. Following the works of control system community, the current paper concentrates on designing hybrid control techniques such as (i) switched linear control system and (ii) coordinated and uncoordinated control inputs for the aircraft Yaw and roll control system. The modelling begins with a derivation of suitable mathematical model to describe the lateral dynamic motion of the aircraft. Later, the proposed control techniques are validated by simulating under various conditions in MATLAB/SIMULINK platform, and the results are compared and tabulated with each other. Keywords Linear Quadratic Regulator (LQR) · Fuzzy Logic Controller (FLC) · Linear Quadratic Gaussian (LQG) · Roll · Pitch and Yaw
1 Introduction In the world, almost all the practical control system requires both analogue and discrete nature. Hybrid or switched control system is a generic term for such system refers, where the continuous and discrete parts of the dynamical system are interacting with each other and generate a mixed signal (combination of continuous and discrete). Switched or hybrid control system is a recent and very active research which involves the fundamentals of control theory and computer science. Currently, aircraft system is an automatic control system to continuously monitor and control the various subsystems and its associated variables in aircraft system. Flying aircraft A. C. Pavithra (B) Electronics and Communication Department, ATMECE, Mysore, Karnataka 570 028, India e-mail: [email protected] N. V. Archana Electrical and Electronics Department, NIEIT, Mysore, Karnataka 570 028, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_34
457
458
A. C. Pavithra and N. V. Archana
attitude or orientation has to be controlled with three dimensions (Roll, Pitch and Yaw) because the world is three dimensional. The aircraft control society is using the control inputs Ailerons, Elevator and Rudder for the control of Roll, Pitch and Yaw, respectively. The present paper concentrates on implementing the hybrid control techniques along with coordinated/uncoordinated control inputs (Ailerons and Rudder) for the aircraft control system. Nair et al. [1] presented a Advanced and Intelligent Control for the Aircraft Yaw Dynamics using Linear Quadratic Regulator (LQR) and Fuzzy Logic Controller (FLC). The steady-state error and peak overshoots of both the proposed controllers are observed, and the results show that the LQR has relatively better performance compared to FLC. An autopilot automatic aircraft control system is model and simulated to control the roll of the aircraft system by Akyazi et al. [2]. A self-tuning FLC is designed for the developed model, and to show its effectiveness, the simulation results are compared with the LQR and FLC. Dahiya and Singh [3] developed a comparative assessment between Proportional Integral Derivative (PID), Artificial Neural Network (ANN), Fuzzy, Fuzzy-PID and Artificial Neuro Fuzzy (ANFIS) controllers for the Roll motion of an Aircraft Dynamics, and the simulation results reveal that the combination of PID-Fuzzy compensation approach provides improved response compared to all other approaches with the time response specifications. Chrif and Kadda [4] applied LQR and Linear Quadratic Gaussian (LQG) controller for the dynamics of lateral and longitudinal aircraft control system. The simulation results reveal that the LQG control provides better performance by reducing the steady-state error when disturbance acts on the system. The design of controller for tracking of the aircraft when all the states are not available is developed by Lichota et al. [5] using LQR output feedback control. Ashraf et al. [6] proposed the simple control techniques using linear feedback and LQR for the lateral dynamics of F-16 aircraft. LFB control is designed using pole placement technique, and for the LQR design, the weighting matrices Q and R are tuned by trial-and-error approach. The simulation results for the state variables slide slip angle, Roll rate, Yaw rate are compared with the proposed controllers. Vo and Seshagiri [7] have designed a sliding mode control technique for the F-16 aircraft lateral dynamics. The robustness of the proposed controller is demonstrated through simulation of transient and steady-state performance of the aircraft system. Rasheed [8] proposed LQR and Linear Matrix Inequalities (LMI) controllers for the longitudinal model of aircraft system, and the results show that the performance and handling qualities are satisfactory with the reference model over huge range of nonlinear flight conditions. The above literature survey indicates that the optimal control theory of LQR is applied rigorously to the aircraft system and shows the improved performance. Hence, it motivates to implement the hybrid control mechanism using optimal LQR control for the aircraft Roll and Yaw control for the lateral dynamics.
Hybrid Control Design Techniques for Aircraft Yaw and Roll …
459
2 Modelling of Yaw and Roll Control System Presently, aircraft system has two types of dynamical equations, one is lateral, and other is longitudinal where both represent the dynamics of aircraft with respect to lateral and longitudinal axis, respectively. The state variables yaw, roll and slide slip motions comes under the first category of lateral dynamics [1], where the longitudinal dynamics includes the pitch motions. The current section explains the modelling of Yaw control system. Figure 1 represents the control surfaces of aircraft, and Fig. 2 represents the forces, moments and velocity components, respectively. Where, L, M, N: Aerodynamic Moment Components. p: Angular rate component of roll axis. q: Angular rate component of pitch axis. r: Angular rate component of yaw axis. u: Velocity component of roll axis. v: Velocity component of pitch axis. Lateral equations are derived assuming that the aircraft is in steady state with constant altitude and velocity. The linearized form of the equations are as below:
d − Yv V − Y pp + (u 0 − Yr ) − (g cos θ0 )φ = Yδr δ dt
Fig. 1 Aircraft motions: yaw, roll and pitch
(1)
460
A. C. Pavithra and N. V. Archana
Fig. 2 Definitions of force, moments and velocity components in a body fixed frame
d IXZ d − L p p − + L r r = L δα δα + L δr δα dt I X dt d ⊥xz x −Nv v + − L P P − + N P P ≡ Nδa δa1 + N∂r δr dt dt ⊥x
−L v v +
(2)
(3)
In the current paper, slide slip angle β is considered, and the side velocity v is neglected. The relationship of these two quantities is as below: β = tan −1 v/uo = v/uo
(4)
Above lateral equations are put into state space form: x(t) ˙ = A · x(t) + B · u(t) ⎡
⎤ β
⎢ P ⎥ ⎥ u = δa x =⎢ ⎣ r ⎦ δa θ ⎡
⎤ Yβ/U 0 Y P/U 0 −(1 − Y r/u0) g cos θ 0/U 0 ⎢ Lβ ⎥ LP Lr 0 ⎥ Q=⎢ ⎣ Nβ ⎦ NP Nr 0 0 1 0 0
(5)
Hybrid Control Design Techniques for Aircraft Yaw and Roll … Table 1 The lateral dynamics derivative stability
461
Quantity
Y-force derivatives
Yawing moment derivatives
Slide slip angle
Yβ =−44.665 Nβ = 4.549
Rolling moment derivatives Lβ =−15.969
Rolling rate
YP = 0
Nβ = 4.549
LP =−8.395
Yawing rate
Yr = 0
Nr =−0.76
Lr = 2.19
Rudder deflection
Y δr = 12.433
N δr =−4.613
L δr = 23.09
⎤ Yδr /U 0 Yδa /U 0 ⎢ L δr L δa ⎥ ⎥ B=⎢ ⎣ Nδr Nδa ⎦ 0 0 ⎡ ⎤ 1000 ⎢0 1 0 0⎥ ⎥ C =⎢ ⎣0 0 1 0⎦ 0001 ⎡
where, δa, δr: aileron and rudder deflection. β, : slide slip and roll angle. P, r: roll and yaw rate. The lateral dynamic stability criterions for this current paper employment are given in Table 1 [1]. The numerical values in Table 1 are substituted for the state space matrix, and it yields as follows: ⎡
⎤ −0.254 0 −1 0.183 ⎢ −15.969 −8.395 2.19 0 ⎥ ⎥ A=⎢ ⎣ 4.549 −0.349 −0.76 0 ⎦ 0 1 0 0 A = Bδr Bδa ⎡
⎤ 0 0 ⎢ 23.09 −28.916 ⎥ ⎥ A=⎢ ⎣ −4.613 −0.224 ⎦ 0 0
462
A. C. Pavithra and N. V. Archana
2.1 Hybrid Control Techniques In this section, the proposed hybrid control techniques are explained with implementation procedure considering the following scenarios: Scenario I: The two optimal feedback controllers Kα, K, α and Kβ, K, β are designed for each control inputs of Bδr and Bδa of aircraft lateral dynamics to switch between one another as follows [9–11]: x(t) ˙ = Aσ (t)x(t) The switching signal σ (t) indicates x(t) ˙ = A1x(t) = if, σ (t) = 1 = A2x(t) = if, σ (t) = 2
(6)
where, A1 = A − Bδr K α A2 = A − Bδr K α And A1 = A − Bδa Kβ A2 = A − Bδa Kβ The switching model is defined as, x(t) ˙ = A1x(t)x Sx ≤ 0 = A2x(t)x 0 Sx > 0
(7)
The switching matrix S is calculated as in [11], and the simulation results are compared with individual two controllers (without switching). The numerical values of the feedback controller gain Kα, Kα , Kβ Kβ and the switching matrices for the two control inputs Bδr (Rudder Deflection Control Input) and Bδa (Aileron Deflection Control Input) S1 and S2 for the two state variables namely deviations in roll rate P and yaw rate r for the proposed scenario is as follows: K α = 0.2096 0.6711 −0.63480 1.10470 kα = 1.6591 2.5349 −2.1856 3.5090 Kβ = 0.2389 −0.7874 0.1129 −1.0397 kβ = −0.1356 − 2.9276 0.5486 −3.3015
Hybrid Control Design Techniques for Aircraft Yaw and Roll …
463
⎡
⎤ −0.3930 0.3540 0.1593 −0.3091 ⎢ 0.3540 1.5600 −0.7145 1.0279 ⎥ ⎥ S1(P) = ⎢ ⎣ 0.1593 −0.7145 0.1090 −0.1025 ⎦ −0.3091 1.0279 −0.1025 0.0559 ⎡ ⎤ 0.3101 0.1644 − 0.6693 0.3590 ⎢ 0.1644 − 0.0899 −0.6099 0.729 ⎥ ⎥ S1(r ) = ⎢ ⎣ −0.6993 −0.6099 1.0772 −0.9439 ⎦ 0.35901 0.0729 −0.9439 0.3377 ⎡ ⎤ −0.1092 −0.1384 0.0803 −0.3312 ⎢ −0.1384 1.9848 −0.1035 1.0406 ⎥ ⎥ S2(P) = ⎢ ⎣ 0.0808 −0.1035 −0.0401 −0.0401 ⎦ −0.33121 1.0406
0.1058 − 0.0173 ⎤ 0.0960 0.2757 −0.1122 0.3041 ⎢ 0.2757 0.0162 −0.3238 0.0896 ⎥ ⎥ S2(r ) = ⎢ ⎣ −0.1122 −0.3238 − 0.1312 −0.3569 ⎦ 0.3041 0.0896 −0.3569 0.1713 ⎡
Scenario II: The aircraft industry is using various control inputs Elevator, Rudder and Aileron deflection for the control of Pitch, Yaw and Roll control, respectively. For the current research, the two control inputs Rudder (δr) and Aileron (δa) are considered. In this subsection, two control inputs δr and δa are coordinated by designing the optimal LQR control as follows: Bγ = [Bδr Bδa] A = A−Bγ Kγ The simulation results are compared with the uncoordinated control inputs Bδr and Bδa. The numerical values of the feedback controller gains Kγ , Kδr and Kδa for the proposed scenario are as follows. Kγ =
0.1207 0.4719 −0.8096 0.5393 0.2485 −0.6657 −0.4132 −0.8653
K δr = 0.2096 0.6711 −0.6348 −0.6348 K δa = 0.2389 −0.7874 0.1129 −1.0397
464
A. C. Pavithra and N. V. Archana
2.2 Simulation Results The MATLAB/SIMULINK simulations are presented in this section for the proposed two scenarios I and II for the state variable deviations in Roll and Yaw rates P and r. The digital simulations are carried for the two conditions: (i) by applying initial conditions and (ii) by applying reference input to the proposed system using legends from Figs. 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 and 14. Figures 3, 4, 5, 6, 7, 8, 9 and 10 show the Scenario I responses for the initial conditions and forreference step ∞ input, respectively. The comparison of the performance index J = 0 y 2 dt to the proposed scenarios ise tabulated for the simulated conditions (initial conditions and reference input) from Tables 2, 3, 4, 5, 6 and 7.
Fig. 3 Deviation in roll rate for the control input δr
Fig. 4 Deviation in yaw rate for the control input δr
Hybrid Control Design Techniques for Aircraft Yaw and Roll …
465
Fig. 5 Deviation in roll rate for the control input δa
Fig. 6 Deviation in yaw rate for the control input δa
Fig. 7 Deviation in roll rate for the control input δr
3 Discussion Figures 3 and 4 show the response of the Scenario I proposed controllers for the initial conditions to the state variables P and r for the control input Bδr followed
466
A. C. Pavithra and N. V. Archana
Fig. 8 Deviation in yaw rate for the control input δr
Fig. 9 Deviation in roll rate for the control input δa
Fig. 10 Deviation in yaw rate for the control input δa
by same responses in Figs. 5 and 6 for the control input Bδa. The responses for the reference inputs (Scenario I) for the control inputs Bδr and Bδa are plotted from Figs. 7, 8, 9 and 10. The Scenario II responses for both initial conditions and reference input are depicted in Figs. 11, 12, 13 and 14. For the clarity in figures plotted (Figs. 3,
Hybrid Control Design Techniques for Aircraft Yaw and Roll …
467
Fig. 11 Deviation in roll rate for Scenario II
Fig. 12 Deviation in yaw rate for Scenario II
Fig. 13 Deviation in roll rate for Scenario II
4, 5, 6, 7, 8, 9, 10, 11, 12, 13 and 14) is tabulated in Table 7 with respect to scenarios, control inputs and simulations carried with respect to initial conditions/step input.
468
A. C. Pavithra and N. V. Archana
Fig. 14 Deviation in yaw rate for Scenario II ∞ Table 2 Scenario I: 0 y 2 dt for Bδr (initial conditions)
Table 3 Scenario I: for Bδr (step input)
∞ 0
y 2 dt
∞ Table 4 Scenario I: 0 y 2 dt for Bδr (initial conditions)
Table 5 Scenario I: for Bδr (step input)
∞ 0
y 2 dt
∞ Table 6 Scenario I: 0 y 2 dt for Bδr (initial conditions)
Proposed controllers
P
r
BδrKα
0.02248
0.2112
BδrKα
0.0111
0.2011
Switch BδrKα/BδrKα
0.01097
0.1948
Proposed controllers
P
r
BδrKα
0.2114
0.2155
BδrKα
0.02439
0.02489
Switch BδrKα/BδrKα
0.0241
0.02489
Proposed controllers
P
r
BδaKβ
0.01602
0.4519
BδaKβ
0.005358
0.4515
Switch BδaKβ/BδaKβ
0.005356
0.4486
Proposed controllers
P
r
BδaKβ
0.4413
0.2338
BδaKβ
0.04689
0.02377
Switch BδaKβ/BδaKβ
0.04517
0.02377
Proposed controllers
P
r
BδrKα
0.02263
0.2112
BδaKβ
0.01602
0.4504
BδrKα + BδaKβ
0.01303
0.1113
Hybrid Control Design Techniques for Aircraft Yaw and Roll … Table 7 Scenario I: for Bδr (step input)
∞ 0
y 2 dt
Proposed controllers
P
469 r
BδrKα
0.2242
0.3553
BδaKβ
0.4413
0.2338
BδrKα + BδaKβ
0.1065
1.527
Scenario I: From Figs. 3, 4, 5 and 6 and Tables 2, 3, 4 and 5, inference that the response of the switched system (switching between two feedback controllers) for both control inputs Bδr and Bδa with respect to the state variables P and r provides better performance compared to the individual feedback controllers Kα, K, α and Kβ, K, β. The performance index J (output energy) is also minimized for the system employed with switching between two feedback controllers compared to system without switching (individual feedback controllers). Scenario II: The response of the coordinated aircraft yaw and roll control system for the two control inputs rudder δr and aileron δa is showcased from Figs. 11, 12, 13 and 14 along with compared to individual control inputs. The simulation results show that the proposed coordinated system between two control inputs (Bδr Kδr + Bδa Kδa) provides better performance compared to response of the individual two control inputs with respect to peak overshootand settling time control system prominent parameters. The performance index ∞ J = 0 y 2 dt is tabulated depicted as evidence proof for the proposed coordinated system compared with individual control inputs are shown from Tables 6, 7 and 8. Table 8 List of plotted figures Figure
Scenario
Control input
State variables
Response
3
I
Bδr
P
Initial condition
4
I
Bδr
r
Initial condition
5
I
Bδa
P
Initial condition
6
I
Bδa
r
Initial condition
7
I
Bδr
P
Step input
8
I
Bδr
r
Step input
9
I
Bδa
P
Step input
10
I
Bδa
r
Step input
11
II
Bδr + Bδa
P
Initial condition
12
II
Bδr + Bδa
r
Initial condition
13
II
Bδa + Bδa
P
Step input
14
II
Bδa + Bδa
r
Step input
470
A. C. Pavithra and N. V. Archana
4 Conclusion The hybrid control techniques is implemented to the aircraft yaw and roll control system to improve the performance of roll and yaw rate deviations with respect to control system prominent parameters such as peak overshoots and settling time. The proposed control is simulated with two scenarios, one simulation set up with uncoordinated (aileron and rudder) control inputs with various switching and non-switching conditions and another simulation set up with coordinated control inputs. The simulation results of all the proposed scenarios are compared with their existing techniques such as individual feedback control inputs (without switching) and uncoordinated ∞ aileron and rudder control inputs. The performance index J = 0 y 2 dt is also tabulated for the state variables deviations in yaw and roll rate for the initial conditions as well as step input simulation conditions. The simulation results and tabulation conclude that proposed controllers have better output performance compared to individual (without switching) feedback controllers and uncoordinated control inputs.
References 1. V.G. Nair, M.V. Dileep, V.I. George, Aircraft yaw control system using LQR and fuzzy logic controller. Int. J. Comput. Appl. 45(9) (2012). ISSN: 0975-8887 2. O. Akyazi1, M. Ali Usta, A self-tuning fuzzy logic controller for aircraft roll control system. Int. J. Control Sci. Eng. 2(6), 181–188 (2012) 3. R. Dahiya, A.K. Singh, Performance analysis of control techniques for roll movement of aircraft. Int. J. Eng. Comput. Sci. 5(11), 19212–19226 (2016). ISSN: 2319-7242 4. L. Chrif, Z.M. Kadda, Aircraft control system using LQG and LQR controller with optimal estimation-Kalman filter design, in 3rd International Symposium on Aircraft Airworthiness. Proc. Eng. 80, 245–257 (2014) 5. P. Lichota, F. Dul, A. Karbowski, System identification and LQR controller design with incomplete state observation for aircraft trajectory tracking. Energies 13(5354), 1–27 (2020) 6. A. Ashraf, W. Mei, L. Gaoyuan, Z. Anjum, Design linear feedback and LQR controller for lateral flight dynamics of F-16 aircraft, in International Conference on Control, Automation and Information Sciences (ICCAIS), pp. 367–371 (2018) 7. H. Vo, S. Seshagiri, Robust control of F-16 lateral dynamics, in 34th Annual Conference of IEEE Industrial Electronics, pp. 343–348 (2008) 8. A. Rasheed, LQR and LMI based optimal control design for aircraft. J. Space Technol. 7(1), 97–103 (2017) 9. L. Yathisha, K. Davoodi, S. Patil Kulkarni, Optimal switching control strategy for UPFC for wide range of operating conditions in power system, in 3rd Indian Control Conference (Indian Institute of Technology (IIT), Guwhati, 2017), pp. 225–232. https://doi.org/10.1109/ INDIANCC.2017.7846479 10. L. Yathisha, S. Patil Kulkarni, Application and comparison of switching control algorithms for power system stabilizer, in IEEE International Conference on Industrial Instrumentation and Control (ICIC) (Pune, 2015), pp. 1300–1305 11. L. Yathisha, S. Patil Kulkarni, LQR & LQG based optimal switching techniques for PSS and UPFC in power systems. Control Theory Technol. 16(1), 25–37 (2018)
Hybrid Control Design Techniques for Aircraft Yaw and Roll …
471
12. S. Shankar, K.T. Veeramanju, L. Yathisha, Multi-stage switching control of multi-LQR’s for STATCOM operating over wide range of operating conditions in power system. Int. J. Recent Technol. Eng. 7(6s), 371–379 (2019). ISSN: 2277-3878 13. L. Yathisha, S. Patil Kulkarni, Optimum LQR switching approach for the improvement of STATCOM performance. Springer LNEE 150, 259–266 (2013). https://doi.org/10.1007/9781-4614-3363-7-28 14. J.L. Aravena, L. Devarakonda, Performance driven switching control, in IEEE International Symposium on Industrial Electronics (2006). https://doi.org/10.1109/ISIE.2006.295564 15. Z.J. Wang, S.J. Guo, W. Li, Modelling, simulation and optimal control for an aircraft of aileronless folding wing. WSEAS Trans. Syst. Control 3(10), 869–878 (2008)
A Review of the Techniques and Evaluation Parameters for Recommendation Systems S. Vijaya Shetty, Khush Dassani, G. P. Harish Gowda, H. Sarojadevi, P. Hariprasad Reddy, and Sehaj Jot Singh
Abstract The explosion of Internet and the boom of social media have led to many organizations in successfully collecting huge amount of data about its’ customers. These data are one of the most valuable assets an organization can own in this present time. Organizations can use these data to generate very high profits. One can gather insights from these data by using various mining techniques, and based on these insights the organization can make interesting recommendations to their customer base, which can bring in more sales and profits. In this paper, we provide a comparative analysis of some of the most successful techniques that are used to build such recommendation systems, an overview of the techniques and evaluation parameters that can be used to evaluate recommender systems and brief of some of the most common problems faced in building a recommender system. We also include information about the possible modifications on these techniques for improved performance. Keywords Recommendation systems · Techniques for building recommendation systems · Evaluation metrics · Matrix factorization · Deep learning
1 Introduction Recommendation systems are beneficial tools for organizations wanting to gain financial and competitive edge over their rivals using large amount of data gathered from its customers. Many large companies like Amazon, Netflix, YouTube, etc., have invested a lot to build their own recommendation systems which have helped them to become the giants as we have come to known them. Recommendation systems not only help financially, but also help in catering a company’s customer base more S. Vijaya Shetty (B) · K. Dassani · G. P. Harish Gowda · H. Sarojadevi · P. Hariprasad Reddy · S. J. Singh Nitte Meenakshi Institute of Technology, Bengaluru, India e-mail: [email protected] H. Sarojadevi e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_35
473
474
S. Vijaya Shetty et al.
efficiently. It acts as an extra layer of service to the customers. Building a recommendation system is not an easy task. Recommender systems must deal factors like changing customer behaviour, time relevancy, demographic, sparse data, cold start problem, etc. Besides it is also very difficult to calculate the accuracy of recommendation systems as it is not sure whether the recommendations made are useful or not. It is said that building recommendation system is more of an art than science. In this paper, we will discuss about the types of recommendation systems, various methods for building a recommendation system, problems associated with building a recommending system and ways of evaluating a recommender system.
2 Types of Recommendation Systems 2.1 Collaborative Filtering Systems Collaborative filtering systems use ratings provided by users to make recommendations. These ratings can be explicit or implicit. Explicit ratings are of different scales. The problem with such type of rating is that the users do not provide rating for most of the items. Implicit ratings are based on user interactions such as visit, clicks. The item is rated as 1 if the user has looked at the item or 0 otherwise. This information can be extracted from web logs. There are two ways to building a collaborative filtering recommender system—memory-based approaches and model-based approaches. Memory-based approaches are relatively simple and involve locating similar users with similar preferences to the target user, whereas model-based approaches first build a model out of the rating matrix and then provides recommendation. There are two types of model-based approaches—user based and item based. In user-based recommender, each user is represented as a vector of ratings given to each item in the dataset and distances are measured between each user vector. The users with small distances are assumed to have similar preferences. In item-based recommender systems, items are represented as vectors and similarities are found in the same way as user-user similarity. The main idea of finding item-item similarity is calculating the weighted sum of user ratings and use them to make recommendations. Itembased models are preferred over user-based models as it provides better scalability. The collaborative filtering approach is greatly affected by the changing preferences of the users, popular item list and addition of new users or new items [1–3].
2.2 Content-Based Systems Content-based systems take into consideration all the item features to make recommendations. They recommend items based on the information provided about the
A Review of the Techniques and Evaluation Parameters …
475
items. They can also be useful in recommending new items based on the item specification. To work with content-based recommender systems, domain knowledge is an added benefit. These recommender systems have less diversity as compared to collaborative recommender systems but can work even without user ratings. These type of recommender systems have limited ability to expand and can provide personal recommendations only based on the current interests of the user [4, 5].
2.3 Knowledge-Based Recommendation Systems The main advantage of collaborative and content-based recommenders is that they allow collection of knowledge at a very low cost. A knowledge-based recommender system does not depend on ratings matrix but on the information acquired from the user. It helps the user to select possible candidates among a large pool of objects if he is ready to modify the information if there are no matching items to his interests. Knowledge-based recommendations can aid in removing the cold start problem for new users [1].
2.4 Demographic Recommendation Systems These recommendation systems take into the account the demographic attributes of the users to make recommendations. It basically works on the logic that users having similar demographic attributes and similarity rating items will have similar interests. The demographic attributes include age, gender, demographic area, education, interests, etc. It has the capability of making recommendations before the user give any ratings [1, 6].
2.5 Hybrid Recommender Systems These systems combine two or more recommender systems together so that it can take advantage of the benefit provided by each system and provide better recommender systems. Sometimes multiple recommender systems are wrapped together to for form a hybrid recommender system where each recommender system has different purpose. Such a wrapper can have different purposes. Generally, both collaborative filtering and content-based units are wrapped together, to avoid their constraints. Although a hybrid recommender system takes the benefits of different algorithms, building a recommender system is not an easy task. There can be different approaches in building a recommender system. Hybrid recommender systems also faces the issue of data sparsity and fails to recommend new items or make recommendations to a new user [6].
476
S. Vijaya Shetty et al.
3 Popular Techniques for Building Recommender Systems 3.1 K-Nearest Neighbour (KNN) Recommender Systems KNN algorithms is a memory-based approach used to make recommendations by grouping users and items into clusters based on similarity metrices like cosine similarity and Pearson similarity. In user-based KNN’s, we group together K users, who has rated similar items, in the same cluster to predict the score of an item by the target user. In item-based KNN, to predict the score of an item, we find top-k-similar-items which are rated by the user. The KNN algorithm is a very powerful and easy to use tool for making recommendations. But these recommender systems are prone to recommending only popular items, cold start problem, and data sparsity problem. The cold start problem can be solved by combining the KNN with a context aware recommender system. The data sparsity problem arises when the number of users and items in the system is very large because of which only a small portion of data are related. These reduce the reliability of similarity measure. The data sparsity problem can be solved by taking the following factors into account—size of intersection, average of all the items, time interval of scoring, as well as scoring interval of items [2].
3.2 Matrix Factorization Methods These is a model-based approach which applies matrix factorization techniques to compute the sparse rating matrix. The idea of these technique is to group similar items such that they are represented as a single feature and calculate similarity between users based on these features. Matrix factorization is very useful in solving the data sparsity problem. Singular value decomposition (SVD) is one of the most popular matrix factorization techniques. In SVD, the rating matrix M represents the users in terms of items. The rating matrix is decomposed into three lower rank matrices (U, , V ) and then the estimates of the lower rank matrices are used to compute the missing ratings. The columns of U represent the eigenvectors of MMT , and the T is a diagonal matrix used columns of V represent the eigenvectors of MM, and for scaling. The dimensions of the matrix U, , and V play an important factor in dimensionality reduction. Here, m is the number of users, n is number of items, and r is rank of the user-item matrix. Rank represents the number of linearly separable columns in M. If the users who liked item A also liked item B, then both items contribute a single rank. Then the user-item matrix can be computed from the matrix M as T M(m,n) = U(m,r ) (r,r ) V(r,n)
(1)
A Review of the Techniques and Evaluation Parameters …
477
Thus, SVD algorithm reduces the number of unnecessary parameters in the dataset by decomposing a user rating matrix from a higher rank to a lower rank matrices U and V. SVD tries to find correlations in the rating matrix. The performance of SVD can be improved by replacing the missing values in the rating matrix by median of the user ratings and median of the item ratings before the user-item matrix is decomposed [7]. There are other matrix factorization techniques such as probabilistic matrix factorization (PMF) and non-negative matrix factorization (NMF) that can be used to populate the sparse user-item matrix. The PMF, too, decomposes the rating matrix into two lower rank matrices and uses Bayesian inference to calculate the missing data. The NMF decomposes a positive matrix into two positive latent matrices [8]. NMF gives better results for very sparse data matrix. Matrix factorization (MF) techniques can be provided with side information of the items to avoid cold start problems. This information can be used in two ways. In the first method, item similarities are first calculated and then extended to be used with MF techniques [9, 10]. In the second technique, item features can directly be included into one of the lower rank matrices which reduces computation complexity and scales linearly with the dataset [11]. The MF technique can also be modified to focus on user’s attention on certain item features [12]. The user-latent matrix generated by the matrix factorizing techniques can be used in clustering techniques like fuzzy C-means and K-means to accommodate similar users, based on the latent factors, in the same cluster and generate recommendations [8]. MF techniques can also be used with encryption algorithms to preserve user privacy while trying to use knowledge from one domain to recommend items in another domain [13]. Table 1 details a comparison of different matrix factorization techniques discussed.
3.3 Deep Learning Networks Deep learning networks mainly consists of complex matrix operations which can capture complex features. They are like matrix factorization techniques, and in the way, they are used to predict ratings. Restricted Boltzmann machines (RBM) is one of the most famous techniques used in recommender systems. They are undirected graphical models. They consist of two layers one is the visible layer and other is the hidden layer. The visible layer passes the inputs to the hidden layers in the forward propagation adjusting the weights and forward bias. The output of the hidden layer neurons comes from a RELU activation function. During the backward propagation the input matrix is recomputed again. Since the initial weights are random, the difference between original inputs and constructed inputs is large. These give rise to bias terms on the visible layer. The process is done for all the users in an epoch and the after the weights and biases converge ratings can be predicted for a new user using the obtained weights and biases [14]. RBMs are mostly used for learning user preferences and modelling item ratings correlations by latent features. They can also be modified to include side information
478
S. Vijaya Shetty et al.
Table 1 Comparison of different techniques discussed in matrix factorization section Paper title
Techniques employed
Collaborative filtering item recommendation methods based on matrix factorization and clustering approaches
An overview of various Solves the data matrix factorization sparsity problem techniques like SVD, NMF, and PMF used in conjunction with clustering techniques like fuzzy C-means and K-means is provided
Advantages
Can include information about user-similarities matrix
Limitations
New algorithm for recommender systems based on singular value decomposition method
An improved SVD architecture is proposed where the missing values in the rating matrix are filled with appropriate values
Solves the data sparsity problem and recommendation problem
Cannot be used for complex networks
Heterogeneous information network embedding for recommendation
A MF-based architecture is proposed to better understand the semantic and structural information from textual data
Solves the data sparsity problem and improve recommendation
Can include deep learning methods to include embedding’s from multiple data paths
Capsmf: a novel product recommender system using deep learning-based text analysis model
An improved PMF matrix factorization is proposed which uses latent features from textual information
Solves the data sparsity problem and improve recommendation
Deep learning methods can be refined for better representation of text information
Attentive matrix factorization for recommender system
An improved factorization technique is proposed which works by adding an attention layer to better understand the user tastes
Solve the data sparsity problem and improve recommendation
User attention to PMF can improve the quality of recommendations
FeatureMF: an item feature enriched matrix factorization model for item recommendation
A MF architecture is proposed where item features can directly be included into one of the lower rank matrices which reduces computation complexity and scales linearly with the dataset
Solves the data sparsity problem and provide a scalable architecture
Richer sources of information can be used in place of categorical information
(continued)
A Review of the Techniques and Evaluation Parameters …
479
Table 1 (continued) Paper title
Techniques employed
Advantages
Privacy-preserving matrix factorization for cross-domain recommendation
A MF technique is proposed where an encryption algorithm is used to preserve user privacy while trying to use knowledge from one domain to recommend items in another domain
Solves the Requires expensive recommendation computational problem and resources preserves user privacy
Limitations
such as user demographic information, items categorization, and other features to improve the efficiency of the model and handle the cold start problem [15]. RBM can also be modified to better learn the user preferences by projecting the user-item ratings as three different views such as examinations, positive feedback, and negative feedback [16]. RBM can also be used with datasets containing implicit data such as time spent by a user on a website and watch time of a video for a user. RBM trained with implicit data can provide recommendation to a new user provided that the user provides feedback for a few items. For such an architecture, the user need not be in the dataset as recommendations can be provided through user feedback only. This type of architecture ensures user privacy and are computationally less expensive than traditional matrix factorization techniques which requires the new user to be added in the dataset and training to provide recommendations [17]. CNNs or convolutional networks can be used for feature extraction. They can be used in semantic analytics which can be beneficial for getting insights from text data given by the users in the form of reviews [5]. They can be used to learn user behaviours and item properties by using textual reviews [18]. Attention-based models can be used with CNN to generate the most eye-catching feature or hashtags from textual information which can be used in recommendation systems [19]. Attentionbased deep learning networks can be used to generate concise reviews from the actual reviews, and these concise reviews can be used by CNN to model user preference features and business performance features separately which can help in rating generation [20]. RNNs or recurrent neural networks can operate on time series data or sequential data. The architecture of RNN’s is such that they use the output of the previous iterations to make calculations. However, they do not consider the results of iterations of long term and focusses on only on a short-term iteration result. These limitations of RNN’s are called the vanishing gradient problem and can be overcome using LSTM or GRU cells as neurons. Such technology can be used to make clickstream recommendations or session-based recommendation [21]. They operate on sequential data such as browsing history or sequence of user actions and can be used to capture the changing interests of the users [22]. RNN’s can be organised in a hierarchical fashion to include information related to sequence and time intervals of the user’s
480
S. Vijaya Shetty et al.
rating to make recommendations. The hierarchical structure consists of two layers—a layer for short-term events and a layer for long-term events [23]. Deep learning networks can also be combined with matrix factorization techniques to make rating prediction from both linear and nonlinear viewpoints. Here, MF learns linear interaction from the user-item interaction data, and the deep learning model learns nonlinear interactions between latent features [24]. Deep learning networks can be used to extract the most important user-preference features and most important item features to make rating prediction. These features can then be used to explain why a particular recommendation is made to a particular user [25]. Deep learning methods can also be used to learn user’s interests by fusing multiple kind of userinterest representations such as user embedding’s (latent factors), item-level representations, neighbour-assisted representations, and category-based representations. Such representations are then fused using either of the two techniques—early fusion and late fusion to better integrate the diversity of user interests [26]. A comparison of various deep learning architectures used for building recommendation systems is shown in Table 2.
4 Testing on Recommender Systems 4.1 K-Fold Cross-Validation It is a data partitioning strategy used on a dataset to build a more generalized model. The data will be divided into training set and testing set. The train data will be further divided into k-random subsamples, and we will train the model for each of these subsamples, and the model will then be evaluated by using testing data. We take the average of the accuracy score to see how well the recommendation system is learning. This prevents our model from overfitting.
4.2 Leave One Out Cross-Validation LOOCV is used to evaluate the performance of algorithms when we have a small dataset or when an accuracy measure of model is important. It has a greater computational cost than k-fold cross-validation. If there are n rows in our dataset, we will have to complete n-folds or iterations of our model. Each time our model would be trained on n−1 data and tested on the left-out data. If we expect large variations between participants, a leave-one-out cross-validation approach may be the best way to validate our model.
A Review of the Techniques and Evaluation Parameters …
481
Table 2 Comparing the various architectures discussed in the deep learning section Paper title
Techniques
Restricted Boltzmann A stochastic neural machines for net is used to learn collaborative filtering user tastes and model correlations by latent factors
Advantages
Limitations
Solves the scalability and data sparsity problem
Does not include any side information about the user or item
Content-boosted restricted Boltzmann machine for recommendation
A modified RBM is Solves cold start and used that includes side data sparsity problem information such as user’s demographic information
Content information and RBM are used separately
Conditional restricted Boltzmann machine for item recommendation
A conditional RBM is used to better understand the user preferences by projecting the user ratings as three different views such as examinations, positive feedback, and negative feedback
Solves the data sparsity problem, preserves user privacy, and tries to better learn the user preferences
Complex networks such as autoencoders can be used to learn data features
Restricted Boltzmann machines for recommender systems with implicit feedback
A modified RBM is used which works on implicit data such as time spent on a website, watch time of a video for a user to make recommendations
Solves the cold start problem and is less computationally expensive
Does not take into account the results of accidental data
A review semantics-based model for rating prediction
A CNN-based model is proposed to extract review semantics which helps in rating prediction
Solves data sparsity problem
Semantic extraction can be improved by combining aspect keywords with user and items
Joint deep modelling of users and items using reviews for recommendation
A deep neural net is proposed which makes recommendations by generating user preferences and item properties from textual data
Solves the data sparsity Does not include problem and improve information about recommendation user-item ratings
(continued)
482
S. Vijaya Shetty et al.
Table 2 (continued) Paper title
Techniques
Advantages
Limitations
Session-based recommendations with recurrent neural networks
A deep learning model is proposed to make session-based recommendations through RNN by looking at click stream data of a user
Solves the cold start problem and works well when the user information is not known
Automatically generated item features can be used
Recurrent co-evolutionary latent feature processes for continuous-time recommendation
A RNN-based model is proposed which works on time series dataset to learn the co-evolving preferences of the users and provide the right recommendations at the right time
Solves the challenges related to changing user preferences and provide a scalable architecture
Modelling dynamics on additional data can be beneficial
DNR: a unified framework of list ranking with neural networks for recommendation
A combined Solves the architecture of MF recommendation and MLP is used to problem make rating predictions combining both user-item interactions and user-item reviews
Additional information such as comments can be used to develop the personalized ranking performance
Neural explicit factor model based on item features for recommendation systems
An explainable model is proposed to extract user-preference matrix-based on attention and item-feature matrix to make rating predictions
Solves the recommendation problem and provide answers as to why a recommendation is made
Information from user comments can be used to improve reliability
Learning and fusing multiple user-interest representations for micro-video and movie recommendations
A deep learning model which learns to integrate the diversity in user interests by fusing multiple user-interest representation is proposed
Solves the recommendation problem and learn the diversity in user interests
Demographic and social information can be included
Recommendation system with hierarchical recurrent neural network for long-term time series
A hierarchical structure is provided to combine effects of both interim and life-long information about the sequences and timing of user’s rating history
Solves the challenges related to changing user preferences
Trust information can be used to improve recommendations
(continued)
A Review of the Techniques and Evaluation Parameters …
483
Table 2 (continued) Paper title
Techniques
Rating prediction based on merge-CNN and concise attention review mining
A combined Solves the architecture of recommendation attention-based model problem and CNN is proposed where ratings can be predicted from latent features of concise text information
Advantages
Limitations Systems should be established to use and find credibility of text information
4.3 A/B Online Test A/B online test is a practical method to test a recommender system, and it is one of the widely used concept nowadays. It is a basic randomized control experiment. Comparison between two versions of a system is done to find out which performs better in a real-world environment. Here, the basic idea is to have two versions of the system A and B. Whenever a change is to be made, B is subjected to that change and part A remains the same. Now, based on the response from customer groups A and B, strategies can be made to bring those changes in action. This testing plays an import role in making decision in recommending systems. It is a better way of testing a system, as no matter what the score of a model may be if it does not provide recommendations that are useful to the users, it is of no use.
5 Evaluation Metrics 5.1 Mean Absolute Error or MAE MAE is used to measure the model accuracy [27]. It is the average of the absolute value of the errors. n i=0 abs(yi − K (x i )) (2) MAE = n It is basically an arithmetic average of the absolute errors where yi is the actual value and K(x i ) is the predicted value for test instance x i , and n is the number of test instances. If we look at these formulae in the perspective for recommending system, we can write above formulae as MAE =
1 |rui − p(r )ui | |R| p(r)∈R
(3)
484
S. Vijaya Shetty et al.
Here, r ui is actual rating, p(r)ui is predicted rating, and R is size of the test set.
5.2 Root Mean Squared Error or RMSE RMSE calculates the standard deviation of prediction errors [27, 28]. 1 RMSE = (rui − p(r )ui )2 |R|
(4)
p(r )∈R
Here, r ui is the actual rating, p(r)ui is the predicted rating, and R is the test data. It calculates how far are these errors from line of best fit. RMSE highly penalize bad predictions due to the square, hence it is highly affected by bad predictions. RMSE always have higher value than MAE.
5.3 Hit Rate The hit rate of a system can be calculated by dividing the number of hits by the size of the test data. It measures how often we can recommend an item watched by the user. If we have higher hit rate, then we can recommend good products to users. If we have a lower hit rate, then we may need to increase our data size. Higher the hit rate, better the recommendations. To find the hit rate, we find all the items in a user’s training data. We then remove an item from this data and feed the rest of the data to the recommender system to generate top-N recommendations. If the left-out is one of the recommended items, we count it as a hit. Hit rate is then calculated as Hit Rate =
# hits in testdata #users
(5)
5.4 Average Reciprocal Hit Rate or ARHR Average reciprocal hit rate is like the hit rate. Commonly used metric for ranking the evaluation of top-N recommender systems, that only considers where the first relevant result occurs. We get more credit for recommending an item which the user rated on the top of the rank than on the bottom of the rank. Higher rate is better. This is a user-focused metric as people tend to concentrate more on what they see at the beginning of top-n lists. So, we will give more weight to those hits which are
A Review of the Techniques and Evaluation Parameters …
485
showing up at the top. To calculate ARHR, we take the reciprocal of the ranks of each hit to account for the position of the recommendation. 1 1 ARHR = #users i=1 positioni #hits
(6)
5.5 Recall Recall is the measure of how many true positives our model correctly identifying [27]. Recall is the percentage of relevant items selected out of all the items. In simple words, how many relevant items can appear in the top-n items is recall. If recall at 10 is 60% in our top-10 recommendation system, this means that 60% of the total number of the relevant items appear in the top-k results. Recall =
True Positives True Positives + False Negatives
(7)
A true positive is an outcome where the model correctly estimates the true value, whereas a false negative is an outcome where the model wrongly estimates a false value. Formulae for recommendation systems Recall =
#of our recommendations that are relevant #of all the possible relevant items
(8)
Recall tries to answer, “What proportion of actual positives were known properly by the model?”
5.6 Precision Precision is the fraction of items in the recommendation list that are watched by the user [27]. Precision =
True Positive True Positives + False Positive
(9)
Formulae for recommendation systems: Precision =
#of our recommendations that are relevant #of items we recommended
(10)
486
S. Vijaya Shetty et al.
5.7 F1-Score It is basically a combination of precision and recall. We can give more importance to either precision or recall, but if we want to give equal importance, then we will use F1-score. F1 Score = 2 ∗
Precision ∗ Recall Precision + Recall
(11)
where precision and recall are precomputed using formulas in the given section. F1 gives a score between 0 and 1.
5.8 Other Parameters Diversity in recommendations is a useful feature to introduce customers to products they might not have thought of. Diversity helps to find the variety in a system. High diversity is simply a list of random recommendations, whereas low diversity fails to draw user’s attention to new items. Novelty is another useful metric that helps us determine how popular are the recommended items. High novelty is simply a popularity-based recommender system, whereas very low novelty may cause distrust among users [6].
6 Challenges in Implementing Recommender Systems 6.1 Changing User Preferences It is one of the major challenges in a recommendation system since a user may have different intentions at different points in time, which makes it relatively tough to recommend products. For example, one day a particular user will be browsing the site for new books, but the next day he may be searching for some other thing. So, it is difficult to recommend products, with changing users’ preferences which in turn affects users’ experience due to inappropriate suggestions [3].
6.2 Cold Start Problem The cold start problem arises when new users or new products are added to the database. In situations like these, neither the recommendations to the new users can be made, nor the new products can be rated. The problem caused due to the addition
A Review of the Techniques and Evaluation Parameters …
487
of a new user or new product is difficult to handle as it is difficult to obtain a similar user without knowing the previous interests of the new user or knowledge about the new product. This problem can be solved in a variety of ways and some of them are: • Enquiring the new users about their taste explicitly. • To overcome the cold start problem, we can make recommendations based on the demographic information. • Asking the user to rate some of the popular items at the beginning itself [6].
6.3 Time Relevancy Whenever new products are added, the recommender still suggests only those items that are already rated items. To prevent this situation, we can gradually stop recommending old products by adopting different approaches. We can also decrease the waiting period of recommending new products by using techniques like collaborative filtering, but these may introduce overspecialization.
6.4 Sparse Data It happens in many cases, that most users do not rate or review products they have purchased and thus, the rating matrix becomes very sparse which could lead to data sparsity problems. Moreover, a user purchases only a small number of items from a large catalogue. This leads to many zero ratings in the dataset, indicating the user did not like the product which is not true. Moreover, many zeroes mean that the recommendation systems must unnecessarily deal with a large dataset [6]. As the number of items and user’s increases, the number of same items used by two users decreases. As a result, the reliability of similarity decreases. To increase reliability, we need to consider the size of intersection [2].
6.5 Data Insufficiency Data insufficiency is also one of the major challenges for designing a recommender system since these systems require a lot of data to make accurate and effective recommendations. It is a no-brainer that companies having excellent recommendations also have a huge amount of user data. The bigger the dataset, the better the recommendations a system can make.
488
S. Vijaya Shetty et al.
6.6 Shilling Attacks Sometimes situations arise when an anonymous user or some random attacks, enters a system and starts entering false ratings to affect the item popularity. This type of attacks can affect the performance of the system and reduce the recommendation quality. Such attacks can be identified through many ways like hit ratio and prediction shift [6]. To prevent such attacks, only trusted users and with only one account should be allowed to vote.
7 Conclusion In this paper, we have given a brief review of some of the most popular techniques used in recommender systems. We have also tried to provide an overview of the most recent advancements in the field of matrix factorization and deep learning. We have tried to highlight the fact that different techniques can be combined to overcome the challenges associated with building a recommender system. In addition to this, we have provided an overview of the techniques and evaluation parameters that can be used to test recommender systems. We have also tried to provide an overview of some of the most common problems faced in building a recommender system. At the end, we would like to highlight that not all complex algorithms lead to better results. One should choose an architecture suitable for the system and not based on what is state-of-the-art as these architectures are complex and may be computationally expensive for the requirement.
References 1. Y.G. Patel, V.P. Patel, A survey on various techniques of recommendation system in web mining (2015) 2. B. Li, S. Wan, H. Xia, F. Qian, The research for recommendation system based on improved KNN algorithm, in 2020 IEEE International Conference on Advances in Electrical Engineering and Computer Applications(AEECA), pp. 796–798 (2020). https://doi.org/10.1109/AEECA4 9918.2020.9213566 3. J. Yu et al., Collaborative filtering recommendation with fluctuations of user’ preference, in 2021 IEEE International Conference on Information Communication and Software Engineering (ICICSE), pp. 222–226 (2021). https://doi.org/10.1109/ICICSE52190.2021.940 4120 4. M. Kwak, D.-S. Cho, Collaborative filtering with automatic rating for recommendation, in ISIE 2001. 2001 IEEE International Symposium on Industrial Electronics Proceedings (Cat. No. 01TH8570), vol. 1, pp. 625–628 (2001). https://doi.org/10.1109/ISIE.2001.931866.N 5. R. Cao, X. Zhang, H. Wang, A review semantics based model for rating prediction. IEEE Access 8, 4714–4723 (2020). https://doi.org/10.1109/ACCESS.2019.2962075 6. M. Mohamed, M. Khafagy, M. Ibrahim, Recommender systems challenges and solutions survey (2019). https://doi.org/10.1109/ITCE.2019.8646645
A Review of the Techniques and Evaluation Parameters …
489
7. Z. Sharifi, M. Rezghi, M. Nasiri, New algorithm for recommender systems based on singular value decomposition method. ICCKE 2013, 86–91 (2013). https://doi.org/10.1109/ICCKE. 2013.6682799 8. Ifada, M.K. Sophan, M.N. Fitriantama, S. Wahyuni, Collaborative filtering item recommendation methods based on matrix factorization and clustering approaches, in 2020 10th Electrical Power, Electronics, Communications, Controls and Informatics Seminar (EECCIS), pp. 226–230 (2020). https://doi.org/10.1109/EECCIS49483.2020.9263450 9. C. Shi, B. Hu, W.X. Zhao, P.S. Yu, Heterogeneous information network embedding for recommendation. IEEE Trans. Knowl. Data Eng. 31(2), 357–370 (2019). https://doi.org/10.1109/ TKDE.2018.2833443 10. R. Katarya, Y. Arora, Capsmf: a novel product recommender system using deep learning based text analysis model. Multimedia Tools Appl. 79(47), 35927–35948 (2020) 11. H. Zhang, I. Ganchev, N.S. Nikolov, Z. Ji, M. O’Droma, FeatureMF: an item feature enriched matrix factorization model for item recommendation. IEEE Access 9, 65266–65276 (2021). https://doi.org/10.1109/ACCESS.2021.3074365 12. J. Zhu, W. Ma, Y. Song, Attentive matrix factorization for recommender system, in 2020 13th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), pp. 932–936 (2020). https://doi.org/10.1109/CISP-BMEI51763.2020. 926355 13. T.B. Ogunseyi, C.B. Avoussoukpo, Y. Jiang, Privacy-preserving matrix factorization for crossdomain recommendation. IEEE Access 9, 91027–91037 (2021). https://doi.org/10.1109/ACC ESS.2021.3091426 14. R. Salakhutdinov, A. Mnih, G. Hinton, Restricted Boltzmann machines for collaborative filtering, in Proceedings of the 24th International Conference on Machine Learning (Association for Computing Machinery, New York, USA, 2007), pp. 791–798. https://doi.org/10. 1145/1273496.1273596 15. Y. Liu, Q. Tong, Z. Du, L. Hu, Content-boosted restricted Boltzmann machine for recommendation (2014). https://doi.org/10.13140/2.1.1424.8963 16. Z. Chen, W. Ma, W. Dai, W. Pan, Z. Ming, Conditional restricted Boltzmann machine for item recommendation. Neurocomputing 385, 269–277 (2020) 17. F. Yang, Y. Lu, Restricted Boltzmann machines for recommender systems with implicit feedback, in 2018 IEEE International Conference on Big Data (Big Data), pp. 4109–4113 (2018). https://doi.org/10.1109/BigData.2018.8622127 18. L. Zheng, V. Noroozi, P.S. Yu, Joint deep modeling of users and items using reviews for recommendation, in Proceedings of the Tenth ACM International Conference on Web Search and Data Mining (WSDM ‘17) (Association for Computing Machinery, New York, USA, 2017), pp. 425–434. https://doi.org/10.1145/3018661.3018665 19. Y. Gong, Q. Zhang, Hashtag recommendation using attention-based convolutional neural network, in Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (AAAI Press, 2016), pp. 2782–2788 20. Y.-C. Chou, H.-Y. Chen, D.-R. Liu, D.-S. Chang, Rating prediction based on merge-CNN and concise attention review mining. IEEE Access 8, 190934–190945 (2020). https://doi.org/10. 1109/ACCESS.2020.3031621 21. B. Hidasi et al., Session-based recommendations with recurrent neural networks. CoRR abs/1511.06939 (2016) 22. H. Dai, Y. Wang, R. Trivedi, L. Song, Recurrent co-evolutionary latent feature processes for continuous-time recommendation, in Proceedings of the 1st Workshop on Deep Learning for Recommender Systems (DLRS 2016) (Association for Computing Machinery, New York, USA, 2016), pp. 29–34. https://doi.org/10.1145/2988450.2988451 23. B. Choe, T. Kang, K. Jung, Recommendation system with hierarchical recurrent neural network for long-term time series. IEEE Access 9, 72033–72039 (2021). https://doi.org/10.1109/ACC ESS.2021.3079922 24. C. Wei, J. Qin, W. Zeng, DNR: a unified framework of list ranking with neural networks for recommendation. IEEE Access 9, 158313–158321 (2021). https://doi.org/10.1109/ACCESS. 2021.3130369
490
S. Vijaya Shetty et al.
25. H. Huang, S. Luo, X. Tian, S. Yang, X. Zhang, Neural explicit factor model based on item features for recommendation systems. IEEE Access 9, 58448–58454 (2021). https://doi.org/ 10.1109/ACCESS.2021.3072539 26. X. Chen, D. Liu, Z. Xiong, Z.-J. Zha, Learning and fusing multiple user interest representations for micro-video and movie recommendations. IEEE Trans. Multimedia 23, 484–496 (2021). https://doi.org/10.1109/TMM.2020.2978618 27. M. Lerato, O.A. Esan, A. Ebunoluwa, S.M. Ngwira, T. Zuva, A survey of recommender system feedback techniques, comparison and evaluation metrics, in 2015 International Conference on Computing, Communication and Security (ICCCS), pp. 1–4 (2015). https://doi.org/10.1109/ CCCS.2015.7374146 28. C. Jian, Y. Jian, H. Jin, Automatic content-based recommendation in e-commerce, in 2005 IEEE International Conference on e-Technology, e-Commerce and e-Service, pp. 748–753 (2005). https://doi.org/10.1109/EEE.2005.37
Fostering Smart Cities and Smart Governance Using Cloud Computing Architecture Lubna Ansari, M. Afshar Alam, Mohd Abdul Ahad, and Md. Tabrez Nafis
Abstract Smart cities are required in today’s world. It is a complex term comprised of many other terms like smart transportation, smart energy, smart governance, and many more. Smart governance is the most significant one. Many technological advancements have been done to transform any city into a smart city. Out of many, cloud computing is a new torrent in terms of technological revolution and economic development. This paper aims to identify the factors that contribute to the slow performance of e-governance systems when compared to the use of cloud technology in supporting e-governance implementation; it also examines the main factors influencing cloud computing technology adoption and argues that cloud computing technology can be recommended as a new avenue to support smart governance implementation with various cloud techniques. To reduce cost, alleviate time to market, and enhance on-demand applications, we have proposed a cloud computing architecture as a savior of smart cities and smart governance. Keywords Smart cities · Cloud computing · Smart governance · Sustainability · E-governance · Security · Privacy
1 Introduction Recently, smart cities become a new trend spanning across the world [1]. It is a transformative concept, transforming the lives of people in a positive way in different aspects of life [1]. But smart city is a vague term that has different meaning for different nations. A smart city is like a high-rise building, and it is constructed on three main pillars, viz. smart technology, smart citizens, and smart governance [2]. Each pillar is equally important but smart governance has a significant value [1]. Being able to implement rules or policies, smart governance is very significant element of smart cities [3]. Smart governance is also responsible for doing all the communication with the citizens using ICT [3]. A successful governance of any nation should not only be L. Ansari (B) · M. Afshar Alam · M. Abdul Ahad · Md. Tabrez Nafis Department of Computer Science and Engineering, Jamia Hamdard, Hamdard Nagar, Delhi 110062, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_36
491
492
L. Ansari et al.
smart but also sustainable [4]. If any smart city is not following sustainability, it will no longer be smart [4]. So, in order to make governance of any nation more effective, more powerful, and more efficient, we have to make it both sustainable and smart [4]. In this way, we can foster smart cities. Sustainable development, whether in underdeveloped or developed countries, is a concern and a problem for society today. Good governance is essential for effectively managing all types of resources, especially natural resources, for the benefit of present and future generations [5]. If governments wish to improve openness, accountability, and efficiency, digital transformation may be a significant driver of change. E-governance makes it easier to implement integrated policies and public services that support long-term economic growth, social development, and environmental preservation [6]. Including cloud computing in various government, organizations have many advantages. As per the National Institute of Standards and Technology (NIST), cloud computing can be defined as a way of providing ubiquitous, on-demand access to a huge network of sharable computer resources (such as servers, networks, data warehouses, services, and applications) that can be provided in less time and delivered with very less involvement of service provider [7]. At both the hardware and software levels, cloud types provide varying levels of service customization. Suppliers price them differently because they serve various functions. There are three types of cloud computing, viz. Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS) [8]. Customers can choose the way they wish to provide the service according to their current subscription and maintenance expenses, IT infrastructure, and privacy policies. They have a direct influence on data security, as well as the way computations and data are stored. We differentiate between private cloud, hybrid cloud, public cloud, and social cloud hosting [9]. Many direct and indirect factors influence the adoption of new technologies in developing nations, including cloud computing, culture, society, enabling environment, a lack of resources, and a lack of expertise [10]. There are plenty of obstacles that might stymie cloud computing adoption. These problems may be divided into three categories: technical, organizational, and environmental. In underdeveloped nations, technological and organizational issues might be regarded as failures in IT governance that must be handled by top management [11]. IT management is concerned with the cloud computing provider’s security rules, which means they must be cautious while implementing new technology. As a result, choices must be made to guarantee that cloud computing adoption in the e-governance system is used effectively [12]. According to research, from a technological perspective, an individual can gain advantages like the ease inaccessibility of government services, if they can accept the emerging trends [13]. Another concern in underdeveloped nations is the lack of attention paid to the protection of data on e-government Web sites. E-government sites should be updated frequently to reflect procedural and policy changes; as a result, interactive features and processes on the site should be checked regularly to ensure that users can access services without difficulty [13]. The cloud feature separates cloud solutions from their various offline equivalents and other network services. On-demand access means the ability for the user to tailor
Fostering Smart Cities and Smart Governance Using Cloud …
493
cloud resources like network bandwidth, computing power, and disk space when required [14]. The addition of resources can be done automatically whenever the users increase or their demand expands. Virtual machines can also be scaled up and scaled down [15]. When the need for computer power is depreciated by a particular user, the acquired resources are released and allocated to other users. Throughout the process, the supervision of VMs is handled by the load balancer software and hypervisor [16]. Nowadays, Internet is used to enable mobility. Users have access to resources from anywhere, at any time, and on any system. And this helps the users to work from any device which has an Internet connection [7]. The process of grouping resources is integrating them to automatically share and synchronize them. Virtual machines, servers, storage space, CPU units, networks, and RAM are examples of resources that can be grouped. The service’s measurability is assured by continuous monitoring of useful resource use. To determine the cost of the service, data is gathered, reported, and made public. Customers are given information that allows them to select a suitable service based on current demand [17]. Cloud computing makes hardware resources instantly available without requiring the user to make any further investments. It triggers the conversion of IT expenses (hardware investments) into operating expenses (cloud subscription). Users may work from anywhere, at any time, and on any device thanks to the cloud. Access to a large amount of data and the capacity to collaborate greatly boost productivity [18]. Government enterprises have their executive leadership and a significant amount of power when it has to make decisions regarding their investment in information systems (ISS), mainly cloud computing technologies [19]. Government agencies may acquire different cloud services if there are no strong restrictions, thereby the quality of the policy-making processes around these investments would differ. Additionally, government agencies must foster an environment that fosters technology use while simultaneously serving the needs of small businesses [20]. Cloud services have many difficulties like data privacy, international contract law, and data security, as well as accountability obligations, and this makes the amalgamation of cloud computing with governance quite difficult [21]. Security is crucial as cloud computing needs supervision of the whole environment on two levels, viz. virtual and physical resources. If a physical server is compromised, all related virtual machines become vulnerable, and a hacked virtual machine may damage the servers it is using.
2 Literature Survey Previously Indian tax system has a complicated structure where both central and state government foist their tax on a particular thing. Henceforth, the same thing has variable prices in different locations. To eradicate this unique problem, many solutions had been suggested (Joseph et al. [17]). Cloud computing comprises a huge sharable pool of computer resources. These resources have features like pay-as-use and on-demand scalability. And as a result of
494
L. Ansari et al.
these features, many countries shifted from their conventional e-governance model which is costly to this new scalable and cost-efficient cloud-based e-governance model (Sadiku et al. [18]). People around the globe use this virtual world to put up their opinions on any topic like entertainment, sports, and politics making it a perfect platform. More than 33% of social media is filled up with posts, comments, or discussions about politics (Hossain et al. [19]). To control and monitor public policies, social media and e-governance based on cloud computing can be a powerful and influential model. A massive amount of data is produced when people all over the country take part in conversations related to government policies on social media. There is a paramount requirement for a cloud-based system by which we can use this massive data and convert this public opinion in form of concerns, issues, solutions, advantages, and proposals and can make amendments in the policies to content the public at the conception phase only (Androutsopoulou et al. [20]). Yimam and Fernandez et al. [21] spotted many issues in cloud computing related to privacy. And it is quite a task to inculcate security and privacy. As a result, it is becoming very difficult to analyze security, privacy, and compliance amid cloud service providers. Even the government has the same requirements like security, integrity, privacy, and enforcement. And if there is any breach in security and lawsuit, the organizations are responsible. Bhalaji [22] discussed an integrated deep learning technique. The performance of the network is improvised by correct workload and prediction of resource allocation using time series. By using logarithmic operation, standard deviation is depreciated first, and after that by the help of strong filters, the extreme points and noise interference are eradicated. Additionally, to forecast the time series, an integrated deep learning algorithm is used. This technique not only forecasts workload and resource sequence correctly but also time series. However, there is a scope for improving the performance by providing optimization algorithms. Andi [23] enlighten about the concept of serverless cloud computing and its various advantages and uses in the IT industry. Conventionally, it is the developer task to allocate resources, ownership, and management of the server. And it follows three models: SaaS, IaaS, and PaaS. However, in serverless cloud computing, tasks like managing, owning, or maintaining servers are tackled by the cloud service provider. So, developer need not to take care any of the tasks mentioned. Furthermore, this is cost-efficient as the time to market is highly reduced. Currently, serverless computing is not an answer for many problems in the IT sector. A secure framework for data migration and analysis of data security is provided in [24] by Shakya et al. Installation of Secure Socket Layer (SSL) and migration tickets with fewer privileges are provided. To encrypt data, Prediction-Based Encryption is used. This framework is beneficial in the system like e-commerce or healthcare systems where we have to deal with sensitive data like credit card details. In addition to this, the sensitive information is encrypted and thereby keeping insensitive and sensitive data separately. However, the encryption is a bit time-consuming and needs to be reduced further.
Fostering Smart Cities and Smart Governance Using Cloud …
495
In [25], Mugunthan discussed Distributed Denial of Service (DDOS) attack. The exploitation of resources and cloud architecture services is a major concern recently. These DDOS attacks are very advanced and keep on growing at an exponential rate because of which detecting and resolving them are quite problematic. The authors proposed a soft computing-based detection to find DDOS assaults. To observe network traffic, Markov model is used, and to classify detected attacks, Random Forest is used. But if the time span is short, even the Markov model is not very effective. Chen and Yeh [26] discussed the round trips for graph-based data representation. The delivery of data is lower by combining cloud services with an efficient web framework so that to achieve effective management of data and conservation of energy in IoT sensing applications. And if integrated with any potential technology, these mentioned aims can be accomplished. For creating a testbed in applications at the initial layer RIOT OS, Graphene web framework, Google cloud services, and Z1 IoT motes are employed. Still to empower this proposed approach, new technologies and bigger networks are required.
3 Sustainable Smart Governance Today’s government especially the Indian government is lacking overall controllability and access to various unexpected conditions. The year 2020 added a huge number of problems, especially for the government. The COVID-19 epidemic and the ensuing educational, economic, and national security challenges affect every country. In addition to it, the natural catastrophes, such as climate change issues, fires, droughts, and storms, become very dangerous than they were already. Geopolitical instability became a very common experience within and outside countries, affecting countries that had not been stable for long and also, those that had been seen as heights of democracy and stability. There is a need to build a smart governance to employ various schemes which would reach all the nation in a very short period with much reduced economic cost.
3.1 Limitations of Cloud Cloud computing provides the ability to create a virtual platform that allows users to store, serve, and manage data across many virtual clouds. Furthermore, cloud computing improves end-user productivity while lowering the cost of constructing physical servers. The deployment of such a cloud computing facility over a wide range of demographics, platforms, and servers has three major challenges:
496
L. Ansari et al.
• Adherence of mandates Regulations are monotonous and daunting to comprehend, which adds to the difficulty of comprehending and accumulating data in our systems. In cloud computing services, service providers are frequently confronted with overlapping concerns. There are no architecture guidelines in existence for cloud computing systems. Different states’ laws and regulations may be disrupted by data kept in the cloud. • Illicit handling Malware is included with the rest of the system by using cloud servers that can send a malicious email to a user to obtain sensitive information. To avoid inbound and outbound assaults from being detected, attackers use security flaws in other network hosts. • Security and privacy Cloud computing provides users to access all of their digital assets, such as emails, documents, and sensitive information, from a remote location, resulting in a privacy and security breach throughout the platform. As a result, users of cloud computing should examine the following problems in order to avoid security breaches on the platform. So, there is a high need to overcome the adherence of mandates, illicit handling, and security and privacy issues, and thereby, a new framework to imply a smart sustainable governance for developing countries is proposed in this system.
3.2 Digital Governance Framework There are four essential phases to form a digital sustainable governance model. These are as below: • Information Phase The government can use electronic methods such as the Internet to distribute relevant information to the public which is the purpose of this phase. All the enabled information is gathered in the cloud which possesses as a backup of the history of the government announcement and functioning. • Interaction Phase This phase attempts to make interaction and communication between government officials and the general public easier. This system enables to function the opinion of the public to reach the authorities easily. For example, in case of any queries to government authorities instead of processing via document, it is very easy to upload the details to the relevant section, and thus, action can be made easily. • Transaction Phase Customers and consumers may execute transactions without having to visit the office, boosting their value based on the transaction phase. This enables all kinds of transactions to be done easily, and the track of the transaction is easy to collect in one place via cloud.
Fostering Smart Cities and Smart Governance Using Cloud …
497
• Transformation Phase With phase-wise development, the final destination must have the ultimate objective to provide all services at a single counter. This phase makes all kinds of services and combines all services and controls appropriately. To imply the cloud computing infrastructure for a sustainable governance, the following clouds may be used. 3.2.1
Cloud Consumer
A cloud consumer can be defined as a firm that has a contract or agreement with a cloud provider. A cloud consumer can avail of all the IT resources which are provided by the cloud provider. The diagrammatic representation of cloud consumer is depicted in Fig. 1. The cloud consumer can be categorized as SaaS consumer, IaaS consumer, or PaaS consumer [28]. The SaaS consumer can be an organization, an end-user, or software application administrator [27]. The PaaS consumer uses tools and resources issued by cloud providers in order to test, develop, manage, and deploy applications hosted in a cloud [27]. A PaaS consumer can be a tester, developer, application deployer, and application administrator [27]. IaaS consumer can be system administrator, system developer, or IT manager [27]. They can access network storage, virtual computers, or network infrastructure components [27].
Fig. 1 Cloud consumer [27, 28]
498
L. Ansari et al.
Fig. 2 Cloud provider [29]
3.2.2
Cloud Provider
Using a cloud provider as depicted in Fig. 2 can save your time and effort by allowing you to access software applications that you would otherwise have to supply ourselves, such as [29]: • Infrastructure: Infrastructure is the bedrock of every computer environment. Database services, Networks, cloud storage, data management, and servers might all be part of this architecture. • Platforms: Tools that are mandatory to distribute and develop apps like middleware, operating systems, and runtime environments. • Software: Ready to use applications. This software can be customized or standard programs offered by third-party developers. The new trend in the cloud is data control. It means creating data tiers and allocating data to the correct type of storage. It is also important to have knowledge about when to send workloads to spinning disks or flash drives.
3.2.3
Cloud Carrier
A cloud carrier acts as a connection between cloud providers and cloud users. It provides delivery and connectivity to various cloud services. Users can access cloud carriers via telecommunications, networks, and other access devices. Cloud customers, for say, can avail of cloud services via network access devices like laptops, PCs, mobile Internet devices (MIDs), and mobile phones. Cloud services are spread via a transport agent or a network and telecommunication carriers, where a transport agent is a company that provides physical transportation of storage media like hard drives.
Fostering Smart Cities and Smart Governance Using Cloud …
499
Although public cloud models have shown to be effective and are here to stay, they still operate on a “best-effort” basis. They do not provide stringent SLAs based on guaranteed bandwidth, latency, QoS, downtime penalties, and, most importantly, data security guarantees, aside from minimal SLAs (if any). Carrier clouds provide all of this and more: They provide developers with the ability to create applications based on open APIs in addition to stringent SLAs, high QoS, and low latency. Mobile carriers, for example, may use their carrier clouds to deliver online gaming and multiplayer services, as well as build new apps for their consumers. Fixed operators can provide VPN services to their consumers with a tight SLA.
3.2.4
Cloud Auditor
A third party who may conduct a fair review of services offered by the cloud to form an opinion is a cloud auditor, and it is shown in Fig. 3. Audits are conducted to ensure that standards are being followed by reviewing objective evidence. A cloud auditor can assess a cloud provider’s services in terms of security measures, privacy implications, performance, and other factors [30]. With the industry debating whether or not to pursue “Network Virtualization,” carrier cloud might serve as a stepping stone toward that goal. Carriers are now in a better position to provide premium cloud services than public cloud providers. They are close to their consumers. Their POPs are located close to end consumers,
Fig. 3 Cloud auditor [30]
500
L. Ansari et al.
ensuring low latency. They have operational teams in place, as well as defined maintenance protocols, to ensure that their clients receive the best possible support for their services. In summary, they can provide the SLA, which is a critical component of today’s public clouds.
3.2.5
Cloud Broker
A cloud broker serves as a middleman between a cloud computing service provider and a cloud computing service customer. A broker is like a person who acts as a connection between two or more parties when a transaction is in progress. The broker’s job may be as simple as saving the buyer time by researching services from various suppliers and giving advice on how to use cloud computing to achieve corporate objectives. In this case, the broker collaborates with the customer to learn about their work processes, provisioning needs, budgeting, and data management needs. The following steps are needed to be focused on to implement cloud computing for a sustainable governance: • Restricting any data localization policies that may exist • Creating cross-border data transmission procedures • Putting in place a data classification system that allows various types of data to be managed differently, and • Developing an interoperable cloud system for the government that incorporates an iterative policy mechanism for adjusting and harmonizing government policies in the event of policy disputes.
4 Trends and Discussion Figure 4 depicts the increase in cloud computing market size yearly [31]. This Fig. 5 depicts the growing trend of organizations toward cloud computing [32]. Within the last year, most organizations have adopted cloud computing. Due to its various advantages, there is a growing trend toward the adoption of computing.
5 Conclusion Cloud computing is now commonly used in various online platforms. The term egovernance varies nation by nation and is connected to a variety of problems that organizations encounter when it comes to growing adoption. The purpose of this article is to investigate the notion of e-governance. However, there are still hurdles to the widespread adoption of cloud computing in the public sector, such as the need for government policymakers to better grasp cloud computing features and
Fostering Smart Cities and Smart Governance Using Cloud …
Fig. 4 Increase in cloud computing market size yearly [31]
Fig. 5 Investment in cloud computing yearly [32]
501
502
L. Ansari et al.
additional training to establish data protection and security for the government cloud. Hence, it is suggested that governments should work to create favorable regulatory circumstances that encourage the use of cloud computing in the public sector. To avoid fragmentation of policy approaches, it is ideal if these laws are iterated regularly by a central authority. This involves regional coordination to establish more uniformity in accountability requirements, as well as the implementation of international technical standards controlling information security, which would permit cross-border data exchange and encourage interoperability.
References 1. L. Ansari, M.A. Alam, R. Biswas, S.M. Idress, Adaptation of smart technologies and Ewaste: risks and environmental impact, in Smart Technologies for Energy and Environmental Sustainability ed. by P. Agarwal, M. Mittal, J. Ahmed, S.M. Idress (Springer Nature, 2021), pp. 208–227. https://doi.org/10.1007/978-3-030-80702-3 2. T. Nam, T.A. Pardo, Conceptualizing smart city with dimensions of technology, people, and institutions, in Proceedings of the 12th Annual International Digital Government Research Conference: Digital Government Innovation in Challenging Times—dg.o’11, p. 282 (2011). https://doi.org/10.1145/2037556.2037602 3. H.J. Scholl, M.C. Scholl, Smart governance: a roadmap for research and practice, in iConference 2014 Proceedings (2014). https://doi.org/10.9776/14060 4. M.A. Alam, L. Ansari, R. Biswas, S.I. Hassan, Z. Mohammad, Indian outlook of sustainable smart governance : a paramount for smart cities. Int. J. Tech. Innov. Mod. Eng. Sci. (2019) 5. O. Ali, V. Osmanaj, The role of government regulations in the adoption of cloud computing: a case study of local government. Comput. Law Secur. Rev. 36 (2020). https://doi.org/10.1016/ j.clsr.2020.105396 6. P. Singh, Y.K. Dwivedi, K.S. Kahlon, R.S. Sawhney, A.A. Alalwan, N.P. Rana, Smart monitoring and controlling of government policies using social media and cloud computing. Inf. Syst. Front. 22, 315–337 (2020). https://doi.org/10.1007/s10796-019-09916-y 7. K. Alzadjali, A. Elbanna, Smart institutional intervention in the adoption of digital infrastructure: the case of government cloud computing in Oman. Inf. Syst. Front. 22, 365–380 (2020). https://doi.org/10.1007/s10796-019-09918-w 8. H. Sallehudin, A.H.M. Aman, R.C. Razak, M. Ismail, N.A.A. Bakar, A.F.M. Fadzil, R. Baker, Performance and key factors of cloud computing implementation in the public sector. Int. J. Bus. Soc. 21, 134–152 (2020) 9. D.S.S.B.P.K.A. Kamel, Advancing e-Government using Internet of Things, in Mobile Computing and Sustainable Informatics ed. by S. Shakya, R. Bestak, R. Palanisamy, K.A. Kamel (Springer International Publishing, 2022), pp. 123–137 10. W.A. Wanjiru, M. Yusuf, Cloud computing and performance of county governments in Kenya; a case of the county Government of Nyandarua. Glob. J. Manage. Bus. (2020) 11. C. Pettit, B. Stimson, J. Barton, X. Goldie, P. Greenwood, R. Lovelace, S. Eagleson, Open access, open source and cloud computing: a glimpse into the future of GIS. Handb. Plan. Support Sci. 56–71 (2020). https://doi.org/10.4337/9781788971089.00011 12. T. Staunton, Cloud computing: adoption barriers and enablers for government. https://openre search-repository.anu.edu.au/handle/1885/201826 (2020) 13. A.I. Aripin, A. Abimanyu, F.S. Prabowo, B. Priandika, B. Sulivan, A. Zahra, Mobile cloud computing readiness assessment framework in upstream oil and gas using RAMI 4.0, in International Conference on Information Management and Technology (ICIMTech) 2020, pp. 130–135 (2020). https://doi.org/10.1109/ICIMTech50083.2020.9211193
Fostering Smart Cities and Smart Governance Using Cloud …
503
14. P.V.B. Reddy, ASAFE G-cloud based framework to improve government healthcare services. Mukt Shabd J. IX, 1437–1441 (2020) 15. A.M. Ahmed, O.W. Allawi, A review study on the adoption of cloud computing for higher education in kurdistan region—Iraq. UHD J. Sci. Technol. 4, 59–70 (2020). https://doi.org/10. 21928/uhdjst.v4n1y2020 16. M.J. Ahn, Y.C. Chen, Artificial intelligence in government:: Potentials, challenges, and the future. PervasiveHealth Pervasive Comput. Technol. Healthc. 243–252 (2020). https://doi.org/ 10.1145/3396956.3398260 17. N. Jospeh, P. Grover, P.K. Rao, P.V. Ilavarasan, Deep analyzing public conversations: insights from twitter analytics for policy makers, in e-Business, e-Services, and e-Society (Springer International Publishing, Delhi, 2017), pp. 276–288. https://doi.org/10.1007/978-3-319-685 57-1 18. M.N.O. Sadiku, S.M. Musa, Cloud computing opportunities and challenges (2014) 19. M.A. Hossain, Y.K. Dwivedi, C. Chan, C. Standing, A.S. Olanrewaju, Sharing political content in online social media: a planned and unplanned behaviour approach. Inf. Syst. Front. 20, 485–501 (2018). https://doi.org/10.1007/s10796-017-9820-9 20. A. Androutsopoulou, Y. Charalabidis, E. Loukis, Policy informatics in the social media era: analyzing opinions for policy making. Lecture Notes Computer Science (including Subseries Lecture Notes Artificial Intelligence Lecture Notes Bioinformatics). 11021 LNCS, 129–142 (2018). https://doi.org/10.1007/978-3-319-98578-7_11 21. D. Yimam, E.B. Fernandez, A survey of compliance issues in cloud computing. J. Internet Serv. Appl. 7 (2016). https://doi.org/10.1186/s13174-016-0046-8 22. N. Bhalaji, Cloud load estimation with deep logarithmic network for workload and time series optimization. J. Soft Comput. Paradig. 3, 234–248 (2021). https://doi.org/10.36548/jscp.2021. 3.008 23. H.K. Andi, Analysis of serverless computing techniques in cloud software framework. J. ISMAC 3, 221–234 (2021). https://doi.org/10.36548/jismac.2021.3.004 24. S. Shakya, An efficient security framework for data migration in a cloud computing environment. J. Artif. Intell. Capsul. Networks 01, 45–53 (2019). https://doi.org/10.36548/jaicn.2019. 1.006 25. S.R. Mugunthan, Soft computing based autonomous low rate DDoS attack detection and security for cloud computing. J. Soft Comput. Paradig. 2019, 80–90 (2019). https://doi.org/10. 36548/jscp.2019.2.003 26. J.I.Z. Chen, L.T. Yeh, Graphene based web framework for energy efficient IoT applications. J. Inf. Technol. Digit. World 3, 18–28 (2021). https://doi.org/10.36548/jitdw.2021.1.003 27. W. Bumpus, NIST cloud computing standards roadmap. NIST Cloud Comput. Stand. 1–113 (2013) 28. M. Bousmah, O. Labouidya, N. El Kamoun, MORAVIG: an android agent for the project mobile e-learning session. Int. J. Comput. Appl. 113, 12–19 (2015). https://doi.org/10.5120/ 19901-2006 29. P. Pedamkar, Educba “Public cloud providers”. https://www.educba.com/public-cloud-provid ers/ 30. M. Sookhak, A. Gani, H. Talebian, A. Akhunzada, S.U. Khan, R. Buyya, A.Y. Zomaya, Remote data auditing in cloud computing environments: a survey, taxonomy, and open issues. ACM Comput. Surv. 47, 1–34 (2015). https://doi.org/10.1145/2764465 31. Cloud Computing Market Share. https://www.t4.ai/industry/cloud-computing-market-share. Last updated: Feb 2021 32. Enterprise spending on cloud and data centers by segment from 2009 to 2020. https://www.sta tista.com/statistics/1114926/enterprise-spending-cloud-and-data-centers/
IoT-Enabled Smart Helmet for Site Workers D. Mohanapriya, S. K. Kabilesh, J. Nandhini, A. Stephen Sagayaraj, G. Kalaiarasi, and B. Saritha
Abstract In construction sites, the fatality rates of workers are increasing day by day. Currently, no rescue measures are in place to reduce death rates among construction workers. During working hours, workers are monitored continuously to ensure their safety and to take remedial measures in the event of an emergency. Wearing a helmet is one of the rescue methods to prevent accidents. But the traditional construction helmet does not help to reduce the fatality rate drastically. A smart helmet that monitors the physical condition of the workers is designed in this paper with sensors including MEMS sensors, heartbeat sensors, temperature sensors, IR sensors, and vibration sensors. The panic button is included which enables the worker to give an alert message to the owner. GPS and GSM module is included to track the location of the workers and send a message to the owner. Workers are continuously monitored through IoT. Keywords Smart helmet · Arduino · Internet of Things · Global positioning system · Global system for mobile communication · ThingSpeak
1 Introduction Safety is a major issue for the workers at the construction site; such as during excavation, material handling, scaffolding, working at heights, formwork, stacking, and housekeeping works [1]. Normal helmet does not provide such a protection from construction accidents, illness etc., when the workers are at height, they need to face some problems such fear-ness, tension, trembling, giddiness, headache, and leg-pain. This may affect the workers both mentally and physically.
D. Mohanapriya (B) · S. K. Kabilesh · J. Nandhini · G. Kalaiarasi · B. Saritha Department of Electronics and Communication Engineering, Jai Shriram Engineering College, Avinashipalayam, Tiruppur 638660, India e-mail: [email protected] A. Stephen Sagayaraj Bannari Amman Institute of Technology, Sathyamangalam 638401, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_37
505
506
D. Mohanapriya et al.
India has the world’s height accident rate among the construction workers; survey shows that 165 out of every 1000 workers are injured during the construction work [2]. Construction workers alone not the sufferers but their family, children also affected. According to the Occupational Safety and Health Administration survey, 39.9% of deaths were caused by falls, 8.4% by objects, 1.4% by accidents, and 8.5% by electrocution. The recent advancement of technology has contributed to improve the industry performance, and at the same time, it is more challenging and unsafe [2]. Most of the large scale companies have a proper safety measures for their construction projects, but small scale projects does not have enough safety measures to prevent the site accidents [2]. The current situation of the construction worker is analyzed in this paper, if any abnormality occurs, the site manager and worker will be informed via IoT. This helps to ensure the safety of each worker at the construction site.
2 Related Work A construction site uses IoT to detect falls by workers and prevent the accident. Mehta et al. developed the wearable devices like band to monitor the safety and health parameters of the workers by using various sensors to reduce the death rate of the workers [3, 4]. Different types of wearable devices are developed for monitoring the health parameters [5, 6]. Gerald Pirklet all developed a wearable sensor system (light detection and ranging (LiDAR) and inertial measurement unit (IMU) sensors) that helps to maintain a proper documentation and work support. The device is capable of capturing the current position of a worker, measuring the dimensions of a room with an error rate of less than 6% and enabling user-driven documentation [7]. According to Aliyev and co-authors, the smart helmet platform HeadgearX, which consists of ten types of sensors, feedback mechanisms and Bluetooth connectivity, is a great way to improve the safety of construction workers and monitor them in real time from afar [8]. Handling instruments such as hammers, cutting machines, big axes, chisels, and other sharp equipment carelessly causes severe pain, wounds, and blood loss, and may even cut nerves and fingers. They also need to operate heavy machinery like concrete machines, and to operate power tools, working in the undesirable weather conditions and perform other unsafe tasks on daily bases [4]. During this physically difficult line of work injuries are common and sadly deaths are reality. An another system is discussed to detect the orientation and rotation of the coal mine workers, if any abnormality occurs system automatically give an alert message to the owner/engineer. In case of emergency, the exact location of the worker cannot be determined [8, 9]. Artificial intelligence techniques are introduced in smart helmet by CamperoJurado I et al., and convolution neural networks (CNN) are used to detect hazards on the job [10]. The system has a high accuracy when compared to static neural networks, Naive Bayes classifiers, and support vector machines [11–13]. Kuhar et al.
IoT-Enabled Smart Helmet for Site Workers
507
invented a system that enables the supervisor to detect the location of the worker and continuously monitor the various field of tasks. And also the worker can send the information to the supervisor room in case of any emergency through HC12 module [11, 14]. Air quality and hazardous event detection also monitored in the system developed by Nandhini et al., and Wi-Fi technology is used to transfer the information between the device and the supervisor system [15]. Data collection is made, and IoT technology can be used to store the data and continuous monitoring [16].
3 Proposed System An image shown in Fig. 1 illustrates the future construction worker who will wear a smart helmet with sensors that monitor their physiological state. The sensors measure background parameters such as body temperature, blood pressure, pulse rate, muscle strain, motion, electrocardiogram data, and any other biomedical information using various biomedical pads. Communication between the system components is formed a network. The components also update the data on the worker’s smart phones, as well as the computers on site, so management can visualize any issues that need to be made. Whenever unacceptable conditions arise, emergency services or medical support
Fig. 1 Construction worker with smart helmet [17]
508
D. Mohanapriya et al.
Fig. 2 Block diagram
services will be notified. Telemedicine will be linked to the data through external connections via cellular networks or the internet [17]. The flow diagram of the system is illustrated in Fig. 3. The IoT-enabled smart helmet is designed to decrease the death rate of construction workers. The block diagram of the system is shown in Fig. 3. Sensors embedded in the smart helmet include MEMS sensors, heartbeat sensors, temperature sensors, vibration sensors, and infrared sensors. The GPS and GSM module is included to track the location of the workers at any time and to send the short message to the predicted mobile number of the owner\civil engineer.
4 Hardware Description The hardware design of the system is illustrated in Fig. 4. Which consists of the following sensors and units. The specifications of the sensors are given in Table 1. Arduino Uno: Arduino Uno is a microcontroller (ATMEGA328P)-based development board. Embedded C programming language is used to control the process of the smart helmet. Various sensors acquired the data from the environment and connected to the Arduino board which controls the entire process of this system. There are significantly more microcontrollers than the couple of working on some “flavor” of the “Arduino board family”, yet the Arduino board family … • has a typical IDE which is truly simple to utilize • has a ton of fringe support—including outsiders • has a great deal of test applications to begin with
IoT-Enabled Smart Helmet for Site Workers
509
Fig. 3 Flow diagram
does effectively scale between various individuals from the family—including changing the microcontroller family the singular sheets depend on. Differentiating to that more “general microcontrollers” for the most part accompanied their own “environment”, regularly including some IDE. Be that as it may, there has not been a lot of work to make projects “compact” and abilities of the singular methodologies might contrast. As a general evaluation, while the Arduino biological system might possibly give the greatest from the microcontroller(s) utilized, it offers high transportability and programming convenience—at the expense of the remainder of execution. Conversely, programming “general microcontrollers” (“uncovered metal”) requires much more exertion, offers little transportability and less “solace”; doing as such
510
D. Mohanapriya et al.
Fig. 4 Smart helmet
Table 1 Specifications of the hardware used S. No.
Hardware
Specification
1
Heartbeat sensor
SparkFun pulse oximeter and heart rate sensor—MAX30101 & MAX32664 (Qwiic)
2
Temperature sensor
MLX90614
3
Vibration sensor
HVM200
4
Barometric pressure sensor
BMP180
5
PIR sensor
HC-SR501
6
Serial Wi-Fi transceiver module
ESP-01 ESP8266
7
Arduino Uno
ATmega328P
releases the full presentation of the individual microcontroller (no consideration’s about movability, a novel arrangement plot and so on). IR Sensor: Sensor module for infrared obstacle avoidance consists of a transmitting and receiving tube for infrared light. IR waves are reflected off the transmitting tube by the receiver tube, which receives the reflected IR waves. By displaying a green LED, the onboard comparator circuitry indicates it is working. There are three wires connecting I/O to the module: Vcc, GND, and OUTPUT. The module is suitable for 3.3–5 V levels. Reflectance results in a digital signal being produced on the output pin. A preset onboard allows the duration of operation to be modified; the effective distance ranges from 2 to 80 cm. MEMS Sensor: The micro-electro mechanical systems (MEMS) measurement of angular velocity is performed using a sensor. The MEMS accelerometer sensor measures the person’s motion in all three axes in case of height fall detection.
IoT-Enabled Smart Helmet for Site Workers
511
Temperature Sensor: In order to record, monitor, or signal changes in temperature, a temperature sensor measures the temperature in its surroundings and converts that information into electronic data. Direct contact is necessary between temperature sensors and the object being monitored (a human body). Heartbeat Sensor: Heartbeat device measures the speed of the heartbeat. The heartbeat is measured in beats per minute that indicates the amount of times the heart is contracting or expanding in a minute [5, 18]. Vibration Sensor: Accelerometer measures the vibration or acceleration of motion of a human body. They need an electrical device that converts mechanical force caused by vibration or a modification in motion, into an electrical current. GSM: An electronic equipment or module that uses GSM mobile telephone technology to create a connection to a foreign network is known as a GSM module. The transportable network views them the same way a normal transportable does, which is why they need a SIM to connect. The Arduino GSM defend permits an Arduino board to attach to the web, send and receive SMS, and build voice calls victimization the GSM library. The defend can work with the Arduino Uno out of the box. GPS Module: It is one among the worldwide navigation satellite systems (GNSS) that has geolocation and time data to a GPS receiver anyplace on or close to the planet. The GPS does not need the user to transmit any information, and it operates severally of any telecommunication or net reception, although these technologies will enhance the quality of the GPS positioning data.
5 Experimental Result The experimental setup is shown in Fig. 5. The sensors in the smart helmet acquires data from the environment and live data are stored in cloud. ThingSpeak is used to visualize and monitor the data. ThingSpeak is an open source Internet of Things (IoT) application that grants to total, break down, and imagine the live information streams in the cloud. Clients can send information straightforwardly from their gadgets, make moment perceptions of live information, and send alarms [19, 20]. In this work, ThingSpeak is used to visualize and monitor the current status of the construction workers; the site engineer/owner can monitor the work progress in the field at any time. Figure 6 shows the physical parameters of the worker such as temperature, vibration, heartbeat, humidity through IoT platform—ThingSpeak. If there should arise an occurrence of crisis, the worker is permitted use the panic button in the smart helmet to intimate the site engineers. They can track the workers through GPS for remedial actions. The location of the worker is shown in Fig. 7. Additionally, the alert message has been sent via GSM to the predefined mobile numbers. The message contains the exact location, i.e., latitude and longitude information shown in Fig. 8.
512
Fig. 5 Experimental setup
Fig. 6 ThingSpeak visualization
D. Mohanapriya et al.
IoT-Enabled Smart Helmet for Site Workers
Fig. 7 Current location of the constructional worker
Fig. 8 Message received
513
514
D. Mohanapriya et al.
6 Conclusion This proposed system ensures the well-being of the building site laborers at the field. The specialists are followed and screen through the brilliant head protector to save the laborers life at building locales. The principle objective of this IoT-based smart helmet is to work on the general exhibition of laborers on a building site, including security and work the board. This paper talks about how the utilization of IoT in the building site can be adjusted by the different instruments or electrical parts that are used in making an IoT-based savvy cap. With the assistance of this brilliant protective cap, one cannot just guarantee the most extreme security for the works chipping away at the building site yet in addition help to finish the work inside the specified time. In the future, the savvy protective cap may be outfitted with more sensors or changed to fill more roles to work on the general proficiency.
References 1. https://theconstructor.org/practical-guide/construction-site-safety-issues/5684/ 2. S. Kanchana, P. Sivaprakash, S. Joseph, Studies on labour safety in construction sites. Sci. World J. 2015, 6 (2015). Article ID 590810. https://doi.org/10.1155/2015/590810 3. K.M. Mehata, S.K. Shankar, N. Karthikeyan, K. Nandhinee, P.R. Hedwig, IoT based safety and health monitoring for construction workers, in 2019 1st International Conference on Innovations in Information and Communication Technology (ICIICT), pp. 1–7 (2019). https://doi. org/10.1109/ICIICT1.2019.8741478 4. V. Jayasree, M.N. Kumari, IOT based smart helmet for construction workers, in 2020 7th International Conference on Smart Structures and Systems (ICSSS), pp. 1–5 (2020). https:// doi.org/10.1109/ICSSS49621.2020.9202138 5. J.S. Raj, Optimized mobile edge computing framework for IoT based medical sensor network nodes. J. Ubiquit. Comput. Commun. Technol. (UCCT) 3(01), 33–42 (2021) 6. S. Madhura, IoT based monitoring and control system using sensors. J. IoT Soc. Mobile Analytics Cloud 3(2), 111–120 (2021) 7. G. Pirkl, P. Hevesi, O. Amarislanov, P. Lukowicz, Smart helmet for construction site documentation and work support, in Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct (UbiComp ‘16) (Association for Computing Machinery, New York, USA, 2016), pp. 349–352. https://doi.org/10.1145/2968219.2971378 8. A. Aliyev, B. Zhou, P. Hevesi, M. Hirsch, P. Lukowicz, HeadgearX: a connected smart helmet for construction sites, in Adjunct Proceedings of the 2020 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2020 ACM International Symposium on Wearable Computers (UbiComp-ISWC ‘20) (Association for Computing Machinery, New York, USA, 2020), pp. 184–187. https://doi.org/10.1145/3410530.3414326 9. P. Hazarika, Implementation of smart safety helmet for coal mine workers, in 2016 IEEE1st International Conference on Power Electronics, Intelligent Control and Energy Systems (ICPEICES), pp. 1–3 (2016). https://doi.org/10.1109/ICPEICES.2016.7853311 10. N. Gomathi, S. Karthikkumar, S. Brindha, J. Paruvathavardhini, S. Menaga, A study depicting the advent of artificial intelligence in health care. Eur. J. Mol. Clin. Med. 7(11), 131–146 (2020) 11. I. Campero-Jurado, S. Márquez-Sánchez, J. Quintanar-Gómez, S. Rodríguez, J.M. Corchado, Smart helmet 5.0 for ındustrial ınternet of things using artificial ıntelligence. Sensors 20(21), 6241 (2020). https://doi.org/10.3390/s20216241
IoT-Enabled Smart Helmet for Site Workers
515
12. A.S. Sagayaraj, S.K. Kabilesh, D. Mohanapriya, A. Anandkumar, Determination of soil moisture content using ımage processing—a survey, in 2021 6th International Conference on Inventive Computation Technologies (ICICT) (IEEE, 2021), pp. 1101–1106 13. I.J. Jacob, P. Ebby Darney, Design of deep learning algorithm for IoT application by image based recognition. J. ISMAC 3(03), 276–290 (2021) 14. P. Kuhar et al., J. Phys.: Conf. Ser. 1950, 012075 (2021) 15. M. Nandhini, G.V. Padma Priya, S.R. Nandhini, K. Dinesh, IoT based smart helmet for ensuring safety in industries. Int. J. Eng. Res. Technol. 6(04), 1–4 (2018) 16. J.I.Z. Chen, L.-T. Yeh, Graphene based web framework for energy efficient IoT applications. J. Inf. Technol. 3(01), 18–28 (2021) 17. R. Edirisinghe, Digital skin of the construction site: smart sensor technologies towards the future smart construction site. Eng. Constr. Architect. Manage. (2019) 18. S.K. Kabilesh, K.C. Sivashree, S. Sumathiiswarya, G. Narmathadevi, R. Panjavarnam, Selfregulated anaesthesia feeder for surgical patients. Res. Appl.:Embed. Syst. 4(3), 1–10 (2021). https://doi.org/10.5281/zenodo.5545715 19. https://en.wikipedia.org/wiki/ThingSpeak 20. A. Bashar, Agricultural machine automation using IoT through android. J. Electr. Eng. Autom. 1(2), 83–92 (2019)
Efficient Direct and Immediate User Revocable Attribute-Based Encryption Scheme Tabassum N. Mujawar
and Lokesh B. Bhajantri
Abstract Nowadays, many organizations are adopting cloud computing for storing the large amount of organization’s important and private data. Here, it becomes important to mange appropriate access rights to these data as it is stored outside the organization’s boundary and is handled by the third party service providers. The Ciphertext Policy Attribute-based Encryption (CPABE) scheme is the most widely utilized technique that offers encrypted access control. In the existing implementations of CPABE scheme, one of the significant issues that need to be addressed is an efficient revocation mechanism. In this paper a direct and immediate user revocation approach for CPABE scheme is presented. The proposed method offers direct user revocation by maintaining the revocation list and to keep revocation list smaller the validity time is embedded in the user’s secret key. The revoked users are still able to access the previously generated ciphertexts. Hence, ciphertext update process is incorporated and a separate immediate revocation list is maintained so that the revoked users’ access is restricted. Also, in the proposed system the revocation information is embedded in ciphertext as a separate part so the update process of ciphertext is more efficient. Keywords Attributes · Revocation · ABE · CPABE · Revocation list · Validity period · Cloud · Access structure
T. N. Mujawar (B) Research Scholar, Department of CSE, Basaveshwar Engineering College, Bagalkot, Karnataka, India e-mail: [email protected] Department of CE, Ramrao Adik Institute of Technology, D Y Patil Deemed to be University, Navi Mumbai, Maharashtra, India L. B. Bhajantri Department of ISE, Basaveshwar Engineering College, Bagalkot, Karnataka, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_38
517
518
T. N. Mujawar and L. B. Bhajantri
1 Introduction The idea of Attribute-based Encryption (ABE) is first time introduced in [1] as a fuzzy identity-based encryption mechanism. The main idea is that the decision of whether the ciphertext is decrypted or not is dependent on the attributes and policies. The ABE scheme provides an efficient way to encrypt data for multiple users by using attributes of user and predefined access policies. It provides one to many encryption mechanism along with the collusion resistance feature. There are two different ways to connect the attributes and policies with the ciphertext and the secret keys. The two types of ABE schemes are Ciphertext Policy Attribute-based Encryption (CPABE) [2] and Key Policy Attribute-based Encryption (KPABE) [3]. In case of CPABE method the attributes are bracketed together with the secret key and the access policies in the form of access structure is embedded with the ciphertext. Whereas, in KPABE scheme the attributes are merged in the ciphertext and the access policies are embedded in the secret key. In literature there are many variation presented for efficient implementation of CPABE and KPABE schemes. One of the important issues related to ABE scheme is how to handle revocation mechanism. The revocation mechanism deals with how to prevent the users who have left the system from accessing the stored ciphertext. Whenever any user is revoked from the system, every organization’s prime concern is that he/she should not be able to access any data. It is also possible that some of the attributes of any user are updated and in this case, the access rights must be given according to the new attributes. Thus, revocation in ABE system deals with user level revocation and attributes level revocation. The user revocation signifies that the user has left the organization hence user’s access rights must be revoked. In case of attribute revocation, the attribute values are changed hence that specific attribute for a particular user must be revoked. There are different approaches to implement the revocation mechanism such as direct, indirect, multiple authorities-based. The indirect approach includes secret key update process for all non-revoked users. The secret keys of all non-revoked users are updated so that the existing state of the system will be associated with ciphertext and with these updated secret keys. In this way the revoked users will not be able to access the newly generated ciphertexts but they will have access to all previously created ciphertexts. Also, if there is time gap in key update algorithm execution and user revocation then the revoked users are able to access all the newly generated ciphertexts during this time gap. Thus, immediate user revocation is not possible here and it is major concern that needs to be addressed. Also, the ciphertext update process is included so that the revoked users will not be able to access the old ciphertexts also. The other possible approach for user revocation is the direct approach; here the list of revoked users is embedded in the ciphertext. Hence, the users who are present in the revocation list cannot decrypt the ciphertext. The users who are not present in the revocation list and user’s attributes satisfy the access policy can decrypt the ciphertext. The direct approach supports immediate revocation and does not need any updations of the secret key. The main issue with this approach is that the size of revocation list will increase over the time and it will put overhead on the encryption process.
Efficient Direct and Immediate User Revocable . . .
519
The important issues related to existing revocation schemes that need to be addressed include: how to handle immediate revocation, size of revocation list, overhead incurred due to ciphertext update and secret key update process. In this paper an efficient revocation approach for CPABE scheme that supports direct and immediate revocation is presented. Our major contributions are: – Direct revocation: Whenever any user is revoked from the system, the identity of user is immediately added in the revocation list. So that the revoked user will not be able to decrypt any newly generated ciphertext. – Small revocation list: The validity time is embedded in the user’s secrete key and whenever this time is expired the user is removed from the revocation list. This approach will keep the revocation list smaller. – Immediate revocation: If any user is revoked before the validity time and old ciphertexts are not updated yet then the user will have access to these ciphertexts. Hence an immediate revocation list is maintained separately and user who is revoked before validity is included in this list immediately. This approach will provide immediate user revocation and also protects old ciphertexts from illegitimate access. – Efficient ciphertext update: The encryption process will embed revocation information in ciphertext as a separate component. Due to this the ciphertext update process has to modify the revocation related part only, instead of modifying all components. The rest of the paper is organized as: the existing methods that address the user revocation issue are elaborated in Sect. 2. The proposed revocation scheme with direct and immediate revocation mechanism is explained in Sect. 3. The implementation of proposed scheme is compared with existing methods and the results are presented in Sect. 4. The conclusion is presented in Sect. 5.
2 Related Work The time-based revocable scheme along with ciphertext policy attribute-based encryption mechanism is presented in [4]. In this scheme direct method for user revocation that maintains list of revoked users and utilizes validity time for key is presented. The scheme possesses advantages of shorter revocation list and revoked users cannot decrypt the ciphertext after the validity time of key is expired. This scheme incorporates efficient group management and construction capability in the original CPABE method. The identity based revocable ciphertext policy attributebased encryption method is presented in [5]. In this scheme the access policy is built by considering attributes of users and the unique identities of users and the identities are embedded in the private keys also. The identity is revoked whenever any user is revoked from the system. The revocation technique based on hierarchical identitybased encryption in presented in [6]. In this method a direct revocation approach
520
T. N. Mujawar and L. B. Bhajantri
by incorporating revocation list in the ciphertext is proposed. In order to make the revocation list smaller, the expiry time is added to the key, so that the users with expired keys can be removed from revocation list. Such users cannot decrypt the ciphertext though they are not part of revocation list. The method also includes key update algorithm which will update the keys periodically. A tree-based attribute and user revocation scheme is presented in [7]. This scheme outsources the computation of costly pairing operations and also integrates revocation mechanism. The scheme is more flexible and efficient in terms of computation time. The dynamic ciphertext policy attribute-based encryption method for user and attribute revocation is proposed in [8]. The scheme includes ciphertext update facility whenever any attribute is revoked. This will ensure that the user who has appropriate attributes to satisfy access policy and that attribute has not been revoked can decrypt the ciphertext and can apply the key update algorithm. The authors in [9] have presented a ciphertext policy attribute-based scheme that applies hidden access policies and support revocation mechanism. The scheme also supports traceability feature to identify the malicious users. The revocation information is maintained in terms of binary tree and revocation list. It is encrypted during encryption process and forms the separate part of the ciphertext. If any user is revoked then the part related to revocation information is only need to be updated. The immediate revocation-based access control method for e-health system is presented in [10]. In this scheme ordered binary decision diagram-based access structure is utilized to specify the access policy. The user identities are integrated with the user keys. The proposed scheme also supports forward and backward security along with collusion resistance feature. The revocable access structure-based scheme including multiple authorities for the named data network is presented in [11]. The scheme includes attribute and user revocation mechanism based on ciphertext policy attribute-based encryption scheme. Once the user is revoked from the system, the user revocation process initiated and the ciphertext is re-encrypted. There is no need to update the keys of the users who have not revoked from the system. The authors in [12] present an integrated approach that includes outsourced decryption, hidden access policies and revocation mechanism. The scheme is majorly intended for the resource constrained devices. The method utilizes the dual pairing vector spaces for pairing operation. As only two vector spaces are used by the scheme, the ciphertext and key size will be small. An integrated approach for ciphertext policy attribute-based encryption scheme that includes online and offline encryption, efficient revocation mechanism and outsourced decryption is presented in [13]. The scheme mainly addresses the security issues related to Internet of Medical things ecosystem. The scheme provides encryption functionality in two phases, i.e., online and offline, so that complex operations are performed offline and encryption time is saved. The scheme supports both user and attributes revocation. The decryption process is outsourced to make it faster and more efficient in resource constrained environment. The data sharing scheme that is based on the ciphertext policy attribute-based encryption mechanism and consisting of different authorities is presented in [14]. The method proposed in this paper mainly addresses the security issues for marine data shared with Internet of Things devices for smart oceans. The scheme includes multiple authorities to distribute components of the
Efficient Direct and Immediate User Revocable . . .
521
key and outsourced decryption facility. The effective attribute revocation approach is also integrated in the proposed scheme. The authors in [15] present the CPABE scheme that supports efficient access policies including multi valued attributes. The scheme also provides revocation strategy and verified outsourced decryption process. The version number strategy and proxy re-encryption is utilized by the proposed method. The proposed method ensures that ciphertext length will be constant and also provides ciphertext update facility. The CPABE-based access control method that provides revocation mechanism for the users who are removed or left the system is proposed in [16] for the dynamic cloud environment. This paper utilizes the concept of user registration and revocation list to implement the revocation strategy. Along with revocation the proposed method also supports efficient decryption with help of trusted third part server and also guarantees constant ciphertext length. The CPABE method with efficient revocation, data deletion along with verification approach is proposed in [17]. The method utilizes the concept of attribute association tree for implementing revocation method. The access policies are reconstructed with help of association tree and the data is reencrypted so that the revoked users cannot access it. The fast and verifiable data deletion algorithm is used to remove expired data from the system. The attribute revocation scheme that provides immediate revocation by using the attribute group key concept is presented in [18]. The scheme includes multiple authorities and also provides outsourcing facility. The keys are immediately updated upon revocation of the attributes. The complex parts of encryption and decryption phases are outsourced to the fog nodes and this will improve the efficiency of the overall system. The majority of the existing approaches support user revocation by maintaining the revocation list and updating the ciphertext or keys after some time interval. The major issue in the existing scheme is growing size of revocation list, overhead due to the ciphertext update and key update process. Hence in this paper the scheme that requires smaller revocation list and causes less overhead for ciphertext update process is proposed. The proposed scheme also supports immediate user revocation so that old ciphertexts are also protected from the revoked users.
3 Proposed Work 3.1 Proposed Scheme Overview In this paper user revocation mechanism for ciphertext policy attribute-based encryption scheme is proposed. The proposed system has facility of direct and immediate user revocation with minimal changes to the ciphertext and secret key. All the users in the system have unique identities issued to them. The revocation list consists of identities of the revoked users. The revoked users will not be able to access the data though their attributes satisfy the access structure. The direct revocation is implemented by maintaining the revocation list. The ciphertext constitutes the revocation
522
T. N. Mujawar and L. B. Bhajantri
list and the secret key holds the user’s identity. Whenever any user is revoked from the system, his/her identity is added in the revocation list. Hence, the newly generated ciphertexts will have updated revocation list and the revoked user will not be able to access it. The main issue with this approach of maintaining revocation list is that the size of the revocation list will increase over the time period and this will also increase the size of the ciphertext. In order to address this problem, the time period is embedded with the user’s secret key which describes the validity of the key. The users are not able to decrypt the ciphertext if the validity period of the key is expired and such users are removed from the revocation list. This will keep the size of the revocation list smaller. While decrypting the ciphertext, if the user is not present in the revocation list then the expiry date of the user’s key is compared with the time period mentioned with ciphertext. If user’s key has valid time period then only the particular user will be able to decrypt the ciphertext. In this way, the user with expired key cannot decrypt the ciphertext though the access structure is satisfied by his/her attributes. The revoked users will be, still, able to access the previously generated ciphertexts. In order to address this issue an update mechanism for ciphertext is proposed. The revocation information is embedded in ciphertext as an independent part. Hence, whenever all ciphertexts are updated, only the part related to revocation information will be updated and the part related to access structure will remain unchanged. This approach reduces overhead on the system, which is incurred by the ciphertext update process. The ciphertexts will be updated after the predefined time period that is decided by the organization. The proposed system also supports immediate revocation approach. If the user has left system before the validity time mentioned for the user’s key then user’s id will be immediately added to revocation list. But such user can try to access the previously encrypted ciphertexts before they are updated. In such condition, the ciphertext can be decrypted by the revoked user. Hence, to prevent user from accessing previously generated ciphertexts, a separate Immediate Revocation List (IRL) is maintained. This list contains identities of those users who have revoked from system before validity time. This list is publicly published and decryption algorithm checks whether the identity associated with user’s secret key is present in this list. The user is not allowed to decrypt the ciphertext if his/her identity is present in this list. The identities are removed from the list whenever the ciphertext update algorithm is executed. The IRL will hold identities of users who have left the system in between so the list will be quiet small and hence easily manageable. Thus, the proposed system supports both immediate and direct revocation process. The architecture representing the functionality of the proposed system is depicted in Fig. 1. The major entities of the proposed system are data user, data owner, trusted authority, cloud storage along with update ciphertext module and revocation module. The data owner encrypts the data by applying necessary access structure and also associates the current revocation list with the ciphertext. The data owner utilizes the cloud storage to store the ciphertext. The trusted authority is responsible for generating necessary public parameters, keys and unique identities for all users. The data user can request the ciphertext from the cloud storage. The access to the requested
Efficient Direct and Immediate User Revocable . . .
523
Fig. 1 An architecture of the proposed scheme
data is granted if the access structure embedded with ciphertext is satisfied by user’s attributes and user’s identity is not present in the revocation list. The revocation module is associated with data user side and it is responsible for implementing proposed revocation scheme. The update ciphertext module takes all older ciphertext and updated revocation list as input and returns all updated ciphertexts to cloud storage. The revocation information is represented using the binary tree approach similar to [9]. In this tree the user’s identity is represented by leaf node. Each node is labeled according to the breadth first approach as shown in Fig 2. The root will be labeled as 0. Let n represents total number of users in the system. Then the tree will consist of 2n − 2 nodes. The function path (x) represents the path reaching to the leaf node x from the root node. The min_cover function returns minimum number of nodes required to represent the user identities that are not present in revocation list. The important feature of this tree is that there is only one node common among the path (x) and list returned by the min_cover function. This feature of the tree is used to embed revocation information during encryption phase and implement the revocation process. Consider that the revocation list, RL={u 2 , u 5 }. Then the min_cover (R L) = {7, 4, 6, 12}. The path of user with id, u 6 is path(u 6 ) = {0, 2, 5, 12}. The user with identity u 6 is not revoked and hence there is only one node common in min_cover (R L) and path(u 6 ). The validity period is mentioned along with the key and the tree like structure is utilized to represent the time period as mentioned in [4]. In this paper a scenario of
524
T. N. Mujawar and L. B. Bhajantri
Fig. 2 The binary tree to represent the revocation details Fig. 3 The tree to represent the validity time associated with user’s key
organization is considered, where the important data is stored in cloud storage and employees can access it as per the defined access policies. The validity time period is mentioned with correspondence of employees’ contracts that is generally for one year. The time period is represented by the tree like structure as shown in Fig 3. The root represents year, the next level is month and the next level is day. The nodes in the tree are labeled as: root is 0, months are 1–12 and days numbered from one to the last day of the month. The time period is represented by the path from the root node till the leaf node of the respective day. As shown in the Fig 3. the date May 31st, 2022 is represented as {0 − 5 − 31}. The user or employee can decrypt the corresponding ciphertext if the validity time period is completely covered in the encryption time. Consider that any employee has the contract till November 30th, 2022 then the validity time period associated with the secret key is {0 − 11 − 30}. The encryption time period mentioned with ciphertext is December 31st, 2022, i.e., {0 − 12 − 31}. In this case the user can decrypt the ciphertext as the validity time period is completely covered by the encryption time.
Efficient Direct and Immediate User Revocable . . .
525
3.2 Proposed Scheme Construction The proposed revocable CPABE scheme has different phases like initial system setup, encryption, key generation, decryption and update ciphertext. The revocation list (RL) is associated with the ciphertext and the revocation information is embedded separately in ciphertext. The Immediate Revocation List (IRL) is publically available and accessible to all and the validity time period is mentioned with secret key and as encryption time is mentioned during encryption phase. System Setup In this phase, the trusted authority generates all the public parameters and issues the Public Key (PK) and Master Key (MK). The scheme considers the bilinear group G with prime order p with generator g. Consider that there is a bilinear map as : e: G × G. The random elements a and b are selected over Z p. Let, the time period is represented by the tree T and there are total T nodes in the tree. Then randomly select ti over Z p, ∀i ∈ (0, T ) and compute ti = g ti ∀i ∈ (0, T ). Let there are total n number of users in the system and the user revocation information is represented by the tree R and there will be total 2n − 2 number of nodes in this tree. Then, randomly r j over Z p, ∀ j ∈ (0, 2n − 2) and compute r j = gr j ∀ j ∈ (0, 2n − 2). Finally the PK is published as: PK= G, g, gb , e (g, g)a , ti , r j , ∀i ∈ (0, T ) , ∀ j ∈ (0, 2n − 2) . The MK is published as: MK= ga , {ti } , r j , ∀i ∈ (0, T ) , ∀ j ∈ (0, 2n − 2) Encryption The access structure is represented in the form of tree, in which intermediate nodes are different logic gates such OR and AND. The leaf nodes represent the attributes. The input for encryption phase is access structure A, Public Key (PK), Revocation List (RL) and the message to be encrpted, m. All nodes in the access tree have associated index value and threshold value. The indices are assigned to each child sequentially by starting at 1. The leaf node has threshold value as 1, OR gate has 1 and AND gate has 2. The polynomial is associated with each node in the access tree with degree d which is one less than the corresponding threshold value of the node. The ciphertext generated by the encryption phase has two independent parts, one is related to access structure and other is related to revocation list. The components of ciphertext are computed as follows: The secret element is randomly selected as s ∈ Z p and it is associated with root node. The polynomial qr for root node is generated as: qr (0) = s and remaining points of the polynomial are selected randomly according to the degree of root node. The polynomial of any other node z is generated as: qz (0) = qparent(z) (index(z)), where the index functions returns index of node and parent function returns the parent node. Then compute, CT = m.e(g, g)as and CT∗ = h s . Consider that there are total k numbers of leaf nodes. Then, for every leaf node l compute, Cl = g ql (0) and Cl∗ = H (attribute(l))ql (0) . Let (RL) is associated with the revocation min_cover s
list RL. Then, compute R j = r j , ∀ j ∈ min_cover (RL). The encryption time period Te is also associated with the ciphertext. The ciphertext is generated as: C T = A, m, C T , C T ∗ , Cl , Cl∗ , R j , R L , Te , ∀l ∈ (1, k) , ∀ j ∈ min_cover (RL) .
526
T. N. Mujawar and L. B. Bhajantri
Key Generation In this phase the user’s secret key (SK) is generated and the validity time period is associated with the key. The user’s identity Uid is also specified along with the key. The input for this phase is user’s attributes set A = {a1 , a2 , . . . al }, master key MK and the time period represented by tree T . The algorithm first ran 1/b . Also randomly domly selects an element x over Z p and computes S K = g a+x select xk over Z p,∀k ∈ (1, l), and then compute, Sk = g x .H (ak)xk and Sk = g xk , ∀k ∈ (1, l). The time period xis represented by array, T and for every component of time period compute, Ti = ti , ∀ti ∈ T. The secret key is generated as: S K = S K , Sk , Sk , Ti , U id, ∀k ∈ (1, l) , ∀i ∈ (1, t)}. Decryption The decryption algorithm takes input as ciphertext (CT), secret key (SK) and revocation list (RL) and immediate revocation list (IRL). The algorithm does not decrypt the ciphertext in following cases: – If the access structure is not satisfied by the user’s attributes. – If Uid ∈ R L or Uid ∈ I R L, though the access structure is satisfied by the user’s attributes. – If Uid is not present in any of the revocation list then the algorithm will compare the validity time of key (T) and the encryption time (Te ) associated with ciphertext. If T is not completely covered in Te . If none of above cases is true then the ciphertext will be decrypted as: m = e(C T ∗ ,SCK T )/e(g,g)xs . The secret associated with root node is recovered in order to ( ) satisfy the access structure. This secret is then used to recover the original message m. Update Ciphertext In order to keep the revocation list smaller, the users are removed from the revocation list if validity time of their key is expired. Such users should not access the ciphertext generated in past so an update algorithm is implemented to embed current revocation information. All the ciphertexts contain separate part for revocation information and hence only that part need to be updated. The input for this phase includes, ciphertext (CT), updated revocation List (RL ). Let RL is the updated revocation list. The min_cover(RL ) is computed for the updated revocation list RL . Then randomly select λ over Z p and compute r ∗j = ∗ λ ∗ r j ∀ j ∈ (0, 2n − 2). Then compute r j = gr j ∀ j ∈ (0, 2n − 2). Finally, compute s R ∗j = r j ∀ j ∈ min_cover(RL ). The updated ciphertext is generated as: CT∗ = A, m, CT , CT∗ , Cl , Cl∗ , R ∗j , RL , Te , ∀l ∈ (1, k) , ∀ j ∈ min_cover(RL ) .
4 Results and Discussion The proposed revocable mechanism is implemented using jpbc library [19] and traditional implementation of CPABE. The scenario of organization is considered, where the data is stored in cloud storage in encrypted form. The employees are given access to the respective data by defining access policies using their attributes. The secret
Efficient Direct and Immediate User Revocable . . . Fig. 4 The comparison of ciphertext update time by varying number of attributes
527
4
CT Update Time (s)
R-CP-ABE [4] Proposed Scheme 3 2.5 2 1.5 1 0.5 0
5
10
15
20
25
Number of Attributes
keys for all employees are generated by the trusted authority and the attributes are embedded with the secret keys. The unique identities are also generated whenever the employee joins the organization. The validity time period is represented using the tree structure as explained in previous section and one year time period is considered for the same. The ciphertext update time period is also kept as one year. The Immediate Revocation List (IRL) is also updated whenever the ciphertext update process is executed. The revocation schemes presented in [4] and [9] are also implemented and the performance of proposed scheme is compared with it. The performance is compared in terms of time required for encryption, ciphertext update and key generation by varying number of attributes and by considering different scenarios. Figure 4 represents the comparison of time required for ciphertext update process for the proposed scheme and the scheme presented in [4]. The ciphertext update process in proposed scheme needs to modify only the component related to revocation information. Therefore, the time taken by proposed scheme is very much less as compared to existing scheme. Figure 5 represents the comparison of encryption time by varying the number of revoked users. The access policies are built using 10 different attributes and total number of users considered is 30. The proposed scheme removes the users from revocation list whenever validity time of their key is expired. Hence, the size of revocation list never grows though the number of revoked users in the system increases. Here, the proposed scheme is compared with the scheme in [9]. In proposed scheme the revocation list size is smaller as compared to the scheme in [9]. Hence, the proposed scheme consumes less time for encryption. Figure 6 represents the comparison of time taken for updating the user’s secret key. Here also the proposed method performs better as compared with [4]. The keys includes separate component related to revocation and hence only that part need to be updated whenever the keys are reissued to the users by considering current state of the system.
528
T. N. Mujawar and L. B. Bhajantri
Fig. 5 The comparison of encryption time by varying the number of revoked users
4 TR-AP-CPABE [9] Proposed Scheme Enc. Time (s)
3 2.5 2 1.5 1 0.5 0
5
10
15
20
25
Number of Revoked Users
Fig. 6 The comparison of key generation time by varying the number of attributes
4
KeyGen. Time (s)
R-CP-ABE [4] Proposed Scheme 3 2.5 2 1.5 1 0.5 0
5
10
15
20
25
Number of Attributes
5 Conclusion In this paper a user revocable CPABE scheme is presented with direct and immediate revocation facility. The revocation list is maintained to support the direct user revocation and immediate user revocation is implemented using immediate revocation list. The concept of validity time is incorporated with the key so that the size of revocation list will not grow exponentially. The paper also presents efficient ciphertext update algorithm by generating two different components of ciphertext; one is related to access policy and other is related to revocation. During update process only the revocation components are modified and it will reduce overall burden on the system. The results also show that the time required for ciphertext update and encryption process is less as compared to the existing methods. In future attribute revocation approach can be included with the proposed scheme.
Efficient Direct and Immediate User Revocable . . .
529
References 1. A. Sahai, B. Waters, Fuzzy identity based encryption. J. Adv. Cryptol. Eurocrypt. 3494, 457– 473 (2005) 2. J. Bethencourt, A. Sahai, B. Waters, Ciphertext policy attribute based encryption, in IEEE Symposium on Security and Privacy, pp. 321–334 (2007) 3. V. Goyal, O. Pandey, A. Sahai, B. Waters, Attribute based encryption for fine-grained access control of encrypted data, in Proceedings of the 13th ACM Conference on Computer and Communications Security, pp. 89–98 (2006) 4. L. Zhe, W. Fuqun, C. Kefei, T. Fei, A new user revocable ciphertext-policy attribute-based encryption with ciphertext update. J. Secur. Commun. Netw. 2020, 1–11 (2020) 5. W. Weijia, W. Zhijie, L. Bing, D. Qiuxiang, H. Dijiang, IRCP-ABE: identity revocable ciphertext-policy attribute-based encryption for flexible secure group based communication. J. IACR Cryptol. ePrint Archive. 1100, 1–14 (2017) 6. J.K. Liu, T. H. Yuen, P. Zhang, K. Liang, Time-based direct revocable ciphertext-policy attribute-based encryption with short revocation list. J. Appl. Cryptogr. Netw. Secur. 516-534 (2018) 7. L. Zoe, Jiang, Z. Ruoqing, L. Zechao, S.M. Yiu, C.K. Lucas, X.W. Hui, F. Junbin, A Revocable Outsourcing Attribute-Based Encryption Scheme. In:Lecture Notes of the Institute for Computer Sciences. (2018) 8. W. Guangbo, W. Jianhua, Research on ciphertext-policy attribute-based encryption with attribute level user revocation in cloud storage. J. Math. Probl. Eng. 2017, 1–12 (2017) 9. H. Dezhi, P. Nannan, K. Li, A traceable and revocable ciphertext-policy attribute-based encryption scheme based on privacy protection (IEEE Trans. Depend, Secur Comput, 2020) 10. E. Kennedy, J. Beakcheol, W.K. Jong, Collaborative ehealth privacy and security: an access control with attribute revocation based on OBDD access structure. IEEE J. Biomed. Health Inform. 24(10) (2020) 11. W. Zhijun, Z. Yun, X. Enzhong, Multi-authority revocable access control method based on CP-ABE in NDN. J. Fut. Internet. 12(1) (2020) 12. Z. Dominik, M. Alexander, Efficient revocable attribute-based encryption with hidden policies, in IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), pp. 1638–1645 (2020) 13. G. Rui, Y. Geng, S. Huixian, Z. Yinghui, Z. Dong, O3-R-CP-ABE: an efficient and revocable attribute-based encryption scheme in the cloud-assisted IoMT system. IEEE Internet Things J. 8(11), 8949–8963 (2021) 14. Y. Yao, G. Lei, L. Shumei, Z. Jian, W. Hongli, Privacy protection scheme based on CP-ABE in crowdsourcing-IoT for smart ocean. IEEE Internet Things J. 7(10), 10061–10071 (2020) 15. Z. Yang, X. Xin, Z. Xing, D. Yi, A revocable storage CP-ABE scheme with constant ciphertext length in cloud storage. J. Math. Biosci. Eng. 16(5), 4229–4249 (2019) 16. H. Yong-Woon, L. Im-Yeong, CP-ABE access control that block access of withdrawn users in dynamic cloud. Ksii Trans. Internet Inform. Syst. 14(10), 4136–4156 (2020) 17. M. Jun, W. Minshen, X. Jinbo, H. Yongjin, CP-ABE-based secure and verifiable data deletion in cloud. J. Secur. Commun. Netw. 2021, 1–14 (2021) 18. S. Tu, M. Waqas, F. Huang, G. Abbas, Z.H. Abbas, A revocable and outsourced multi-authority attribute-based encryption scheme in fog computing. J. Comput. Netw. 195, 108196 (2021) 19. A.D. Caro, V. Iovino, jPBC: Java pairing based cryptography, in Proceedings of the 2011 IEEE Symposium on Computers and Communications, pp. 850–855 (2011)
Comparative Analysis of Deep Learning-Based Abstractive Text Summarization Techniques Dakshata Argade and Vaishali Khairnar
Abstract In this digital era, a vast amount of information is generated; it is very difficult to get the information faster and more efficiently. In today’s time, everyone needs more information from existing data in a shorter amount of time. For this, a good mechanism is needed which will extract important information from raw data or provide an abstract view of the source document. Basically, in this survey, we have described different recent deep learning-based approaches used for abstractive text summarization. We explained different approaches with its working, effectiveness, and drawback. In this article, we analyzed different approaches based on its architecture, training algorithm, and dataset used. We further discussed how text summarization can be improved in the future. Keywords Abstractive text summarization · Recurrent neural network · Long short-term memory
1 Introduction The text summarization is the process of representing a larger document in an accurate and precise way without altering the meaning of original text. Text summarization is broadly divided based on input, domain knowledge, and output produced. There are various types of text summarization based on input, which are used for summarization of single and multi-documents. There is also division of text summarization based on context, i.e., domain specific, generic, and query-driven, and based on output type, extractive and abstractive. Here, more focus is on output-based approach, i.e., abstractive and extractive. Extractive summarization extracts important sentences and words from original text, while in abstractive, it creates a summary in our own words. In the extractive approach, parts of the original document are used for generating the summary. While in abstractive summarization, it creates new sentences from the original text which require understanding of the language. Many real-time D. Argade (B) · V. Khairnar Terna Engineering College, Navi-Mumbai, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_39
531
532
D. Argade and V. Khairnar
applications of text summarization are available like summarization of news and blogs, headline generator, summarization of financial and medical reports, summarization of web-based documents, meetings, books, and scientific articles. Due to the linguistic complexity and semantic and syntactic constraints of text documents, summarization becomes quite challenging in the field of NLP [1]. In past years, the majority of the work was done on extractive summarization due to the complexity and difficulty in abstractive summarization. The extractive approach is easy due to the copying part of text from the input document ensures that the created summary to be accurate and grammatically correct. It does not need any extra efforts for checking grammar and syntactic skills. Extraction-based summaries are syntactically correct, but they are lacking in cohesion, ambiguity, and identification problems. Problem with extractive summary is lack of balance between topic coverage in case of multi-document summarization. The extractive summary does not look logically linked because of weak linkage between sentences. The issues associated with extractive approach can be overcome by abstractive approach by paying attention toward the semantic view of text and using natural language generation techniques. Unlike extractive methods, summary created using an abstractive approach is more cohesive, grammatically correct, and readable. There are following approaches used for abstractive text summarization: 1. 2. 3. 4. 5. 6. 7.
Seq-2 Seq Bidirectional Seq-2 Seq Encoder decoder Attention Re-feed decoder Copy from input Reinforcement learning
Abstractive is very much similar to the way humans summarize. Our brain looks for internal semantic text representation and then produces a summary using our own word. Researchers are using new words which are not appearing in original documents which are quite difficult. For this, researchers have to use some advanced techniques such as deep learning models. In recent years, natural language processing has used models based on deep learning for mapping input sequence to output sequence, called as sequence to sequence model. The sequence to sequence model is very successful in the application of machine translation and speech recognition [2]. In machine translation, researchers will map every input word to output word, but the summarization task is different from it. In summarization, output text is shorter than the input text and does not depend on the length of the text, so it is a challenging task. In the current work, we focus on abstractive summarization methods and presenting an overview of some of the most dominant approaches in this category along with its limitations and dataset used.
Comparative Analysis of Deep Learning-Based Abstractive Text…
533
2 Techniques and Methods Used for Abstractive Text Summarization 2.1 Recurrent Neural Network (RNN) In recurrent neural network (RNN), the current output depends on previous output. Due to this interdependency of current and previous output, RNN is used to summarize our text as more like human. The encoder-decoder architecture is basically required to treat sequence problem where input and output both are sequence of words of different lengths. As depicted in Fig. 1, the RNN encoder and decoder architecture is based on the sequence to sequence models. The sequence to sequence model is that where we provide sequence of input and generate sequence of output [3]. It is divided into two parts, i.e., encoder and decoder. Encoder reads the input, and decoder predicts the next word. Encoder is encoded based on input at time and hidden state [2]. Encoder has hidden state as well as decoder also has hidden state. All inputs and its final hidden state are taken by encoder which contains entire information of inputs. The encoder produces context vector which is further passed to decoder as input. Decoder generates output as sequence based on context vector. The original document being to be summarized will be input sequence for text summarization, and the output sequence will be the summary [2, 4]. RNN is a DL-based model that is used to handle data in a sequential manner, with the output of one state influencing the input of the next [4]. In some sentences, all words are tightly coupled with each other, i.e., current word depends on previous word context. The neural network learned various hidden states of RNN to remember previous word. The vector representation consists of present input data and the output of the hidden states of all previous data, which are merged and supplied to the next hidden state in the RNN encoder-decoder model at specified at encoder’s hidden states [3]. The outputs obtained from the hidden state-1 and hidden state-2 are merged with the word X3 vector representation and collectively pass further to the hidden states-3 as input. Once we provided all of the words as input to the encoder, the output from the encoder’s previous hidden state is supplied in the form of vector to decoder known as the context vector [2]. Along with context vector, the start of sequence, i.e., (SOS) is used to construct summary’s initial word (i.e., Y1 as mentioned in Fig. 1). The Y1 is further passed as input to next decoder’s hidden state. To produce summary’s next word, every produced word is provided as an input to next decoder’s hidden state. The end-of-sequence sign EOS is the last created word. The output of the decoder is represented as probability distribution before passing to the next layer of softmax and attention module to construct the further summary [4] (Fig. 2).
Fig. 1 Block diagram for text summarization using RNN encoder decoder
534
D. Argade and V. Khairnar
Fig. 2 RNN encoder decoder
2.2 Bidirectional Recurrent Neural Network The bidirectional recurrent neural network is combination of two independent RNN’s. In this instead of running single RNN in forward direction from first word to last, we can start another from the backward direction from last token. The forward RNN creates a hidden state sequence by reading input from the first word to the last, whereas the backward RNN creates a hidden state sequence by reading input from the last word to the first. Forward and backward RNNs are concatenated to represent the input sequence [5]. The output layer can receive data from both the past (backward) and future (ahead) states at the same time. To predict current word, we used information of past and future words of that. In bidirectional RNN, we used past as well as future information for predicting next word, so it improves performance. There are less chances of wrong prediction in the case of bidirectional RNN because in the case of unidirectional RNN, prediction is based on past context only if any mistake in last prediction, then the same mistake will carry forward in following predictions, this issue gets resolved by bidirectional RNN [5].
2.3 Long Short-Term Memory (LSTM) The long short-term memory is basically addressing the issue of long-term information preservation and short-term word skipping. It is an RNN variant. It has four levels of transactions going on inside of it. It takes word sequence as input and also has the ability to recall those words. It does not forget words as much as basic RNN did. There are various gates, such as input, memory, forget, and output gate. If there are any terms that you do not need to remember for the model, use the forget gate trigger to erase them [6], i.e., it allows the LSTM to know what is relevant and what is not. The LSTM architecture’s repeating unit is made up of the following gates: input, memory, forget, and output gates [3], although the linking arrangement is similar to RNN. All four gates exchange information, allowing information to flow inside
Comparative Analysis of Deep Learning-Based Abstractive Text…
535
loops for an extended time period. The LSTM unit consists of four gates as shown in Fig. 3, which are described here. 1.
2.
Input gate—The input gate is the first gate of LSTM. The input is a randomly initialized vector in the first timestep, whereas the present step’s input is the previous step’s output (memory cell content) in subsequent steps. The input always multiplies elementwise with the forget gate’s output. The multiplication result is added to the output of the current memory gate [6]. Forget gate—It is a neural network of single layer with a sigmoid as the activation function. The sigmoid function’s outcome will determine whether the information from the previous state should be forgotten or remembered. If the output of sigmoid function is one, then the prior state will be maintained, and if
Fig. 3 Bidirectional recurrent neural network
Fig. 4 Structure of LSTM unit
536
3.
4.
D. Argade and V. Khairnar
the output is zero, then previous state will be forgotten then. The forget gate has four inputs: the input vector, bias, the previous block’s output, and remembered information from the previous block [7]. Memory gate—It is a game about remembering things. The influence of reminiscence information on newly generated information is controlled by the memory gate. Two neural networks are used to build the memory gate. The first neural network is structurally similar to the forget gate but has a different bias, while the second is utilized to produce new information using the tanh activation function. The new information is created by combining previous information with result of elementwise multiplication of two memory gates output [6]. Output gate—It regulates the quantity of new data delivered to the next unit of LSTM. The output gate is a sigmoid activation neural network which takes the input vector, the bias, the prior hidden state, and the new information as an input. The current block’s output is produced by multiplying the sigmoid function’s output by the tanh of the incoming information.
2.4 Gated Recurrent Unit (GRU) The GRU is an advancement in RNN and simplified version of LSTM, having two gates only. There is no explicit memory in a GRU. It is a simplified version of LSTM with two gates, i.e., reset and an update gate. The previous hidden state information is not remembered if all elements of the reset gate value reach to zero. In this case, only input vector affects the candidate hidden state. The update gate decides how much past information is to be passed further. The training time of LSTM is large because LSTM comprises a memory unit that provides extra control; it is widely used for abstractive summarization; however, the calculation time of the GRU is lowered [8]. Furthermore, parameter tuning is easier in LSTM, while the training time of GRU is less [6] (Fig. 5).
2.5 Attention-Based Model In this attention, we provide special attention to important words rather than the entire input sequence. Before being used for NLP applications like text summarization [9], the attention mechanism was used for neural machine translation [10]. The basic encoder and decoder architecture is not capable of handling long sentences as the size of encoding is fixed and all elements of long sentences are not taken in consideration. The attention mechanism is used to give importance to main word which can be part of summary [9]. The attention is basically used for calculating the weight between each input and output, and the sum of all weights is equal to one. The use of weights has the advantage of indicating which input word requires special attention in relation to the output word. After passing each input word, the softmax layer receives the
Comparative Analysis of Deep Learning-Based Abstractive Text…
537
Fig. 5 GRU
weighted average of the decoder’s previous hidden layers along with the final hidden layers [10].
2.6 Reinforcement Learning The RL creates a summary and compares it to the reference summary to determine how good the generated summary is. The score determines what the model should update. It increases general recall because all pertinent information is summarized. Reinforcement learning (RL) is a technique based on feedback where an agent learns to interact with environment by observing an action and its result. The reinforcement learning solved many problems like non-differentiability of language generation, inability of supervised learning for some problems [11]. Many of the metrics used to evaluate summarization, such as ROUGE, BLEU, and METEOR, are not differentiable. The reinforcement learning is used to optimize the metrics using the power of optimization of non-differentiability. To optimize these metrics, some people employed the REINFORCE algorithm to train various RNN-based models for sequence creation tasks, resulting in significant improvements over previous supervised learning methods [12]. To anticipate expected reward and stabilize objective function gradients, an extra neural network called a critic model is used. Rennie et al. [13] created a self-critical sequence training strategy that does not require the usage of a critic model and improves image captioning performance [13] (Tables 1 and 2).
538
D. Argade and V. Khairnar
Table 1 Comparison of abstractive text summarization models based on review S. No.
Model
Type of decoder
Limitation
1
Attentive RNN-based CNN + attention abstractive text summarization [10]
Type of encoder
Elman RNN or LSTM
Limited for sentence level
2
Model for abstractive Convolution phrase text using deep encoder learning-based techniques (LSTM-CNN) [7]
LSTM decoder
Difficulty in finding semantic similarity
3
Attention-based Bidirectional-GRU seq2seq RNN (GRU) + hierarchical model for test attention summarization [12]
GRU + LVT + pointer switch
Multi-sentence summarization on different dataset
4
Text summarization with pre-trained encoders [14]
5
Model for abstractive Bidirectional LSTM LSTM + pointer summarization based + intra-attention switch + using deep intra-attention reinforcement [11]
Readability enhancement
6
Model for abstractive GRU + double text summarization encoder with dual encoder [15]
Dynamic mechanism for decoding length
7
Model for text summarization using double attention mechanism with pointer network [16]
8
Pointer and generator Bidirectional LSTM LSTM + pointer network-based + attention switch + coverage abstractive text mechanism summarization [17]
Pre-trained encoder Transformer
GRU + attention mechanism
Bidirectional LSTM LSTM + dual + self-attention pointer + coverage mechanism
Language generation can be focus
Limited result over short summary database
Higher level abstraction is challenging
Table 2 Dataset used for summarization S. No.
Dataset
Description
1
CNN
This dataset basically consists of news articles with its summary News articles tokens—781 on an average Summaries—56 tokens/3.75 sentences
2
Gigaword
This dataset used for task of generating headline of short documents Input document—31.4 tokens Summaries—8.3 tokens
3
DUC 2004
This dataset meant for summarization of sentence Input document—35.6 tokens Summaries—10.4 tokens
Comparative Analysis of Deep Learning-Based Abstractive Text…
539
3 Discussion The ability to explain the semantic meanings of phrases and texts is one of the most significant parts of summarizing. Neural-based models outperform conventional models by automatically extracting these feature representations. On the other hand, deep neural network models are neither visible nor sufficient nor effectively integrated with prior information. More research and understanding of neural network-based models are required. Furthermore, current neural-based models suffer from the following shortcomings: (1) unable to process longer sequence due to memory constraints; (2) need more time to train due to their complexity; and (3) not suitable for small datasets due to the large number of parameters these models contain [18]. Future text summarizing research could focus on a variety of exciting and promising topics. We presented two options in this review: (1) use reinforcement learning methods for training neural-based models, (2) using text processing techniques before summarizing [18].
4 Conclusion We looked at a variety of neural network-based abstractive summarization algorithms in our survey. In the task of summarization, neural network-based approaches perform better. We began our research with basic encoder decoders, followed by bidirectional RNNs, LSTMs, GRUs, attention mechanisms, and finally reinforcement learning. We have covered a variety of strategies, as well as their drawbacks and solutions. There are a few issues that have yet to be rectified. In the future, we may be able to improve performance by utilizing newly emerging approaches such as reinforcement learning. Future text summarizing research could focus on a variety of exciting and promising topics.
References 1. S. Syed, Abstractive Summarization of Social Media Posts: A Case Study using Deep Learning. PhD diss., Master’s thesis, (Bauhaus University, Weimar, Germany, 2017) 2. T. Shi, Y. Keneshloo, N. Ramakrishnan, C.K. Reddy, Neural abstractive text summarization with sequence-to-sequence models. ACM Trans. Data Sci. 2(1), 1–37 (2021) 3. D. Suleiman, A.A. Awajan, Deep learning based extractive text summarization: approaches, datasets and evaluation measures, in 2019 Sixth International Conference on Social Networks Analysis, Management and Security (SNAMS) (IEEE, 2019), pp. 204–210 4. E. Jobson, A. Gutiérrez, Abstractive text summarization using attentive sequence-to-sequence RNNs p. 8 (2016) 5. M. Schuster, K.K. Paliwal, Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (1997)
540
D. Argade and V. Khairnar
6. S. Hochreiter, J. Schmidhuber, Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997) 7. S. Song, H. Huang, T. Ruan, Abstractive text summarization using LSTM-CNN based deep learning. Multimedia Tools Appl. 78(1), 857–875 (2019) 8. P.K. Rachabathuni, A survey on abstractive summarization techniques, in 2017 International Conference on Inventive Computing and Informatics (ICICI) (IEEE, 2017), pp. 762–765 9. S. Chopra, M. Auli, A.M. Rush, Abstractive sentence summarization with attentive recurrent neural networks, in Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2016), pp. 93–98 10. A.M. Rush, S. Chopra, J. Weston, A neural attention model for abstractive sentence summarization. arXiv preprint arXiv:1509.00685 (2015) 11. R. Paulus, C. Xiong, R. Socher, A deep reinforced model for abstractive summarization. arXiv preprint arXiv:1705.04304 (2017) 12. M.A. Ranzato, S. Chopra, M. Auli, W. Zaremba, Sequence level training with recurrent neural networks. arXiv preprint arXiv:1511.06732 (2015) 13. S.J. Rennie, E. Marcheret, Y. Mroueh, J. Ross, V. Goel, Self-critical sequence training for image captioning, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017), pp. 7008–7024 14. Y. Liu, M. Lapata, Text summarization with pretrained encoders. arXiv preprint arXiv:1908.08345 (2019) 15. K. Yao, L. Zhang, D. Du, T. Luo, L. Tao, Y. Wu, Dual encoding for abstractive text summarization. IEEE Trans. Cybern. 50(3), 985–996 (2018) 16. T. Young, D. Hazarika, S. Poria, E. Cambria, Recent trends in deep learning based natural language processing. IEEE Comput. Intell. Mag. 13(3), 55–75 (2018) 17. A. See, P.J. Liu, C.D. Manning, Get to the point: summarization with pointer-generator networks. arXiv preprint arXiv: 1704.04368 (2017) 18. Y. Dong, A survey on neural network-based summarization methods. arXiv preprint arXiv: 1804 (2018)
Crop Disease Prediction Using Computational Machine Learning Model Rupali A. Meshram and A. S. Alvi
Abstract The crop yield disease identification plays a significant role for improving overall crop production in agriculture field. Sustainable production in agriculture field conventionally depends on environment change, quality of soil and global warming. All prominent parts of cultivated plants get naturally affected by various diseases, and that diseases can be found in growing, flowering, fruiting phases of the plant. In this paper, an alternative approach proposed the training model which is used to accurately detect the various diseases occurring merely on plant’s life span. In machine learning process, properly training the dataset is reliably to gain precise accuracy. During training, various factors are considered functionally to achieve precisely more appropriate performance of the experimental model. The training and validation efficiently implemented on cultivated crop disease dataset by considering the contemporary approach. Keywords Machine learning · Cultivated crop diseases · Training–testing ratio · Augmentation and overfitting
1 Introduction Nowadays, farming divided into traditional and modern farming. In traditional farming, irrigation depends on the reservoir water, rainfall dependency; water is pumped out and sent along smaller canals or pipes to the farms. In modern farming, biotechnology produces crops that can resist pests and diseases improve the nutritional contents of crops. Chemical fertilizers and effective pesticides are widely employed to increase soil fertility pesticides, insecticides and fungicides are merely used to control pests and diseases. Future world farming systems face huge challenges: rising population, declining arable land. Machine learning techniques help farmers make analysis of health of crops, soil content, quality of land, etc. Accuracy of model mostly depends on the size of dataset. Data preprocessing plays a vital R. A. Meshram (B) · A. S. Alvi PRMIT & R, Badnera, Amravati, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_40
541
542
R. A. Meshram and A. S. Alvi
role to enhance the quality of raw dataset. Preprocessing works on missing values, noise/errors and disorganized dataset. In preprocessing phase, the batch normalization used to increase the learning rate [1]. In mathematically, the image processing is a way to transfer the image (consider as a function) into a two-dimensional array so that the data will be converted into numeric format which is useful for further processing [2]. Well-labeled and organized data will be provided to model to get the accurate result like MAF model [3] with the CNN framework, deep feature extraction with CNN [4] and transfer learning approach with pre-trained CNN [5] gives the more performance. The capsule neural network has been properly designed by an author [6]. This network consists merely of input, hidden and output layer. The hidden layer further divided to more layers like convolutional layer, primary capsules (lower and high layer). As compared to CNN, the capsule neural network useful for analyzing, partitioning, localizing, detecting and quantifying. But these capsules are not compatible with the big dataset and power requirement for the capsule is high. CNN with deep networks has been proposed by authors [7] for feature extraction phase. The dense deep valuation mechanism was efficiently performed by authors for an average cross-pooling and 2 max pooling. Average cross-pooling and 2 max pooling methods are accurately used for enhancing the unique features of the image. Other pooling methods are also available, but it typically depends on the dataset which pooling will be applicable for feature extraction. Author [8] has proposed alternative technique which integrated image processing with classification algorithm for accurate prediction of fusarium oxysporum-specific disease which is mostly appeared on tomato plant leaf. Author designed an algorithm for model with two-factor identification (2-classifying and 2-identifying phases) method to get the precise result of a model. Abundant issues were found in crop disease identification such as disease in various parts of plants, disease with similarities and disease in various stages of plants. Crop diseases can be identified at early stages which results in the crop yield improvement. It is very important to recommend treatment to the famers which will helpful for them to prevent disease measure. All the above several factors will be considered to improve overall crop quality. This research paper describes the numerous factors which are considerable in training and validation on the raw dataset.
2 Literature Survey Standard variants of Extreme Learning Machine brands are typically studied by author [9] to progressively improve the performance factor of various classification algorithms. To reduce the computation time of a model, E- Extreme Learning Machine learning algorithm properly adjusts the biases values and weights. Extreme Learning Machine learning methods are extremely useful for fast learning process. As reported by an author, accurately determining the number of neurons in hidden layer for complex problems is challenging job for current researchers. The author [10] properly has done a survey on the deep neural network with its advantages and
Crop Disease Prediction …
543
disadvantages. Deep neural network needs a broad range of dataset for training and even a significant of computing resources. Cloud computing and cluster computing with DNN would be efficiently utilized the power consumption and training time. In future, it will be considerate to increase the optimal efficiency of a system.
2.1 Condition for Farming The reasons for crop diseases are multifaceted and can be resulted from worsening weather cycles, bad farming practices, unfavorable climatic conditions and deteriorating soil quality. In the majority of regions, worldwide farmers are still employing conventional methods. The production potential of fields can get reduced due to degradation in soil condition which results in poor yields of crops. A crop management system [11] is used to properly manage variations in the temporal and spatial domains of the soil in precision agriculture, but the rate of the development and deployment of the precision agriculture tools are not rapid [12]. The existing techniques and the field protocols [13] in practice have a gap to efficiently handle diseases. The gaps in knowledge and methodology employed are identified by [14] in some of the areas highlighted above and details on techniques to address such gaps are presented. To detect potential diseases as early and as effectively as possible dependable and economically viable plant health supervision and monitoring systems are described in [14]. Most of such systems comprise necessary sub-systems and a multitude of sensors, which together enable farmers to efficiently and timely capture the signatures and patterns of diseases reliably. These systems also include vision and image sensors and are much easier to use and deploy to benefit farmers and agricultural professionals and assist in use of technological advances. The limitations and advantages of the various techniques previously adopted, the usefulness of results obtained and list possible future work suggested by various existing research are also addressed in [14]. The categorization techniques proposed by [15] is employed by [14]. These techniques are categorized into three main groups, namely detection of diseases, quantification of disease severity and classification.
2.2 Augmentation and Overfitting Oversampling in “classical” machine learning models can lead to overfitting, while convolutional neural networks (CNNs) are less prone to overfitting [16]. Whenever statistical model correctly describes random error or noise in machine learning or statistical model, overfitting appears [17]. Affine transformation, perspective transformation and simple image rotations are effective techniques of image augmentation. In augmentation, to express translations and possible rotations with respective to the linear transformations and vector addition; the affine transformations
544
R. A. Meshram and A. S. Alvi
were applied [18]. Ensemble learning techniques are currently being developed; for example, a new algorithm (HIBoost) uses a discount factor to limit the updating of weights, reducing the risk of overfitting [19]. In case of an insufficient training dataset, the data augmentation method is generally applied [3]. The author adopted simple amplification and experimental amplification technique for data augmentation by applying image translation, rotation, cutting, etc. operations on the dataset.
2.3 Training and Validation Accuracy The proposed method [20] was designed by authors, who employee the images for training in laboratory conditions and for testing they in real-time situations and the system attained a 33% (field condition) accuracy rate. There is a need to consider the source of images (real-time or laboratory condition) for the development of an efficient machine learning system that could identify from collected data from different datasets [21]. Researchers [22] examined their model with various images from those in the training dataset 31.40% in dataset 1 and 31.69% in dataset 2. According to the authors [23], the split ratio of training and testing set, as well as the quality of samples, had a substantial impact on multiclass classification performance and XGBoost gave best performance, while the naïve Bayes classifier was outperformed by all the other models.
2.4 Research Gap Table 1 addressed various research issues with research gap. Some authors only focus on particular infected part of a cultivated plant. Effective treatments are not recommended by authors. Our proposed approach design to accurately predict a disease on various parts of a cultivated plant and the proper treatment will be recommended to the farmers.
3 Methodology In the proposed research work, a system will be designed and developed methodologies for identification of cultivated crop diseases. The dataset is used for this research work which contains the images of healthy and diseased crops. Healthy crop images are added to improve the validation of the training dataset. Figure 1 describes the detailed working of proposed work. The main focus of this paper is to train our dataset. In Fig. 2, preprocessing phase includes augmentation techniques before training a dataset. Some contributing factors are typically
Crop Disease Prediction …
545
Table 1 Table captions should be placed above the tables Methodology
Research issues addressed
IoT and deep learning-based fine-grained crop disease detection [24]
(1) Detection of crop diseases The quality of images can be and transmission of diagnostic improved to improve the results to farmers accuracy. Proper features are not incorporated for disease identification. Treatments are not recommended for detected diseases
Research gap
Fine-tuning of deep convolutional neural networks (DCNN) [25]
(1) A generative adversarial network with dual-attention and topology-fusion mechanisms is developed to improve image quality (2) A maize leaf feature enhancement framework is designed
Only maize leaf and features are used. Treatments are not recommended. More types of maize pests and diseases can be identified in future
Data augmentation by generative adversarial networks [26]
(1) Improving the recognition accuracy of tomato leaf diseases
In future, a better data augmentation method can be designed. To solve the problem of data imbalance image-to-image models can be used. Differences in disease at different stages are not handled
Leaf generative adversarial networks [27]
(1) Lack of training images of grape leaf diseases (2) Feature extraction capability on grape leaf lesions
Treatments are not recommended. Differences in disease at different stages are not handled. It cannot be used for other crops
A fungal effector predictor [28]
(1) A genetic algorithm with a granular support vector-based under-sampling (GS V-US) for majority class sampling (2) Fungal effector identification
Treatments are not recommended. Differences in disease at different stages are not handled. It cannot be used for variety of diseases identification
PlantVillage Nuru Mobile Application [29]
(1) Early detection of plant diseases (2) Sustainable food production
It is used only for crop health monitoring at early stage. Crop diseases are not identified. Measures to be taken to improve crop health are not recommended
IoT, machine learning and drone technology [30]
(1) Crop health monitoring with multi-modal data
It is used only for crop health monitoring. Crop diseases are not identified. Measures to be taken to improve crop health are not recommended (continued)
546
R. A. Meshram and A. S. Alvi
Table 1 (continued) Methodology
Research issues addressed
Research gap
K-means clustering and random forest classifier [31]
(1) Rapid assessment of the severity of FH (2) Wheat ears are counted, the disease severity of the wheat ears groups was graded and the efficacy of six fungicides was evaluated
It can be only used for FHB disease identification. Treatments are not recommended
Labelling
Healthly crop Augmentaon
Preprocessing Training dataset
Dataset
Tesng dataset
Machine Learning Algorithm
Feature Extracon
Proposed Methodology
Infected ?
New dataset Diseased crop Treatment Recommendaon Fig. 1 Block diagram of methodology
considered during training phase. (i) Augmentation process plays a significant role during training phase. In augmentation, the rotation technique is properly applied on images which will be helpful to reduce the overfitting that appears in the training phase. (ii) If a lot of datasets are used for training, then an overfitting problem occurs. There are a number of techniques available to reduce overfitting. (iii) There are a number of methods present to split the dataset for training and testing purposes; the most familiar method is the holdout method. In this method, the given dataset is spread into two partitions as train and test. For getting more accuracy, the dataset of proposed research splits into different ranges like 80–20 (80% of the whole dataset for training and 20% for testing), 70–30, 60–40, etc.
Crop Disease Prediction …
547
Fig. 2 Accuracy result of a model for epoch 19–25
According to proposed methodology, the preprocessing will be applied on raw dataset to remove noise and unwanted data. Assume that “A” to be plant data that have infected and disinfected area, A = Iarea + Darea
(1)
We have to find out the infected area by considering various features. Now consider “X” to be the number of features to be extracted from the A X ( A) =
n
x
(2)
i=0
All the extracted features are provided to the model for prediction of plant diseases. After applying the classification model on the extracted feature, the final output of the model will belong to 0 (healthy plant) and 1 (diseased plant) values. After the result, the treatment will be recommended as per the plant disease to the farmers.
4 Training and Validation Result A crop disease dataset is used for training and validation that contains various images of healthy and diseased crops. Naturally cultivated crops can be affected by
548
R. A. Meshram and A. S. Alvi
a pathogenic organism like a virus, fungus, infectious bacteria, mycoplasma, nematode, etc. To accurately detect infectious crops, extensive experimentation Anaconda 3.8 Python distribution is used. iPython interpreter is used to properly execute the code. Evaluations are carried out on Intel i3 8th generation processor with 4 GB RAM 1 TB HDD running Windows 10 OS. The Python modules NumPy, pickle, cv2, sklearn, keras and Matplotlib are properly used to efficiently carry out the practical implementation. The dataset is correctly used to identify diseases like pepper bell bacterial spot; potato early blight, late blight; and tomato target spot, mosaic virus, yellow leaf curl virus, bacterial spot, early blight, late blight, leaf mold, septoria leaf spot, spider mites (two-spotted spider mite). These common diseases are precisely identified in pepper bell, potato and tomato crops at various stages. The sequential model for dataset model is professionally trained in 25 epoch (see Fig. 2). Epoch is number of passes required in machine learning to train the dataset. In training model, a given dataset needs to be divided into groups. To avoid the overfitting, a proposed training model stops at 25 epochs.
5 Discussion All the leading researchers were uniquely designed various model to improve the accuracy of model for accurately predict the plant diseases. This research paper is fruitful for new researcher to developed the training model. The accuracy of model not only be determined by the classification model, but also depends on some factors like appropriate size of dataset, laboratory-based dataset/field-based dataset, augmentation techniques, overfitting problems, training and testing splitting ratio. In training model, numbers of epochs are directly created the overfitting issues. Some researchers were properly focused on epoch value to improve the optimal performance of training model. Some authors considered precision, recall and F1-score values for the robust model.
6 Conclusion Precisely, preventing several specific diseases can significantly increase food production. Existing agricultural disease detection methodologies are typically designed on a specific part of the plant like leaf, flower and seed. The proposed research typically focuses on diseases on different stages, diseases in various parts of crop, similarity between diseases and finally advice of remedies to prevent the impact on yields. The overall accuracy of training and validation model is 63.28%. According to training and validation results, the augmentation, overfitting and training–testing ratio represent enormous key contributing factors in training phase to achieve the more accuracy.
Crop Disease Prediction …
549
References 1. F. Akhtar, N. Partheeban, A. Daniel, S. Sriramulu, S. Mehra, N. Gupta, Plant disease detection based on deep learning approach, in 2021 International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE), pp. 74–77 (2021). https://doi.org/10. 1109/ICACITE51222.2021.9404647 2. R. Bayraktar, B. Haznedar, K.S. Bayram, M.F. Haso˘glu, Plant disease detection by using adaptive neuro-fuzzy inference system. Tamap J. Eng. 2021, 1–10 (2021). https://doi.org/10. 29371/2021.3.125 3. Y. Zhang, S. Wa, Y. Liu, X. Zhou, P. Sun, Q. Ma, High-accuracy detection of maize leaf diseases CNN based on multi-pathway activation function module. Remote Sens. 13, 4218. https://doi. org/10.3390/rs13214218. (2021) 4. Y. Altunta¸s, F. Kocamaz, Deep feature extraction for detection of tomato plant diseases and pests based on leaf images. Celal Bayar Univ. J. Sci. 17(2), pp. 145–157 (2021) 5. M. Chohan, A. Khan, R. Chohan, S. Hassan, M. Mahar, Plant disease detection using deep learning. Int. J. Recent Technol. Eng. 9(1), pp. 909–914 (2020) 6. T. Vijayakumar, Comparative study of capsule neural network in various applications. J. Artifi. Intell. Capsule Netw. 01(01), 19–27 (2019). http://irojournals.com/aicn/. https://doi.org/10. 36548/jaicn.2019.1.003 7. A. Sungheetha, R. Sharma, Design an early detection and classification for diabetic retinopathy by deep feature extraction based convolution neural network. J. Trends Comput. Sc. Smart Technol. 3(2), 81–94 (2021). https://doi.org/10.36548/jtcsst.2021.2.002. 8. R. Dhaya, Flawless identification of fusarium oxysporum in tomato plant leaves by machine learning algorithm, J. Innovative Image Process. (JIIP), 02(04), 194–201 (2020). https://www. irojournals.com/iroiip/. https://doi.org/10.36548/jiip.2020.4.004 9. D.J.S. Manoharan, Study of variants of extreme learning machine (ELM) brands and its performance measure on classification algorithm. J. Soft. Comput. Paradigm 3(2), 83–95 (2021). https://doi.org/10.36548/jscp.2021.2.003 10. D.A. Bashar, Survey on evolving deep learning neural network architectures. J. Artifi. Intell. Capsule Netw. 1(2), 73–82 (2019). https://doi.org/10.36548/jaicn.2019.2.003 11. J.V. Stafford, Implementing precision agriculture in the 21st century. J. Agric. Eng. Res. 76(3), 267–275 (2000) 12. A. McBratney, B. Whelan, T. Ancev et al., Future directions of precision agriculture. Precis. Agric. 6(1), 7–23 (2015) 13. C. Hillnhutter, A.K. Mahlein, Remote sensing to detect plant stress. Field Crops Res. 60(4), 143–149 (2011) 14. A. Sinha, R.S. Shekhawat, Review of image processing approaches for detecting plant diseases, IET Image Process. 14(8), pp. 1427–1439 (2020) 15. J.G. Arnal-Barbedo, Digital image processing techniques for detecting, quantifying and classifying plant diseases, Springer Plus, 2(1) (2013) 16. M. Buda, A. Maki, M.A. Mazurowski, A systematic study of the class imbalance problem in convolutional neural networks. Neural. Netw. 106, 249–259 (2018) 17. D.M. Hawkins, The problem of over-fitting, J. Chem. Inf. Comput. Sci. 44(1), 1–12 (2004) 18. C.C. Stearns, K. Kannappan, Method for 2-D affine transformation of images, US Patent No. 5,475,803 (1995) 19. Q. Wu, Y. Lin, T. Zhu, Y. Zhang, HIBoost: a hubness-aware ensemble learning algorithm for high-dimensional imbalanced data classification. J. Intell. Fuzzy Syst. 39, 133–144 (2020) 20. K.P. Ferentinos, Deep learning models for plant disease detection and diagnosis. Comput. Electron. Agric. 145, 311–318 (2018) 21. S.M. Hassan, A.K. Maji, M. Jasi ´nski, Z. Leonowicz, E. Jasi ´nska, Identification of plant-leaf diseases using CNN and transfer-learning approach. Electronics. 10, 1388. (2021). https://doi. org/10.3390/electronics10121388 22. S.P. Mohanty, D.P. Hughes, M. Salathé, Using deep learning for image-based plant disease detection. Front. Plant Sci. 7, 1419 (2016)
550
R. A. Meshram and A. S. Alvi
23. A. Rácz, D. Bajusz, K. Héberger, Effect of dataset size and train/test split ratios in QSAR/QSPR multiclass classification. Molecules 26, 1111 (2021). https://doi.org/10.3390/molecules260 41111 24. W-J. Hu, J. Fan, Y-X. Du, B-S. Li, N. Xiong, E. Bekkering, MDFC-ResNet: an agricultural IoT system to accurately recognize crop diseases. Spec. Sect. Data Mining Internet Things IEEE Access 8, 115287–115298 (2020) 25. X. Cheng, Y. Zhang, Y. Chen, Y. Wu, Y. Yue, Pest identification via deep residual learning in complex background. Comput. Electron. Agricult. 141, 351–356 (2017) 26. Q. Wu, Y. Chen, J. Meng, DCGAN-based data augmentation for tomato leaf disease identification, IEEE Access 8, 98716–98728 (2020) 27. B. Liu, C. Tan, S. Li, J. He, H. Wang, A data augmentation method based on generative adversarial networks for grape leaf disease identification. IEEE Access 8, 102188–102198 (2020) 28. C. Wang, P. Wang, S. Han, L. Wang, Y. Zhao, L. Juan, Fun effector-pred: identification of fungi effector by activate learning and genetic algorithm sampling of imbalanced data. IEEE. Access. Spec. Sect. Feature Represent. Learn. Methods. Appl. Large-Scale Biol. Sequence. Anal. 8, 57674–57683 (2020) 29. A. Coletta, N. Bartolini, G. Maselli, A. Kehs, P. McCloskey, D.P. Hughes, Optimal deployment in crowd sensing for plant disease diagnosis in developing countries. IEEE J. Internet Things 20(4), 34–49 (2020) 30. U. Shafi, R. Mumtaz, N. Iqbal, S.M.H. Zaidi, S.A.R. Zaidi, I. Hussain, Z. Mahmood, A multimodal approach for crop health mapping using low altitude remote sensing, Internet of Things (IoT) and machine learning. IEEE Access 8, 112708–112724 (2020) 31. D. Zhang, Z. Wang, N. Jin, C. Gu, Y. Chen, Y. Huang, Evaluation of efficacy of fungicides for control of wheat fusarium head blight based on digital imaging. IEEE Access 8, 109876–109890 (2020).
A Survey on Design Issues, Challenges, and Applications of Terahertz based 6G Communication Selvakumar George, Nandalal Vijayakumar, Asirvatham Masilamani, Ezhil E. Nithila, Nirmal Jothi, and J. Relin Francis Raj
Abstract Fifth-generation (5G) wireless technology has grown dramatically in the last few years because of increasing demand of faster data connections with lower latency. At the same time, several researchers believe that 5G will be insufficient in the coming years due to the rapid increase of multiple machine connection. So, beyond the fifth generation (B3G), obviously it will be the sixth generation (6G) in which mobile users demand ultra-high speed in gigabits per second (Gbps) for Internet of Everything (IoE). To satisfy such requirements, we need wideband spectrum for communication. Terahertz (THz) waves are best solution for wideband applications. This paper outlines the advantages, challenges, and applications of terahertz waves. Keywords THz band · 6G · Millimeter waves · Wideband
1 Introduction Due to rapid advancements in handheld smart terminals, multimedia services are rapidly gaining popularity in modern wireless communication. As on 2021, 20 billion S. George (B) · A. Masilamani · E. E. Nithila · N. Jothi · J. Relin Francis Raj Department of ECE, SCAD College of Engineering and Technology, Tirunelveli, Tamilnadu 627414, India e-mail: [email protected] A. Masilamani e-mail: [email protected] E. E. Nithila e-mail: [email protected] N. Jothi e-mail: [email protected] J. Relin Francis Raj e-mail: [email protected] N. Vijayakumar Department of ECE, Sri Krishna College of Engineering and Technology, Coimbatore, Tamilnadu 641008, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_41
551
552
S. George et al.
Fig. 1 Structure of electromagnetic spectrum
wireless devices are connected to wireless internet and the count is expected to increase to 75 billion by 2030 [1]. This is not only due to population increment but also due to the use of multiple devices by a single user. To accommodate such a huge crowd over the air, we need wide spectrum. Since currently using radio and microwave carrier is highly saturated with massive number of internet users, researchers are focusing on terahertz (THz) band or sub-millimeter wave for future wireless communication. The IEEE 802.15 established the “THz Interest Group” in 2008 as a step toward standardizing THz communications in frequency bands ranging from 275 to 3000 GHz. Figure 1 shows the structure of electromagnetic spectrum [2].
1.1 What Is Terahertz Communication? The electromagnetic spectrum consists of audio waves, radio waves, microwaves, millimeter waves (mm), infrared waves (IR), visible light communication waves (VLC), ultraviolet (UV) waves, X-rays, gamma waves, and cosmic waves as in 1. According to IEEE, the terahertz frequency range is defined as 0.3–10 THz. Tera means unit. The proposed tera band exactly lies between millimeter waves (30– 300 GHz) and infrared waves (10–43 THz). Some authors refer this terahertz (THz) waves as sub-millimeter (sub-mm) waves. Before understanding terahertz wave, we need to recall the mostly used electromagnetic band and its behavior through Table 1 [3]. Table 1 shows some characteristics of high-frequency waves. Millimeter (mm) Wave versus Terahertz Even though there was an increased interest in millimeter-wave analysis, the allotted bandwidth of 7–9 GHz dilutes the attraction of researchers toward millimeter waves. Due to increasing customer demand, usage of millimeter waves will eventually limit the overall throughput to an unacceptable level. Hence, due to limited bandwidth, the mm waves fail to be an alternative to existing radio/microwave communication.
A Survey on Design Issues, Challenges, and Applications…
553
Table 1 Characteristics of high-frequency waves EM waves
Spectrum range
Power consumption
Suitable topology
Source of noise
Millimeter waves
30–300 GHz
Medium
Point to multi-point
Thermal noise
THz band
100–10 THz
Medium
Point to multi-point
Thermal noise
Infrared
10–430 THz
Low
Point to point Sun/ambient light
VLC
430–790 THz
Low
Point to point Sun/ambient light
UV
790–30 PHz
Expected to low
Point to multi-point
Sun/ambient light
Infrared (IR) Wave versus Terahertz Laser sources having wavelength of 750–1600 nm provide data speed up to 10 Gbps in infrared technology. The infrared emissions are restricted to the room since they do not pass through walls or other opaque obstacles. Under critical climate conditions such as fog, dust, and turbulence, the THz waves are best choice to rather than infrared waves. Visible light communication (VLC) versus Terahertz The visible light waves are nothing but VIBGYOR colors having wavelength between 390 and 750 nm. Visible light communication is a method of transmitting data by modifying light in the visible spectrum (390–750 nm). In order to obtain high data rates in VLC network, line of sight (LoS) is needed which means both the transmitter and receiver need to be aligned linearly. This LoS issue makes the VLC wave unfit for long distance communication. Ultraviolet (UV) versus Terahertz Since ultraviolet (UV) waves have ultra-high frequency (790 THz–30 PHz) and in ionizing band, UV communication has health and safety implications for both the eyes and the skin. THz waves, on the other hand, are in the non-ionization range; thus, there are no health dangers associated with them [4, 5]. Thus, Elayan et al. in [6] justify that the THz wave is a strong candidate over other high-frequency waves for future wireless communication.
2 Highlights of THz Communication THz waves or sub-millimeter waves have some unique properties over traditional radio and microwaves. The features are listed below [2] • Since all high-frequency application need line of sight (LoS), these THz waves also have high directivity.
554
S. George et al.
• Terahertz waves can be used in the medical field to analyze the water content of tissue for cancer treatment. • Terahertz waves has excellent spatial resolution for imaging application. • Since penetration power is high in terahertz waves, it can be used for imaging in a smoky environments like fire rescue fields and deserts. • Detection of skin diseases and detection of biological samples are possible with low photon energy (in millielectron volt) terahertz waves. • The physical and chemical analysis of a material can be done with terahertz waves since they have excellent spectral information of the material.
3 Applications of THz Communication The application includes wireless cognition, sensing, imaging, communication, and positioning. Wireless cognition is the process of establishing communication link for huge computation remotely from a machine. The nano-level application includes precise medicine, on-chip communication, plant monitoring, mobile access, and the macro-level applications are air quality detection, information shower, gesture detection, explosive detection, gas sensing, and positioning [2, 7]. Apart from these applications, polymer fields use THz waves for dispersion quality control for polymeric compounds, plastic weld joint inspection, and material characterization [8]. Figure 2
Fig. 2 Applications of terhertz waves
A Survey on Design Issues, Challenges, and Applications…
555
broadly categorizes the application of THz waves in communication field, biomedical field, security purpose, agriculture field, and in industries.
4 Design Considerations of THz Communication 4.1 Signal Generation THz waves can be generated using both electronics and photonics because these waves fall between the millimeter wave and optical frequency bands. In terms of electronic devices, recent advances in nanofabrication methods have supported the development of THz frequency semiconductor devices. Gallium arsenide and indium phosphide are such devices. Fujishima et al. in [9] use silicon-based electronics with a carrier frequency of 240 GHz to get data rate of 10 Gbps. Kallfass et al. in [10] use GaAs with a carrier frequency of 300 GHz to get data rate of 64 Gbps. But by using photonic material with 500 GHz carrier signal, the THz waves can reach 160 Gbps in [11]. The conventional oscillators in the market are not promising for THz signal generator. Although these issues are discussed in [4], the solutions are very expensive and more complex. Hence, THz signal generation needs more investigations.
4.2 Channel Estimation Channel estimation in the THz range is extremely difficult. In mobile applications, precise channel state information (CSI) is required for beamforming mechanisms when there is a lack of line of sight (LoS) link. Furthermore, for fixed LoS pointto-point THz networks, frequent channel estimation could be essential. Fast channel tracking methods, lower-frequency channel approximations, compressive sensingbased approaches, and learning-based techniques are all alternatives for reducing the complexity of THz band channel estimation [12]. The success of compressive sensing techniques in mm wave communications prompted the adoption of similar approaches for sparse channel recovery in THz channel estimation. At higher dimensionalities, learning-based systems are most effective. Here, [13] shows that deep kernel learning based on Gaussian process regression is efficient for multiuser channel estimation in MIMO systems above 0.0610 THz. Furthermore, THz transceivers need to have low noise, high power, and high sensitivity to overcome the path loss in THz communication [14, 15].
556
S. George et al.
4.3 Thermal Stability Thermal runaway is a process of cumulative increase of temperature in a system, which is not only common in low-frequency circuits but also in high-frequency networks. This thermal runaway issue makes the system unreliable. At THz-wave communication, thermal equilibrium of miniature size antenna should control the overheating problem. The possible solutions are listed in [16].
4.4 Beam Forming Beamforming is an important requirement in conventional radio/microwave communication. When it comes into THz scenario, it becomes highly challenging because of more heterogeneous users in densely populated urban areas [17, 18]. Reconfigurable metasurface technology may help to fix this challenge. And also, new tools and more computations are needed to analyze the heterogeneous environment [19].
4.5 High Directivity The power at the receiver is directly proportional to the transmitted power and inversely proportional to the path loss. The path loss depends more on frequency, when frequency increases, the path loss also increases, and hence, it decreases the received power of a system [18]. Since gain of an antenna is inversely proportional to the beamwidth, the THz antenna needs to have narrow beamwidth [20]. This narrow beamwidth requirement leads to use directional antenna in proposed THz regime. The main disadvantage of using directional antenna is that the position of the receiver should be known to the transmitter. This is more difficult in densely populated area. Hence, high directive beam alignment in THz antenna is important for efficient THz communication.
4.6 Material Selection Copper is the most frequent material used in antenna construction. At THz frequency, due to decrease in skin depth and conductivity, copper antennas suffer from large losses of propagation and degradation of radiation efficiency. These facts stimulate the use of carbon materials such as carbon nanotube (CNT) and graphene for the manufacture of THz antennas. When compared to normal copper, carbon nanotube has better conductivity and kinetic inductance at THz frequencies. Therefore, the performance of the carbon nanotube antenna may be better than that of a conventional
A Survey on Design Issues, Challenges, and Applications…
557
Table 2 Comparison of material characterization Parameter
Copper
Graphene
Carbon nanotube
Electronic mobility
32 cm2 V−1 s−1
2×105 cm2 V−1 s−1
8×104 cm2 V−1 s−1
Current density
106 A cm−1
109 A cm−1
109 A cm−1
Tensile strength
587 MPa
1.5 TPa
50–500 GPa
Thermal conductivity
400 W m−1 K−1
5000 W m−1 K−1
3000 W m−1 K−1
Density
2700 kg m−3
2.26 g m−3
2.26 g m−3
Surface area
637
cm2
g−1
2360
m2
g−1
387 g−1
copper antenna at THz frequency. The most recent material for THz antenna design is graphene. Because of its superior material properties at THz frequencies, graphene THz antennas may outperform traditional copper and carbon nanotube THz antennas [21]. Dash and Patnaik concluded that THz antennas made of graphene are quite versatile and can be used in a variety of applications. Table 2 shows the material comparison for THz waves.
5 Conclusion This article has taken a wide and in-depth look at the basic prospects, difficulties, and techniques for developing future communication standard. We exhibited a wide range of possible opportunities for THz frequencies. This survey shows that THz positioning will allow for centimeter-level precision and may even allow imaging in non-line of sight (NLoS) scenarios. This study also offered some suggestions for potential remedies for few issues in THz communication. As the world looks to 6G and beyond, further rigorous research should be conducted to establish the impact of THz radiation on living things. Finally, we conclude that the usage of THz-based communication provides wideband and high-speed data communication for the next generation.
References 1. Statista 2019. Internet of Things (IoT) Connected Devices Installed Base World-wide from 2015 to 2025. Accessed: Mar. 1, 2020 [Online]. Available: https://www.statista.com/statistics/ 471264/iot-number-ofconnected-devices-worldwide 2. T.S. Rappaport, Y. Xing, O. Kanhere, S. Ju, A. Madanayake, S. Mandal, A. Alkha-teeb, G.C. Trichopoulos, Wireless communications and applications above 100 GHz: opportunities and challenges for 6G and beyond. IEEE Access 7, 78729–78757 (2019) 3. H. Elayan, O. Amin, R.M. Shubair, M.S. Alouini, Terahertz communication: The opportunities of wireless technology beyond 5G, in 2018 International Conference on Advanced Communication Technologies and Networking (Comm-Net).https://doi.org/10.1109/commnet.2018.836 0286
558
S. George et al.
4. S. Mumtaz, J.M. Jornet, J. Aulin, W.H. Gerstacker, X. Dong, B. Ai, Terahertz communication for vehicular networks. IEEE Trans. Veh. Technol. 66(7), 5617–5625 (2017) 5. A. Anand, G. Selvakumar, Reliable and efficient multicast protocol for MPEG-4 transmissions over IEEE 802.11 n. (2015) 6. H. Elayan et al., Terahertz band: the last piece of RF spectrum puzzle for communication systems. IEEE Open J. Commun. Soc. 1, 1–32 (2020). 7. Technology Trends of Active Services in the Frequency Range 275–3000 GHz, (International Telecommunication Union, Geneva), Recommendation ITU-R, document SM.2352–0 (Nov 2015) 8. V. Sharma, D. Arya, M. Jhildiyal, Terahertz technology and its applications, in IEEE International Conference on Advanced Computing Communication Technologies (ICACCT) (2011) 9. M. Fujishima, S. Amakawa, K. Takano, K. Katayama, T. Yoshida, Tehrahertzcmos design for low-power and high-speed wireless communication. IEICE Trans. Electron. 98(12), 1091–1104 (2015) 10. I. Kallfass, I. Dan, S. Rey, P. Harati, J. Antes, A. Tessmann, S. Wagner, M. Kuri, R. Weber, H. Massler et al., Towards mmic-based 300ghz indoor wireless communication systems. IEICE Trans. Electron. 98(12), 1081–1090 (2015) 11. X. Yu, S. Jia, H. Hu, M. Galili, T. Morioka, P.U. Jepsen, L.K. Oxenløwe, 160 gbit/s photonics wireless transmission in the 300–500 GHz band. APL Photon. 1(8), 081301 (2016) 12. A. Alkhateeb, J. Mo, N. Gonzalez-Prelcic, R.W. Heath, MIMO precoding and combining solutions for millimeter-wave systems. IEEE Commun. Mag. 52(12), 122–131 (2014) 13. S. Nie, I.F. Akyildiz, Deep kernel learning-based channel estimation in ultra-massive MIMO communications at 0.06–10 THz, in 2019 IEEE Globecom Workshops (GC Wkshps) (Dec. 2019), pp. 1–6 14. P. Mukherjee, B. Gupta, Terahertz (THz) frequency sources and antennas-a brief review. Int. J. Infrared Millimeter Waves 29(12), 1091–1102 (2008) [Online]. https://doi.org/10.1007/s10 762-008-9423-0 15. V. Nandalal, G. Selvakumar, Power optimization in OFDM networks using various peak to average power ratio techniques. Asian J. Appl. Sci. Technol. (AJAST) 1(2), 185–199 (2017) 16. V. Petrov, A. Pyattaev, D. Moltchanov, Y. Koucheryavy, Terahertz band communications: Applications, research challenges, and standardization activities, in Proceedings 8th International Congress Ultra Modern Telecommunications and Control Systems and Workshops (ICUMT) (Oct. 2016), pp. 183–190 17. M.A. Jamshed, A. Nauman, M.A.B. Abbasi, S.W. Kim, Antenna selection and designing for THz applications: suitability and performance evaluation: a survey. IEEE Access 8, 113246– 113261 (2020). [9119381] https://doi.org/10.1109/ACCESS.2020.3002989 18. J.C. Pujol, J.M. Jornet, J.S. Pareta, PHLAME: a physical layer aware MAC protocol for electromagnetic nanonetworks. IEEE Conf. Comput. Commun. Workshops (INFOCOM WKSHPS) pp. 431–436 (Apr. 2011) 19. X. Fu et al., Terahertz beam steering technologies: from phased arrays to field-programmable metasurfaces. Adv. Opt. Mater. 8(3) (2020): 1900628 20. M. Biabanifard, S.J. Hosseini, A. Jahanshiri, Design and comparison of terahertz graphene antenna: ordinary dipole, fractal dipole, spiral, bow-tie and log- periodic. Eng. Technol. Open Access J., to be published 21. S. Dash, A. Patnaik, Material selection for THz antennas. Microw. Opt. Technol. Lett. 60(5), 1183–1187 (2018)
A Study of Image Characteristics and Classifiers Utilized for Identify Leaves Dipak Pralhad Mahurkar and Hemant Patidar
Abstract This paper provides an overview of various leaf identification techniques with a classifier. A method of identification leaves is based on various leaf characteristics like a shape, color, texture features, etc., and classifiers used are like K-nearest neighbor, probabilistic neural network, support vector machine, decision tree classifier, artificial neural network. We proposed a method to identify the leaf picture and their species using open-source computer vision library because automatically identifying plant leaves is a challenging task in computer vision. Keywords Contour · Shape · Color · Texture features · K-nearest neighbor · Probabilistic neural network · Support vector machine · Decision tree classifier · Artificial neural network
1 Introduction Ayurveda, a method of medicine based on medicinal plants, is effective in the treatment of some chronic diseases. In the world, Ayurveda is regarded as a substitute to allopathic medicine. This Indian medical system has a long history. Its intensity is stated in ancient epigraphic literature. Since many countries are embracing Ayurveda, it generates a significant amount of foreign exchange revenues for India through the export of Ayurvedic medicines. Due to a significant decline in the population of certain medicinal plant species, we must cultivate these plants in India [1]. Plant identification has become a difficult task and a hot topic of study. Trees are the primary oxygen source, and they release via the photosynthesis process. Aside from that, the plants are used in several industrial applications, including herbs and Ayurvedic medicine products, biofuels, biomass, and so on [2]. Plants have been used a long time as a therapeutic source in India. Ayurveda [3] is the name given to this science. Every plant on the planet, according to Ayurveda, has some medicinal value. In the world, it is regarded as an alternative to allopathic medicine. One of D. Pralhad Mahurkar (B) · H. Patidar Electronics and Communication Engineering, Oriental University, Indore, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_42
559
560
D. Pralhad Mahurkar and H. Patidar
the most significant benefits is that it has no negative side effects. Taxonomists must manually identify these medicinal plants, which are vulnerable to human error in many situations. Build an automated system for plant leaf recognition to prevent this. Medicinal plant identification has been attempted by several researchers using leaf photographs. For detection, researchers use a variety of methods. Form features are used by some researchers, while texture features are used by others. They often employ various classifiers. Identification of leaf is usually dependent on morphological characteristics of leaves, but around the world vast types of leaves, so it would be difficult for professional botanists to classify the plants. In this situation, it is beneficial to create a computer-based plant recognition system to identify the class of a leaf. Now advancement of image processing is possible to automatically identify plants. More studies demonstrate that leaves contain a wealth of knowledge like color, texture and shape. It is important to organize medicinal plants with distinguishing features. To recognize unfamiliar plants also to discriminate plants those have similar characteristics. Medicinal plants come in a wide range of shapes and sizes, which are used to classify them. For identification, we used morphological features, which are physical characteristics. Leaf shape analysis is an essential phenotyping parameter. The shape of leaves varies greatly between species. Using classifiers like KNN, ANN, SVM, and others, divide the plants into their relevant species. The rest of this work is organized as given below: The remaining of this work is laid out as follows: In Sect. 2 based on literature survey concluded with graphs shows various current algorithms available for the identify leaves species. The existing systems used different feature extractions methods explain in Sect. 3. Section 4 explained how the proposed system will work. Then in Sect. 5 concluded with the result, our proposed method has given better results than present systems.
2 Literature Survey Plants are extremely crucial in human life as well as other life forms on the planet. Plant identification is based on leaf photographs. Botanists’ definitions are used in some schemes. However, automatically extracting and transferring those features to a computer are difficult. The widespread use of digital cameras also laptops has implemented of this device a reality in today’s world. In image processing as well as in machine learning experiments, researchers develop non-manual plant classification systems. The detection of discriminative features for different organisms is a challenge in designing this method. Feature extraction is used to identify properties that are normal in one category but rare in others. The camera records the photographs of the plant leaf. Different preprocessing techniques are considered for removing noise from images or other objects (e.g., image clipping, image smoothing). The term “segmentation” refers to the division of an image into different sections with similar features. When it comes to identifying an item, feature extraction is extremely critical. Extracted features are utilized in a
A Study of Image Characteristics and Classifiers …
561
variety of image processing applications. The property used for leaf identification includes morphology, color, texture, and edges. The database images are labeled after feature extraction is complete. The experimental results reveal that the suggested methodology accurately classifies trees since the edge and color properties of trees are readily distinguished from those of plants and shrubs. Because the bulk of the plants are green, classification based on color histogram characteristic has a lower accuracy. Furthermore, the shades shift periodically, resulting in a poor level of color feature consistency. As a result, the combination of color and texture (edge) elements produces good outcomes. The classification accuracy of the SVM classifier is found to be higher than that of the neural network classifier [1]. The leaves were used to identify the plants, and the leaves were identified based on their shape and textural characteristics [2]. The suggested system employs image processing to identify a specific leaf and, as a result, retrieve the plant’s medical characteristics [3]. A method for describing plant leaves that incorporates a ridge filter as well as statistical measures to characterize textural information, and also curvelet coefficients and invariant moments to model shape [4]. The suggested approach is limited to photos of matured plant leaves [5]. A new method for detecting plant leaf pictures using a combination of HOG and LBP features and then classifying the leaves using a multiclass support vector machine [6]. For plant species recognition and categorization, the SVR method and two combination approaches are used [7]. A feature extraction method for shape characterization and a statistical classifier for distinct feature dimensions are presented in a study. The suggested method beats Zernike moments, and curvature scale space because the retrieved features are scale and rotation invariant. Several leaf templates are recommended for constructing the species leaf model if the form of leaves within a species differs significantly. In a future investigation, we’ll extract more characteristics from leaf vein patterns as well as the positions of petioles to increase recognition performance [8]. The use of a genetic algorithm (GA) to improve the classification performance of KNN is offered as an innovative strategy [9]. The leaf photos loaded from the mobile phone camera then smartphone application to automatically recognize and classify different types of plants [10]. On 102 different active ingredients isolated from medicinal plants, a multiple linear regression research was undertaken [11]. In probabilistic neural, more than one output can be used to classify instances but long training time and a more complex network structure [12]. In support vector machine ability to generalize well but limited speed, size in training also testing [13]. The global and local feature descriptors and classifiers employed in the leaf recognition algorithm [14]. A method for computer assisted segmentation and classification is proposed [15]. Decision trees are very easy to use and understand fast prediction but small variations in data input may be at times due to huge changes in the tree [16]. In an artificial neural network, less formal statistical training is required but a huge computational load [17]. A method for determining the correct leaf species from six different types, in the future, there will be a greater number of leaf classes with enhanced efficiency [18]. Because of the two-factor identification procedure, we were able to get more precise answers [19]. By executing a dense deep feature extraction on the retinal pictures, the suggested framework was able to detect the location of HE
562
D. Pralhad Mahurkar and H. Patidar
in the blood vessels [20]. The review presents the deep learning neural network as an accurate way of categorizing and forecasting data that is unlabeled and unstructured using sorting and ordering in a feature hierarchy [21]. With varied datasets in the complex domain, determining an optimal number of neurons in the hidden layer is a difficult issue. Researchers are still working on this research challenge [22]. In contrast to convolution networks, the capsulenet has a better learning process, which allows for higher performance [23].
3 Existing Systems The present system used different leaf features extracted then identified the leaf species.
3.1 Shape Features • Aspect Ratio: The aspect ratio of a leaf is the proportion of its length to its breadth. Aspect Ratio = Width/Height. • Leaf Length: Length of the leaf’s main vein between its two ends is called the length of the leaf [3]. • The Breadth of a leaf: It is the measurement from a leaf’s leftmost point to its rightmost point. • Leaf Diameter: It is the greatest distance between any two places inside the leaf’s coverage area. • Leaf Extent: The proportion of the contour area to the area of the bounding rectangle. • Convex Hull: Using the convex hull method, we can get the coordinates of the locations where the whole area of the leaf is occupied, also known as shape of a leaf. • Solidity: It’s the ratio of the contour area to the convex hull’s surface area. • Leaf of Area: Leaf area is calculated using a smoothed leaf picture. Multiple leaves are photographed from the top, side, then obliquely using a tight bounding region. Leaf region segmentation, region filling, and area computation are used to generate average values for each view [5]. • Leaf Perimeter of Leaf: It is determined by computing the amount of pixels that make up the margin. • Compactness: The square of perimeter divided by the product of the area with (4*π). Roundness is another term for it. • Eccentricity: Eccentricity is defined as the ratio of the length of the region-ofinterest (ROI) major inertia axis to the length of the ROI’s minor inertia axis. • Rectangularity (R): The ratio between the area of the ROI and the size of the minimum bounding rectangle (MBR) is known as rectangularity.
A Study of Image Characteristics and Classifiers …
563
3.2 Color Features • Arithmetic Mean: Arithmetic mean filtering is used to eliminate short-tailed noise from a picture, like a uniform also Gaussian form noise, at the cost of softening the image. The arithmetic means filter is the sum of all pixels in a certain area of an image. • Standard deviation: The standard deviation is a statistic that expresses the amount of difference in a process trait that has been calculated. In particular, it determines how much an individual measurement can deviate from the mean on average. Greater accuracy, predictability, and efficiency are associated with a lower standard deviation. • Skewness: Skewness is a deviation from the symmetrical bell curve, or regular distribution, in a set of data. Whether the curvature is moved to the left or right, it is considered to be bent. • Kurtosis: Kurtosis values are represented in digital image processing in conjunction with noise and resolution calculation. Low noise and low resolution can go hand in hand with high kurtosis values. The kurtosis value of images with moderate amounts of salt and pepper noise is likely to be high.
3.3 Texture Features • Energy: The term “energy” is commonly used in optimization. Energy is a function that captures the desired solution and uses gradient-descent to find its lowest value, resulting in an image segmentation solution. Or the “energy” refers to a problem of minimization or maximization, depending on your target. It is presented as an energy minimization problem in traditional object detection or segmentation tasks. • Contrast: The difference in luminance or color that distinguishes one entity from another is referred to as contrast. The disparity in color and brightness of an object within the same field of view determines contrast in visual perception of the real world. • Entropy: Entropy is characterized in the context of images as the corresponding states of intensity level to which individual pixels can adjust. The entropy value is used in the quantitative analysis and estimation of image information because it allows for a clearer comparison of image details. The above-mentioned features extracted from the Flavia and Swedish datasets. We exhibit and discuss various datasets and their accompanying recognition results. There are 32 classes in Flavia dataset, each with a different amount of samples per species. We have chosen 50 photos from each species [24]. Swedish dataset has 1125 leaves from 15 different species, each with 75 pictures. We have chosen 25 photos from each species [25]. According to the testing results, Shape Context (SC) performs somewhat better than Bag of Features (BoF) algorithm, whereas network
564 70 Percentage s of Correct Matches
Fig. 1 Correct percentages matches for various current algorithms
D. Pralhad Mahurkar and H. Patidar
60 50 Swedish 40 Flavia
30 20 10 0 SC
Skel
BoF Network Algorithms
algorithm performance is almost similar to SC. However, the Skel algorithm returns the fewest valid matches. Despite the fact that SC is frequently utilized on a wide range of shapes, Figure 1 shows various current algorithms available for identify leaves species using Flavia and Swedish database.
4 Proposed Work In our proposed system, main aim is to identify the leaf species from publicly available two databases, i.e., Flavia and Swedish datasets. Figure 2 shows flowchart of the proposed approach.
4.1 Image Acquisition In image processing, image retrieval is the process of extracting an image from a source, usually hardware-based, that can be transferred via whichever processes are needed afterward.
4.2 Image Preprocessing Using preprocessing stage, normalize the image’s scale as well as orientation before function computing. Typically, for an unprocessed image is a color image with an unknown angle as well as height. The image is first transformed into binary as well
A Study of Image Characteristics and Classifiers …
565
Fig. 2 Block diagram of the proposed systems
as grayscale forms. The picture is used to determine the degree of a leaf’s main axis, also the leaf is then rotated such that the major axis is parallel to a horizontal [4]. • Grayscale conversion: The method of transforming an image to grayscale is known as grayscale conversion. Before being stacked as slices for further processing, the grayscaled pictures were treated utilizing contrast of an image as well as intensity enhancement methods. • Binary conversion: Thresholding is a technique for converting gray-scale to a binary image. An alternate name of a Binary image is called as digital image in which each pixel has only two possible values. Black and white are the most common colors used in binary images. • Noise removal: Noise can appear in digital images in a several forms. Pixel values that originate from inaccuracies in the digital picture capturing process produce noise. Certain forms of noise can be removed using linear filtering. This is where filters like the cumulative Gaussian filters or low-pass filters come in handy. An averaging filter, for example, can be used to remove grain noise from an image. To eliminate salt as well as pepper noise via an image, averaging and a median filter are utilized.
566
D. Pralhad Mahurkar and H. Patidar
4.3 Image Segmentation The technique of separating a digital image into pixelated groupings called image objects is known as picture segmentation, which decreases image complexities and simplifies image analysis.
4.4 Feature Extraction Image features like shape, color, texture are extracted.
4.5 Classification Image classification is the method of categorizing as well as naming pixels or vectors groups inside an image using a special rule.
4.6 Result Using the Otsu algorithm, measure the inter-class variance (between classes and segments) and intraclass variance (within a single segment) and identify the leaf species.
5 Conclusion and Future Scope We concluded that the simpler classification technique is the nearest neighbor approach after studying the above classification techniques but more time is required for the predictions is a drawback of the KNN system. We can increase the leaf species recognized rate and the algorithm’s efficiency by extending the run time. Experiments will be conducted on databases such as Flavia and Swedish. This work can be extended to a larger number of leaf classes, and it will improve efficiency in the future.
A Study of Image Characteristics and Classifiers …
567
References 1. B.S. Anami, S.S. Nandyal, Medicinal plants a combined color, texture and edge features based approach for identification and classification of Indian. Int. J. Comput. Appl. 6(12), 0975–8887 (2010) 2. D. Tomar, S. Agarwal, Leaf recognition for plant classification using direct acyclic graph filter and curvelet transform with neuro-fuzzy based multi-class least squares twin support vector machine, Int. J. Image Graph. 16(03) (2016) 3. D. Venkataraman, S. Narasimhan, N. Shankar, S. Sidharth, D. Prasath, Leaf recognition algorithm for retrieving medicinal ınformation, in Intelligent Systems Technologies and Applications Conference, 177–191 (2016) 4. J. Chaki, R.Parekh, S. Bhattacharya, Plant leaf recognition using ridge filter and curvelet transform with neuro-fuzzy classifier. in International Conference on Advanced Computing, Networking, and Informatics, vol. 43, 37–44 (2015) 5. S. Kumar, Leaf color, area and edge features based approach for identification of Indian medicinal plants. Indian J. Comput. Sci. Eng. 3(3), 436–442 (2012) 6. M.A. Islam, Md.S.I. Yousuf, M.M. Billah, Automatic plant detection using HOG and LBP features with SVM. Int. J. Comput. 33(1), pp. 26–38 (2019) 7. S.S. Kumar, Plant species ıdentification using sıft and surf technique, Int J. Sci. Res. 6(3) (2017) 8. C.-Y. Gwo, C.-H. Wei, Plant identification through images: using feature extraction of key points on leaf contours. Appl. Plant Sci. J. 1(11), 1–9 (2013) 9. N. Suguna, K. Thanushkodi, An improved k-nearest neighbor classification using genetic algorithm. Int. J. Comput. Sci. 7(2) (2010) 10. S. Shejwal, P. Nikale, A. Datir, Automatic plant leaf classification on mobile field guide. Int. J. Comp. Sci. Technol. (2015) 11. C.X. Xue, X.Y. Zhang, M.C. Liu, Z.D. Hu, B.T. Fan, Study of probabilistic neural networks to classify the active compounds in medicinal plants. J. Pharm. Biomed. Anal. 38, 497–507 (2005) 12. S.S. Sawant, P.S. Topannavar, Introduction to probabilistic neural network-used for ımage classifications. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 05 (2015) 13. Y. Zhang, Support vector machine classification algorithm and its application. in International Conference on Information Computing and Applications, 179–186 (2012) 14. A.V. Sethulekshmi, K. Sreekumar, Ayurvedic leaf recognition for plant classification. Int. J. Comput. Sci. Inf. Technol. 5(6) (2014) 15. S.S. Panchal, R. Sonar, Pomegranate leaf disease detection using support vector machine. Int. J. Eng. Comput. Sci. (2016) 16. B. Patel, K. Rana, A survey on decision tree algorithm for classification Int. J. Eng. Dev. Res. 2(1) (2014) 17. P. Kumar, P. Sharma, Artificial neural networks-a study. Int. J. Emerg. Eng. Res. Technol. 2(2), 143–148 (2014) 18. R. Janani, A. Gopal, Identification of selected Medicinal Plant Leaves Using Image Features and ANN, in International Conference on Advanced Electronic Systems (2013) 19. R. Dhaya, Flawless identification of fusarium oxysporum in tomato plant leaves by machine learning algorithm. J. Innovative Image Proc. 02(04), pp. 194–201 (2020) 20. A. Sungheetha, R. Sharma, Design an early detection and classification for diabetic retinopathy by deep feature extraction based convolution neural network. J. Trends Comput. Sci. Smart technol. 03(02), pp. 81–94 (2021) 21. A. Bashar, Survey on evolving deep learning neural network architectures. J. Artif. Intell. 1(02), 73–82 (2019) 22. J. Samuel Manoharan, Study of variants of extreme learning machine (elm) brands and its performance measure on classification algorithm. J. Soft Comput. Paradigm. 03(02), 83–95 (2021)
568
D. Pralhad Mahurkar and H. Patidar
23. T. Vijayakumar, Comparative study of capsule neural network in various applicatıons. J. Artif. Intell. Capsule Netw. 01(01), 19–27 (2019) 24. S.G. Wu, F.S. Bao, E.Y. Xu, Y.Wang, Y. Chang, Q. Xiang, A leaf recognition algorithm for plant classification using probabilistic neural network, in Proceeding of IEEE International Symposium on Signal Processing and Information Technology, 11–16 (2007) 25. O.J.O. Soderkvist, Computer vision classification of leaves from Swedish trees. Department of Electrical Engineering, M.S. thesis, Linkoping Univ., Sweden (2001)
COVID-19 Detection Using X-Ray Images by Using Convolutional Neural Network S. L. Jany Shabu, S. Bharath Vinay Reddy, R. Satya Ranga Vara Prasad, J. Refonaa, and S. Dhamodaran
Abstract The 2019 COVID-19 is otherwise called COVID-19. Intense Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) is brought about by a beta-type COVID-19. The seriousness of the illness can be related to an enormous number of passing and diseases all over the planet. Assuming analyzed early, the illness can be successfully controlled. It is feasible to perform lab tests for examination, yet they are dictated by the accessible test hardware. Computed tomography (CT) can analyze the illness. Specifically, in the infection location organization, the pipeline is intended to anticipate standard picture input, viral burden, and COVID-19 levels. A similar report is likewise led to existing techniques. Keywords COVID-19 prediction · CNN model · Deep learning · Image processing using CNN · X-Rays
1 Introduction Top-to-bottom CT imaging studies are utilized to analyze (COVID-19). A portion of the analysts likewise played out a populace-based chest X-rays of COVID-19 patients. To recognize COVID-19 contamination (COVID-19), a strategy called COVID-Net has been created and utilized locally information base. Utilizing inside and out imaging from the chest X-rays gives the best outcomes. Instances of inside and out preparing are generally utilized in clinical imaging. Pneumatic location is performed utilizing an organization of muscles. This paper gives a robotized method for assessing COVID-19 from an inside and out viewpoint. The arranged organization utilizes a multi-utilitarian examination. Interfacing firmly established pipelines enjoys many benefits. A ton of decay goes into the net. The channel utilized isn’t the conventional convolutional neural network (CNN). This capacity utilizes an altogether different organization. S. L. Jany Shabu (B) · S. Bharath Vinay Reddy · R. Satya Ranga Vara Prasad · J. Refonaa · S. Dhamodaran Sathyabama Institute of Science and Technology, Chennai-600119, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_43
569
570
S. L. Jany Shabu et al.
2 Literature Survey The flare-up of intense respiratory condition, known as COVID-2019 (SARS-CoV2), has compromised the whole world [1]. The world is giving a valiant effort to battle the spread of this dangerous infection in framework, finance, news markets, insurance, medical care and numerous different apparatuses. Craftsmanship scientists have zeroed in on their insight into numerical advancement to break down the condition of the scourge, utilizing information from the nation over (Punn et al. [2]). The sickness has spread to 212 nations and areas and has tainted (affirmed) multiple million individuals. In India, the infection was first given an account of January 30, 2020 to an understudy getting back from Wuhan in Kerala. Across India (as of May 3, 2020), the quantity of individuals tainted is north of 37,000. Most investigations and diaries center around the quantity of tainted individuals the nation over (Ghosh et al. [3]). In the normal instances of foreseeing the worldwide COVID-19 plague, the epidemiological and measurable epidemiological model has been important to the specialists and notable in the media. Because of the great degree of vulnerability and absence of significant data, the standard models are not satisfactory over the long haul (Ardabili et al. [4]). Nonetheless, prediction requires adequate data. Simultaneously, what’s to come isn’t so unsurprising as it used to be. What’s more, the speculation depends on the believability of the news media and the anticipated changes. Mental factors additionally assume a significant part in how individuals see and react with the impacts of the sickness and what it can mean for them (Petropoulos and Makridakis [5]). Nonetheless, prediction requires adequate data. Simultaneously, it is obscure right now what he will do subsequent to leaving the post. Moreover, the theory depends on the believability of the news media and the anticipated changes. Mental factors likewise assume a significant part in how individuals see and react to the danger of illness, and they dread that it could influence them (Bhatnagar [6]). The information from March 15, 2020 to April 30, 2020 is utilized to approve the model, and the interior speed is steady. In some Indian nations, for example, Maharashtra, Gujarat and Delhi, individuals are tainted every day. The underlying disease is supposed to be on the ascent and is relied upon to keep on developing. Unexpected blasts are characterized by contrasts in the way of life of these three nations (Shekhar [7]). The abrupt ascent in COVID-19 is putting a strain on worldwide wellbeing administrations. At this stage, it is critical to analyze the illness rapidly, precisely and early. To help independent direction and arranging in well-being programs, the review utilized blood tests from 485 tainted patients in Wuhan, China, to identify indications of death from the sickness (Yan et al. [8]). As indicated by the authority data transmission framework, this report inspects how the 2019 (Li et al. [9]). Jany Shabu et al. [10] classification models are created during the training phase. Using the classification model, classification is done in the
COVID-19 Detection Using X-Ray Images …
571
testing phase. Vijayakumar et al. [11] the convolutional neural network and capsule networks faces the same problem of high power consumption.
3 Objective The design is done with the help of python programming language and PyTorch to produce and train a ResNet-18 model and apply it to a X-ray Radiography Dataset. Produce custom Dataset and Data Loader in PyTorch. Train a ResNet-18 model in PyTorch to perform Image Bracket, will be suitable to produce Convolutional Neural Networks, and will be suitable to train it to classify X-ray reviews with nicely high delicacy. This design is meant for the scanning of X Shafts such that they can separate the x shafts by the complaint. The technology can be used in the hospitals for complaint identification and threat satisfaction.
4 Methodology The review utilized COVID-19 Chest X-ray D-dataset and an information pneumonia bundle. Utilizing these two information, we set up the adjusted data, generally speaking, the viral infection and COVID-19. Coronavirus is made out of a chest X-ray. These pictures are utilized in search. Pictures have been resized because of various sizes. The most effective method to the installed pictures is standard prior to handling. Typical pictures are added to the picture without blunders because of light. This segment presents essential data planning procedures for AI that are reasonable for machine plan and preparing. We utilized three stores to set up the data: two packs from GitHub and one more pack from kaggle. Organizer 2 has a place with COVID-19 and standard media. The information was carried out as regular by COVID-19. We will pack data and objectives before we ponder CNN preparing. Prior to preparing the layout, we will introduce an information generator to peruse the picture from the root catalog as opposed to labeling it exclusively. Naturally, the Data Generator picture distinguishes the information-based pictures and shows the dataset. To interface CNN rapidly utilizing standard estimations, dark should be ordinary. Then, at that point, chest radiographs can be categorized as one of three classifications: typical, COVID-19, or viral pneumonia (Figs. 1 and 2). Algorithm A neural network might be considered a type of computer Science CNNs are a type of neural network. CNN (ConvNet/CNN) is used in current vision systems. The phrase is frequently used, utilizing relevant filters in a flexible manner. The model releases high-quality input features as we go further into CNN layers.
572
S. L. Jany Shabu et al.
Fig. 1 Normal person
Fig. 2 Person with COVID +ve
CNN is split into several levels. With the assistance of some filters, the convolution layer removes the main points from the photographs. The convolution filter employs the convolve algorithm to run across input in two dimensions x and y, functions to extract image elements, providing a 2D matrix output with a low dimensional dimension (total weight) that’s a feature space. The activation function is then applied. • (Rectified Linear Unit). The dataset used is a public dataset available. It has chest X-ray images of non COVID-19 and COVID-19 patients. These dataset images are resized. • CNN was trained using this database. The dataset images are divided in to 70% of training the dataset and 30% of testing the data set. • For training and testing the model, we divide the dataset in the ratio 7:3.
COVID-19 Detection Using X-Ray Images …
573
Fig. 3 General architecture diagram
• Our GoogLeNet model consists of inception modules embedded into convolutional neural networks to reduce computational cost and perform convolution on an input with different sized filters and hence extracts different features (Fig. 3).
5 Result With the creation, it is trained by the model by using GoogLeNet and the dataset is divided in to 70% training and 30% testing the images are resized, and we acquired the best accuracy by training the model. To increase the performance of the model, we looked for hyperparameter values during the training process. PyTorch provides torchvision models (Fig. 4). Pre-training-once the model detects the objects, textures on the data set, you can apply this to own images and recognition problem. We get accuracy of 90 (Fig. 5).
574
S. L. Jany Shabu et al.
Fig. 4 Training model
Fig. 5 Pre-trained model
6 Conclusion The venture will utilize X-ray to decide whether an individual has COVID-19, pneumonia or typical. By involving stable neural organizations in profound preparing, we can identify them utilizing X-ray. It is vital to take note of that the outcomes shown ought not be equivalent to for any remaining things. For instance, essential information from European patients; small and medium-sized information handling should be visible in different patients all over the planet assuming a superior level is required utilizing worldwide information. What’s more, sex contrasts in the data gave will give more insights regarding the model, as the littlest bosom tissue can conceal the lung region, and it isn’t clear assuming that this isn’t true in the model expectation. Future Works To find to the right solution, we have distinguished two potential central issues and recognized the initial segment of the meaning of pneumonia, particularly dementia and working in glass. The subsequent advance is to extend the COVID-19 archive to resolve the issues of information reconciliation and the utilization of various business sectors.
COVID-19 Detection Using X-Ray Images …
575
References 1. Features, evaluation and treatment coronavirus (COVID-19). Available: https://www.ncbi.nlm. nih.gov/books/NBK554776/, May 18 (2020) 2. N.S. Punn, S.K. Sonbhadra, S. Agarwal, COVID-19 epidemic analysis using machine learning and deep learning algorithms. medRxiv, Available: June 1 (2020) 3. P. Ghosh, R. Ghosh, B. Chakraborty, COVID-19 in India: state-wise analysis and prediction. medRxiv, Available: https://doi.org/10.1101/2020.04.24.20077792. May 19 (2020) 4. S.F. Ardabili, A. Mosavi, P. Ghamisi, F. Ferdinand, A.R. VarkonyiKoczy, U. Reuter, T. Rabczuk, P. M. Atkinson, COVID-19 outbreak prediction with machine learning. Available at SSRN: https://ssrn.com/abstract=3580188 or http://dx.doi.org/https://doi.org/10.2139/ssrn.3580188, April 19 (2020) 5. F. Petropoulos, S. Makridakis, Forecasting the novel coronavirus COVID-19. Available: https:// doi.org/10.1371/journal.pone.0231236, Published: March 31 (2020) 6. A.M.K.K. Bhatnagar, Modeling and predictions for COVID-19 spread in India. Available: Published: April (2020) 7. H. Shekhar, Prediction of spreads of COVID-19 in India from current trend. medRxiv, Available: May 06 (2020) 8. L. Yan, H. Zhang, J. Goncalves et al. An interpretable mortality prediction model for COVID-19 patients. Nat. Mach. Intell. 2, pp. 283–288, Published: 14 May (2020) 9. L. Li, Z. Yang, Z. Dang, C. Meng, J. Huang, H. Meng, D. Wang, G. Chen, J. Zhang, H. Peng, Y. Shao, Propagation analysis and prediction of the COVID-19. Infec. Dis. Modell. 5 10. S.L. Jany Shabu, C. Jayakumar, Brain tumor classification with mri brain images using 2level GLCM features and sparse representation based segmentation, in IEEE Transaction Proceedings of the Third International Conference on Intelligent Sustainable Systems [ICISS 2020] 11. T. Vijayakumar, Comparative study of capsule neural network in various applications. J. Artif. Intell. 1(01), 19–27 (2019)
Polarimetric Technique for Forest Target Detection Using Scattering-Based Vector Parameters Plasin Francis Dias and R. M. Banakar
Abstract Synthetic aperture radar signal energy penetrates deeper into forest area. The extraction of the information of the forest area is more feasible by synthetic aperture radar. Several research studies are found on extracting data parameters such as forest biomass, tree height and its density. The biomass depicts the healthiness and surrounding conditions of the forest area. Through synthetic aperture radar it is possible to know the ecological conditions of the forest. This paper analyzes the vector parameters related to polarimetry decomposition methods. These vector parameters depict the scattering involved in the target. The role of the forest in climate control is also discussed in this paper. The image data is modeled using system level model. Keywords Polarimetry · Polarization · Scattering mechanism · Synthetic aperture radar · Vector parameters
1 Introduction Synthetic aperture radar plays vital role in forest parameter analysis. Forest is one of the greater assets of every country. The role of forest is an essential aspect for the life cycle as well as ecosystem of the earth surface. So, in today’s world, saving the forest has become an essential fact. There are several factors where forest should be protected. The major aspect is protecting forest from fire scars and wind. So, the forest observation is important in this regard. Most of the research-oriented study is going on related to forest cover mapping and biomass mapping. Synthetic aperture is advanced remote sensing technology plays crucial role in analysis of the forest parameters. The synthetic aperture radar signal is more sensitive to the structure of P. Francis Dias (B) Department of Electronics and Communication Engineering, KLS VDIT, Haliyal, India e-mail: [email protected] R. M. Banakar Department of Electronics and Communication Engineering, BVBCET, Hubli, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_44
577
578
P. Francis Dias and R. M. Banakar
the plant. Forest consists of several elements such as trees, plants, water resources, animals and timbers. The saving of the forest will help to increase the life cycle and ecosystem. The requirement of ayurvedic medicine and natural medicine is also fulfilled by forest area. The major application of forest is to reduce the soil erosion by restricting the flow of water. The forest also reduces the flow of the speed. One more important feature is that it improves oxygen and absorbs the carbon dioxide from the nature. Forest basically helps to maintain the fertility of the soil. Most of the earth space that is around 33% is covered by forest. The forest plays major role in regulation of carbon cycle. It basically absorbs most of the carbon dioxide from earth surface. So, forest resource management is essential factor. This is also called as forest ecology. The biomass estimation is the prime concern for the classification of the forest area. This is done through the measurement of trees. The biomass calculation normally involves two basic components. The first component is trunk diameter, and second component is tree height. The information of the forest biomass normally supports the forest resource management methods. The four basic parameters of importance for forest in polarimetric radar parameters are scattering amplitude, amplitude ratios, relative phase angles and coherence. This paper basically enriches information regarding the covariance and coherence involved. There are several factors, which makes to understand the need of observation regarding forest biomass. Basically the first factor is to know the global carbon cycle quantity and secondly to understand the effect of global warming conditions. Third factor is to understand quantity of carbon stocks in the forest. Further to find out external sources of carbon. When we look at the environmental problems, deforestation and forest regrowth are of prime concern. Monitoring these two issues has become essential as forest plays major role in carbon cycle and global warming. The major role of forest parameter information extraction by synthetic aperture radar is to find the carbon stock. The carbon cycle study is normally done considering the tree biomass existing above ground. The classification regarding various tree species is comparatively difficult one for synthetic aperture radar as compared to other optical sensors [1]. The understanding of back scatter information from forest is compared with some existing models of vegetation. It is basically compared with water cloud model which is used for vegetation representation. The model consists of two parameters. First being vegetation canopy and second as back scatter. The vegetation canopy is normally considered as random distribution of water droplets to know the water content existing. The second parameter is back scatter which consists of information related to moisture content as well as height of the plant. The moisture content is measured from soil and vegetation. Such principle can be applied to back scatter information of forest. These numerical models fall short of some parameters related to branches, size and direction, hence not found beneficial. So in practice, the empirical- and model-based approaches are used in recent years to understand the forest parameter analysis. Among the four approaches, the first is to use relation between normalized radar cross section and biomass. Second is based on the information obtained from synthetic aperture radar interferometry. Third one is data obtained from PolSAR and from Pol-InSAR. The fourth one uses relation of biomass and image texture [1, 2].
Polarimetric Technique for Forest Target …
579
One more parameter of consideration of forest analysis is implementation of models. The models cannot be applied to every type of forests. The application of models basically depends on the type of forest and its location. The type of forest includes conifers, deciduous, rain forest, broad and needle leaves and ground topography to name few. So depending on the forest types, the model functions are calculated. The author Kauzuo Ouchi presents details of various approaches for the implementation of classification models related to forest information extraction. The author describes importance of these models and the advance techniques of implementation as well as texture-based approaches [1]. The author presents information regarding the wetland targets and species of forest area. The importance of wet land ecosystem is featured. Wetland targets are identified using scattering mechanisms. The scattering behavior of the target area for L and P-band is presented [3]. The author in [4, 5] presented details regarding SAR applications in forestry. The information regarding back scatter component details of forest area is presented. The importance of various band of operation useful for identifying the parts of the tree structure in forest area is described. According to the data presented, in K-band, only leaves are identified. During the operation of the X- band leaves, twigs and small branches will be identified. C-band operation helps in identifying the leaves, primary and secondary branches. The L-band recognizes primary and secondary branches, trunks and ground. The P-band provides information regarding the main branches, trunk, ground and interaction between ground and trunk. This clearly shows that back scatter information depends on the order of the wavelength related to the operating band. The various tree parameter measurements are presented for biomass calculation. The physical parameter extraction information depends on the polarization method where the back scatter is measured [6, 7]. The author presents the information regarding the decomposition methods used for assessment of forest degradation. For L and P-band data, the levels of the degradation of the forest are represented. The target is basically called as reflecting object. To understand the physical properties of the target, the decomposition methods are used. The author analyzes polarimetric SAR data using Cloude Pottier decomposition method. The author presents information about entropy variation for healthy as well as degraded forest. The forest exhibits volume scattering mechanism [8, 9]. The author presents details of information related to forest biomass. The calculation related to carbon accounting is done through regression models. The four types of regression models, namely linear, polynomial, support vector and random forest, are used to analyze the data [10]. The various types of forest classification were explained in this context. The machine learning approaches are applied to improve the biomass estimation calculation. Nowadays most of the analysis is done by machine learning methods. SAR is also becoming the part of its implementation in the data analysis. The basic aspect of correlation is explored while dealing with forest biomass measurement parameters. The five different types of forest are discussed. The bamboo species analysis is taken into consideration by the author to support the parameter of study. The correlation
580
P. Francis Dias and R. M. Banakar
coefficient varied for each type of classification methods based on the type of the forest under study. This showed the sensitivity of the improvisation in correlation coefficient. The regression models are based on predicator variables [11]. In recent study, the measuring parameters of interest are found to be biomass estimation of forest area. Looking into the conditions of carbon cycle and global warming threats, it is essential to identify the location of the target as forest available in the earth surface. In recent years, SAR remote sensing is covering the most forestry application. In the second and third section, the forest parameter analysis and impact of spotlight and strip map mode are discussed in more details. The section four depicts the result analysis details. The sections five and six represent the discussion and conclusion parameters.
2 Forest Parameter Analysis The preferred bands of operation involved in the radar for forest area are generally P-band, L-band and C-band. P-band has wavelength 30–100 cm. The L-band has 23 cm wavelength and C-band is having the wavelength 5 cm. The sensitivity factor related to structure of the forest is more for C-band backscatter. The biophysical parameter of the forest normally depends on the polarization direction of the radar signal. Depending upon the band of operation the ranges of values are found from the research survey for the polarization and is around 15 dB. Specifically for HV polarization falls in the range around 15 dB. For the HH polarization it is 11 dB and VV it is 4.8 dB for P-band data retrieve information. The L-band operation range falls around 8.6 dB. The range is falling around 5.3 dB at HH and 4.6 dB at VV polarization. For C-band this is normally found as less as 4 dB for HV polarization. The value lies around 2 dB for HH and VV polarization. This information is extracted from the Le Toan representation of pine stand for dynamic range and by Anup Kumar Das in his forest application research. The biomass measurement is one more parameter which also depends on band of operation in synthetic aperture radar polarimetry. The measurement basically goes with 80 ton/ha for C-band, 100 ton/ha for L-band and 200 ton/ha for P-band from the survey study. The tree structure of the forest contributes for the various scattering mechanism involved in the object. The branches provide single bounce and volume scattering. The tree trunk allows the surface scattering. The branch and ground interaction as well as trunk and ground interaction depict the double bounce scattering involved. The direct reflection from ground contributes for surface scattering. The polarimetric data reveals such scattering mechanism details and structure of canopy related information [12, 13]. The acquisition of the image during the various seasons also plays major role in identifying the object [14, 15]. The dry season images are better than the wet season images for clarifying the object identification. The other parameter of importance is
Polarimetric Technique for Forest Target …
581
Table 1 Parts of target identification Forest characteristics and scattering mechanisms Band
Polarization
Image acqui sition mode
Scattering mechanism
Parts of target iden tification
L
HH/HV/VV
SLC/MLC
Surface scattering
Ground, Tree trunk
P
HV
SLC/MLC
Double bounce
Tree trunk and ground Branch and ground
C
HV/VV
SLC/MLC
Volume scatter ing
Leaf
incidence angle, polarization and mode of imaging. The polarization discriminates the feature of the reflecting object. The dry season data is preferred by most of the researchers. The next parameter of consideration is frequency which decides the penetration of the signal into the object. This also depends on the need of the application and analysis to be carried out on the reflecting object. The information in Table 1 provides details of characteristics of forest and its scattering mechanism involvement. The scattering vector parameter identifies the reflection property of the target or object. It is a directional parameter. The three different scattering vector parameters are represented as k 1 , k 2 and k 3 . The parameter indicates the application of scattering feature involved. The first parameter k 1 indicates the characteristics of surface scattering mechanism. This parameter normally involves the polarization of wave. Scattering based vectors parameters have co polarization and cross polarization components. The parameter k 1 holds the HH polarization, k 2 holds the HV polarization, k 3 holds the VV polarization. In HH polarization the transmit and receive information is in the same plane of operation. VV polarization also depicts single plane polarization. The HV polarization depicts the dual plane polarization. To obtain the value of these parameters the polarization wave information is accessed. The parameters also depend on the method of decomposition used. These decomposition method are namely called as linear and Pauli decomposition. When k 1 parameter is taken for linear decomposition analysis it has S hh element involved. The parameter k 2 has S hv information. The parameter k 3 will have S vv information. These values are mapped to the RGB component of the images. The vertical polarization is not considered in the Pauli decomposition method. When electromagnetic wave analysis is considered the scattering parameter in first method that is linear decomposition method occupies the horizontal, vertical and cross polarized movement of electromagnetic wave. In the second method the electromagnetic wave is related in horizontal and cross polarization movement. The movement of electromagnetic wave in vertical direction is not allowed. The mode of movement of electric and magnetic field are taken as two directional forms. They are known to be Single and Multi look. Polarization occurs due to scattering which is similar like reflection. The roles of polarization in two different image acquisition methods are to be related. While considering the seasonal aspect of image acquisition, preferred season
582
P. Francis Dias and R. M. Banakar
is dry season for forest parameter. So understanding of image acquisition in single look, multi look also make sufficient effect. In single look image is observed in one plane one position where as in multi look the image is observed in different position as well as different direction. For multi look data consideration, the information is retrieved for all direction from the tree structure. To access the information interaction of leaves, stem and ground or surface or ground of forest is taken into account. The reflection direction from the reflecting object is based on the viewing angle of the radar signal. So the different position and orientation gives well organized interaction features. The mode of analysis for forest structure basically depends on the canopy structure involved. So that the information obtained through this way of acquisition helps in processing effectively. Multi look data is related to spotlight mode method.
3 Spotlight Mode and Strip Map Mode Analysis When target like forest is to be observed the image is accessed as whole component for polarimetric data retrieval. So the spot mode is preferred. As compared to strip map mode the acquiring the image becomes restricted to one strip like zone. The strip map mode is useful for detecting like target which are only single structure namely bare lands, surface, water or roads, etc. In strip map mode single antenna is focused. While analyzing the target in spot mode more than one antenna is considered. In strip map mode of analysis image acquired is calibrated in one long structure. So, the target which are having similar or single structure can be studied in one singular direction. The featuring component will adopt the same physical nature. Observing the strip map one can find the suitability of spot mode acquisition is good for forest or volume scattering target. The strip map mode normally gives the flexibility in the swath width for the target. This is basically achieved using change in incidence angle. Normally swath is the area covered under beam of incidence. SAR like ERS 1/2 and JERS-1 SAR had normally fixed antenna and hence not used in swath selection. The newer generations SAR like RADARSAT-1/2, ENVISAT ASAR, ALOS PALSAR, TerraSAR X and COSMO SkyMed has the facility of selection of various swath modes. In spotlight mode the fine azimuth resolution is achieved as compared to strip map mode. The multiple viewing angles are possible in spotlight mode. The multiple scenes of the images are obtained in spotlight mode. The three basic features characterize the spotlight mode than the strip map mode. Synthetic aperture radar signal energy penetrates deeper into the forest area. The extraction of the information of the forest area is more feasible by SAR. Several research studies are going on for extracting the data parameters such as forest biomass, height and density. This paper analyzes the vector parameters related to decomposition methods. These vector parameters depict the scattering involved in the target. The role of forest in climate control is discussed. The healthiness and surrounding biomass depict the conditions of the forest area. Through SAR it is possible to know the ecological conditions of forest. The SAR signals are being
Polarimetric Technique for Forest Target …
583
sensitive to plant canopy, structure, size, orientation and moisture content of leaves, branches and trunks.
4 Results The image analysis is carried on the sample image of forest Rico. The study involved understanding of features of forest such as biomass and scattering mechanisms involved in the target through entropy and anisotropy. Figure 1 represents the image used for analysis of certain parameters of observation for forest. This image is referred from NASA space radar when it flies over Puerto Rico to assess forest damage after hurricanes. The El Yunque National forest is earlier known as Caribbean National Forest. It is situated in north eastern Puerto Rico. It is the only tropical rainforest in the United States National Forest System and United States Forest Service. The forest covers flora and fauna trees and tallest mountains. Figure 1 represents the forest area affected after the hurricane. To extract the information, the study of accessing image of the particular season is also considerable factor when it comes to the target like forest. The two basic decomposition methods that are linear and Pauli decomposition are obtained for the sample image of forest and are represented in Figs. 2 and 3, respectively. Fig. 1 Sample forest image
Fig. 2 Linear decomposed forest Rico image
584
P. Francis Dias and R. M. Banakar
Fig. 3 Pauli decomposed forest image
Figures 2 and 3 show the linear and Pauli decomposition applied to Fig. 1. The selection of the data which depends on the season is also an important parameter of consideration. As per the survey, it is found that the dry season images will give better clarity. The wet season images will have less contrast as compared to the dry season images. The second aspect is regarding the time of acquisition of the image. Normally, there will be daily changes in the moisture content of the forest vegetation. The third prime concern is about the leaf fall and leaf exist situation related to the forest. The seasonal variation is normally found in all the forest areas.
5 Discussions The analysis parameter for forest target is represented using fish bone diagram shown in Fig. 4. In this fish bone diagram, the center part depicts the process of target identification. The side bones represent the parameters of representation as entropy and anisotropy. The two parameters depend on the eigenvalues of image involved. The coherency matrix diagonal elements are measured through the scattering mechanism which basically depicts the polarization. The polarization is based on the co and cross polarization. The fish bone diagram for the target identification is represented in Fig. 4. It depicts the input and output relation of the parameters of extraction for the forest feature analysis. Through this diagram one can study the variation of entropy and anisotropy parameter in the identification of the target as forest. In polarimetry, SAR images are captured from space borne radar, and there is need of mechanism to illustrate the spreadness of scattering. This random scattering can be quantified mathematically by statistical analysis. The parameter used is entropy which illustrates this feature of spread with respect to the wave. The image is quantified for the log probability. When we take the log probability and its summation, the computation becomes much easier. The probability of wave polarization will account the result in the range of 0 and 1. The probability can be called as the core function which assists in predicting the threshold value and the
Polarimetric Technique for Forest Target …
585
Fig. 4 Fish bone diagram for target identification
range for the forest as the target image. Each of these probabilities depends on the eigenvalues which are directly obtained from the image pixel values. These eigenvalues represent the peculiar and the characteristic behavior of the target image. Using the mathematical step by taking matrix as the input, these eigenvalues are found out. For large target image say m*n, a window matrix 3*3 is taken from original acquired PolSAR image. So eventually, there is a distinct relationship between target image, eigenvalue, probability using eigenvalues and sum of product of log probability. Two specific models are needed to characterize the target image. The sensor data evaluation parameters are represented toward the left side of the flow diagram in Fig. 5. The scattering randomness evaluation parameters are represented in right side of the flow diagram. The entropy-based target identification is represented in Fig. 5. The scattering mechanism involved in forest area is modeled through the coherency matrix elements. The coherency matrix is represented by Tcohe . Table 2 data presentation is to support the entropy and anisotropy analysis method adopted through the coherency and covariance features, wherein values obtained are related to particular area. The SAR polarimetry analysis basically helps in computation of the parameter such as entropy and anisotropy which are useful in understanding the mechanism of scattering process.
586
P. Francis Dias and R. M. Banakar
Fig. 5 Entropy-based target identification
Table 2 Forest feature parameters
Entropy and anisotropy analysis Sample image
Entropy
Anisotropy
Forest image
0.91
3.4
⎡
Tcohe
⎤ T11 T12 T13 = ⎣ T21 T22 T23 ⎦ T31 T32 T33
(1)
The coherency matrix is denoted as T cohe as shown in Eq. (1). The diagonal elements of coherency matrix are represented by T 11 , T 22 and T 33 . They precisely represent the co and cross polarization components. The T 11 , T 22 and T 33 are diagonal scattering matrix elements as represented by Eqs. (2), (3) and (4).
Polarimetric Technique for Forest Target …
587
T11 = Shh + Svv
(2)
T22 = Shh − Svv
(3)
T33 = Shv
(4)
The parameter components S hh + S vv is relating single bounce scattering Normally, the rough surface contributes to the single bounce scattering. The double bounce scattering is indicated by S hh − S vv . The scattering known as volume scattering is represented by S hv . The physical interpretation of the object is possible through the polarimetric parameters. These parameters depict the basic property of the object in terms of scattering involved. The entropy depicts the measure for scattering mechanism of object. The value for entropy is between 0 and 1 for particular scatterers. Anisotropy depicts equal and anisotropic scattering of the target. Polarimetric parameters are useful for recognizing the target. The analysis of synthetic aperture radar data basically depends on this computational information of polarimetric parameters. The polarimetric analysis is useful for identifying the scattering involved in the forest area. But when it comes to find the height of the scattering centers, the interferometry is used. The scattering involved in the interior of forest is obtained through the L and P-bands. These bands have long penetration depths.
6 Conclusions The role of SAR sensors and their peculiar functions have variety of applications in remote sensing field. In this paper, classification of forest area has been discussed in detail. The feature related to forest in the study of polarimetry is described. The various scattering mechanisms related to forest are investigated. Identification of different targets or objects is studied based on the characteristics of scattering mechanisms. The entropy variations in the image are analyzed. The sample image is considered for feature analysis purpose. The sample forest images are modeled using Python system model for decomposition methods. The coherency matrix elements are analyzed during the process. The probability model is evaluated for entropy evaluation.
References 1. K. Ouchi, Recent trend and advance of synthetic aperture radar with selected topics. Rev. Remote Sens. 1(1), 716–765 (2013). ISSN: 2072-4292
588
P. Francis Dias and R. M. Banakar
2. C. Thiel, SAR theory and applications to forest cover and disturbance mapping and forest biomass assessment, in ESA PECS, SAR Remote Sensing Course, Cesis (2016), pp. 1–227 3. P. Patel, H.S. Srivastava, R.R. Navalgund, Use of synthetic aperture radar polarimetry to characterize wetland targets of Keoladeo National Park Bharatpur, India. Curr. Sci. 97(4), 529–537 (2009) 4. A.K. Das, C. Patnaik, Monitoring forest above ground biomass of Gujarat state using multi temporal synthetic aperture radar data. Asian Conf. Remote Sens. 1(1), 1–10 (2017) 5. A.K. Das, SAR applications in forestry. 1(1), 1–47 (2018) 6. M.E. Arrigada, Performance of scattering matrix decomposition and color space for synthetic aperture radar imagery, in Master Thesis for degree of Master of Science (2010), pp. 1–73 7. H. Sun, M. Shimada, F. Xu, Recent advances insynthetic aperture radar remote sensing-systems, data processing and applications. IEEE Geosci. Remote Sens. Lett. 14(11), 2013–2016 (2017) 8. T.T.C. Tuong, H. Tani, X. Wang, N.Q. Thang, H.M. Bui, Combination of SAR polarimetric parameters for estimating tropical forest aboveground biomass. Polish J. Environ. Stud. 29(5), 3353–3365 (2020) 9. Shashikumar, Advances in Polarimetry, in SPIE Asia Pacific Remote sensing APRS Symposium Tutorial (2016), pp. 1–23 10. B.H. Trisasongko, The use of polarimetric SAR data for forest disturbance monitoring. Sens. Imag. 11(1), 1–13 (2010) 11. B. Scheuchi, R Caves, I. Cumming, G. Staples, Automated sea ice classification using space borne polarimetric SAR data. IGARSS 2001. Scanning Present Resolving Future 1(1), 1–3 (2001) 12. A.O. Varghese, A. Suryavanshi, A.K. Joshi, Analysis of different polarimetric target decomposition methods inforest density classification using C-band SAR data. Int. J. Remote Sens. 37(37), 694–709 (2016) 13. M. Ouarzeddine, B. Souissi, A. Belhadj-Aissa, Classification of Polarimetric SAR images based on scattering mechanisms. Univ. Sci. Technol. Houri Boumed. 1(1), 1–6 (2007) 14. S.-W. Chen, Y.-Z. Li, X.-S. Wang, S.-P. Xiao, M. Sato, Modelling and interpretation of scattering mechanism in polarimetric synthetic aperture radar. IEEE Signal Process. Mag. 1(1), 79–89 (2014) 15. A. Sungheetha, R. Sharma, A novel caps net based image reconstruction and regression analysis. J. Innov. Image Process. (JIIP) 2(03), 156–164 (2020)
Multilingual Identification Using Deep Learning C. Rahul and R. Gopikakumari
Abstract Language identification (LID) problem is one of the main applications of natural language processing (NLP). It is used to identify the languages from the given corpus. This paper proposes a deep neural network-based model that can perform language identification of English, Sanskrit, Malayalam, Tamil, Kannada, and Hindi parallelly, with an overall percentage accuracy of 97.3. The model can recognize the input character as Sanskrit, Malayalam, Tamil, Kannada, Punjabi, and Hindi. Recurrent neural network (RNN), long short-term memory (LSTM) network, and gated recurrent unit (GRU) are used to formulate the problem. Keywords Natural language processing · Language identification · Recurrent neural network · Long short-term memory · Gated recurrent unit
1 Introduction Automatic LID states the method of automatically identifying the language of a given text from the corpus. Each language in this world has different syntax, semantic and distinguishing features. Automatic language detection is considered as the basic step toward achieving altered NLP tasks like text prediction, spell correction, speech recognition, language identification, tagging of parts of speech, machine translation, handwriting detection, and mistake correction, and among other things in the field of natural language processing, ambiguous word detection is a difficult task. Word frequency-based classifier fails to recognize these words. Language models are mainly classified into two types: statistical language models and neural language models. The statistical method uses n-gram techniques, and the neural language model uses neural networks for the language identification problem. This paper presents a deep neural network-based model for predicting Sanskrit, Malayalam, Tamil, Kannada, Punjabi, and Hindi in India. The model is implemented using RNN, LSTM, and GRU architectures. C. Rahul (B) · R. Gopikakumari Division of Electronics Engineering, School of Engineering, CUSAT, Cochi, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_45
589
590
C. Rahul and R. Gopikakumari
The Sanskrit language has a powerful grammar set, completely defined in the form of Ast¯adhy¯ay¯ı by Panini Mahaarshi [1]. Keralans speak Malayalam as their first language. It is one of India’s official languages, and it is one of the 22 languages spoken in the country. There are lots of similarities between the Sanskrit and Malayalam languages. Malayalam was influenced by Sanskrit and Prakrit languages, which was brought into Kerala by Brahmins. A unique mixture of native languages of Kerala and the Sanskrit language, known as ‘Manipravalam’ [2], served as the medium of literary expression after the eleventh century. Malayalam retained a great deal from the Sanskrit language, at the lexical level, yet additionally in the phonemic, morphemic, and syntactic degrees of language. Tamil has a place with one of the Dravidian Indian dialects and is the authoritative language of the Indian province of Tamilnadu. It is additionally the authority dialects of the Pondicherry Union Territories, Islands of Andaman and Nicobar, Sri Lanka, Malaysia, and Singapore. Tamil language history can be arranged into three decades, namely Old Tamil (300 BC–700 CE), Middle Tamil (700–1600), and Modern Tamil (1600-present). Kannada is the authority language of the Indian province of Karnataka and is assigned as the old-style language of India. Hindi is onetwo authority dialects of the Government of India and is written in the Devanagari script. It is an authority language in nine states and three union territories. This paper proposes a profound neural organization-based model to anticipate the Indian dialects Sanskrit, Malayalam, Tamil, Kannada, and Hindi. RNN, LSTM organization, and GRU are utilized to plan the issue, as these are considered to be efficient architectures for character-level sequential labeling problems [3]. Sanskrit, Malayalam, Tamil, Kannada, Punjabi, and Hindi texts are allowed as input to the model. DNN-based models are used to build a language classifier [4, 5]. This model should be able to identify the input character as Sanskrit, Malayalam, Tamil, Kannada, or Hindi. The following is the content of the paper: Sect. 2 examines the writing audit. In area 3, information planning of Sanskrit, Malayalam, Tamil, Kannada, and Hindi dialects is examined. The general profound learning design is examined in Sect. 4. Area 5 examines the trial results followed by the end.
2 Related Work Character level convolution neural network [6] for text classification uses N-gram method to extract character level features, results improvement in Part of Speech Tagger and information retrieval. Unsupervised automatic language identification [7] can be used for multilingual textual data processing. The Polish language identification deep learning model [8] is used to improve speech processing and recognition system. LSTM model with two hidden layers of 100 neurons each is used for language modeling and avoids the long-term dependency problem. Various methods like Naïve Bayes [9], SVM [10], and n-gram [11] had applied, but the accuracy achieved is less than 90%. Babhulgaonkar [5] proposed a deep learning-based language recognizable
Multilingual Identification Using Deep Learning
591
Table 1 Number of words and characters in the dataset Sanskrit Number of words Number of characters
Malayalam
Tamil
Kannada
Hindi
3,62,512
1,86,890
1,10,754
1,15,642
1,17,337
50,91,099
28,25,403
22,35,469
28,67,532
28,75,084
proof framework. Smitha [12] introduced various mesh generation techniques used for engineering applications. Jacob [13] described design of deep learning algorithm for IoT application by image-based recognition. Vijayakumar [14] explained the capsule network on font style classification. Sathesh [15] described a hybrid parallel image processing algorithm for binary images with image thinning technique. Akey Sungheetha [16] proposed a classification of remote sensing image scenes using double feature extraction hybrid deep learning method. Deep learning-based language distinguishing proof is restricted. Until this point, no immediate correlation of deep learning models for Sanskrit, Malayalam, Tamil, Kannada, and Hindi language recognizable proof has been finished.
3 Description of Dataset Experiments are conducted using the corpora of Sanskrit, Malayalam, Tamil, Kannada, and Hindi. Sanskrit corpora are obtained from the Digital Corpus of Sanskrit (DCS) (http://kjc-fs-cluster.kjc.uni-heidelberg.de/dcs/index.php), developed by Oliver Hellwig and Computational linguistics R&D at JNU-India (http:// sanskrit.jnu.ac.in/sbg/index.jsp). Malayalam, Tamil, Kannada, and Hindi words are manually constructed by consulting with linguistic experts and from the Internet. The entire number of words and characters found in the corpus is shown in Table 1.
4 Methodology A deep neural language identification model is implemented for Sanskrit, Malayalam, Tamil, Kannada, and Hindi data. Figure 1 depicts the model’s overall architecture.
4.1 Proposed Model The goal of this project is to make an automatic language identification for Sanskrit, Malayalam, Tamil, Kannada, and Hindi languages. Deep neural network-based models are used to build a language classifier. This model should be able to identify the input character as Sanskrit, Malayalam, Tamil, Kannada, or Hindi automatically.
592
C. Rahul and R. Gopikakumari
Fig. 1 Proposed model
The possibility of using RNN, LSTM network, and GRU for this purpose is explored. All these models can record temporal dependencies among the input sequence. Hence, they can be used for NLP tasks. The input sequences of these models are characters of a particular language. A tokenizer is implemented for splitting the corpus into different characters. The proposed model is shown in Fig. 1.
4.2 Training The whole information is partitioned into 70% as a preparing set, 20% for testing, and 10% for approval. The model is prepared with datasets referenced in Table 1. When prepared, the exhibition of the model is assessed utilizing the approval dataset. The quantity of preparing ages is fixed at 10. Execution of the model corrupts past 10 ages. The quantity of neurons in the info layer is fixed at 500, and the quantity of stowed away layers at 3 by utilizing the experimentation technique (Sect. 5.1). The mixes (360, 0.5), (240, 0.5), and (128.0.5) for the quantity of stowed away neurons and dropouts separately gave the best presentation on the approval set (Sect. 5.1). Softmax is utilized as the initiation work for grouping the result into classes. This model uses cross-entropy as the expense work, and an Adam streamlining agent is utilized to refresh the boundaries in the language classifier. Word2Vec [17] algorithm is used for word embedding. These are used as inputs for the next stage of classification. Following the representation of each letter by its
Multilingual Identification Using Deep Learning
593
related vector, which was created by the Word2Vec model, the structure of characters {C 1 …. C n } is fed to LSTM one by one. The value of ‘n’ depends on the number of characters present in the corpuses. This is shown in Fig. 1. The LSTM model is used to process these characters. The model is also implemented by using RNN and GRU architectures. The results are compared and tabulated in Table 2 (Sect. 5.1). F1-score, recall, precision, and accuracy [3] are standard evaluation metrics used to assess classification performance. Section 5 discusses all of the outcomes. The algorithm of the model is explained as follows Step 1: Step 2: Step 3: Step 4: Step 5: Step 6: Step 7: Step 8 Step 9
Text File for input Preprocessing the Text File. Took each word and tokenize and save them in an array. Split the characters of words from the array Convert the word to vector using the Word2Vec algorithm. Make features from each vector and identify the feature set Deep neural language identification model is developed using LSTM, RNN, and GRU : Predict the language of the given input. Compare the result using accuracy, precision, recall, and F1-score.
The pseudo-code in support of the model is explained as follows. Develop corpuses of English, Sanskrit, Malayalam, Tamil, Kannada, and Hindi. Table 2 Performance measures of language identification module Number of neurons Hidden layer1
Accuracy (%) Hidden layer2
Time (h)
Hidden layer3
300
200
100
92,989
540
310
210
105
93.112
542
320
215
108
93.234
548
325
218
109
94.898
552
328
220
110
95.345
556
330
222
111
95.991
558
335
225
112
96.876
562
340
230
115
96.924
565
345
232
120
97.387
571
350
235
122
97.411
576
355
238
125
97.422
581
360
240
128
97.623
586
370
250
150
97.611
590
380
260
150
97.521
594
400
270
150
96.112
598
594
C. Rahul and R. Gopikakumari
70% as a training set, 20% for testing, and 10% for validation. /*Tokenize English, Sanskrit, Malayalam, Tamil, Kannada, and Hind*/. def character_split(lang): final_character_set = [] for value in range(0,len(language)): test.append() final_character_set.append() seta = character_split(langage) final.append() /*Character-Word embedding */ The dimensionality of vector words = 100 Word count required = 1 Number of parallel threads to run = 4 Size of the context window = 10 Down sample setting = 1e-3 import Word2Vec model Save the model /*Make feature vectors*/ for variable1 in char: if variable1 in index2word_set: num_words + = 1 featureVec = np.add(featureVec,model[c]) featureVec = np.divide(featureVec,num_words) return featureVec def getAvgFeatureVecs(char, model, num_features): counter = 0 FeatureVecs = np.zeros((len(char),num_features),dtype = "float32") for variable1 in char: FeatureVecs[counter] = makeFeatureVec(c, model, num_features) counter = counter + 1 return FeatureVecs
Multilingual Identification Using Deep Learning
595
/*Deep Neural Network Model*/ Drop_out = 0.4 recurrent_drop_out = 0.4 activation_fun = ’relu’ loss = ’sparse_categorical_crossentropy’ optimizer = ’adam’ model = Sequential() model.add(LSTM(500) /* Initial layer*/ for number_layers from 1 to 3 for number_neurons from 1 to 400 model.add(LSTM()) model.add(Dense(5, activation = ’softmax’)) model.compile()
5 Experimental Result The experiments are conducted using the following system specification. • • • • •
CPU platform: Intel Haswell Machine type: (8 vCPUs, 52 GB memory) NVDIA Tesla K80, 2 GPUs Standard persistent disk: 1 TB OS: Debian GNU/Linux 9.9(stretch) (GNU/Linux 4.90-9amd64*8664)
5.1 Language Identification Model The language ID model is carried out with profound learning structures utilizing RNN, LSTM, and GRU. Performance of profound learning calculation relies upon the distinguishing proof of the ideal hyper-boundaries. The trials are done with various mixes of loss functions, optimizers, learning rate, number of epochs, and batch sizes. The gaining rate is chosen from the arrangement of {0:001; 0:01; 0:1; 1} and saw that the calculation gives great exactness with the learning pace of 0.01. At the point when the learning rate is expanded to 0.1 or 1, the exactness score diminished by 3–4%. Thus, the learning rate is set to 0.01 with Adam analyzer, and the number of ages is fixed to 10 with a bunch size of 64. Different mixes of some stowed away layers and the number of neurons in each secret layer are attempted. The tests are completed on the different mixes to at long last show up at the model giving the most productive result. The model is prepared at first with a lonely secret layer, with the number of neurons being changed from 1 to 400. From there on, the model is prepared with two secret
596
C. Rahul and R. Gopikakumari
Table 3 Parameters of language identification model
Input layer neurons
500
Number of hidden layers
3
First hidden layer neurons
360
Second hidden layer neurons
240
Third hidden layer neurons
128
Output layer neurons
2
Activation function
Softmax
Optimizer
Adam
Dropout
0.5
Size of the batch
64
The rate of learning
0.01
The total number of training epochs
100
layers, with the number of neurons in each layer shifted from 1 to 400. Accuracy is calculated in each case, and a sample set of results and total running time are shown in Table 2. The LSTM architecture with three hidden layers with 360, 240, and 128 neurons shows the highest accuracy of 98.32%. The model is evaluated using RNN and GRU for all the combinations specified in Table 3, and the result is displayed in Table 4. The boundaries of the language identification model are given in Table 3. 10% of this dataset is used for validation. In the beginning, validation losses go down and increase rapidly at epoch 2. The training loss goes down and almost reaches zero at epoch 10. The results are shown in Fig. 2. The classification performance of language identification models is evaluated using the standard evaluation metrics, accuracy, precision, recall, and F1-score. The results are shown in Table 4. From Table 4 and Fig. 2, it can be inferred that the language identification model implemented with the LSTM network gives the best performance. It has an accuracy of 97.623% when classifying Sanskrit and Malayalam texts. Table 4 Performance measures of Sanskrit–Malayalam language identification model Techniques
Accuracy (%)
Precision
Recall
F1-score
RNN
97.491
0.98
0.95
0.97
LSTM
97.623
0.99
0.94
0.96
GRU
97.585
0.98
0.95
0.97
Multilingual Identification Using Deep Learning
597
Fig. 2 Training and validation loss
6 Conclusion This paper details the implementation of deep learning-based LID for Sanskrit, Malayalam, Tamil, Kannada, and Hindi. A very large corpus is needed for the parallel model. The model proposed in this paper can be adopted for developing deep learning-based language identification for any language. As future work, the proposed model can be applied to semantically similar languages, namely Punjabi and Gujrathi. Thereafter, a hybrid language identification model can be developed for Sanskrit, Malayalam, Tamil, Kannada, Panjabi, Gujrathi, and Hindi.
598
C. Rahul and R. Gopikakumari
References 1. A. Kedman, Form, function and interpretation, a case study in the textual criticism of P¯an.ini’s As.t.a¯ dhy¯ay¯ı’. J. of Bulletin d’Études Indiennes 32, 171–203 (2014) 2. M. Muhaseen, A. Kumar, B. Vinuraj, R.P. Joseph, An Archaeological Investigation into Shukasandesham, Unnuneelisandesham and Mediaeval Malayalam Literary Works. J. Multi. Stud. Archaeol. 6, 739–755 (2018) 3. O. Hellwig, Sanskrit word segmentation using character-level recurrent and convolutional neural networks, in Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 2754–2763 (2018) 4. K. Chen, R. Wang, Neural machine translation with sentence-level topic context. IEEE/ACM Trans. Audio Speech Lang. Process. 24, 182–214 (2019) 5. A. Babhulgaonkar, S. Shefali, Language identification for multilingual machine translation, in International Conference on Communication and Signal Processing (2020) 6. M.A. Zissman, Comparison of four approaches to automatic language identification of telephone speech. IEEE Trans. Speech Audio Process. 4(1), 31 (1996) 7. N. Dehak, Language recognition via i-vectors and dimensionality reduction, in Proceedings of Interspeech (2011), pp. 857–860 8. G. Hinton, N. Deng, Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Process. Mag. 29(6), 82–97 (2012) 9. Y. Lei, L. Ferrer, A. Lawson, M. McLaren, N. Scheffer, Application of convolutional neural networks to language identification in noisy conditions, in Proceedings of Odyssey-14, Joensuu, Finland (2014) 10. F. Richardson, D. Reynolds, N. Dehak, A unified deep neural network for speaker and language recognition, in Proceedings of Interspeech (2015), pp. 1146–1150 11. I. Lopez Moreno, J. Gonzalez-Dominguez, O. Plchot, On the use of deep feedforward neural networks for automatic language identification, in Computer Speech and Language (2016), pp.46–59 12. T.V. Smitha, A study on various mesh generation techniques used for engineering applications. J. Innov. Image Process. 3(2), 75–84 (2021) 13. I. Jacob, D. Ebby, Design of deep learning algorithm for IoT application by image based recognition. J. ISMAC 3(03), 276–290 (2021) 14. T. Vijayakumar, R. Vinothkanna, Capsule network on font style classification. J. Artif. Intell. 2(02), 64–76 (2020) 15. A. Sathesh, A. Edriss, Hybrid parallel image processing algorithm for binary images with image thinning technique. J. Artif. Intell. 3(03), 243–258 (2021) 16. A. Sungheetha, R. Rajesh, Classification of remote sensing image scenes using double feature extraction hybrid deep learning approach. J. Inf. Technol. 3(02), 133–149 (2021) 17. B. Premjith, K. Soman, A deep learning approach for Malayalam morphological analysis at character level. Int. Conf. Comput. Intell. Data Sci. 132, 47–54 (2019)
AI-Based Career Counselling with Chatbots Ajitesh Nair, Ishan Padhy, J. K. Nikhil, S. Sindhura, M. L. Vandana, and B. S. Vijay Krishna
Abstract The days of connecting with a service solely through a keyboard have passed. Users are increasingly using voice assistants and chatbots to communicate with systems. A chatbot is a piece of computer software that converses with humans via messaging platforms by utilising artificial intelligence. Every time the chatbot receives user input, it remembers both the input and the response, allowing chatbots with limited baseline knowledge to grow based on the responses they receive. The chatbot’s precision improves as more responses are provided. In this paper, we will discuss an intelligent chatbot that could replace the traditional method of presenting questionnaires to users to collect data in order to accurately understand the user’s interests and recommend appropriate courses and colleges on the Website TheStudentSuccessapp.com by nSmiles. Based on the user’s previous inputs, this chatbot would intelligently select questions from its question bank, attempting to predict his/her interests and career possibilities with as few questions as possible. When the nSmiles psychometric engine finds an interest, the chatbot will deliver a short intermittent status report to the user, along with the option to examine the recommended list of courses and colleges, before moving on to explore other interests. Finally, a complete report with a list of courses and colleges recommended to the user based on all of his/her interests and choices would be presented. Several other features like email generation and graph visualisations are also included. Keywords Chatbot · Rasa · Artificial intelligence · Machine learning · Career counselling
A. Nair (B) · I. Padhy · J. K. Nikhil · S. Sindhura · M. L. Vandana Computer Science, PES University, Bengaluru, India e-mail: [email protected] M. L. Vandana e-mail: [email protected] B. S. Vijay Krishna nSmiles, Bengaluru, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_46
599
600
A. Nair et al.
1 Introduction These days chatbots are present on every Website for a wide variety of applications. We built a chatbot with Rasa which helps the user to take up a career counselling assessment within the chat window. The chatbot would replace the traditional method of presenting questionnaires to the user to collect data to accurately understand the user’s interest and recommend appropriate courses and colleges in TheStudentSuccessapp.com by nSmiles. This chatbot would intelligently select questions from its question bank based on the previous inputs given by the user and try to estimate his/her interests and career choices with as few questions as possible. As soon as the nSmiles psychometric engine detects an interest, a short intermittent status report would be provided to the user along with an option to view the recommended list of courses and colleges, and then, the chatbot would move on to explore other interests. Finally a detailed report would be provided including the list of courses and colleges recommended to the user based on all his/her interests and choices. An email containing the report would be sent to the user to the email id extracted from the user response using regular expressions and NLP techniques. The chatbot makes use of machine learning and AI in the form of pretrained models to respond to a user input. Further the bot falls back to a default response if it was not able to generate an output with enough confidence levels. Bar graph visualisations are also added within the chat window to better represent his/her interests and skills. We have also discussed the various platforms and technologies one could use for creating a chatbot and the challenges one could face while doing so.
2 Literature Survey In order to find the most suitable framework for our requirements, we explored several state of the art chatbots. They are listed below.
2.1 Jollity Chat-Bot-A Contextual AI Assistant [1] The main purpose of the jollity chatbot is to talk with people and aid them by proposing blogs, videos and photographs. The bulk of chatbots are retrieval-based; however, the jollity chatbot is generative. The jollity chatbot is aware of the flow of the interaction and can ease the user’s discomfort by offering acceptable or precise solutions. Chatbots are not only used for bookings, education, and hospitality; this chatbot is unusual; in that it can operate as a virtual buddy for an individual anytime he needs it, especially during bad moments, by recommending videos, posts, and photographs to make him feel better. Because they cannot afford a high-priced therapeutic session that will make them happy, and this is a cost-effective solution. The
AI-Based Career Counselling with Chatbots
601
model is tested using a variety of criteria, including intent accuracy, story-line accuracy and a confusion matrix. Experiment findings show that the system can identify intentions and retrieve suitable replies with a 90% accuracy rate. Objective/Methods. The authors have created a chatbot to converse with human users, ensuring that it entertains and provides suggestions and motivation through difficult moments. Implemented in Rasa [2], an open-source chatbot framework and deployed on Telegram. They have incorporated 12 separate intents, each with more than eight text instances, for a total of 100 input samples. Advantages. All the components of the chatbot can be deployed on our own server, thereby increasing security. Since the chatbot uses machine learning, artificial intelligence and interactive learning, it keeps learning on its own as it interacts with people. It can be deployed on any Website, Facebook, Slack, Telegram, etc. Limitations. Development is a little complex. Prior knowledge on the topic is required. Understanding the models used requires deep knowledge of machine learning and artificial intelligence. Frontend development is necessary if deploying on a Website. The frontend must be able to communicate and transfer data between the chat window and the backend.
2.2 Nsmav-Bot: Intelligent Dual Language Tutor System [3] The bot was created with educational restrictions in mind, since many children are unable to receive the proper upbringing and skills that they need as they grow up. It addresses both the rural and urban populations by using two of the country’s most widely spoken languages. The bot is now teaching at the primary school level, with a few chapters from each of the four topics thrown in for good measure. All of the information is delivered in a logical order and at a pace that is comfortable for the user. Resources such as instructional films, related material and frequently asked questions will be available on the page to which the user will have access. Objective/Methods. The writers have designed a bot that will serve as a traditional education tutor for people of all ages, is engaging with its student user and keeps him or her up to date with the rest of the population. The bot is written with Node.js and linked with Facebook Messenger using the Microsoft Bot framework. Advantages. Microsoft Bot framework [4] is open source. Hence, it is freely available. It has pre-built entities. It is very user-friendly and easy to develop. It also provides the feature of speech to text translation using machine learning. It is multilingual. It supports a variety of languages and can be integrated with Websites, Cortana, Microsoft teams, Skype, Slack, Facebook Messenger, Kik, etc. Limitations. C# or Node.js platforms can be used for development. It does not provide any other option for development platforms. It uses a Pay as you use a model for a paid version with premium features. All features are not available for free.
602
A. Nair et al.
2.3 Ticketing Chatbot Service Using Serverless NLP Technology [5] A personal assistant that uses a human operator will take very less time to respond to a specific request, such as reserving a ticket, an order or a service request. Many requests for information on the Internet may be combined into a single search. Since time and productivity are valued in business, it must be seen as an alternate method of accepting requests. Chatbots can provide support 24 h a day, which can be beneficial. In a conversation, a chatbot serves as a routing agent, classifying the user’s intent. The chatbot uses natural language processing (NLP) to interpret the request and extract some keyword information. Morphological analysis and part of speech (POS) tagging are two essential NLP processes. Centred on a series of rules, POS assists in parsing the context of chat language. The rule base is tailored to a certain language and is intended to capture all of the keywords used in chat content. The terms departure and destination area, as well as the flight date, are used as keywords in the booking conversation. There is a distinction between a consumer deciding the city and date. The role of natural language processing in booking confirmation is to examine different patterns representing ordering requests such as city and date. Based on the conducted experiment, this study found that chatbots would assist with customer care, with a F-measure score of 89.65. Objective/Methods. The authors have built a chatbot to help with ticket purchasing/sales. The system can be divided into three parts: Node JS webhook, Wit.ai NLP services and Ticket.com Order API. This bot is built with Wit AI framework and NodeJS. Advantages. Wit AI [6] is open source which indicates that it has a large developer community base. The NLP engine in Wit AI chatbot framework is undeniably amongst the best. It offers SDKs in iOS, Node.js, Python, Ruby and can be integrated with apps, Websites, Slack, Facebook Messenger, wearable devices, home automation, etc. Limitations. Every triggered intent requires a webhook call which makes it time intensive. Development UI is not provided. Hence, development is difficult. Retrieving missing parameters in Wit AI is difficult.
2.4 Effectiveness of an Empathic Chatbot in Combating Adverse Effects of Social Exclusion on Mood [7] Researchers have demonstrated that strategies like emotional service animals can help alleviate the negative consequences of social exclusion. When individuals are removed from social media channels, new ways to intervene using technologies become available. As a result, an anonymous instant messaging chat with a stranger
AI-Based Career Counselling with Chatbots
603
has been shown to increase self-esteem and mood after social exclusion. In this paper, we see a more open variant of a related intervention: an empathetic chatbot. They investigated whether an empathetic chatbot could be used to counteract the harmful effects of social isolation. Objective/Methods. In this study, the authors investigated whether an empathetic chatbot could protect people from the detrimental impacts of social exclusion. After experiencing social media isolation, participants were randomly picked to interact with an empathetic chatbot about it. It was implemented using the Botkit framework. Advantages. Botkit framework [8] is open source. Hence, it is freely available. It can be customised according to our needs. Fast building of bots is possible and can be integrated with Slack, Facebook Messenger, Twilio SMS, Cisco Webex, Cisco Jabber, Microsoft Teams, Google Hangouts, etc. Limitations. It has no built-in support for NLP. Hence, it cannot process free flowing text. It suffers from many bugs since it is relatively new.
2.5 Exploring and Promoting Diagnostic Transparency and Explainability in Online Symptom Checkers [9] The authors of this paper have proposed a chatbot which interacts with the user and tries to extract the user’s symptoms. The chatbot then predicts if the user has COVID-19 or not. It also provides the users with information regarding diagnosis and the precautions/steps the user must follow for speedy recovery. Objective/Methods. In this paper, they have attempted to enhance diagnostic transparency by augmenting online symptom checkers with explanations. A COVID-19 OSC is designed which gives three types of explanations. This bot is built with the Botman framework, PHP and Laravel. Advantages. BotMan framework [10] is open source. Hence, code is available to be customised. BotMan is the only PHP application that completely supports PHP chatbot development. BotMan Studio is a BotMan and Laravel installation that works together. BotMan has a bespoke chat widget that you can use right away to add a BotMan-powered chatbot onto your Website. It works with Hangouts, Slack, Telegram, Cisco Spark, Facebook, Website, WeChat and other platforms. Limitations. We can choose only PHP platforms for development. It does not support any other platform. Conclusions from Literature Survey. The top five open-source frameworks were analysed through various papers on different chatbot projects. They are: Rasa framework, Microsoft Bot framework, Wit AI, Botkit framework and Botman framework. After analysing the advantages and limitations of the various frameworks, our requirements and suitability, we decided that the Rasa framework is the most suitable
604
A. Nair et al.
for our needs. We also explored the general architectures of chatbots and their platforms [11]. According to the authors, Eleni Adamopoulou and Lefteris Moussiades, there are seven types of chatbots, namely open domain, closed domain, interpersonal, intrapersonal, rule-based, retrieval-based and human-aided. The different types are explained in detail along with their advantages and disadvantages. It also helped us in designing the architecture of our chatbot. While Rasa makes use of ML models, there are many other possibilities. One of them is deep learning neural networks [12]. This is explored by author Abul Bashar. In this paper, the author has described the various applications of deep learning neural networks. Although it could be used for document classification, ML models would provide higher accuracies. One of the things deep learning models are best at is speech recognition. Thus, deep learning could be used when a new feature of speech to text is added to the chatbot so that the user could speak his response instead of typing it. The other type of models that could possibly replace ML is capsule neural networks. This is explored by author Vijayakumar [13]. According to the author, capsule networks are proved to be good at tasks such as intent classification and slot filling which are essential tasks in any chatbot with error rates as low as 0.0531. However, training capsule networks are both time and power consuming. Another model that does intent classification is the extreme learning machine (ELM). According to Samuel Manoharan [14], ELM can be used for prediction without an iterative tuning procedure. The learning efficiency is also faster than other traditional methods. However, once the dataset becomes large, the success rate of classification reduces due to the computation that is required. Hence, it would not be suitable for a real-time and interactive application like chatbot. Finally, the challenges that one could face while creating a chatbot are discussed by Rahman et al. [15]. According to the authors, the first challenge is natural language processing which means understanding what the user is saying. It involves extracting keywords from the message and identifying its meaning. The second challenge is machine learning. It involves predicting the correct output in response to the input given. All these papers have given us immense knowledge, advantages, disadvantages and challenges of designing a chatbot.
3 Rasa and Its Merits Rasa is easy to integrate and customise, and it is not a state machine. It can be easily integrated with existing systems, and it can be run on the platform of our interest. Rasa supports various intents. It can be deployed in multiple environments. Rasa provides options to check various analytics and data and also provides rolebased access control. In an enterprise environment, Rasa provides an option to create different roles and define their permissions and control access to chat logs, NLU training data, stories, responses, models, etc. By default, Rasa provides three roles, namely admin, annotator and tester. Admin users can customise these roles and assign users to them. Rasa framework is open source, highly customisable, makes use of
AI-Based Career Counselling with Chatbots
605
AI and ML and most importantly can be deployed on our backend server with all components in-house.
4 Data To solve the user problem, the chatbot datasets necessitate a massive amount of data, which is trained using several examples. However, training chatbots with inaccurate or inadequate data produce unfavourable outcomes. Since chatbots do not only answer questions, but also converse with customers, it is critical that the right data is used to train the datasets. A chatbot converts unstructured data into a conversation. This data is commonly unstructured and comes from a variety of sources. A chatbot requires data for two reasons: to understand what people are saying to it and to react appropriately. The different types of training data for Rasa are NLU training data, stories, rules and entities. The NLU training data consists of intent-categorised examples of user utterances. Entities may also be used as training examples. Entities are logically organised chunks of data that can be extracted from a user’s post. Extra details can also be added to training data, such as regular expressions and lookup tables, to help the model correctly recognise intents and entities. Stories can be used to train models that can generalise to unknown contact routes. It is a type of training data that the assistants use to learn how to manage their conversations. Rules refer to short sections of dialogue that should all go in the same direction. It is a type of training data that your assistants use to learn how to manage their conversations. The chatbot will need a general idea of the types of questions it will be asked, as well as the relevant answers. Entities are logically organised chunks of data within a user post. For entity extraction to work, you must either specify training data or build regular expressions to extract entities using the RegexEntityExtractor based on a character pattern. All of the data in this scenario is training data (Fig. 1).
Fig. 1 Rasa architecture
606
A. Nair et al.
5 Framework and Method Rasa is a tool used to build custom AI chatbots using Python and natural language understanding. NLU and conversation models are trained and run through the Rasa Open Source service. When a model handles a conversation, it adds events to a conversation tracker. It also uses API calls to conduct custom actions on the action server. The tracker store, which can be any type of database or in-memory store, stores conversation trackers. Events stored in the tracker store are also published to an event broker if one is set up. When there are numerous Rasa Open Source nodes, the lock store ensures that only one node can work on a single conversation tracker at a time. The intended audience is individual students/users who wish to take career counselling on TheStudentSuccessapp.com. Their primary goal is to take the assessment and realise their interests and possible future career paths. For companies that provide their employees with wellness and counselling packages, their primary goal is to upskill or change their career track. The current architecture is a form-based questionnaire approach. We propose a chatbot-based architecture. This chatbot will provide multi choice-based options to take a career counselling assessment. Based on the report generated by existing nSmiles psychometric engine, the report is presented in the bot interface with options to get more details after each interest is detected. More details for each section shall include list of colleges and courses to select from. This framework shall also return each response type along with confidence level. Finally, the complete report provided by the psychometric engine will be presented in the bot interface.
6 System Analysis 6.1 Functional Requirements The bot should respond to any input it gets. If the bot recognises the interest from the input, it should react with the psychometric engine’s intermittent status report. If the bot needs more information to determine the interests, it should ask more questions, but the number of questions asked should be kept to a minimum. The bot should be able to communicate data to users as a text message or as an image/graph with text. It must also be able to query the API’s data and generate an intermittent status from the results. If the bot is successful in matching an interest, the confidence level must be returned. It must produce an error message if it is unable to process the request and match a suitable intent. If the connection fails, it must return an error.
AI-Based Career Counselling with Chatbots
607
6.2 Non-Functional Requirements The response time of the bot must be short since it is an interactive chatbot, and listed interests must be accurate to a high degree. When the amount of data grows, so does the need to secure it. The data provided by the clients in the form of answers is used solely for the purpose of testing them, and we must ensure that they remain anonymous and are not used for any other unauthorised purposes. Only users who have enrolled will be able to view the Website’s content. Users with their credentials will be able to access and respond; nevertheless, it must be ensured that no third party will interfere with the usage or assessment of the client’s responses. The information supplied must be kept secure and used solely for the purpose of evaluation. Only the client will have access to the curated findings, and no one else will be able to access the report without the client’s credentials. As a result, data transmission between the frontend and backend must be totally secure.
6.3 Advantages • Rasa is easy to integrate and customise. It is an open-source framework for conversational AI development. Open-source platforms are programs whose source code is available for anyone to examine, alter or improve. Because it is open source, developers will be able to add new features and functionality to meet your needs. The platform is easy to configure and versatile, and it may be tailored to your specific requirements. It is so simple to integrate and configure, and it saves your company money while simultaneously ensuring that you get precisely what you want. • Learning is interactive. Rasa’s AI chatbot is based on the interactive learning approach. Even if you do not have enough data to train the AI chatbot, you may quickly create one by interacting with the Rasa chatbot example during the creation phase. Because the chatbot will be in the demo stage, any errors it makes will be easy to fix. Developers of AI chatbots can feed it data and build on the bot using an interactive learning technique.
6.4 Disadvantages Many people assume that chatbots are difficult to utilise and that learning what you want from your consumers takes a long time. It might also irritate the consumer if they are slow or have difficulty screening responses. The purpose of installing chatbots is to increase consumer contact and speed up responses. Due to the limited availability of data and the time required for self-updating, this strategy may be time-consuming and costly. As a result, rather than serving numerous customers at once, chatbots may grow confused and deliver poor customer care. Chatbots are time-saving tools
608
A. Nair et al.
that help you save time and effort by remaining available at all times and serving several customers at the same time. However, unlike people, each chatbot must be customised for each company, resulting in a higher initial setup cost. This is a risky investment, given the chance of last-minute changes, as replacing the programme will incur additional fees.
7 Implementation Different modules work together to provide the required functionality. As shown in Fig. 2, the end user could be a student or a working professional who wishes to take career counselling. Chat window is the chat box through which the user will communicate with the chatbot. Rest API helps the chat window of the client to communicate with the backend present in the server. Backend is the component which analyses the user input and produces output in the form of a text or an image. The backend further consists of the intent classifier which extracts intents from the user message and classifies it into one of the intents, the NLU core which contains the AI and ML model which understands the user message and generates the response, story manager keeps track of the conversation till then and decides which course the conversation should take based on the defined stories. The psychometric engine is the existing module which uses various formulas to calculate interests and generates reports based on user interests. The college or course recommender module will give appropriate recommendations based on the user’s interests. Question bank is the data
Fig. 2 High-level design
AI-Based Career Counselling with Chatbots
609
store of all the questions from which questions are selected and are presented to the user by the bot.
7.1 Chatbot Frontend The frontend is the chat widget used to communicate with the bot. This user will interact with the chat window which will have a text field for the user to write their answer and a list of past messages above it. The bot’s message can take the shape of either text buttons or graphics. The rest API manager will be used to communicate between the chat window and the backend. The frontend is implemented completely using HTML, CSS and Javascript. HTML and CSS are used to design and style the chat window. The Javascript function is invoked as soon as the page is loaded which initialises the connection between the frontend and backend by providing a unique user id to the user to keep track of the user. It is also used to display the typing graphics while the bot is processing the input.
7.2 Chatbot Backend This is the module which communicates with the frontend and psychometric engine which does all the processing and generates the next message to be displayed to the user. It consists of three main classes, namely actions, email and Rasa. Each of these classes has several data members and functions which are described in detail in the upcoming sections. Actions server, email module and Rasa module are the main parts of the backend. Rasa makes use of the functions in this class to perform custom actions at different points of time in the conversation with the user. When the user clicks on the chat widget, the action server is responsible for welcoming the user. It is also in charge of checking and updating the assessment’s completion status in the database. It also manages the updating of interest scores in real time. The email module handles sending the assessment report to the user at the end of the assessment using a predefined template. Rasa’s objective is to extract structured data from user messages. This usually comprises the user’s intent as well as any entities contained in their message. These specifics are then used to determine the next course of action. Entities are logically organised chunks of data within a user message. To improve intent classification and entity extraction, regular expressions can be used. The chatbot backend makes use of bidirectional encoder representations from transformers (BERT) model to extract intents and entities. The AI model is trained by using a dataset consisting of various possible types of entities. The general flow of the conversation is defined in the form of stories. Checkpoints are used within the story which enables the conversation to loop from a particular point in the story. It allows a part of a story to be invoked from many different parts without rewriting, thereby reducing duplication. Further rules are defined which helps in executing a
610
A. Nair et al.
particular conversation in case a predefined scenario occurs. A default response is displayed if the bot is not able to predict a response with sufficient confidence.
7.3 Actions Module • GreetUser() function. Purpose: To greet the user with a “Hello message” as soon as the chat widget is clicked. By default, Rasa does not allow the chatbot to initiate the conversation. However, to meet our requirements, we have defined this function which will override Rasa’s default option and initiate the conversation. • DBCheck() function. Purpose: To check if the user is registered with nSmiles and view assessment completion status. This function connects to the database and fetches the details of assessments completed by the user. SQLite is the database used for development. However, it can connect to any database and will be connected to MongoDB in the production environment. • UpdateStrengthInterest() and UpdateEmployabilityInterest() function. Purpose: To increment the interest value. This function is called each time the user provides an answer for the assessment questions. It updates the interest scores and also displays the interest scores once all questions related to an interest are completed.
7.4 Email Module • SendStrengthMail() and SendEmployablityMail() function. Purpose: To send the curated report to the user at the end of the assessment. This module is called when an assessment is completed. A predefined email template has been designed. The interest scores are plugged into the template, and the email is sent to the user. The email id, port, smtp server, etc., have to be specified before invoking this function. This function is implemented using the “Email” Python library. The template is written as a string in HTML with variables embedded in between which will contain the interest scores. The action server sends the interest scores to this module using a function call once all the interests are calculated. Along with the interests, the email id and name are also passed. The function then sends an email using library functions to the email id specified. The sender’s email id, port and smtp server are already defined before and are used to send the email.
7.5 Rasa Module This is the module which makes use of various artificial intelligence and machine learning techniques. NLP techniques, regular expressions and machine learning are
AI-Based Career Counselling with Chatbots
611
used to extract name and email from user response. It uses whitespace tokenizer, Regex featurizer, count vector featurizer and DIET classifier which is trained using the dataset to extract intents and entities. The chatbot on receiving the input, alongside classification of the intent, checks whether the regular expression satisfies these constraints or not. DIET is a multi-task transformer architecture that simultaneously performs intent categorization and entity recognition. Once they are extracted, the machine learning model predicts the bot’s response. This model, although trained before, keeps continuously learning from the user inputs, thereby increasing its accuracy.
8 Conclusion and Future Work The chatbot system is in place to satisfy the users mental health needs. A chatbot’s simulation or generation of a response is a knowledge-based task. When a user uses the chatbot’s graphical user interface to ask questions (GUI), the database is searched for the query. If the response is discovered in the database, it is displayed to the user; otherwise, the system tells the administrator that a response is missing from the database and provides the user with a predetermined response. When a user selects a category, the chatbot captures the user’s email address. If the user’s question is not answered by the alternatives, the chatbot system provides an additional dialogue box in which the user can write his or her question about the course. Users can ask the chatbot system any number of questions about the courses. The chatbot system responds to all user enquiries immediately. When the existing nSmiles psychometric engine finds an interest, the user will receive a brief intermittent status report, as well as the ability to see the recommended list of courses and colleges, before the chatbot moves on to other interests. Once the assessment is complete the bar graph displays the real-time calculation of interest scores, interest type in x-axis and the corresponding score in y-axis. It gives an overall distribution of interests to the user for a better evaluation of their mental health. Finally, the user will receive a full report that includes a list of courses and universities that have been recommended to him or her based on all of his or her interests and preferences. Further the chatbot can be voice-enabled. This would involve receiving the user responses by recording his/her audio and providing responses by playing audio using text to speech technology. This would enhance the user experience and let humans and computers communicate more effectively.
References 1. J.M.K. Deepika, V. Tilekya, T. Subetha, Jollity chatbot- a contextual ai assistant, in 2020 3rd International Conference on Smart Systems and Inventive Technology (ICSSIT) (2020), pp. 1196–1200
612
A. Nair et al.
2. Rasa: Open source conversational AI. https://rasa.com/ 3. S.J.S. Mohapatra, N. Shukla, S. Chachra, “Nsmav-bot: Intelligent dual language tutor system” in 2018 4th International Conference on Computing Communication Control and Automation (ICCUBEA) (2018), pp. 1–5 4. Microsoft Bot Framework. https://www.botframework.com/ 5. E. Handoyo, M. Arfan, Y.A.A. Soetrisno, M. Somantri, A. Sofwan, E.W. Sinuraya, Ticketing chatbot service using serverless nlp technology, in 2018 5th International Conference on Information Technology, Computer, and Electrical Engineering (ICITACEE) (2018), pp. 325–330 6. Wit.ai. https://wit.ai/ 7. E.G.K. Mauro de Gennaro, G. Lucas, Effectiveness of an empathic chatbot in combating adverse effects of social exclusion on mood in (2018) 8. Botkit: Building Blocks for Building Bots. https://botkit.ai/ 9. C.H. Tsai, Y. You, X. Gui, Y. Kou, J.M. Carroll, Exploring and promoting diagnostic transparency and explainability in online symptom checkers 10. BotMan—The PHP messaging and chatbot library https://botman.io/ 11. E. Adamopoulou, L. Moussiades, An overview of chatbot technology, in IFIP International Conference on Artificial Intelligence Applications and Innovations (2020, Springer), pp 373– 383 12. A. Bashar, Survey on evolving deep learning neural network architectures. J. Artif. Intell. 1(02), 73–82 (2019) 13. T. Vijayakumar, Comparative study of capsule neural network in various applications. J. Artif. Intell. 1(01), 19–27 (2019) 14. J.S. Manoharan, Study of variants of extreme learning machine (ELM) brands and its performance measure on classification algorithm. J. Soft Comput. Paradigm (JSCP) 3(02), 83–95 (2021) 15. A.M. Rahman, A.A. Mamun, A. Islam, Programming challenges of chatbot: current and future prospective, in 2017 IEEE Region 10 Humanitarian Technology Conference (R10-HTC) (2017), pp. 75–78. https://doi.org/10.1109/R10-HTC.2017.8288910
A High-Gain Improved Linearity Folded Cascode LNA for Wireless Applicatıons S. Bhuvaneshwari and S. Kanthamani
Abstract This paper presents Folded Cascode(FC), Low-Noise Amplifier (LNA) designed for 5.2 GHz. Several techniques such as Complementary Current Reuse, diode-connected Forward Body Biasing, and linearity improvement have been used separately. The proposed LNA combines all the three techniques for low power consumption, gain enhancement and to have good linearity. Under Cadence Virtuoso environment, using 180 nm Complementary Metal–Oxide–Semiconductor (CMOS) technology, the designed LNA exhibits high gain (S 21 ) of 27.4 dB followed by Noise Figure (NF) of 2.4 dB at a reduced supply voltage of 0.6 V. The other important parameters including reflection coefficient (S 11 ) −12.5 dB, power dissipation (Pdc ) 3.23 mW, and Third-order Input Intercept Point (IIP3 ) −4.43 dBm are achieved. This demonstrates the applicability of proposed LNA for wireless applications functioning in 5.2 GHz. Keywords Complementary current reuse · Folded cascode · Diode-connected FBB · PMOS IMD sinker
1 Introduction Nowadays everyone is demanding faster devices and gadgets ever before and this is becoming possible due to CMOS and wireless technology which contribute in achieving tremendous growth in high data rate, downloading multimedia videos, accessing online gaming, and making video conferring. Along with this, wireless communications drives the considerable attention of fully developed integrated CMOS receiver frontend due to the scaling, higher level of integrability, lower cost, etc. [1, 2]. Importantly, the sensitivity of the wireless receiver is limited by the noise factor which is decided by the LNA. Therefore, the design of LNA involves compromise among the parameters such as impedance matching, gain, noise figure, and linearity. Thus, to accommodate the compromise S. Bhuvaneshwari (B) · S. Kanthamani Department of ECE, Thiagarajar College of Engineering, Madurai 625015, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_47
613
614
S. Bhuvaneshwari and S. Kanthamani
among the parameters, CMOS is a promising technology for designing LNAs to achieve low noise figure, high gain with low power consumption [3–5]. Proliferation of portable and moveable devices operating in Wireless Local Area Networks (WLANs) and sensors forming huge Wireless Sensor Networks (WSNs) is confronted to accomplish performance requirements including low power consumption to preserve battery life at very low supply voltage. With the view of this, the folded cascode topology [6, 7] is adapted with complementary current reuse technique(CCR) [8, 9] for low voltage and low power consumption needing for wireless and sensor network applications. However, achieving the linearity without the degradation of other performance parameters is quite challenging. In order to obtain good linearity, various linearity enhancement techniques [10] are probed. The feed forward technique [11] uses an auxiliary path for designing LNA. In this technique, signal with the same amplitude and opposite phase of thirdorder intermodulation distortion (IIM3 ) is generated and fed to the output of the main path to cancel the third-order harmonic at the cost of power dissipation and gain reduction. An optimum out-of-band tuning technique [12] uses low-impedance inductor and capacitor trap network to resonate with the nonlinear signals at the input, as a result, it is cancelled. Moreover, this technique is more suitable to use with for Metal–Oxide–Semiconductor Field–Effect Transistor (MOSFET) when not functioning in strong inversion region. Optimum biasing technique at gate terminal [13] is useful to produce current that is proportional to third-order nonlinearity (gm3 ) of the FET transconductance and to have a DC feedback to set it to zero. Nevertheless, it is sensitive to bias variations. Injection of IM2 technique [14] generates a low-frequency product form and mixed with the fundamental input signals to produce the signal that is required for cancellation with the intrinsic signal IIM3 in the main path without degradation of noise, gain, and power consumption. However, it may leak to differential output of the main path and degrade IIP2 . Derivative Superposition (DS) techniques using multiple gated transistors [15–18] use two transistors operating in strong and weak inversion region. In this configuration, the negative peak of second-order nonlinear term which contributes third-order intermodulation distortion of main transistor is cancelled by the positive peak of auxiliary transistor. However, the noise figure is degraded due to the cancellation path. And also, it still suffers from second-order distortion combined with harmonic feedback. To eliminate the residing nonlinear components, double DS method [19] along with auxiliary transistor and inductor employed in the CG stage contributes to cancel these nonlinear components. The modified DS method [20–23] uses two transistors double DS method. However, the source inductors are connected to transistor may tune out the third-order intermodulation distortion directly from the input which may result in reduction of gain and noise figure. In post linearization technique [24, 25], the use of NMOS diode connected with parallel RC circuit contributes to partially cancel the IMD3 at the input without the penalty of gain and noise figure. Thus after reviewing the pioneering state of arts, various techniques such as complementary current reuse, diode-connected Forward Body Biasing(FBB), and linearity improvement have been used separately. This paper presents a FC LNA with combination of CCR, diode-connected FBB, and post linearization techniques.
A High-Gain Improved Linearity Folded …
615
CCR technique boosts the transconductance (gm ) which increases the gain. Moreover, gain is increased further by diode-connected FBB technique and reduces power dissipation of proposed LNA. Additionally, post linearization using PMOS IMD sinker technique to sink the intermodulation distortion at the input to achieve better linearity. From the simulated results, it is proved that the proposed LNA design exhibits good performance with high gain of 27.4 dB followed with good linearity of −4.43 dBm as compared to the existing state of arts.
2 Proposed LNA Figure 1 shows the schematic of proposed LNA. It is a two-stage LNA with folded cascode topology as first stage and buffer in the second stage. Normally, cascode topology is widely used for its advantages such as high gain, better isolation, and good stability with low power consumption. But due to stacking of transistors, it requires large voltage headroom. To overcome this problem, folded cascode topology is chosen mainly for low power design. Low voltage operation is possible with both NMOS (M 1 ) and PMOS (M 2 ) connected to common supply voltage using drain inductor (L d ). Input matching is done using C in , L g , C ex , whereas L m and C out are used for output matching. L m is interstage matching inductor between M 2 and M 3 . M 3 is used to provide inverting gain for M 2 through C m which increases the transconductance of M 2 and also lowers the power consumption by reusing the current from M 3 . Input signal is received by M 1 and the nonlinearity component present at the output of M 1 is reduced by PMOS IMD sinker (M p ) to improve the linearity. Diode-connected MOS devices M 4 , M 5 , and M 6 are linked at bulk of M 1 ,
Fig. 1 Schematic of proposed LNA
616
S. Bhuvaneshwari and S. Kanthamani
M 2, and M 3 to reduce the power consumption and to improve gain. The buffer is designed at the output stage to increase the gain further.
2.1 Impedance Calculation of Proposed LNA In this complementary current reuse folded cascode LNA, input matching is done by parallel inductive (PI) [26] matching using gate inductor (L g ). The input impedance of folded cascode LNA using PI matching is achieved as follows:
ωL p1 Z in = Rp
2
1 + j ωL p1 − ωCs
(1)
where Rp = ω Rs Cg L p1
1
Rs ωCg
2
(2)
2πf 0 ; f 0 is the operating frequency is the parasitic input source resistance C gs + C gd + C ex associated gate capacitance is the inductance seen by C in towards LNA
When the input is matched,
ωL p1 Rp
2 = 50 and ωL p1 =
1 ωCs
(3)
which gives Cin = Cg Lp =
Rs 50
1
ω2 C g 1 +
(4)
(5)
Rs 50
where C in and L p are the series capacitance and parallel inductance, respectively, connected to M 1. Impedance seen at the output side of proposed LNA is expressed as follows:
A High-Gain Improved Linearity Folded …
617
ZL =
s L m
1 sCout
(6)
where L m is the middle inductor between M 2 and M 3 . And C out is the output capacitance.
2.2 Transconductance Improvement Using Diode Connected Forward Body Biasing Technique In general, body of MOS is zero biased or reverse biased. For low supply voltage, MOS device is in weak inversion region as the threshold of MOS device is reduced. In order to improve the performance, FBB technique is used to make the MOS in strong inversion region. In conventional FBB technique [27, 28] as shown in Fig. 2a, a DC current flow across the junction between source and body that depends on body voltage causes extra power consumption and leads to latchup problem. To avoid latchup failure, diode-connected FBB [29] technique is used as shown in Fig. 2b, where terminals of MOS devices are connected in such a way that two diodes are connected back to back. Forward biasing the MOS will reverse bias the back-to-back connected diode which will reduce the leakage current and thus power dissipation is reduced. Moreover forward biasing the MOS diode will reduce the threshold voltage of the device and hence lowers the power consumption. This decrease in threshold will significantly reduce the effective voltage of the device which also improves the transconductance as shown in Eq. (7). gm =
2ID Veff
where gm ID V eff
is the transconductance of the MOS device. is the drain current of the MOS device. is the effective voltage for the MOS device to switch on.
Fig. 2 a Conventional FBB b diode-connected FBB
(7)
618
S. Bhuvaneshwari and S. Kanthamani
Fig. 3 Small signal equivalent of the proposed LNA
Figure 3 shows the small signal equivalent circuit of the proposed LNA. The effective transconductance of M 1 with respect to diode-connected FBB is given by, G m1 =
gm1
gmb1 (sCdb1 ) sCdb1 +s(Csb1 +Cb4 )
(Csb1 +Cb4 ) + s CCdb1db1+(C Z1 + 1 +C ) sb1 b4
(8)
where
1 1 + Z 1 = s L d || sCgp sCp Z1 gmb1 C db1 C sb1 C b4
(9)
is the impedance seen at the drain of M 1 is the body transconductance of M 1 is parasitic drain to bulk capacitance of M 1 is parasitic source to bulk capacitance of M 1 is the capacitance due to diode-connected MOS M 4
And the effective transconductance of M 2 is given by, G m2 =
gm2 + gmb2 gm2 +
db2 +C b5 ) gmb2 s(Cs(C db2 +C b5 )+sC sb2
+Cb5 )Csb2 + s (C(Cdb2db2+C Y2 )+C b5 sb2
(10)
where Y2 = sCout ||
1 s Lm
(11)
A High-Gain Improved Linearity Folded …
Y2 gmb2 C db2 C sb2 C b5
619
is the admittance seen at node 4 is the body transconductance of M 2 is parasitic drain to bulk capacitance of M 2 is parasitic source to bulk capacitance of M 2 is diode-connected MOS capacitance M 5
It can be perceived from Eqs. (8) and (10) that the effective transconductance of M 1 and M 2 is improved by diode-connected MOS capacitance C b4 , C b5 . Thus diode-connected MOS device size is chosen appropriately to have increased gain and thus receiver sensitivity [30] is increased.
2.3 Gain Improvement Using Complementary Current Reuse Technique In conventional current reuse technique, MOS transistors are operated in saturation region with large voltage headroom to achieve high gain at the cost of power dissipation. In order to achieve high gain and low power with low supply voltage, CCR technique is used. An inverted gain is introduced by complementary transistor, M 3 to the transistor M 2 and is given by Av3 = −G m3 Z out .
(12)
where Gm3 is the effective transconductance of M 3 and given by, G m3 =
gm3 gmb3 sCdb3 sCdb3 +(s(Csb3 +Cb6 ))
+
sCdb3 (s(Csb3 +Cb6 )) sCdb3 +(s(Csb3 +Cb6 ))
Z3 + 1
(13)
where Z 3 = s L m || gmb3 C db3 C sb3 C b6
1 sCm
(14)
is the body transconductance of M 3 is parasitic drain to bulk capacitance of M 3 is parasitic source to bulk capacitance of M 3 is the diode-connected MOS capacitance of M 6
And Z out is the impedance seen at the output side of M 3 and given by
Z out
1 = s(L m + L s )|| sC3
(15)
620
S. Bhuvaneshwari and S. Kanthamani
where
C3 = (Cdb3 + [Cb8 + Csb8 ]) Cm + Cgs2
(16)
The overall voltage gain (AVovall ) introduced by M 3 to M 2 is given by AVovall = G m2 Z L (1 + Av3 )
(17)
where Gm2 is the effective transconductance of M 2 and Z L is the load impedance seen at output of M 2 . From Eq. (17), it is observed that, the overall gain of the proposed LNA is increased twice with inverted gain introduced by M 3 and diode-connected FBB at M 2 . Addition of buffer in the second stage enhances gain further which is required for high-gain WLAN applications.
2.4 Linearity Improvement Using PMOS Inter Modulation Distortion Sinker Technique The main source of nonlinearity is the transconductance (gm ) which converts linear input voltage (V gs ) to nonlinear output drain current (I d ). The drain current flowing through M 1 is expressed as, 2 3 IdM1 = I1 = gm1M1 vgsM1 + gm2M1 vgsM1 + gm3M1 vgsM1
(18)
where gm1, 2, ..n M1 vgsM1
is the nth order transconductance nonlinearity of M 1 is the gate to source voltage of M 1
And, the source current flowing through M p is expressed as, IsMp = I2 = gm1Mp vsgM2 + gm2Mp v2sgM2 + gm3Mp v3sgM2
(19)
2 3 vsg M2 = f 1 vgs M1 + f 2 vgs M1 + f 3 vgs M1
(20)
where
gm1, 2, ..n Mp vgsM2 f 1, 2, …n
is the nth order transconductance nonlinearity of M p is the gate to source voltage of M 2 is the frequency dependent coefficients
A High-Gain Improved Linearity Folded …
621
Resulting current I 3 is expressed as, I3 = I1 −I2
(21)
2 ≈ gm1M1 − f 1 gm1Mp vgsM1 + (gm2M1 − f 12 gm2Mp − f 2 gm1Mp )vgsM1 3 + (gm3M1 − f 13 gm3Mp − f 3 gm1Mp − 2gm2Mp f 1 f 2 )vgsM1
≈ gm1M1 − f 1 gm1Mp
(22) (23)
From Eq. (23), it is evident that by adjusting the frequency dependent coefficients f 1, 2… n , gm3 M1 of M 1 is cancelled by gm3 Mp of the PMOS IMD sinker M p . This means that the IMD3 generated by the third-order nonlinearity in M 1 is sinked by M p . It can also be seen that gm1 M1 also partially reduced by gm1 Mp of the IMD sinker. It causes the lowering of gain from 1 to 2 dB and degradation of NF from 0.1 to 0.2 dB. But degradation of gain and NF is not severed because of bias current and transconductance of PMOS which has low mobility compared to NMOS.
3 Simulation Results The proposed LNA is designed and simulated using cadence virtuoso environment in 180 nm CMOS technology at 5.2 GHz with different process corners Typical— Typical (TT, 27 °C), Fast—Fast (0 °C), Slow—Slow (SS, 80 °C). Figures 4, 5, 6, 7, 8, 9, and 10 show that the proposed LNA achieves S 21 of 27.4 dB, S 11 of −12.5 dB, and NF 2.4 dB. The power dissipation is 3.23 mW and IIP3 of −4.43 dBm with the reduced supply voltage of 0.6 V. The component values are listed in Table 1.
3.1 Effect of Current Reuse Capacitor (Cm ) on Gain Transconductance improves with respect to middle capacitor C m which reuses the current between the complementary devices M 2 and M 3 . Thus C m is varied from 500 fF to 2 pF and observed gain is increased proportionally from 25.7 to 29.2 dB as shown in Fig. 4. C m is tuned to 1.75 pF to achieve gain of 27.4 dB with optimum input matching at the frequency 5.2 GHz.
622
S. Bhuvaneshwari and S. Kanthamani 35 30 25
S21 (dB)
20 15 10
Cm = 500fF Cm = 1.25pF Cm = 1.75pF Cm = 1.375pF Cm = 2pF
5 0 -5 4.0
4.2
4.4
4.6
4.8
5.0
5.2
5.4
5.6
5.8
6.0
5.8
6.0
Frequency (GHz) Fig. 4 Effect of C m on S 21 (dB) 30 25
S21 (dB)
20 15 10 5
With conventional FBB With diode connected FBB
0 -5 4.0
4.2
4.4
4.6
4.8
5.0
5.2
Frequency (GHz) Fig. 5 Effect of diode-connected FBB on S 21 (dB)
5.4
5.6
A High-Gain Improved Linearity Folded …
623
35 30 25 20
S21 (dB)
15 10 5 0 -5
TT FF SS
-10 -15 -20 3.0
3.5
4.0
4.5
5.0
5.5
6.0
6.5
7.0
6.0
6.5
7.0
Frequency (GHz) Fig. 6 S 21 (dB) at different process corners 22 20 18 16
NF (dB)
14 12 10
TT FF SS
8 6 4 2 0 3.0
3.5
4.0
4.5
5.0
5.5
Frequency(GHz) Fig. 7 NF (dB) at different process corners
624
S. Bhuvaneshwari and S. Kanthamani 2 0 -2
S11 (dB)
-4 -6
TT FF SS
-8 -10 -12 -14
3.0
3.5
4.0
4.5
5.0
5.5
6.0
6.5
7.0
6.5
7.0
Frequency (GHz) Fig. 8 S 11 (dB) at different corners 2 0 -2
S22 (dB)
-4 -6 -8
TT FF SS
-10 -12
3.0
3.5
4.0
4.5
5.0
5.5
Frequency (GHz) Fig. 9 S 22 (dB) at different corners
6.0
A High-Gain Improved Linearity Folded …
625
Fig. 10 IIP3 (dBm) of the LNA
Table 1 Component values
Component
Values
M1
150/0.18 µM
M2
60/0.18 µM
M3
240/0.18 µM
Mp
12/0.18 µM
Lg
1.5 nH
Ls
1 nH
Ld
1.75 nH
Lm
7 nH
C in
1.625 pF
C ex
0.07 pF
Cm
1.75 pF
Cp
0.5 pF
C out
0.1 pF
3.2 Effect of Diode Connected Forward Body Biasing on Gain From Fig. 5, it is observed that gain increases with diode-connected FBB. S 21 of 27.4 dB gain is achieved through the proposed LNA, whereas in conventional FBB S 21 is 25.8 dB. The gain improvement of 1.6 dB is attained through transconductance
626 Table 2 Performance comparison at normal, best, and worst cases
S. Bhuvaneshwari and S. Kanthamani Parameter
TT 27 °C
FF 0 °C
SS 80 °C
S 11 (dB)
−12.5
−13.7
−11.6
S 22 (dB)
−10.3
−11.1
−10
S 21 (dB)
27.4
28.2
26.8
NF (dB)
2.4
1.5
2.5
Pdc (mW)
3.23
2.5
4.8
improvement as discussed in Eq. (17) by diode-connected FBB technique which also reduces the induced leakage current under same bias condition. The prelayout simulation of the proposed LNA for S 21 at normal (TT), best (FF) and, worst (SS) corners is shown in Fig. 6. It is observed that gain is maximum of 25 dB at FF corner between 4.9 and 5.3 GHz and between 5 and 5.2 GHz at TT and SS corners. Moreover, S 21 is nearly similar at TT and SS corners. NF of the proposed LNA is 1.5 dB at FF corner between 4.25 and 6.25 GHz and it is shown in Fig. 7. NF of 2.4 ± 0.1 dB is attained at TT and SS corners between 4.75 and 5.75 GHz. And also it reveals that NF is almost same at TT and SS corner at the frequency of interest. Input return loss (S 11 ) and output return loss (S 22 ) must be less than −10 dB. Figure 8 depicts that S 11 is less than −10 dB between 5 and 5.2 GHz at FF, TT corners and in the desired frequency at SS corner. Similarly, S 22 is less than − 10 dB at FF corner and equal to −10 dB at TT and SS corners at the frequency 5.2 GHz as shown in Fig. 9. The performance parameters such as gain, NF, input and output return loss, power dissipation at various process corners are analyzed and tabulated in Table 2. It is observed that, the performance is good at FF corner and better at SS and TT corner which means that the proposed LNA works well even at worst case. Nonlinearity occurs due to mixing of adjacent harmonics that produces intermodulation distortion with the desired operating frequency. The linearity of the device depends on the third-order IMD which is represented in IIP3 . Higher the value of IIP3 , better the linearity, and lower the IMD. It also indicates ability of the amplifier that can process the signal before the IMD occurs. IIP3 of the proposed LNA is simulated with third-order frequency 5.1G followed by first-order frequency of 5.2 GHz. As illustrated in Fig. 10, it is observed that −17.3 dBm is attained for the design without PMOS IMD sinker technique and −4.43 dBm for the design with PMOS IMD sinker technique. Thus, a 13 dBm improvement is attained using the proposed PMOS IMD sinker technique. The design is simulated with the input power of − 30 dBm. The power dissipation (Pdc ) is 3.23 mW with the reduced supply voltage of 0.6 V. The figure of merit (FoM) of the proposed LNA is calculated by Eq. (24). FoM =
Gain(abs) ∗ IIP3 (mW) (NF)(abs) ∗ Pdc (mW)
(24)
A High-Gain Improved Linearity Folded …
627
Table 3 Performance comparison with state-of-the-art LNAs References
[7]a
[8]c
[9]a
[11]a
[19]b
[20]a
This work
Freq. (GHz)
5.5
2.4
5
5
5.25
2.44
5.2
Tech. (µm)
0.18
0.18
0.18
0.18
0.18
0.18
0.18
Gain (dB)
16.5
18.6
15
12.25
10.5
14.6
27.4
NF (dB)
1.53
1.52
3.2
3.5
2.4
2.9
2.4
Supply (V)
0.5
0.5
0.6
0.6
1.5
0.6
0.6
S 11 (dB)
−15.2
−20
−8
−11
−23.6
−15
−12.5
S 22 (dB)
−15
−22
−12
−9
−10.8
–
−10.3
Pdc (mW)
0.89
2.1
1.3
1.28
9
3.8
3.23
IIP3 (dBm)
−17.2
−4.9
−15.9
−1
15.3
4.19
−4.43
FoM
0.672
7.86
0.32
4.65
22.68
10.2
35.3
a
Folded cascade
b Cascode c Body Bias
4 Performance Comparison with Existing State of Art The performance of the proposed LNA is compared with few previously reported literatures designed for WLAN applications in Table 3. The proposed LNA achieves 27.4 dB gain, 2.4 dB NF, and IIP3 −4.43 dBm with the reduced supply voltage of 0.6 V. Compared to the literatures [7–9, 11, 19, 20], it is observed that the proposed LNA has FoM 35.3 which is higher than all the other reported literatures. With the proposed design techniques, the improvements are attained by the proposed diodeconnected FBB-based folded cascode architecture with complementary current reuse and PMOS IMD sinker is clearly evident particularly for high gain, good linearity, and low-power solution.
5 Conclusion A Folded Cascode LNA with Complementary Current Reuse, diode-connected Forward Body Biasing, and PMOS IMD sinker techniques are presented in this paper is studied successfully. It also proves that CCR technique contributes to improve the transconductance (gm ) of cascode transistor which increases the gain with low power and low supply voltage. Here, diode-connected FBB technique helps to increase the gain further and minimize the leakage current of MOS diode which results in low power dissipation. Moreover, PMOS IMD sinker technique sinks the nonlinearity component associated at the input which improves linearity. Therefore, because of this, the proposed LNA design is a promising candidate for wireless applications, where high gain and good linearity are the essential parameters with low power.
628
S. Bhuvaneshwari and S. Kanthamani
References 1. B. Razavi, CMOS technology characterization for analog and RF design. IEEE J. Solid-State Circ. 34, 268–276 (1999) 2. T.H. Lee, 5-GHz CMOS wireless LANs. IEEE Trans. Microwave Theory Tech. 50, 268–280 (2002) 3. T. Nguyen, C. Kim, et al., CMOS low-noise amplifier design optimization techniques. IEEE Trans. Micowave Theory Tech. 52, 1433–1442 (2004) 4. X.J. Li, Y.P. Zhang, CMOS Low noise amplifier design for microwave and mmWave applications. Progress Electromagnet. Res. 161, 57–85 (2018) 5. M. Parvizi, K. Allidina, Mourad N. El-Gamal, An ultra-low-power wideband inductorless CMOS LNA with tunable active shunt-feedback. IEEE Trans. Microwave Theory Tech. 64, 1843–1853 (2016) 6. H. Hsieh, J. Wang, L. Lu, Gain-enhancement techniques for CMOS folded cascode LNAs at low-voltage operations. IEEE Trans. Microwave Theory Tech. 56, 1807–1816 (2008) 7. W.C. Wang, Capacitor cross-coupled fully-differential CMOS Folded cascode LNAs with ultra low power consumption. Wirel. Pers. Commun. 78, 45–55 (2014) 8. R. Dai, Y. Zheng et al., A 0.5-V novel complementary current-reused CMOS LNA for 2.4 GHz medical application. J. Microelectron. 55, 64–69 (2016) 9. E. Kargaran et al., A new gm boosting current reuse CMOS folded cascode LNA. IEICE Electron. Express 10 (2013) 10. H. Zhang, E. Sánchez-Sinencio, Linearization techniques for CMOS low noise amplifiers: a tutorial. IEEE Trans. Circ. Syst. I Regul. Pap. 58(1), 22–36 (2011) 11. E. Kargaran et al., Highly linear folded cascode LNA. IEICE Electron. Express 10 (2013) 12. V. Aparin, L.E. Larson, Linearization of monolithic LNAs using low-frequency low-impedance input termination, in Proceedings of the European Solid-State Circuits Conference (2003), pp. 137–140 13. V. Aparin, G. Brown, L.E. Larson, Linearization of CMOS LNAs via optimum gate biasing, in Proceedings of the IEEE International symposium on Circuits and Systems (2004), pp. 748–751 14. S. Lou, H.C. Luong, A linearization technique for RF receiver front-end using second-order intermodulation injection. IEEE J. Solid-State Circ. 43, 2404–2412 (2008) 15. T.H. Jin, T.W. Kim, A 5.5-mW +9.4-dBm IIP3 1.8-dB NF CMOS LNA employing multiple gated transistors with capacitance desensitization. IEEE Trans. Microwave Theory Tech. 58, 2529–2537 (2010) 16. Y.M. Kim, H. Han, T.W. Kim, A 0.6-V +4 dBm IIP3 LC folded cascode CMOS LNA with gm linearization. IEEE Trans. Circ. Syst. Express Briefs 60, 122–126 (2013) 17. Z.S.M. Salim, M. Muhamad, H. Hussin, N. Ahmad, CMOS LNA linearization employing multiple gated transistors, in IEEE International Conference on Telecommunication Systems, Services, and Applications (TSSA) (2019) 18. L. Ma, Z.-G. Wang, J. Xu, N.M. Amin, A high linearity wideband common-gate LNA with differential active inductor. IEEE Trans. Circ. Syst. II Expr. Briefs 64, 402–406 (2017) 19. C.W. Park, Y. Ahn et al., Linearity improvement cascode low noise amplifier using double DS method with a tuned inductor. Int. J. Electron. 97, 847–855 (2014) 20. S. Kumaravel et al., A high linearity and high gain folded cascode LNA for narrowband receiver applications. Microelectron. J. 54, 101–108 (2016) 21. A.P. Tarighat, M. Yargholi, A CMOS low noise amplifier with employing noise cancellation and modified derivative superposition technique. Microelectronics 54, 116–125 (2016) 22. R. Raja, B. Venkataramani, K. Hari Kishore, A 1-V 2.4 GHz low-power CMOS LNA using gain-boosting and derivative superposition techniques for WSN. Wirel. Pers. Commun. 96, 383–402 (2017) 23. M. Rafati, S.R. Qasemi, P. Amiri, A 0.65 V, linearized cascade UWB LNA by application of modified derivative superposition technique in 130 nm CMOS technology. Analog Integr. Circ. Sig. Process. 99, 693–706 (2019)
A High-Gain Improved Linearity Folded …
629
24. T. Kim, B. Kim, Post-linearization of cascode CMOS low noise amplifier using folded PMOS IMD Sinker. IEEE Microwave Wirel. Compon. Lett. 16, 182–184 (2006) 25. C.P. Chang, W.C. Chien et al., Linearity improvement of cascode CMOS LNA using a diode connected NMOS transistor with a parallel RC circuit. Prog. Electromagnet. Res. C 17, 29–38 (2010) 26. S. Asgaran, M.J. Deen, C.-H. Chen, Design of the ınput matching network of RF CMOSLNAs for low-power operation. IEEE Trans. Circ. Syst. I Regul. Pap. 54, 544–554 (2007) 27. C.S. Chang, J.C. Guo, Ultra-low voltage and low power UWB CMOS LNA design using forward body biases, in IEEE Radio Frequency Integrated Circuits Symposium (RFIC) (2013), pp. 173–176 28. V. Singh et al., A 0.7 V, Ultra-wideband common gate LNA with feedback body bias topology for wireless applications. J. Low Power Electron. Appl. 42 (2018) 29. T.P. Wang, Minimized device junction leakage current at forward-bias body and applications for low-voltage quadruple-stacked common-gate amplifier. IEEE Trans. Electron. Dev. 61, 1231–1236 (2014) 30. M. Bansal, H. Singh, G. Sharma, A taxonomical review of multiplexer designs for electronic circuits and devices. J. Electron. 3(02), 77–88 (2021)
Design and Analysis of Low Power FinFET-Based Hybrid Full Adders at 16 nm Technology Node Shikha Singh and Yagnesh B. Shukla
Abstract In the current trend technology, where portable electronic devices are used in our day-to-day life, the development of the electronic circuits which consumes low power with optimum performance is critically required. Adder is one of the circuitry which is used in most of the electronic systems for various arithmetic operations as well as for address calculation. A major part of total power consumption is contributed by these adder cells. Therefore, optimizing these circuits can help in power reduction. In this paper, four different full adder circuits are designed with the novel full swing XOR-XNOR cell. Simulations are done using HSPICE and PTM 16 nm library. Average power consumption has been calculated. Results show that in comparison to the MOSFET-based HFA, FinFET-based HFA has a much reduced power consumption. Keywords Hybrid full adder · FinFET · Power consumption · Power delay product · Full swing output · Critical path · XOR-XNOR cell
1 Introduction It is seen that a major part of the electronic systems consists of digital circuitry such as microprocessor [1]. The efficiency of any digital circuit or any digital application depends upon the behavior of the fundamental circuit such as adders and multipliers which are the part of it and in turn, the part of the electronic systems. Adders plays a pivotal role as all the arithmetic operations have an involvement of it [2]. In the present era, where portable battery-operated electronic systems are widely used, there is an utmost requirement of optimizing power and speed of these arithmetic circuits [1]. Reduction in power consumption without much compromise for delay can help these systems to work for hassle-free longer hours as for any battery-operated device, there is a limitation for the power availability [3].
S. Singh (B) · Y. B. Shukla Gujarat Technological University, Ahmedabad, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_48
631
632
S. Singh and Y. B. Shukla
With the scaling in technology, conventional MOSFET devices suffers from drawbacks such as drain induced barrier lowering (DIBL), velocity saturation, and hot electron effect [4–6]. It is also observed that MOSFET devices consumes more power as compared to the latest available technologies [7, 8]. In order to overcome these drawbacks, FinFET technology is introduced. It offers low power consumption with better speed. Therefore, full adders designed using this technology also provides reduced power consumption and delay. In this paper, hybrid full adder circuits are designed using novel full swing XORXNOR cell [3, 5]. The proposed XOR-XNOR circuit has a reduction in power consumption as it does not have a NOT gate on the critical path [9]. However, because of the absence of the NOT gate, there is a slight reduction in output driving capability. With the proposed cell, hybrid full adder with 19 transistor (HFA-19T), hybrid full adder with 22T transistor (HFA-22T), hybrid full adder with 26 transistor and a buffer (HFA-B-26T), hybrid full adder with 26 transistor and a new buffer (HFA-NB-26T )[5, 9] are designed and simulated using HSPICE with PTM-MG 16 nm library. Simulations at 16 nm technology node are done in order to inspect the behavior of full adders at this particular technology node. The organization of the paper is as follows. The second section will provide a sketch of the basics of full adder, novel full swing XOR-XNOR cell, hybrid full adders, viz HFA-19T, HFA-22T, HFA-B-26T, HFA-NB-26T [5, 9]. The third section describes the results and discussions at 16 nm technology node. Finally, conclusions are given in the fourth section.
2 The Technique The basic full adder circuit consists of input signals as A, B, Cin and outputs as sum (S) and carry (Cout) (Fig. 1). The truth table for the above circuit can be calculated as follows (Table 1). Considering above truth table, the expression for the outputs can be calculated using as [11]: Carry(Cout) = A AND B OR B AND C OR C AND A Fig. 1 Basic full adder circuit [5, 9]
(1)
Design and Analysis of Low Power …
633
Table 1 Truth table of the full adder [10] Input signals A
B
Output signals Input carry (Cin)
Output carry (Cout)
Sum (S)
L
L
L
L
L
L
L
H
L
H
L
H
L
L
H
L
H
H
H
L
H
L
L
L
H
H
L
H
H
L
H
H
L
H
L
H
H
H
H
H
where L indicates low/0/off state H indicates high/1/on state
Sum(S) = A XOR B XOR Cin
(2)
Designing of hybrid full adder circuits includes a XOR-XNOR cell and a multiplexer. Considering power consumption of the HFA, a major part is contributed by XOR-XNOR cell [12, 13]. Therefore, certain techniques have to be employed in order to reduce the power consumption. Hybrid full adders designed in the current work includes novel full swing XOR-XNOR cell [5] which consumes less power and has less delay as shown in Fig. 2. With this proposed cell, the four different HFAs are designed and simulated. The HFA-19T consists of 19 transistors which is shown in Fig. 3. The schematic design for HFA-22T is shown in Fig. 4 and which consists of 22 transistors. The HFA-B-26T consists of 26 transistors with a buffer which is shown in Fig. 5. The fourth hybrid full adder is HFA-NB-26T with 26 transistors and a buffer. The schematic of the same is shown in Fig. 6.
3 Results Using FinFET 16 nm technology, all the four hybrid full adder circuits are simulated. The common specifications for the four HFAs are as follows (Table 2). Transient analysis is done with a stop time tstop = 100 ns. Average power consumption, delay, and power delay product are calculated using above specifications.
634
S. Singh and Y. B. Shukla
Fig. 2 Full swing XOR-XNOR circuit [9]
Fig. 3 HFA-19T circuit [9]
3.1 HFA-19T The simulation results for the HFA-19T [9] is shown in Fig. 7. The various parameters for HFA-19T are calculated as follows (Table 3).
Design and Analysis of Low Power … Fig. 4 HFA-22T circuit [9]
Fig. 5 HFA-B-26T circuit [9]
635
636
S. Singh and Y. B. Shukla
Fig. 6 HFA-NB-26T circuit [9]
Table 2 Common specifications of the hybrid full adder cells
Technology used
FinFET 16 nm
Supply voltage
0.6 V
Vpulse
0.55 V
pfin height and width
17 n
nfin height and width
10 n
3.2 HFA-22T For the different input combinations as per the truth table, the output results for sum (S) and carry (Cout) are obtained accordingly for the hybrid full adder circuit with 22 transistors [5, 9]. The transient response is analyzed and is shown in Fig. 8. The various parameters for HFA-22T are calculated as follows (Table 4).
Design and Analysis of Low Power …
637
Fig. 7 Input and output waveforms of the designed HFA-19T with a supply voltage of 0.6 V
Table 3 Results obtained for HFA-19T
Average power consumption
1.70 e−08 W
Average propagation delay
2.99 e−08 s
PDP
5.08 e−16 J
3.3 HFA–26T with Buffer The hybrid full adder with 26 transistors and buffer circuit [5, 9] is simulated with a supply voltage of 0.6 V and a pulse voltage of 0.55 V using FinFET 16 nm technology node. Different parameters are calculated for the circuit. Transient response is analyzed and is obtained with a stop time of 100 ns (Fig. 9). The various parameters for HFA-B-26T are calculated as follows (Table 5).
3.4 HFA–26T with New Buffer The hybrid full adder cell with new buffer and 26 transistors [5, 9] is designed and simulated (Fig. 10). The various parameters for HFA-NB-26T are calculated as follows (Table 6).
638
S. Singh and Y. B. Shukla
Fig. 8 Simulation results of the designed HFA-22T with a supply voltage of 0.6 V
Table 4 Results obtained for HFA-22T
Average power consumption
1.67 e−08 W
Average propagation delay
1.99 e−08 s
PDP
3.32 e−16 J
Comparative analysis of the average power consumption of the simulated hybrid full adder circuits is done and shown in Table 7 as. Comparative analysis of the average power consumption of the present results obtained, and the previous results is also done (Table 8). It is observed that hybrid full adder cells designed using FinFET technology has tremendous reduction in power consumption as compared to the hybrid full adder cells designed using conventional MOSFET technology [14, 15]. However, a compromise for delay reduction is seen at 16 nm FinFET technology with the specifications used for the proposed work.
4 Conclusions In order to reduce average power consumption, FinFET-based hybrid full adder circuits are designed and simulated at 16 nm technology node using HSPICE. The reduction in power consumption using FinFET technology as compared to MOSFET
Design and Analysis of Low Power …
639
Fig. 9 Input and output waveforms of the designed H F A-B-26T with a supply voltage of 0.6 V
Table 5 Results obtained for HFA-B-26T
Average power consumption
2.78 e−08 W
Average propagation delay
2.99 e−08 s
PDP
8.31 e−16 J
technology provides a great scope for battery-operated portable electronic systems [3, 12]. It is observed that power consumption is effectively reduced by more than 95% with FinFET technology as compared to the MOSFET-based hybrid full adder circuits [16]. The four different hybrid full adder cells, namely HFA-19T, HFA-22T, HFAB-26T, HFA-NB-26T [5, 9] were designed and simulated with a supply voltage of 0.6 V and the pulse voltage of 0.55 V. With the results obtained, it is concluded that FinFET technology-based hybrid full adder circuits are better for the use in various electronic systems used in our day-to-day life.
640
S. Singh and Y. B. Shukla
Fig. 10 Input and output waveforms of the designed H F A-NB-26 T with a supply voltage of 0.6 V Table 6 Results obtained for HFA-NB-26T
Table 7 Comparison of the different HFAs on the basis of average power consumption
Average power consumption
2.42 e−08 W
Average propagation delay
2.24 e−08 s
PDP
5.42 e−16 J
HFA circuit
Average power consumption
HFA-19T
1.70 e−08 W
HFA-22T
1.67 e−08 W
HFA-B-26T
2.78 e−08 W
HFA-NB-26T
2.42 e−08 W
Table 8 Comparıson of average power consumptıon for the proposed HFAs wıth prevıous work HFA circuit
Average power consumption [9]
Average power consumption (proposed design)
% Reduction in average power consumption
HFA-19T
75.86 e−08 W
1.70 e−08 W
97.7
HFA-22T
77.45 e−08 W
1.67 e−08 W
97.8
HFA-B-26T
85.08 e−08 W
2.78 e−08 W
96.7
HFA-NB-26T
85.82 e−08 W
2.42 e−08 W
97.1
Design and Analysis of Low Power …
641
References 1. P. Bhattacharyya, B. Kundu, S. Ghosh, V. Kumar, A. Dandapat, Performance analysis of a low—power high-speed hybrid 1-bit full adder circuit. IEEE Trans. VLSI 23(10) (2015) 2. M. Amini-Valashani, M. Ayat, S. Mirzakuchaki, Design and analysis of a novel low-power and energy-efficient 18T hybrid full adder. Microelectron. J. (2018) 3. A.M. Shams, T.K. Darwish, M.A. Bayoumi, Performance analysis of low-power 1-bit CMOS full adder cells, IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 10(1), 20–29 (2002) 4. A.B.A. Tahrim, H.C. Chin, C.S. Lim, M.L.P. Tan, Design and performance analysis of 1-Bit FinFET full adder cells for subthreshold region at 16 nm process technology. J. Nanomater. (2015) 5. A. Raghunandan, D.R. Shilpa, Design of high-speed hybrid full adders using FinFET 18 nm technology, in 4th International Conference on Recent Trends on Electronics, Information, Communication & Technology (RTEICT) (IEEE, 2019) 6. S. Singh, Y.B. Shukla, Design methodologies for low power and high speed full adder. J. Crit. Rev. (2020) 7. F. Moradi, D.T. Wisland, H. Mahmoodi, S. Aunet, T.V. Cao, A. Peiravi, Ultra low power full adder topologies. IEEE Trans. VLSI (2009) 8. V. Dokania et al., Design of 10T full adder cell for ultra low-power applications. Ain Shams Eng. J. (2017) 9. H. Naseri, S. Timarehi, Low-power and fast full adder by exploring new XOR and XNOR gates. IEEE Trans. VLSI (2018) 10. A. Pal, Low-Power VLSI Circuits and Systems (Springer Publications, 2015) 11. S. Sharma, G. Soni, Comparision analysis of FinFET based 1-bit full adder cell implemented using different logic styles at 10, 22 and 32 nm, in 2016 International Conference on Energy Efficient Technologies for Sustainability (ICEETS) (2016) 12. H.T. Bui, Y. Wang, Y. Jiang, Design and analysis of low-power 10-transistor full adders using XOR-XNOR gates. IEEE Trans. Circ. Syst. II Anal. Dig. Signal Process 49(1), 25–30 (2002) 13. D. Radhakrishnan, Low-voltage low-power CMOS full adder. IEEE Proc. Circ. Dev. Syst. 148(1), 19–24 (2001) 14. A.P. Chandrakasan, R.W. Brodersen, Low Power Digital CMOS Design (Kluwer Academic Publishers, 1995) 15. A.K. Yadav, B.P. Shrivatava, A.K. Dadoriya, Low power high speed 1-bit full adder circuit design at 45 nm CMOS technology, in Proceeding International conference on Recent Innovations is Signal Processing and Embedded Systems (RISE-2017), 27–29 Oct 2017 16. M. Hasan, A.H. Siddique, A.H. Mondol et al., Comprehensive study of 1-bit full adder cells: review, performance comparison and scalability analysis. SN Appl. Sci. 3, 644 (2021)
A Review on Fish Species Classification and Determination Using Machine Learning Algorithms Sowmya Natarajan and Vijayakumar Ponnusamy
Abstract About 50% of world population depend on seafood for the protein content. Because of the nature resource, illegal fishery in addition with uncultured task is providing threat to marine life. Few standard analytical methods are being applied to determine the fish freshness, quality and species discrimination with respect to physicochemical properties. Colour, meat elasticity, odour, taste, texture and outer appearance are the attributes acquired for determination. These methods require highly skilled operators, expensive, destructive and time consuming. In the last decade, advancement in the recent techniques made the fish species discrimination freshness and quality evaluation to be non-invasive and non-destructive rapid analysis. Spectroscopic, biosensors, image processing and E-sensors are the reliable techniques that provide better instrumental evaluation and making them suitable for the online/real-time analysis. This review work discusses the novel techniques and the results obtained for the fish quality and species examination. Keywords Fish species · Deep learning · Underwater images · Image processing · Biosensors · Spectroscopic · Hyperspectral imaging
1 Introduction Food safety and consumer acceptance are the principal key concerns for the retailers and wholesale of fresh fish and seafood materials. Fish and its products play significant role in human balanced nutrition which supplies protein, omega-3 fatty acids and vitamins in their regular diet [1]. Ecologists in aquatic department regularly count the animal presence to give the information for the management and conservation. Having immense volume of data collected by the underwater devices makes the counting to be critical for manual processing. It consumes significant labour, time and cost. However, fresh fish and its products are extremely perishable which are subjected to undesirable flavour, odours and decay process. Reliable techniques S. Natarajan · V. Ponnusamy (B) Department of ECE, SRM IST, Chennai 6032031, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_49
643
644
S. Natarajan and V. Ponnusamy
and methods are needed to examine the fish spoilage and quality with its sensory attributes. Color changes in the gill, elasticity, meat brightness appearance, odour could be helpful to classify the freshness of the fish sample. One of the most extensively utilized method is Quality Index Method (QIM) to assess the sensory parameters which significantly affect the shelf life and also for fish species determination. Some of the traditional and latest research works are reported and summarized in the following section for the examination of fish quality analysis, classification and species discrimination using imaging techniques. The proposed work mainly focused to discuss the latest techniques utilized for the fish species discrimination, quality examination and determination of fish variety in the underwater. Traditional methods find difficulty in species segregation with QIM and sensory parameters. Deep neural network achieves better accuracy in discrimination, classification and quality evaluation.
2 Fish Species Classification Using Machine Learning and Deep Learning Models Fish and seafood products are one of the most common food commodities traded, and thus, it is prone to fraud and mislabelling. Imaging techniques often provide better results in quantifying the spoilage and fraudulent compounds. Underwater images and videos are obtained, pre-processed, segmented and classified using various machine learning and deep learning techniques. Machine learning and deep learning models works in two phases: (i) training phase and (ii) testing phase. Machine learning and deep learning models are mathematical model which can learn the hidden pattern in the data in terms of its weight value while training the model and those learned knowledge can be utilized to classify the data when a new unknown data is presented in the testing phase. Marine scientists face difficulty in determining the size of flora, fauna quantifying and measuring the cover of background habitats. Still, there is a great challenge exists for video/image processing to precisely distinguish the species and classify the fish varieties. Due to the presence of many water bodies in the underwater [2], it poses many challenges to capture the images of fish due to the variation in light intensity, turbidity in the water, movement of fish and presence of flora fauna. Proposed work obtained 27,142 images for the fish species classification. Convolutional neural network (CNN) and image processing methods are applied and achieve 96.29% of accuracy. The review work [3] focuses on utilization of deep neural networks in various applications for thee prediction and classification. Deep neural networks automatically learn the features, there by the network model produces better accuracy. Earlier determination of diabetics is one of the challenging assignments using retinal images. Hard Executes (HE) [4] is helpful to determine the severity of diabetes. Traditional methods fail to detect the symptom namely Hard Executes (HE) in the
A Review on Fish Species Classification …
645
retinal images. The proposed research employs convolutional neural network algorithm to identify the microaneurysm from the retinal fundus images. CNN model is able to spot the HE in the retinal fundus images with the accuracy of 96%. It also proves that earlier determination of diabetic is possible with the dense deep feature extraction from the retinal fundus images. Proposed approach deployed for real-time applications and obtained 0.00183 s per frame computational time. Image enhancement method might be implemented to enhance the accuracy of the proposed system model. Classification of fish species from the images and videos obtained from the natural environment is a challenging one due to the presence of noise and surrounding habitat [5]. A two-step deep learning approach presents the classification and detection of temperate fishes without pre-filtering. First step proceeds with You Only Look Once (YOLO) for determination of each single species and sex of fish. Second step employs CNN with Squeeze and Excitation (SE) architecture for classification without pre-filtering. Without image augmentation, this work achieves 83.68%. The CNN-SE Net architecture obtains 99.27% accuracy for Fish4knowledge dataset without image pre-processing/data augmentation. Classification of fish is needed for quality analysis, disease identification and biomass estimation. Fish body segments such as head, scales and body are splitted from aquaculture fish images. In case of aquaculture industries, classification of fish species is challenging task, when it comes out of water, due to its variation in the body orientation, deformation and classification. Earlier stage of plant disease detection by means of human visual system is a challenging task due to high density of plants. The research work [6] proposes hybrid stages to predict the spread of Fusarium Oxysporum kind of disease in tomato plant leaves. The classification model constructed with two factors to determine the disease with better accuracy. 87 k tomato plant leaf dataset was utilized for the disease prediction. It consists of 40% of healthy and 60% of affected leaf images. With the hybrid method of prediction, it is found that 96% of accuracy of prediction is possible. Future work is extended to predict the location of the disease in the plant. The research work [7] developed multi-stage optimization technique to minimize the undesirable components present in the image. BYU and Fish-pak dataset employed for the fish classification. Automatic head detection and orientation correction technique is developed for multi segmented fish classification. Transfer learning AlexNet approach is utilized during the training phase of deep neural networks. Naïve Bayes fusion is established to combine the images and enhances the classification accuracy. 98.64% accuracy was achieved with 6 various fish varieties for one kind of dataset, and 98.94% of accuracy was achieved with 4 species varieties. This optimized methodology can be examined for various real-time acquired data to classify the fish species. Manual-based classification of fish species is time consuming and may prone to misjudging. Many deep learning algorithms are implemented and achieve very low accuracy due to image low quality and small dataset size. With this challenging task, Inception-V3 is employed. Experimentation [8] shows that fish species classification is more effective when present in a complex background image. Low quality of images and less number of datasets may lead to poor performance. Data augmentation, inverting, scaling and original image panning are pre-processed to improve the
646
S. Natarajan and V. Ponnusamy
accuracy. Classification of fish species based on Inception-V3 proposed further data augmentation, and transfer learning algorithm is utilized to improve the accuracy at the rate of 89% for fish species classification. More dataset and better image quality need to be enhanced to achieve better accuracy. Automated model is developed for determination and classification of fish species and its habitats. Understanding the underwater living being characteristic may be helpful for the marine biologists. Deep neural network model [9] is deployed with four convolutional and two fully connected layers of AlexNet model. Contrast analysis is performed between AlexNet and VGG Net model. QUT online fish dataset utilized to discriminate the fish species. Deep learning models such as AlexNet and VGG Net were deployed to distinguish the fish species. In proposed AlexNet model optimizing four parameters namely number of iterations, number of fully connected layers, number of convolutional layers and batch size achieves 100% accuracy during training phase. 90.48% accuracy is achieved by applying dropout procedure in AlexNet during testing phase, while the original AlexNet without dropout achieving 88.01% of accuracy. Further improvement has to be made for classification of real-time monitoring underwater fish species. An automatic machine learning approach [10] is presented to classify the fish species using benchmarked dataset from online. Convolutional neural network is employed to extract the speciesdependent features of fishes in order to discriminate the species variety of fish. Dataset is trained for cross set and same set for fish species classification. Cross dataset validation achieves 97.81% of accuracy with the convolutional neural network and support vector machine (CNN-SVM) algorithm. Same dataset accuracy performance achieves 96.75% for the CNN-SVM. The work can be improved by comparing the various types of architectures and various video and image dataset from unconstraint underwater environment. Figure 1 shows the acquisition techniques from the raw and underwater images and some of the machine learning models for the fish quality, freshness, storage time and species discrimination. Currently, the fish species classification recognition [11] is
Fig. 1 Review on techniques of fish quality analysis
A Review on Fish Species Classification …
647
carried through knowledge assumption and visual observations and comparison with the existing characteristics. Neural network combined image processing address is required to classify the fish species. 3 species variety of fish family name scombridae are classified which are skipjack tuna, tongkol and tuna. Geometric invariant moment GLCM and HSV feature extraction combination is applied to extract the features and processed with probabilistic neural network model for 112 images. The testing accuracy of 89.65% is achieved for testing 29 images. Limited number of testing data was examined with the proposed system. Angle of camera placement and distance of measurement are the limitations to overcome. Hyperspectral imaging provide qualitative evaluation for fish freshness based on correlation between selected epidermis area of fish and spectral reflectance [12]. It also finds the fish freshness and stored fish for more than three days “not fresh but edible”. Aquaculture and fishery fishes are taken for analysis. Hyperspectral videoimage morphometric-based analysis is made for evaluation of quality. Hyperspectral encoded data obtained from any object can be useful to discriminate the differences among the objects [13]. It is widely used to capture and study the landscape ecology and floral diversity. Hyperspectral imaging of live fish specimen is obtained from piranha and pacus species fish varieties. Live fish specimen can reveal the clade-level variation and fish species variety determination. Presented method is also applicable for interfamily species discrimination. The images are processed with linear discriminant analysis (LDA), neural network (NN) and K-nearest neighbour (K-NN). Piranha and pacus images are classified with LDA at 81.3%. K-NN supervised learning algorithm can be used to classify fish images to identify the species types with the accuracy of 67%. Overall classification NN accuracy decreased to 68.8%. The research work utilized intact fish sample for experimentation. The drawback of the research is it uses costly equipment and achieved less accuracy. Extreme learning machine (ELM) algorithm [14] evolved due to low performance measure present in the feed forward neural network (FFNN) model. This ELM[d] redesigned with some of new variants such as hidden nodes, biases and weights. These variants are optimized using FFNN to enhance the accuracy. This research work compared the execution time and accuracy with the implemented variants of ELM. It concludes that Elm can be implemented in various applications for huge dataset operation and achieve better generalization and optimization. Raw video footage of sea grass environment is taken to determine the luderick fish species and count [15]. Model is presented with 50 images to determine the luderick by the machine and human being. Experts achieve 88%, scientists lagged by 82%, and the deep learning model achieves 95% of determination sensitivity. Another research work [16] focuses on classification between mullus barbatus and mullus surmuletus as binary classification problem, and multiclass classification is carried out to discriminate 9 gurnard fish family species varieties in the Mediterranean demersal species evaluation on classification of species made by considering only the fish head segment of the contour of the fish. A set of features are used to classify the contour segments for classification. Very similar family species fish varieties are classified with better accuracy rate. Leave One Out Cross Validation
648
S. Natarajan and V. Ponnusamy
method achieves 89.4% accuracy for red mullet species variety classification. Similarly, 80% of accuracy obtained for multiclass classification with faster R-CNN. With respect to the structure of head, scales and eyes, the species are classified; it can also be improved for discrimination of other kinds of species, and other optimization methods need to be implemented to increase the accuracy in classification. To automate the use of image processing and deep learning helps to provide better substantial benefits in the field of aquatic ecology. The proposed work [17] compared the performance accuracy and efficacy of deep learning algorithm with the human equivalents for quantification and volume of fish in underwater video as well as image footage. CNN model is deployed for determination. The research finding conclude that the proposed method outperform more than 7.1% in quantifying volume of fishes compared to that of human expert. Similarly, it outperforms 13.4% more comparing to that of scientist’s in quantifying. It shows that the deep learning model was able to predict more precise than the human experts and also at faster rate. A novel technique [18] was developed to automate the detection and classification of fish species such as sharks, dolphins and other varieties, to protect and also to help endangered species. Images from the boat camera are obtained for experimentation which includes hindrances on degree of fluctuation in luminous intensity, brightness and opacity. The system applies methodologies in three phase. First phase produces augmented real-time boat camera images. Second part is detecting the high likelihood of fish confinement. Third phase is to classify the detected fishes with respect to its species. CNN is used to classify and detect the fish species. Proposed model achieves 92% of accuracy in determining and classifying 8 categories of fish species. Boat camera images involve lot of noise, so it requires lot of data correction methods before processing the images. Above discussed few latest research works deployed for the fish species discrimination with the image processing techniques combined machine learning and deep learning models. The classification accuracy of 99% was achieved for fish class classification by utilizing online dataset. Minimum accuracy of 68% was achieved for other kinds of imaging techniques.
2.1 Sensors Based Quality Evaluation Smart quality sensor enables to measure quality and to predict its progress through time. The sensor combines information of biochemical and microbial spoilage indexes with dynamic models to predict quality in terms of the QIM and EU grading criteria. Sensor performance is validated for the fresh and refrigerator stored Gadus morhua fish samples. Some of the physicochemical techniques were followed to evaluate the quality attributes and methods to monitor the quality of fish freshness [19]. Several techniques were utilized based on electrical effects, colour of the fish, image analysis, texture and electronic noses to detect the odour. Combining the outputs of instrumental analysis and measuring with QIM sensory score for the characteristics such as texture, smell and appearance produce precise Artificial Quality Index (AQI)
A Review on Fish Species Classification …
649
score. Integrated metal oxide microsensors measure total volatile compound (TVC) of fish species which is placed inside the sampling vessel with confined air. The sensor signal was processed and analysed using support vector machine (SVM)/partial least square regression (PLSR) which achieves classification accuracy of 91–100%. Metal oxide sensor is used to measure the Total Volatile Basic Nitrogen (TVBN) for the hair tail fish species. Principal Component Analysis (PCA) is used to analyse the quality of the fish sample which resulted 97% accuracy. Food quality and safety are the major problem involved in food industries regarding nutrition in human health. In coastal regions, the inferior quality of fish supplies affects the international raw trade off. Smart sensors [20] were deployed to predict the progress and quality of the fish under various temperature storage conditions. Spoilage condition namely total volatile base nitrogen and psychotropic content is needed to evaluate sensory conditions. Sensor is accounted for biological variability conditions during testing and validation. Colorimetric sensor array was made to examine the fish freshness based on the selected printing dyes on a reverse phase silica gel plate. Fish of chub was also detected for every 24 h in less than seven days [21]. Colour change profile images are obtained to differentiate the fish sample before and after odour conditions. Principal component analysis is deployed to analyse the digital data representations. Radial basis neural network is applied to classify the freshness on chub samples and obtained the accuracy of 87.5%. Moreover, researchers suggest that this colorimetric sensor approach is capable of determining quality of high protein containing food materials. An intelligent Fish freshness finding system is proposed [22] for real time freshness detection. MQ gas sensors (MQ-4, MQ-2, MQ8, MQ-7, MQ-5 and MQ-135) are used with Arduino board to collect the data. ANN is utilized for fish freshness classification. The following fish varieties are taken for experimentation (i) Tilapia Fish, (ii) Carpio Fish and (iii) Tengra. The ANN was trained with many samples as collected while testing was done only by using fresh fish data (Day 1), semi-spoiled fish (Day 2) and spoiled fish data (Day 3) by using 9 samples of three different species. The entire test samples were classified with 99% accuracy. The proposed system has been successful in identifying the number of days after catching the fish with an accuracy of up to 99%. Rapid and accurate detection of volatile gases generated from raw fish deployed with Au patched capacitive sensor to identify the freshness status [23]. Calibration of the sensor was carried out with known composition of volatile gases ammonia (NH3 ), trimethylamine (TMA), dimethylamine (DMA) and hydrogen sulphide (H2 S) in the parts per billion (ppb) and parts per million (ppm) regime. The proposed sensor response time is recorded as 2 minutes 20 seconds to determine the fish freshness status. The sensor output for the volatile gases, which is obtained from 3 fish varieties such as Rohu (Labeo rohita), Illish (Tenualosa ilisha) and Tilapia (Oreochromis niloticus) at 30, 25 and 20 °C, shows better correlation results with total volatile basic nitrogen (TVBN) and total viable counts (TVC). Amenable consumption limit of Tilapia, Rohu and Illish at 30 °C was detected to be 10, 11 and 12.5 h of repository control time. The review work involves assessment of fish freshness based on the chemical and physical characteristics with various acquisition and statistical
650
S. Natarajan and V. Ponnusamy
E-NOSE SENSOR SYSTEM DESIGN Closed container with evacuator from odour
E-nose gas sensor
Array
Arduino black board
Odour based Freshness Level analysis
Fig. 2 Illustration of E-nose gas sensor array system
algorithmic methods [1]. Data acquisition techniques such as nuclear magnetic resonance, electrochemical biosensor techniques, dielectric properties of fish spectra, E-sensors and various spectroscopic methods are discussed with the quantitative and qualitative statistical measures. Electronic sensor of gas type like MQ4, MQ3, MQ5, MQ8, MQ135 and MQ9 is utilized for acquisition of odour thereby assessment of fish freshness. The process involved is illustrated in Fig. 2. Electrochemical biosensors were utilized widely to determine the compounds involved for deterioration process. Ammonia, total volatile basic nitrogen, TMA and DMA are monitored with the non-invasive biosensing electrode and gas sensors. PH sensor is not considered as a better predictor for fish quality. E-nose, E-tongue, colorimetric sensor array and E-eye are developed to monitor the TVC and lactic acid bacteria with its correct classification rate of 95% for aged and fresh fish samples. Colorimetric sensor array utilized to determine the modification of various chemical agents and chemical bond creations. Dielectric impedance parameter is useful to determine the changes in dielectric property for different icing treated fish samples, storage time and freshness. This technique is applied with electrodes which are in contact with the sample. NMR works with spin of odd number of neutron or protons say hydrogen atoms. Amino acid, organic acid, maximum storage time, fatty acid metabolite content and water mobility are the parameters that can be measured using NMR resonance frequency of fish samples. Optical spectroscopic techniques were also deployed for the analysis of freshness and quality of food products. In case of fish, it is concerned for the determination of microbial spoilage, growth, physicochemical and texture features with the combination of chemometric tools. Another spectroscopic near infrared/mid-infrared (NIR/MIR) (800–14,000 nm) extracts the spectral sample from the fresh/thawed Red mullet, Plaice Atlantic mullet and Flounder species and analyses through linear discriminant analysis and soft independent modelling class analogy (SIMCA) with 97.5% of accuracy [1].
A Review on Fish Species Classification …
651
Potentiometric gas sensor electrode system is composed of hydrogen sulphide, ammonia and oxidation reduction gas sensors [24]. These array of sensors employed to measure the volatile gas emitted from the decomposed fish. The fish samples are stored for 3 days during the experimentation. Volatile gas which is coming out of the decomposed fish is utilized for the fish freshness determination. Deterioration of fish quality with time should also be analysed using sensory test and potential changes in the sensors. Characteristic response was obtained for trimethylamine (TMA) and dimethylamine (DMA) during fish degradation process. It is concluded that PCA delivers better sensitivity for the detection of diurnal gas which is found in degraded fish. The study presents odour changes of three fish varieties during a week of storage. Trout, sea bream and sea bass are three samples utilized for experimentation [25]. MQ3, MQ135, MQ5, MQ4, MQ9 and MQ8 are the low cost sensors used to measure the odour intensity. 10 g of fish meat was kept in a closed container to detect the quality of total viable counts present in the fish meat. With respect to the sensory—microbiological variation and data interpretation are made to find the critical threshold. The volatile counts are found to be < than 3 colony forming unit/gram (CFU/g). Table 1 summarizes the various methodologies employed for determination of fish species varieties through images and sensor data. The sensor data provides the information about total volatile compounds in the fish which in turn gives the data about the quality, freshness and spoilage. The imaging information helps to predict the species and classify the varieties. Since both species identification and freshness analysis are basically solved as classification problem, both the work uses accuracy as validation metrics. The table 1 provides performance of machine learning algorithm for fish species, freshness and quality classification. Table 1 shows the comparative performance analysis of fish species classification methods. A modified CNN architecture called CNN-Squeeze and Excitation achieved highest accuracy [5] of 99.27%. K-NN architecture shows poor performance in discriminating the species with the accuracy of 67%. Linear discriminant analysis and SIMCA methods using spectroscopic data results in 97.5% accuracy for finding freshness of the fish [1] but the method involves bulky and costly equipment which could be tested in the laboratory only. On the other hand, sensor array and radial basis NN mechanisms are portable and low cost but could able to achieve only 87.5% accuracy. Table 2 describes various dataset utilized for various application environment for the fish species discrimination and classification using ML and DL algorithms. The dataset is obtained from online and raw fish images for species determination. The images are feature extracted and applied with ML and DL algorithms for classification of species determination. From Table 2, analysis is obtained for impact of dataset. The dataset of images has huge impact on the performance of classification. Raw images and real-time camera images provide less accuracy of 67%–89%, respectively, for fish species discrimination. Pre-processed online images could able to achieve moderate to high accuracy of 90.48–99.27%.
652
S. Natarajan and V. Ponnusamy
Table 1 Determination of fish species using machine learning algorithms Literature work Deep learning algorithms
Fish classification/determination
Accuracy achieved
Rathi et al. [2]
Images/CNN
Species discrimination and classification
Acc-96.29%
Knausgård et al. [5]
Images/YOLO, CNN-SE
Species classification
CNN-SE-99.27%
Abinaya et al. [7]
Images/AlexNet
Multi-segmented fish classification
Acc-98.94%
Iqbal et al. [9]
Images/AlexNet and VGG Net
Species discrimination
Acc-90.48%
Salman [10]
Images/CNN and SVM
Species determination
Acc-96.75%
Andayani et al. [11]
Images/neural network Species classification model
Acc-89.65%
Kolmann et al. [13]
HSI/LDA, K-NN, NN
Species discrimination
K-NN-67%, LDA-81.7%, NN-68.8%
Marti-Puig et al. [16]
Images/R-CNN
Binary and multiclass classification
Acc (binary)-89.4%
Olafsdottir et al. [19]
Sensorarray/SVM + PLSR
Spoilage, freshness, total volatile compounds
Acc-97%
Huang et al. [21]
Sensor array/radial basis NN
Freshness
Acc-87.5%
Franceschelli et al. [1]
1H NMR, NIR/MIR/SIMCA
Freshness and quality
Acc (NIR/MIR)-97.5%
*
Acc-indicates Accuracy of classifier.
2.2 Other Comparison Methodologies Applied Acoustic trawl track of fish species plays significant role in marine stock management and marine environmental monitoring [26]. Trawl track provides ground truth about the presence of species from the high resolution image data. The work presents deep vision trawl camera system to automate the classification of species through deep neural network model from the acquired camera images. The developed model achieves 94% of classification accuracy from the synthetic data. Aerial Underwater Vehicle (AUV) fish determination is one of the challenging task. The proposed work [27] implements data augmentation, network simplification and training process to speed up the determination time applied with CNN for the underwater fish determination. Data augmentation transformation delivers increased number of learning samples. Simulated and optimized model produces better accuracy and reduces the processing time for an AUV underwater object detection methods. Another comparative study [28] was made between the CNN and capsule neural networks. Application that uses localization, segregation, quantification, detection
A Review on Fish Species Classification …
653
Table 2 Performance comparison between dataset, methodology and application Literature work
Dataset type
Objective function
ML/DL utilized
Accuracy achieved
Knausgård et al. [5]
Fish4knwledge
Temperate family
YOLO, CNN-SE Net
CNN-SE-99.27%
Abinaya et al. [7]
Fish-Pak and BYU
Fish classification
Naïve Bayesian
Acc-98.94%
Lan et al. [8]
Online dataset and data augmented
Fish species classification
Inception-V3
Acc-89%
Iqbal et al. [9]
QUT fish dataset
Fish species classification/identification
AlexNet and VGG Net
Acc-90.48%
Salman [10]
LifeCLEF15
Fish species classification
CNN
Acc-96.75%
Andayani et al. [11]
Real time camera images
Scombridae family classification
Probabilistic NN
Acc-89.65%
Kolmann et al. [13]
Raw fish spectral data
Pacus and piranhas species classification
Principal Component analysis
K-NN-67%, LDA-81.7%, NN-68.8%
Marti-Puig et al. [16]
Raw trawl fish images
Mediterranean demersal species
Extreme learning machine, R-CNN
Acc (binary)-89.4%, (multiclass)-80%
Rekha et al. Boat camera [18] images
Feature based classification CNN of 8 species
Acc-92%
and analysis can utilize the capsule neural network model for its classification performance. It incurs good learning and enhancement in the performance than CNN. It also preserves information of spatial location and enumerates the relationship with objects. High power consumption is one of the issues faced by utilizing this capsule neural network. A pre trained deep learning model of VGG16 is used with transfer learning for fish species classification [29]. 50 different species are obtained and each species is covered by 15 images. The model is trained with 4 various kinds of image dataset namely blending image, canny filter image, RGB colour space image and RGB mixed with blending images. The Genuine Acceptance Rate (GAR) is found to be 96.4% for combined image dataset. Non-invasive morphological feature monitoring is playing essential role in fish culture to measure the parameters such as mass and wellness indicators like the colour of eyes or gills. This work [30] presents the combination of deep learning model and image processing technique applied for the determination of length, height and area of the fish occupied in the particular image. Four specific fish species varieties are selected, and 25 photographs are employed from each of species. Fish part localization achieves 91% of success rate with processing time of 2.7 s. The estimated fish height, area and length are obtained with average length error evaluation of 4.93%.
654
S. Natarajan and V. Ponnusamy
3 Conclusion This review work discusses many different data acquisition methods and its analysis. Instrumental measurement analysis is more accurate and precise when compared to the sensory panel in the following ways: • Sensory panel skills are easily transferred to the physical multi-sensing system. • Quality measurements are fast enough and less costly than the sensory determination. • Regarding the industrial case, it is non-destructive, rapid, easy to operate, accepted widely and better than the current examination methods. There are some invasive sensor systems using spectroscopy, imaging and bio sensors mechanisms reported in literature which are more suitable for on- field fish species classification and quality assessment. Spectroscopic techniques plays better role for non-invasive rapid determination of samples. Underwater images and videos are quite challenging to distinguish the fish species with small dataset. Hyperspectral imaging technique delivers better results; still, there is need to optimize the usage of wavelength to rush up the measurement part. Biosensor is almost on the daily field to monitor the freshness of food products in the packets. So in near future, the combination of various spectroscopic methods with chemometric tools plays better role for fish quality analysis. Integration of multi-stage optimization system with estimation of one particular fish biomass calculation as well as determination of disease also paves way for later work. Moreover, rapid, non-destructive and nonvariable biological sensors need to be enhanced towards the quality analysis of fish during storage conditions. .
References 1. L. Franceschelli, A. Berardinelli, S. Dabbou, L. Ragni, M. Tartagni, Sensing technology for fish freshness and safety: a review. Sensors 21(4), 1373 (2021) 2. D. Rathi, S. Jain, S. Indu, Underwater fish species classification using convolutional neural network and deep learning, in 2017 Ninth International Conference on Advances in Pattern Recognition (ICAPR) (IEEE, 2017), pp. 1–6 3. A. Bashar, Survey on evolving deep learning neural network architectures. J. Artif. Intell. 1(02), 73–82 (2019) 4. A. Sungheetha, R. Sharma, Design an early detection and classification for diabetic retinopathy by deep feature extraction based convolution neural network. J. Trends Comput. Sci. Smart Technol. (TCSST) 3(02), 81–94 (2021) 5. K.M. Knausgård, A. Wiklund, T.K. Sørdalen, K.T. Halvorsen, A.R. Kleiven, L. Jiao, M. Goodwin, Temperate fish detection and classification: a deep learning based approach. Appl. Intell. 1–14 (2021) 6. R. Dhaya, Flawless identification of fusarium oxysporum in tomato plant leaves by machine learning algorithm. J. Innov. Image Process. (JIIP) 2(04), 194–201 (2020)
A Review on Fish Species Classification …
655
7. N.S. Abinaya, D. Susan, R. Kumar, Naive Bayesian fusion based deep learning networks for multi segmented classification of fishes in aquaculture industries. Eco. Inform. 61, 101248 (2021) 8. X. Lan, J. Bai, M. Li, J. Li, Fish image classification using deep convolutional neural network, in Proceedings of the 2020 International Conference on Computers, Information Processing and Advanced Education, 2020, pp. 18–22 9. M.A. Iqbal, Z. Wang, Z.A. Ali, S. Riaz, Automatic fish species classification using deep convolutional neural networks. Wireless Pers. Commun. 116(2), 1043–1053 (2021) 10. A. Salman, A. Jalal, F. Shafait, A. Mian, M. Shortis, J. Seager, E. Harvey, Fish species classification in unconstrained underwater environments based on deep learning. Limnol. Oceanogr. Methods 14(9), 570–585 (2016) 11. U. Andayani, A. Wijaya, R.F. Rahmat, B. Siregar, M.F. Syahputra, Fish species classification using probabilistic neural network. J. Phys. Conf. Ser. 1235(1), 012094 (2019) 12. P. Menesatti, C. Costa, J. Aguzzi, Quality evaluation of fish by hyperspectral imaging, in Hyperspectral Imaging for Food Quality Analysis and Control (Academic Press, 2010), pp. 273–294 13. M.A. Kolmann, M. Kalacska, O. Lucanus, L. Sousa, D. Wainwright, J.P. Arroyo-Mora, M.C. Andrade, Hyperspectral data as a biodiversity screening tool can differentiate among diverse Neotropical fishes. Sci. Rep. 11(1), 1–15 (2021) 14. J.S. Manoharan, Study of variants of extreme learning machine (ELM) brands and its performance measure on classification algorithm. J. Soft Comput. Paradigm (JSCP) 3(02), 83–95 (2021) 15. https://globalwetlandsproject.org/automated-analysis-of-aquatic-videos-accessible-machinelearning-tools-for-ecologists/ 16. P. Marti-Puig, A. Manjabacas, A. Lombarte, Automatic classification of morphologically similar fish species using their head contours. Appl. Sci. 10(10), 3408 (2020) 17. E.M. Ditria, S. Lopez-Marcano, M. Sievers, E.L. Jinks, C.J. Brown, R.M. Connolly, Automating the analysis of fish abundance using object detection: optimizing animal ecology with deep learning. Front. Mar. Sci. 7, 429 (2020) 18. B.S. Rekha, G.N. Srinivasan, S.K. Reddy, D. Kakwani, N. Bhattad, Fish detection and classification using convolutional neural networks, in International Conference on Computational Vision and Bio Inspired Computing (Springer, Cham, 2019), pp. 1221–1231 19. G. Olafsdottir, P. Nesvadba, C. Di Natale, M. Careche, J. Oehlenschläger, S.V. Tryggvadottir, R. Schubring, M. Kroeger, K. Heia, M. Esaiassen, A. Macagnano, Multisensor for fish quality determination. Trends Food Sci. Technol. 15(2), 86–93 (2004) 20. M.R. García, M.L. Cabo, J.R. Herrera, G. Ramilo-Fernández, A.A. Alonso, E. Balsa-Canto, Smart sensor to predict retail fresh fish quality under ice storage. J. Food Eng. 197, 87–97 (2017) 21. X. Huang, J. Xin, J. Zhao, A novel technique for rapid evaluation of fish freshness using colorimetric sensor array. J. Food Eng. 105(4), 632–637 (2011) 22. K. Dharmendra, S. Kumar, S.S. Rajput, An Intelligent System For Fish Freshness Quality Assessment Using Artificial Neural Network 23. M. Senapati, P.P. Sahu, Onsite fish quality monitoring using ultra-sensitive patch electrode capacitive sensor at room temperature. Biosens. Bioelectron. 168, 112570 (2020) 24. N. Kaneki, H. Tanaka, T. Kurosaka, K. Shimada, Y. Asano, Measurement of fish freshness using potentiometric gas sensor. Sens. Mater. 15(8), 413–422 (2003) 25. E. Yavuzer, Determination of fish quality parameters with low cost electronic nose. Food Biosci. 41, 100948 (2021) 26. V. Allken, N.O. Handegard, S. Rosen, T. Schreyeck, T. Mahiout, K. Malde, Fish species identification using a convolutional neural network trained on synthetic data. ICES J. Mar. Sci. 76(1), 342–349 (2019) 27. S. Cui, Y. Zhou, Y. Wang, L. Zhai, Fish detection using deep learning, in Applied Computational Intelligence and Soft Computing (2020)
656
S. Natarajan and V. Ponnusamy
28. T. Vijayakumar, Comparative study of capsule neural network in various applications. J. Artif. Intell. 1(01), 19–27 (2019) 29. P. Hridayami, I.K.G.D. Putra, K.S. Wibawa, Fish species recognition using VGG16 deep convolutional neural network. J. Comput. Sci. Eng. 13(3), 124–130 (2019) 30. N. Petrellis, Measurement of fish morphological features through image processing and deep learning techniques. Appl. Sci. 11(10), 4416 (2021)
Malicious URL Detection Using Machine Learning Techniques Shridevi Angadi and Samiksha Shukla
Abstract Cyber security is a very important requirement for users. With the rise in Internet usage in recent years, cyber security has become a serious concern for computer systems. When a user accesses a malicious Web site, it initiates a malicious behavior that has been pre-programmed. As a result, there are numerous methods for locating potentially hazardous URLs on the Internet. Traditionally, detection was based heavily on the usage of blacklists. Blacklists, on the other hand, are not exhaustive and cannot detect newly created harmful URLs. Recently, machine learning methods have received a lot of importance as a way to improve the majority of malicious URL detectors. The main goal of this research is to compile a list of significant features that can be utilized to detect and classify the majority of malicious URLs. To increase the effectiveness of classifiers for detecting malicious URLs, this study recommends utilizing host-based and lexical aspects of the URLs. Malicious and benign URLs were classified using machine learning classifiers such as AdaBoost and Random Forest algorithms. The experiment shows that Random Forest performs really well when checked using voting classifier on AdaBoost and Random Forest Algorithms. The Random Forest achieves about 99% accuracy. Keywords Cyber security · Malicious URL detection · Lexical features · Count features · Binary features · Random forest · AdaBoost · Machine learning · Voting classifier
1 Introduction The goal of cyber security is to avoid harm to software and hardware components, networks, and network components, as well as to defend against attackers who might steal users’ confidential and personal data/information. At the same time, as the Internet is used to make our jobs simpler, a variety of adversaries attempt to steal data from our computers. There are many other ways to attack on system, and one of S. Angadi (B) · S. Shukla Christ University, Bangalore, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_50
657
658
S. Angadi and S. Shukla
them is using a malicious URL. Antivirus applications, intrusion prevention/detection systems, and spam filters all use blacklisting of URLs method. The blacklist approach is quite easy and may provide improved accuracy if lists are updated on a regular basis, but it is generally ineffective in finding malicious URLs that are newly formed. Machine learning plays a critical role in recognizing malicious URLs in recent malicious URL detection algorithms. The www resources are referred to using the URL, which stands for ‘Uniform Resource Locator.’ There are two categories of URL: (a) resource name, which corresponds to the resource’s location’s domain name or IP address. (b) The protocol identification reveals which protocol is currently in use. Machine learning techniques create a prediction function to categorize a URL as dangerous or benign. Using a set of URLs as training data and supported statistical features. Unlike blacklisting approaches, this offers them the ability to generalize to new URLs. The primary requirement for training a machine learning model is the presence of coaching data. Machine learning is defined as supervised learning or unsupervised learning based on whether the training data has labels or has no labels for a small portion of the training data. After the training data is collected, subsequent step is to be done that is to extract informative features such as they sufficiently describe the URL, and at an equivalent time, the machine learning model will express them quantitatively [1]. Here, at first the length is extracted [2], count and binary features from the URLs that are present in our datasets and then add them as columns in our datasets and perform analysis on these extracted features. To better comprehend and learn about the data, data preprocessing and visualizations were used. To categorize the benign and malicious URLs, machine learning classifiers such as AdaBoost [3] and Random Forest classifiers were utilized. Motivation: Malware from fraudulent Web sites and drive-by downloads can be difficult to detect using Internet protection tools [4]. However, it may be able to save you from acquiring them in the first place. Malicious URL detection is still a problem that costs billions of dollars each year. According to the 2019, Webroot Threat Report [5], 40% of malicious URLs were found on good domains. Good Web sites can become a harmful threat for users. Browser and antiviruses are expected to prevent a user from accessing such Web sites. In recent years [6], the majority of attacks have been carried out via propagating hacked URLs and fishing, with malicious Uniform Resource Locators (URLs) addresses being the most common way used by hackers to carry out harmful actions. These attacks have wreaked havoc on the environment. Even while today’s security components attempt to detect malicious sites and IP addresses, the attackers are able to evade detection by employing various approaches. To avoid any harm, it is critical to research and obtain appropriate malicious URL detection solutions. The rising levels of cyber crimes have necessitated the requirement of reliable classification and identification framework [7]. Contribution: The key contributions of this work include the detection of malicious Web site or malicious URLs using count, binary, and lexical aspects of the connected URLs to improve the performance of classifiers that are used in this experiment. In the literature, the state of the art of machine learning approaches for identifying
Malicious URL Detection Using Machine Learning Techniques
659
malicious URLs has been assessed. The tokenization technique was utilized to extract significant characteristics during feature extraction. URL parsing is done to split the URL and get particular field of URL in order to do data analysis. The dataset was highly imbalanced with 76.80% of benign URLs and 23.20% of malicious URLs. Oversampling technique was applied to balance the dataset. AdaBoost classifier and Random Forest classifiers were used, compared, and voted using the voting classifier. Fivefold cross-validation is used to guarantee that the model is not overfit. Train and test accuracies were compared. The function has been created that accepts URLs as user input and detect whether the given URL is malicious or benign one. If the given URL is detected as malicious, then alert message box is displayed with the alert message “Avoid clicking on such URLs” or else “Safe URL” message is printed. Organization: Organization of research paper is summarized here. Section 2 provides a brief overview of related research work. The problem definition, research challenges, and dataset description are explained in Sect. 3. Section 4 presents the methodology and module description. Module description explains about the preprocessing, model, and the architecture of the model. The performance analysis and results are discussed in Sect. 5. In Sect. 6, the study comes to a close with a consideration of future research.
2 Related Work In recent strategies for detecting malicious URLs, machine learning plays a critical role in identifying dangerous URLs. Many potential attackers/hackers may attempt to take data from our system. There are several ways to attack, and one of them is using a malicious URL. The use of suspicious and malicious URLs may allow an adversary to get unauthorized access to user data. Because the number of harmful URL distributions has increased in recent years, it is essential that approaches or procedures to detect and prevent these dangerous URLs must be studied and implemented [6]. Classifying URL as benign and malicious using appropriate classification methods is one of the essential tasks for any humanitarian organization and in cyber security domain [8]. Xuan et al. [9] proposed a method for malicious URL detection using machine learning based on URL behaviors and attributes as well as the big data technology. Supervised learning techniques such as RF and SVM were employed. RF algorithm achieves a high accuracy of about 96.28%, and balance is maintained between the processing time and accuracy of the system. Concatenation methods, custom decision policy and logistic regression decision policy. Vanitha et al. [10] suggested a method for categorizing URLs that uses a machine learning algorithm called logistic regression for binary classification. Using feature extraction and feature vectorization, the dangerous URLs were detected. It is observed that logistic regression obtains maximum learning accuracy compared
660
S. Angadi and S. Shukla
to other algorithms such as Naïve Bayes and Random forest. Logistic regression achieves accuracy for about 98%. Kumar et al. [11] used black and white list technology and machine learning algorithms and formed a multilayer filtering model for detection of malicious URLs. The model was developed for the machine learning approaches, such as naïve Bayesian classification and decision tree classifier threshold, and this threshold is used to guide two URL filtering classifiers. To increase the accuracy of the malicious URL detection system, the Naive Bayesian classifier, decision tree classifier, and SVM classifiers were merged in a multilayer model. The multilayer filter model performs better than all the three classifier models achieving the accuracy about 79.55%. A machine learning framework for intelligently detecting dangerous URLs was introduced by Chen et al. in [12]. A total of 26,054 URLs were collected for the experiment. XGBoost and ANOVA are used to reduce the number of dimensions of features and determine the best number of features of training data to optimize the model. The XGBoost classification model’s complexity is lowered by 78%, enhancing training speed while maintaining 99.98% accuracy. Patgiri et al. [13] presents a comprehensive investigation on SVM and other machine learning algorithms which were used to conduct a thorough examination into the detection of malicious URLs. Lexical and host-based aspects of URL were used for detecting harmful URLs. With iterations, the accuracy of SVMs swings more than that of Random Forests. The Random Forest classifier outperforms the SVM classifier. When data is divided into 80:20 ratios, random forest obtains an accuracy of around 93%. Kumar et al. [14] discussed how to classify phishing URLs from the given set of URLs containing benign and phishing URLs. They also discussed the randomization of the dataset, feature engineering, feature extraction using lexical analysis, host-based features, and statistical analysis. It is observed that dataset randomization yielded a great optimization and the accuracy of the classifier improved significantly. To extract the attributes from the URLs, a simple technique based on regular expressions was utilized. The classifiers used were LR, RF, Gaussian Naive Bayes, DT, and KNN. The most accurate method is random forest, which has an accuracy of 98.03%. Vanhoenshoven et al. [15] developed a machine learning-based solution for detecting fraudulent URLs. They employed a technique called pattern categorization. They also looked at the performance of algorithms like NB, SVM, multilayer perceptron, DT, RF, and k-Nearest Neighbors. The database contains nearly two million URLs. Random Forest, followed by MLP, looks to be the best appropriate classification technique for this problem. Kumar et al. [16] provided a solution to the problem of identifying malicious URLs on the Web using ensemble-based classifiers. The model trained for five different algorithms namely Boosted Trees (AdaBoost), Bagged Trees (decision trees), Subspace Discriminant, Subspace KNN, and Random Under Sampling (RUS) Boosted Trees. Bagged trees showed the highest performance on HTTPS protocol
Malicious URL Detection Using Machine Learning Techniques
661
about 97.5% in comparison to RUS boosted trees showing greater accuracy on the HTTP protocol about 92.1%.
3 Problem Definition, Research Challenges, and Dataset Description 3.1 Problem Definition Malicious URLs are detected and classified as malicious or benign URLs using a several machine learning classifiers. To increase the effectiveness of classifiers, several feature extraction approaches such as lexical features and host-based features are applied.
3.2 Challenges Managing a large amount of data is a challenging endeavor. The issue of selecting characteristics for high-quality performance of machine learning algorithms is once again complex. Due to the vast expanse of the Internet, tracing botnets created by malware is extremely challenging. Reducing the model’s processing time and classification issues are caused by an unbalanced dataset.
3.3 Dataset Description The dataset named urldata.csv has been taken from Kaggle for the experimental purpose. The goal of this collection was to address the problem of harmful and dangerous URLs on the Internet. The detection of malicious URLs can be done using the lexical features along with tokenization of the URL strings. It contains 4,50,176 rows and 4 columns. Column names are (i) Unnamed, (ii) URL, (iii) Label, and (iv) Result. The Unnamed column has index values, URL column consists of both malicious and benign URLs, Label column has labels like malicious and benign, and Result column contains binary values 0 and 1, 1—benign and 0—malicious. For the sake of experimentation and analysis, the dataset was obtained via Kaggle. It does not consist of any null values and the ‘Unnamed’ column will be dropped in the further stage of data preprocessing as it is not required for any kind of analysis. The size of dataset is 34,595 KB, and each URL can be identified with a unique id. In the dataset, a URL might belong to either the good or harmful classes.
662
S. Angadi and S. Shukla
4 Methodology The proposed methodology, as shown in Fig. 1, discusses urldata.csv dataset which has been collected from Kaggle, Random Forest algorithm, AdaBoost algorithm, and voting classifier. For this research, a Jupyter notebook was used to create code in the Python programming language. Following steps are followed in methodology: Step 1 Step 2 Step 3
Step 4
Data Encoding: As a first step in building the model, the feature extraction is done in such a way that the values are represented as 0s and 1s. Feature Extraction: Length, count, and binary features have been extracted and added to our dataset using lambda function. Splitting the URL into distinct components is possible with URL Parser or the Free Online URL Query String Splitter. You may break the query string into a human-readable and intelligible manner with this tool. URL parsing is done to split the URL and get particular field of URL in order to do data analysis. Tokenizing URL: Tokenization is the process of breaking down a huge body of material into smaller parts, words, or even developing new terminology for a language that is not English. The NLTK module includes additional tokenization functions that may be used in programs. Here, URLs are split into smaller fields in order to do data analysis.
Requirements • Malicious URL Dataset • Google Colab / Jupyter Notebook • python programming language
Training Phase • Extrac on of lexical, count and binary features • Oversampling technique • Spli ng of train and test data
Detec on Phase • Classifica on of Malicious and Benign URL • Confusion matrix • Cross Valida on • Display alert message box
Fig. 1 Methodology
Malicious URL Detection Using Machine Learning Techniques
Step 5 Step 6
Step 7
Step 8
Step 9
Step 10 Step11
663
Data Scaling: No need to do data scaling as there were no such column or data was present. Training and Test Set: The train test split function in the sklearn package is used to split the data into train and test data in the sixth phase. The split ratio is set to 80:20, indicating that 80% of the data is utilized for training and 20% is used to test fresh observations and classify them. The model learns quicker when it has a higher amount of training data. Balancing Data: The dataset is highly imbalanced with 76.80% of benign URLs and 23.20% of malicious URLs. Before training the model, the dataset must be balanced to reduce bias. As a result, the oversampling strategy is used to balance the training data. The model will be skewed toward the majority variable if the data is unbalanced. Feature Selection: The main aspects of the model are picked in the eighth phase. The Pearson correlation coefficient method was used for feature selection in which heat map was drawn and visualized, and the features are selected based on the threshold value 0.65. Two separate correlation relations can be examined, which are correlation of each feature with target and correlation between features, respectively. Highly correlated features with target were selected, and highly correlated features with other features were eliminated. Model Building: The next step is to construct a binary classification model that can tell the difference between good and bad URLs. The AdaBoost and Random Forest classifiers are utilized, compared, and voted on using the voting classifier, with the outcome indicating that the Random Forest classifier has the best accuracy. Cross-Validation: Fivefold cross-validation method was applied in order to cross-validate the model and check whether it is showing valid results. Detection: Function has been created, and different URLs were been provided as user input to detect and classify whether the given URL belongs to benign class or malicious class. If URL is detected as malicious, then alert message box was displayed or else “Safe URL” message is printed.
4.1 Module Description 4.1.1
Preprocessing
At first, the data is checked for missing values, and if any missing values are found, the values will be dropped using functions available in Pandas library. Binary values like yes or no are replaced with zero and ones. Features are selected by extracting the various features from the URLs in the dataset such as length, count, and binary features.
664
S. Angadi and S. Shukla
Fig. 2 Steps followed in the experiment
Dataset Preprocessing Extraction of Features Training and Testing Result
4.1.2
Proposed Model
The malicious URL dataset is separated into an 80:20 ratio set, which means that the model is trained on 80% of the data and tested on 20% of the data, with the model being trained on 80% of the data and 20% of the data being used as test data. The binary classification is done using Random Forest algorithm and AdaBoost algorithm later on which the voting classifier is applied to check the model with highest accuracy. The Random Forest algorithm gives about 99.8% accuracy, and AdaBoost classifier gives about 99.5% accuracy. The results demonstrate that the Random Forest classifier performs well when compared to the AdaBoost classifier. Train accuracy and test accuracy have also been checked where model produced 99% of accuracy from both train data and test data. Fivefold cross-validation has been applied to validate the model in which cross-validation accuracy reached 99%. As the oversampling technique is used to balance the dataset and cross-validation is applied to validate the model, the overfitting problem does not occur (Fig. 2). The dataset was collected from Kaggle which consists 450,176 rows and 4 columns. In data preparation part, the behavior and features of URLs are studied and noted. The length, count, and binary features were used to extract features [17]. There were 19 URL characteristics retrieved in all, and they are as follows: Lexical features are: ‘Length Features’, ‘Length of Url’, ‘Length of Hostname’, ‘Length of Path’, ‘Length of First Directory’, ‘Length of Top Level Domain’. Count Features are: ‘Count of ‘-”, ‘Count of ‘@’, ‘Count of ‘?’, ‘Count of ‘%”, Count of‘.’, ‘Count of ‘=’ , ‘Count of ‘http’, ‘Count of ‘www”, ‘Count of Digits’, ‘Count of Letters’ and ‘Count of Number of Directories’. Binary features are: ‘Use of IP or not’ and ‘Use of Shortening URL or not’. The machine learning models were developed in Python programming to detect dangerous Web sites that are accessed through the Web browser [18]. The path of malicious URL detection and categorization is depicted in the block diagram [19] as shown in Fig. 3.There are two stages to it. 1. 2.
Training Phase Detection Phase
Malicious URL Detection Using Machine Learning Techniques
Training Phase
665
Detection Phase
URL
Feature Extraction, Labeling
Machine Learning algorithms
URL
Feature Extraction Classification
Training
Malicious Alert message box
Benign Safe url
Fig. 3 Workflow of training and detection phase
In the training phase, URLs were collected from the source which is Kaggle and 80% of the data has been sent to feature extraction and labeling of URLs. The model was also trained using machine learning methods. The model should be tested on unknown data once it has been trained. Following the model’s performance, it is assessed using performance assessment criteria like accuracy, precision, recall, and f 1-score. In the detection phase, URLs were collected from the source which is Kaggle and 20% of the unseen data has been sent to feature extraction step; after feature extraction, these URLs are tested on trained modeled and have been classified as “Malicious URL” or “Safe URL.” In this experiment after the detection phase, the function is created to accept the user input and classify whether the given input is benign or malicious. If URL is found malicious, then alert message box is displayed stating that “Avoid clicking on such URLs” or else “Safe URL” message is printed. Machine learning algorithms proved that efficient classification of malicious and benign URL can be done using machine learning classifiers like Random Forest algorithm and AdaBoost algorithm. Voting classifier was used to check the model that gives best accuracy. Soft voting classifier is used in which the data is classified based on the probabilities and the weights associated with each classifier.
666
S. Angadi and S. Shukla
5 Experimental Results The malicious URL dataset is split into 80:20 ratios, with 80% of the data utilized for training and 20% for testing. The binary classification is done using Random Forest classifier and AdaBoost classifier later on which the voting classifier is applied to check the model with highest accuracy. The comparative study between the 2 classifiers Random Forest and AdaBoost shows that the Random Forest algorithm gives about 99.8% accuracy and AdaBoost classifier gives about 99.5% accuracy. As a result, the Random Forest method outperforms the AdaBoost classifier. Train accuracy and test accuracy have also been checked where model produced 99% of accuracy from both train data and test data. Fivefold cross-validation has been applied to validate the model in which crossvalidation accuracy reached 99%. As the oversampling technique is used to balance the dataset and cross-validation is applied to validate the model, the overfitting problem does not occur. To check for ‘TP,’ ‘TN,’ ‘FP,’ and ‘FN’ values, a confusion matrix is plotted as shown in Fig. 5. Performance metrics are used to compute the ‘TPR,’ ‘FPR,’ ‘Precision,’ ‘Recall,’ and ‘F-measure’ [20]. The model scoring is pretty good with good scores and had got accuracy about 99.9% using RF algorithm. Figure 6 shows the result of our experiment. Precision = True Positive/False Positive + True Positive Recall = True Positive/True Positive + False Negative F-MEASURE = 2 ∗ recall ∗ precision/recall + precision At last, the function was created to accept the URLs given by the users to check whether the given URL is malicious or benign one. The model detects, classifies, and gives results based on user inputs. If the URL is equal to 1, then it belongs to malicious one, and if it is 0, then it belongs to benign class. As shown in Fig. 4, if malicious URL is detected, then alert message box is displayed as “Avoid clicking on such URLs,” and if the URL is safe, then “Safe URL” message is printed. Win32api library was used to display the alert message box.
6 Conclusion and Future Work The malicious URLs detection model uses machine learning classifiers such as Random Forest and AdaBoost to perform binary classification. The voting classifier is used to check the model that is giving highest accuracy among all the models. The results show that the Random Forest algorithm performs well compared to AdaBoost classifier. The Random Forest classifier gives the accuracy about 99.8%, whereas AdaBoost classifier gives accuracy about 99.5%, and hence, voting classifier predicts that the Random Forest algorithm performs well. The function has been
Malicious URL Detection Using Machine Learning Techniques
Fig. 4 Result of experiment
Fig. 5 Confusion matrix
Fig. 6 Model scoring
667
668
S. Angadi and S. Shukla
created which detects the URL as malicious or benign based on the characteristics of the URL. The URLs are accepted as user input, and they are identified and classed as benign or malicious. If the URL is found malicious, then alert message box will be displayed showing that “Avoid clicking on such URLs” or else “Safe URL” message will be printed. For the URLs that are found malicious, some precautionary measures can be carried out such as blacklisting of URLs and red-listing the URLs so that they does not appear anymore in future. In future, deployable and reliable models for malicious URL detection could be built. And various deep learning algorithms can be applied and experimented on the dataset.
References 1. T. Li, G. Kou, Y. Peng, Improving malicious URLs detection via feature engineering: linear and nonlinear space transformation methods. Inf. Syst. 91, 101494 (2020) 2. M. Darling, G. Heileman, G. Gressel, A. Ashok, P. Poorna chandran, A lexical approach for classifying malicious URLs, in 2015 International Conference on High Performance Computing & Simulation (HPCS) (IEEE, 2015), pp. 195–202 3. F. Khan, et al., Detecting malicious URLs using binary classification through ada boost algorithm. Int. J. Electr. Comput. Eng. 10(1), 2088–8708 (2020) 4. P. Burnap, et al., Real-time classification of malicious URLs on Twitter using machine activity data, in 2015 Proceedings of the IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (2015) 5. INC., Webroot, Webroot Threat Report (2019). Available also from https://www-cdn.webroot. com/9315/5113/6179/2019_Webroot_Threat_Report_US_Online.pdf 6. N. Khan, J. Abdullah, A.S. Khan, Defending malicious script attacks using machine learning classifiers, in Wireless Communications and Mobile Computing (2017) 7. A.B. Sayamber, A.M. Dixit, Malicious URL detection and identification. Int. J. Comput. Appl. 17–23 (2014) 8. S. Kumi, ChaeHo Lim, S.-G. Lee, Malicious URL detection based on associative classification. Entropy 23(2), 182 (2021) 9. C. Do Xuan, H. Nguyen, T. Nikolaevich, Malicious URL detection based on machine learning (2020) 10. N. Vanitha, V. Vinodhini, Malicious-URL detection using logistic regression technique. Int. J. Eng. Manage. Res. (IJEMR) 108–113 (2019) 11. R. Kumar, et al., Malicious URL detection using multi-layer filtering model, in 14th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP) (IEEE, 2017) 12. Y.-C. Chen, Y.-W. Ma, J.-L. Chen, Intelligent malicious URL detection with feature analysis, in IEEE Symposium on Computers and Communications (ISCC) (IEEE, 2020) 13. R. Patgiri, R., et al., Empirical study on malicious url detection using machine learning, in International Conference on Distributed Computing and Internet Technology (Springer, Cham, 2019) 14. J. Kumar, et al., Phishing website classification and detection using machine learning, in 2020 International Conference on Computer Communication and Informatics (ICCCI) (IEEE, 2020) 15. F. Vanhoenshoven, et al., Detecting malicious URLs using machine learning techniques, in 2016 IEEE Symposium Series on Computational Intelligence (SSCI) (IEEE, 2016) 16. H. Kumar, P. Gupta, R.P. Mahapatra, Protocol based ensemble classifier for malicious URL detection, in 2018 3rd International Conference on Contemporary Computing and Informatics (IC3I) (IEEE, 2018)
Malicious URL Detection Using Machine Learning Techniques
669
17. M.S.I. Mamun, et al., Detecting malicious urls using lexical analysis, in International Conference on Network and System Security (Springer, Cham, 2016) 18. S. Jino, S.V. Niranjan, R. Madhan Kumar, A. Harinisree, Machine learning based malicious website detection. J. Comput. Theor. Nanosci. 17(8), 3468–3472 (2020) 19. D. Kapil, A. Bansal, N.M.A.J. Anupriya, machine learning based malicious URL detection, 2, 8(4S), 22–26 (2020) 20. H.M.J. Khan, et al., Identifying generic features for malicious url detection system, in 2019 IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON) (IEEE, 2019)
Comparative Study of Blockchain-Based Voting Solutions Khushi Patel, Dipak Ramoliya, Kashish Sorathia, and Foram Bhut
Abstract The election is one of the factors due to which democracy fails as a system in place. Today, countries like India, the US, and even Japan are still facing accusations like vote rigging, lack of transparency, and hacking of EVMs. The latest election held in India, even in a situation like this pandemic, has led us to question whether our government parties are not concerned about people’s well-being in their fight for power or are simply turning a deaf ear to the issue. Voting is one of the core pillars of democracy. There have been continual attempts to improve the methodology and procedures for achieving a valid, open voting system. E-voting was one the solutions for solving some issues but due to security and privacy concerns, it was not widely adopted. Blockchain technology is an emerging, decentralized technology that is promising to improve various aspects of many industries. The solution to alleviate current problems in E-voting might be to extend E-voting further into blockchain technology. An Ethereum blockchain-based decentralized voting system, named MYVOTE that protects individual voter privacy and improves accessibility while ensuring fairness, protection, and cost-efficiency, is proposed in this paper. The system also preserves the confidentiality and ease of access of the individual voter K. Sorathia · F. Bhut Department of Information Technology, Devang Patel Institute of Advance Technology and Research (DEPSTAR), Faculty of Technology and Engineering (FTE), Charotar University of Science and Technology (CHARUSAT), CHARUSAT Campus, Changa, Gujarat 388421, India e-mail: [email protected] F. Bhut e-mail: [email protected] D. Ramoliya Department of Computer Science and Engineering, Devang Patel Institute of Advance Technology and Research (DEPSTAR), Faculty of Technology and Engineering (FTE), Charotar University of Science and Technology (CHARUSAT), CHARUSAT Campus, Changa, Gujarat 388421, India e-mail: [email protected] K. Patel (B) Department of Computer Engineering, Devang Patel Institute of Advance Technology and Research (DEPSTAR), Faculty of Technology and Engineering (FTE), Charotar University of Science and Technology (CHARUSAT), CHARUSAT Campus, Changa, Gujarat 388421, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_51
671
672
K. Patel et al.
while maintaining transparency, security, and efficiency in the voting process. The strong authentication mechanism is inculcated to ensure the authenticity of system. Keywords Blockchain · E-voting · Decentralized architecture · Ethereum · 2-factor authentication · P2P network
1 Introduction Today’s democracies, based on population consensus, can be achieved by voting. Many countries still use the conventional ballot method, which necessitates centralized coordination by a trusted third parties for the execution of the voting process as well as for the monitoring and counting of vote ballots, which raises the prospect of vote tampering and corruption. E-voting or electronic voting system offers various capabilities such as low cost and human interference which it has been suggested and adopted in restricted circumstances as an enhanced alternative to the ballot system. However, E-voting solutions have not been widely incorporated, due to concerns about security, transparency, distributed permissions, data integrity, confidentiality, and compliance, and electronic voting systems have not yet been widely implemented in [1]. India has been one of the countries those have used electronic voting machines (EVMs) in elections since 1999. EVMs in India are unquestionably less expensive than other forms of electronic voting. To a large extent, votes cast via an EVM are presumed to be tamperproof. The other obvious problems with these systems, however, were reliance on authority to oversee the voting process, as well as allegations of political party influence to support its cause. That being said, other issues related to the voting process in India’s electoral system include a lack of transparency, false voter identification, a propensity for political manipulation in remote locations, and the delay in explaining the results. All of these identified problems can be well resolved by replacing any current voting option with a blockchain-based electronic system. The voting system works at the user level and records votes in a distributed web or mobile interface [2]. Blockchain is an evolving technology with solid cryptographic bases and offers various advantages of its capabilities to achieve robust and secure environment. Many authors [3–16] have proposed the blockchain-based voting system which are analyzed to develop the novel model. The paper discusses the decentralized vote system utilizing blockchain, a stable and robust system that guarantees voter privacy, transparency, and robust operation. Here, we are implementing our application using the Ethereum blockchain. Ethereum blockchain is an open-source distributed computing platform with a full Turing scripting language in which software developers can implement decentralized applications (DAPP) and benefit from the inherited distribution ownership of blockchain technology [17, 18]. The characteristics of the blockchain-based E-voting system are as follows [19]: 1.
It is open, distributed, and decentralized. It can collect votes from citizens using a variety of electronic devices and computers.
Comparative Study of Blockchain-Based Voting Solutions
2. 3. 4.
673
The scheme enables individual voters to audit and validate their ballots at a low cost. The voting database is handled autonomously and employs a distributed timestamp server on a peer-to-peer network. Voting on the blockchain is a process in which voters’ concerns about data protection are negligible, removing the function of indefinite reproducibility from E-voting. The main contribution of the article is as below:
• Existing systems suitable for voting events are analyzed and compared. • The Ethereum-based voting system is proposed which offers strong authentication through face recognition. • The proposed architecture is implemented and tested on local private blockchain. In the next section, an overview of blockchain technology and its use cases are covered. Then, next section gives brief information of the blockchain technology, its characteristics, and use cases. The Novel Decentralized Ethereum-based E-voting application (MYVOTE) and its implementation details along with architecture and flow of the system are covered in Sect. 4. At the end of the article, the conclusion, discussion, and future directions are covered.
2 Background Theory 2.1 Blockchain Technology—As a Distributed Ledger Technology, Its Working, and Use Cases A distributed ledger technology (DLT) is a system that allows all network users to share an ever-growing, chronologically arranged list of cryptographically signed, immutable transactional data. Any participant with the appropriate access permissions may trace back a transaction to any entity in the network at any point in its history. Transactions are stored in a decentralized manner by technology. Valueexchange transactions are carried out directly between connected peers and are confirmed consensually via the network utilizing algorithms [20]. A blockchain is a peer-to-peer network of nodes those exchange all of the network’s data and code. The node connected with blockchain network can communicate with all of the other nodes of peer-to-peer network in terms of exchange of data and decision. Blockchain establishes the trust among unknown peers by using various consensus mechanisms. Since 2009, the revolution in blockchain technologies has come meekly with Bitcoin the first implementation of blockchain technology [21]. Blockchain has the following 2 main properties:
674
K. Patel et al.
Fig. 1 Structure of the blockchain [23]
1.
2.
Immutability: Any “new block” added to the ledger shall relate to the preceding block by using hashing which establishes trust by preventing the manipulation of previous records. Verifiability: The distributed, decentralized, and cloned nature of ledger offer along with support the consensus version of the ledger by all the nodes assure high opportunity and mediator verifiability [22] (Fig. 1).
Blockchain uses hashing technique to map the data of blocks with unique identifiers to connect the current block to the preceding block. The hash is used in the succeeding block of the blockchain to link it to the preceding block which generates the hash-based chain known as blockchain. As the immutable database, blockchain manages any stored data with the same data copy through a network of nodes. Node is merely a network connecting device or server. If a node is linked, the data copy from the blockchain will be implicitly imported. Another thing is that there is no centralized vote monitoring or administration authority since it is decentralized. Blockchain has grown in popularity over the last decade as a result of Bitcoin. However, blockchain applications have now moved far beyond cryptocurrency. Blockchain technology is quickly becoming a turning point in a variety of industries, finance, supply chain, Internet of Things, cyber security, logistics, real estate, education, etc. as shown in Fig. 2 [24–28].
3 Survey on Existing Mechanisms of Voting 3.1 Paper Ballots The voting system has a long history, beginning with paper ballots, moving on to E-voting, and eventually to I-voting. Paper ballot, also known as Australian ballots or
Comparative Study of Blockchain-Based Voting Solutions
675
Fig. 2 Use cases of blockchain technology
hidden ballots, was the first voting method adopted by South Australia and Victoria in 1856. The idea behind paper ballots was to enable voters to decide their vote using writing tools such as a pen or pencil and a hand-counted or optical scanner counting method. Even after all these years, paper ballots are still regarded as one of the most “trustworthy” voting processes. By marking the paper, the elector decides his or her vote.
3.2 Electronic Voting Machine The current studies say that more than 30 countries have used electronic voting systems, with 20 using electronic voting machines. Dials, touchscreens, and buttons are the three different forms of EVM. All these options use computer memory to store the vote which allows to perform voting and counting parallelly by recording the votes at the time of casting the vote. On the EVM, the elector will vote by pressing any of the candidate’s buttons. Electronic voting machines are more expensive than paper ballots because they include printing, paper, and other costs. It is easy for disabled persons.
676
K. Patel et al.
Table 1 Conventional voting system vs online E-voting using blockchain Conventional voting system
Online E-voting using blockchain
Slower process and long queues during election
Faster and save time
Hackable system
It is more efficient and immutable
Lot of paperwork involved, hence less eco-friendly and time-consuming
It is a transparent system
Cost of expenditure on elections is high
Cost is low
It does not require network. Therefore, work at any time It does not work on network problem even network issue
3.3 I-voting The United States has been the first nation to use I-voting since 2000. Following that, 14 countries used the I-voting scheme, with ten of them agreeing to use it in the future. I-voting may take place in a variety of ways, including online platforms, email, fax, and other forms of remote-based electronic communication, which is where the term “remote E-voting” comes from. I-voting is an extension of E-voting. Voting process includes the following [29]: • Voters use electronic devices like desktops and laptops that can connect to the Internet to access the voting system. • The user interface (UI) is used by the elector to decide his or her vote. • The voter must complete the identity verification process. Additionally, I-voting has been implemented in Estonia, Norway, New South West, and Washington D.C. Estonian I-voting system’s code is made closed which raises the issues related to transparency [23]. Table 1 shows us the comparison of conventional voting system and online voting using blockchain.
3.4 Existing Decentralized E-voting System Blockchain technology is an indispensable component in means to address I-voting constraints. Blockchain technology nature resembles an indisputable, immutable, and distributed leaflet. Key technical characteristics of blockchain are as follows: • Introduced decentralization. P2P network which contains the identical blockchain (data) in every node, however which does not result in a single failure point. • The new block that has established an immutable chain which secures the data from manipulation will be linked on the preceding block when additional data or the so-called block is created.
Comparative Study of Blockchain-Based Voting Solutions
677
• Control over half of the network’s nodes (51%) resulted in an exceptionally secure system (Greatest wins the game.). It is difficult to initiate a DDoS attack on several network nodes at the same time [29]. In Table 2, many existing blockchain-based voting systems [3–7, 30] are compared with their limitations.
4 Proposed System/Scheme In this particular section, a private Ethereum-based blockchain architecture is proposed to address the various security needs of electronic voting [31]. An online voting tool based on the Ethereum blockchain that provides poll results instantly following the voting session and is safe against data manipulation. Because online voting may be done from anywhere in the globe, it is critical to authenticate and authorize the user. We employed two-factor authentication to do this, the first of which was face recognition and the second of which was OTP verification. Finally, the project addressed the flaws in the present EVM system and used blockchain technology to successfully fix those flaws. The fundamental objective of this application is to prevent vote rigging and to implement a time limit mechanism. Anyone who uses this system will find it simple to vote.
4.1 System Architecture The registration process is the first step in our system. Verifying a voter is essential in establishing security within the system. Each user needs to register themselves with unique ID along with an added picture only after the users have verified using face detection to cross-check whether the user is eligible to vote or not. After that, the voter has provided a one-time password (OTP) that he or she can use to vote. We can then proceed to vote using our crypto wallet keys (Ethereum). The smart contracts contain all of the transactions. During the voting day, the voter will visit the online MYVOTE polling booth; he or she will undergo the verification process; and then, using the blockchain address user will cast a vote [32]. Every node generates a private and public key pair before the voting process begins. Every node’s public key will be made available to all peers participating in the procedure. At the time of election, each node will share the voting results with other peers. At the time of completion of whole process, the nodes need to wait for their turn to build a block arrives. When a node gets a turn to create a block, by using the turn rules of blockchain formation, it will generate and send the block and signed it with digital signature to each to avoid crashes and ensure that all nodes are connected to the blockchain [32]. The system architecture is given in Fig. 3.
678
K. Patel et al.
Table 2 Summarization of various blockchain-based existing voting systems Refs. Highlights and strategy of proposed system
Limitations of the systems
Year
[5]
The robust decentralized system The use of the Hyperledger Sawtooth 2017 architecture is proposed which contains can further boost a strong, scalable, and administrator node and local nodes to secure system proposed manage large scale voting events. The public key cryptography is used to offer security. One blockchain is utilized for information on the voting token before the vote, and another blockchain is utilized for voting
[6]
Two-factor authentication system, based on Hyperledger Fabric, is proposed to offer high-level authentication in private blockchain
Time-based tokens are communicated in 2018 plain text mode which can be captured by initiating man-in-the-middle attack
[6]
For devices in an IoT network, 2FA based on blockchain is used. Checking the linked device in the neighboring node and then sending a verification code to enable the device are the authentication stages. The blockchain was utilized to store device associations and only allow access to the linked device. The proposed system can help in preventing malicious attacks even when the tokens are stolen by attacker
First-factor authentication failure has 2018 a high potential. Also, the tokens used in system are not still secure as attackers can access the token even after the hostile devices are used to prevent assaults
[30]
To circumvent various types of attacks, such as MITM third-party attacks, authors have proposed 2FA technology-based approach which shares the encrypted OTP over SMS using smart contract
As the system used SMS to communicate the tokens, if the handset got compromised or lost the attackers can easily manage to initiate various attacks
[4]
The elliptic curve cryptography is used to authenticate the votes using digital signature
Despite the fact that the system provides 2019 a complex mechanism for verifying vote blocks, still the used public key cryptosystem is vulnerable which may lead to security compromise
[3]
Over the blockchain platform, which is decentralized in nature, offer a new way for making server authentication more secure by constructing 2FA. The approach suggested does not need a reliable third party for authentication purposes between a requesting party (user) and a verifier (server). This method has the advantage of using a private blockchain, which eliminates the need for high expenses and memory
Here, system tokens are used which 2019 does not match the notion of tokens as it can be used only once Also, because nodes rely on one other to fetch the latest modifications in the smart contract, network downtime might obstruct SSH access
2019
(continued)
Comparative Study of Blockchain-Based Voting Solutions
679
Table 2 (continued) Refs. Highlights and strategy of proposed system [7]
It defines the safe, useful, and scalable storing of votes in the form of assets. In this proposed methodology, a multi-chain blockchain network is employed, limiting each voter to a single transaction. The voter’s authenticity is verified by a trusted third party (TTP) using a secret message given by the voter
Fig. 3 System architecture of the proposed work
Limitations of the systems
Year
The proposed approach requires more 2019 time since the TTP shall check with the election commission the secret message supplied by each voter and then offer the reference number to be used to see the candidates and cast a vote
680
K. Patel et al.
4.2 Methodology The technology used for implementation of the client-side systems are Django Framework (Python), HTML, CSS, JavaScript, and Node.js. The smart contracts written in Solidity Programming Language are tested using Truffle web framework; Ganache is used to deploy and test the smart contracts, and MetaMask was used to handle the voters’ accounts. The entire system is divided into two subsystems: a registration system and a voting system. 1.
2.
Registration System: A voting registration system is developed using HTML/CSS-Bootstrap framework. Voter’s information is entered prior to the voting procedure starts, by empowering the authenticate-your-self feature along with face recognition for voters. Voting System/Cast Vote: Before casting the vote, the validation of voter information takes place. A voter login to a crypto wallet and uses Ethereum cryptocurrency on a MetaMask to cast a vote. The voter can validate by OTP that is sent to the user. In the smart contract, the candidate’s name is mentioned. The actual logic portion of the entire voting system is a smart contract.
A Transaction is a name given to each and every change performed in a blockchain. The external world interacts with the Ethereum network through transactions. When we want to change or update the status of the Ethereum network, we use transactions. A transaction fee or service charge is required for each transaction. A native currency “ether” circulates within the Ethereum network. Ether is most commonly used as a transaction fee or service charge. It is also called a gas fee. Ganache-CLI is being used in this project. Ganache is a part of Truffle environment which gives the local environment of private blockchain by offering 10 accounts with preloaded counterfeit ethers. MetaMask is browser-based Ethereum wallet which allows you to execute Ethereum dApps without having to run a full Ethereum node in your browser. Basically, MetaMask is a chrome extension. MetaMask stores private keys using the browser’s data store. In MetaMask, a basic transaction requires at least 21,000 gas. After login, you should pass Ethereum blockchain address and email-id. On that email address, OTP is sent for OTP verification as well as a link for continuing voting. When you click that link you will directly transfer to the voting page. On the voting page, you have a list of candidate names and voting count. You have to select one candidate and enter an OTP which was sent in your mail. When you click on the vote button, it verifies the OTP. If it is correct, then it will show a MetaMask transaction top-up box and you have to confirm that to complete the voting process. After completing the process, it will show the result of voting. The main security factor is that one will only have their single personal Ethereum address linked with their verification data used for only one time in voting. Authentication is a way of safeguarding an account by confirming the user’s identity through the use of an email address and a password. Two-factor authentication is
Comparative Study of Blockchain-Based Voting Solutions
681
a type of authentication that combines the first and second factors of authentication. General two-factor authentication entails inputting an email address or a username and password. Two-factor authentication, on the other hand, necessitates the user entering extra details. Tokens or one-time passwords can be used to provide additional information (OTP). Two-factor authentications that rely on third-party services to produce tokens or OTPs are still insecure since tokens may be stolen via MITM and the generated tokens have the same value. As a result, we propose a two-factor authentication architecture based on the Ethereum blockchain, with a decentralized application as the token creation method. In our application, we have used face recognition as primary authentication, and at secondary state, application sends token or OTP to the user registered email. User then sends it to the smart contract using his Ethereum account. This voting system uses some of the state-of-the-art technologies for the purpose of face recognition like OpenCV and Django. This technology utilizes AI calculations to go searching for faces inside an image and match the faces. Since faces are not found, it displays that user is not registered. So, first register yourself. However, if your face is matched, it displays another page related to applying vote. If the OTP matches, the user will be granted access to the system to vote. The technology ensures a high degree of authentication.
4.3 Results and Outcomes Interface Design Figure 4 represents the initial first step of the verification page, where one has to capture the photo and write your name. The system will verify it and validate the user’s data. Figure 5 showcases “voters who have been authorized,” and user has to provide their registered email address and their unique Ethereum blockchain address. The system will send the OTP to the email address mentioned here (Fig. 6). Table 2 represents characteristics of proposed blockchain-based E-voting system with respect to time factor, efficiency, transparency, immutability, authentication, etc. (Table 3).
5 Conclusion and Discussion The blockchain-based electronic voting system is proposed which uses Ethereum blockchain and smart contacts to offer a secure and cost-effective election system by keeping voter’s identity secret. Any organization may use this system to organize
682
Fig. 4 System architecture diagram
K. Patel et al.
Comparative Study of Blockchain-Based Voting Solutions
683
Fig. 5 Interface design—verifying face and username
Fig. 6 Interface design—request OTP and voting link interface
a free, secure electronic voting using this Ethereum-based blockchain. The transparency of the blockchain allows for more election audits and understanding. These characteristics are some of the criteria for a voting system. For the authentication, face recognition approach is mainly beneficial in recognizing flawed voters, with the purpose of reducing the misuse of democratic counts. Voters can vote from anywhere after registering into the system on a local blockchain
684
K. Patel et al.
Table 3 Comparison of conventional voting system, E-voting, and proposed system Factors
Conventional voting system (paper based)
E-voting machine
Proposed model
Time factor
✖
✓
✓
Efficiency
✖
✓
✓
Trustworthy application
✖
✖
✓
Transparency
✖
✖
✓
Immutability and verifiability
✖
✖
✓
Authentication and privacy
✖
✖
✓
Tempering of votes
✓
✓
✖
network through the web and polling using a crypto wallet. Although whole system works on web, the system is immutable and tamperproof. Creating statistics and reports based on numerical properties such as gender, location and age is planned to cover in future work. Other future work includes an application that is more important to administration activities and uses the Aadhaar framework in conjunction with Aadhaar APIs. We anticipate that voters will make their decision via a secure device. Even though our framework is secure, programmers can make or change a decision using vengeful programming, which is already installed on the voter’s device. Local languages might be included, which would be extremely beneficial to people who live in rural areas. A feedback mechanism that allows individuals to file complaints as well as reviews should also be added. However, the system is developed and implemented but still it is in testing phase in terms of risks related to security and scalability.
References 1. O. Daramola, D. Thebus, Architecture-centric evaluation of blockchain-based smart contract e-voting for national elections. Informatics 7(2), 16 (2020) 2. U. Çabuk, E. Adiguzel, E. Karaarslan, A survey on feasibility and suitability of blockchain techniques for the E-voting systems. Int. J. Adv. Res. Comput. Commun. Eng. (IJARCCE) 7, 124–134 (2018). https://doi.org/10.17148/IJARCCE.2018.7324 3. V. Amrutiya, S. Jhamb, P. Priyadarshi, A. Bhatia, Trustless two-factor authentication using smart contracts in blockchains, in International Conference on Information Networking (ICOIN), Kuala Lumpur, Malaysia, 2019 4. H. Yi, Securing e-voting based on blockchain in P2P network. EURASIP J. Wirel. Commun. Netw. 2019(1), 137 (2019) 5. A. Barnes, C. Brake, T. Perry, Digital Voting with the use of Blockchain Technology (Plymouth University, 2017). Accessed Dezembro, vol. 15, pp. 2016
Comparative Study of Blockchain-Based Voting Solutions
685
6. L. Wu, X. Du, W. Wang, B. Lin, An out-of-band authentication scheme for internet of things using blockchain technology, in 2018 International Conference on Computing, Networking and Communications (ICNC), Maui, HI, 2018, pp. 769–773 7. R. Ganji, B. N. Yatish, Electronic Voting System Using Blockchain (2018) 8. S. Shukla, A.N. Thasmiya, D.O. Shashank, H.R. Mamatha, Online voting application using ethereum blockchain, in 2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI) (IEEE, 2018), pp. 873–880 9. A. Benny, Blockchain based e-voting system. Available at SSRN 3648870 (2020) 10. M. Putri, P. Sukarno, A. Wardana, Two factor authentication frameworks based on ethereum blockchain with dApp as token generation system instead of third-party on web application. J. Ilmiah Teknologi Sistem Informasi (2020) 11. K. Garg, P. Saraswat, S. Bisht, S.K. Aggarwal, S.K. Kothuri, S. Gupta, A comparitive analysis on e-voting system using blockchain, in 2019 4th International Conference on Internet of Things: Smart Innovation and Usages (IoT-SIU) (IEEE, 2019), pp. 1–4 12. V. Chandra, K. Geetha Poornima, M. Rajeshwari, K. Krishna Prasad, A Conceptual Framework for the Integrated, Smart and Secure Remote Public Voting System (SSRPVS) (2020) 13. S. Anjan, J.P. Sequeira, Blockchain based e-voting system for india using UIDAI’s Aadhaar. J. Comput. Sci. Eng. Softw. Test. 5(3), 26–32 (2019). https://doi.org/10.5281/zenodo.3428327 14. N. Kshetri, J. Voas, Blockchain-enabled e-voting. IEEE Softw. 35(4), 95–99 (2018). https:// doi.org/10.1109/MS.2018.2801546 15. S. Ghule, M. Bhondave, P. Mishra, V. Survase, A. Kulkarni, Smart E-voting system with face recognition by blockchain technology. Int. J. Adv. Res. Comput. Commun. Eng. 9(8) (2020) 16. M. Razu Ahmed, F.M. Javed Mehedi Shamrat, M. Asraf Ali, M. Rajib Mia, M. Arifa Khatun, The future of electronic voting system using blockchain. Int. J. Sci. Technol. Res. 9(02) (2020) 17. K.M. Khan, J. Arshad, M.M. Khan, Secure digital voting system based on blockchain technology. Int. J. Electron. Govern. Res. (IJEGR) 14(1), 53–62 (2018) 18. M. Bartoletti, S. Carta, T. Cimoli, R. Saia, Dissecting Ponzi schemes on Ethereum: identification, analysis, and impact. Futur. Gener. Comput. Syst. 102, 259–277 (2020) 19. S.K. Vivek, R.S. Yashank, Y. Prashanth, N. Yashas, M. Namratha, E-voting systems using blockchain: an exploratory literature survey, in 2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA). https://doi.org/10.1109/icirca48905.2020. 9183 20. F.Þ. Hjálmarsson, G.K. Hreiðarsson, M. Hamdaqa, G. Hjálmtýsson, Blockchain-based e-voting system, in 2018 IEEE 11th International Conference on Cloud Computing (CLOUD) (IEEE, 2018), pp. 983–986 21. B. Marr, A Very Brief History of Blockchain Technology Everyone Should Read. Retrieved 26 8, 2018, from https://www.forbes.com/sites/bernardmarr/2018/02/16/a-very-brief-history-ofblockchain-technology-everyone-should-read/#5a98f88f7bc4 22. S. Chaithra, J.K. Hima, R. Amaresh, Electronic voting system using blockchain. Int. Res. J. Eng. Technol. (IRJET) 07(07) 23. A.B. Ayed, A conceptual secure blockchain-based electronic voting system. Int. J. Netw. Secur. Appl. 9(3), 01–09 (2017) 24. S. Smys, H. Wang, Security enhancement in smart vehicle using blockchain-based architectural framework. J. Artif. Intell. 3(02), 90–100 (2021) 25. C.V. Joe, J.S. Raj, Deniable authentication encryption for privacy protection using blockchain. J. Artif. Intell. Capsule Netw. 3(3), 259–271 (2021) 26. D. Sivaganesan, Performance estimation of sustainable smart farming with blockchain technology. IRO J. Sustain. Wirel. Syst. 3(2), 97–106 (2021) 27. M.H. Bohara, K. Patel, A. Saiyed, A. Ganatra, Adversarial artificial intelligence assistance for secure 5G-enabled IoT, in Blockchain for 5G-Enabled IoT (Springer, Cham, 2021), pp. 323–350 28. R. Patel, A. Ganatra, K. Patel, Security and Privacy Control in 5G-Enabled Healthcare Using Blockchain (Healthcare Technologies, 2021), Blockchain for 5G Healthcare Applications: Security And Privacy Solutions, Chap. 17, pp. 481–506. https://doi.org/10.1049/ PBHE035E_ch17.IETDigitalLibrary, https://digital-library.theiet.org/content/books/10.1049/ pbhe035e_ch17
686
K. Patel et al.
29. O.K. Yi, D. Das, Block chain technology for electronic voting. J. Crit. Rev. 7(3), 2019 (2020) 30. E. Alharbi, D. Alghazzawi, Two factor authentication framework using OTP-SMS based on blockchain. Trans. Mach. Learn. Artif. Intell. 7(3), 17–27 (2019) 31. C.-H. Roh, I.-Y. Lee, A study on electronic voting system using private blockchain. J. Inf. Process. Syst. 16(2), 421–434 (2020) 32. W.-S. Park, D.-Y. Hwang, K.-H. Kim, A TOTP-based two factor authentication scheme for hyperledger fabric blockchain, in Tenth International Conference on Ubiquitous and Future Networks (ICUFN), Prague, Czech Republic, 2018
Electrical Simulation of Typical Organic Solar Cell by GPVDM Software Rohma Usmani, Malik Nasibullah, and Mohammed Asim
Abstract The use of solar energy in today’s world is gaining momentum. The use of organic solar cell due to its various advantages has changed the tide a bit from crystalline solar cell to organic solar cell. The use of a popular solar simulation software GPVDM has been done in this work to study the effect of change in active layer thickness of the organic solar cell. The different parameters such as open circuit voltage, maximum power, efficiency, fill factor has been found for different active layer thickness of the organic material P3HT/PCBM which has been used for the study. Keywords Organic photovoltaic · Efficiency · GPVDM software
1 Introduction The urge for green energy has led to tremendous exposure in the area of solar energy. Solar energy provides fuel diversification, hence energy security is enhanced. It also reduces global risk of climate change. Photovoltaic research is gaining momentum due to global attraction towards alternate energy resources [1–3]. Whilst the siliconbased SPV system still plays a major role, organic solar photovoltaic cells have newly emerged as a main research focus area due to their low weight, flexibility, environmental stabilities and ease of manufacture [4, 5]. The need to optimize both inorganic and organic solar cell is the need of hour. The efficiency of OPV are less, but research has raised the hope that OPV have got the inherent potential of PCE reaching as high as 25% regarding construction. During last few years, a great interest is aroused in the field of organic solar cell because of the above advantages. The OPV device consists of a photovoltaic electron acceptor and electron donor hetero-junction sandwiched between transparent electrodes. The hetero-junction R. Usmani · M. Nasibullah · M. Asim (B) Integral University, Lucknow, India e-mail: [email protected]; [email protected] M. Nasibullah e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_52
687
688
R. Usmani et al.
provides a large interfacial area for the splitting of excitons with low diffusion lengths (of the order of a few nm). Electron donors and acceptors can be chemically designed at molecular level for maximizing the absorbed solar irradiation. The energy offset between the HOMO–LUMO of donor/acceptors provides the necessary force to overcome the binding energy of the exciton and assist in splitting of the charges. The charges separated by the energetic offset are pulled towards respective electrodes thereby contributing to the flow of current in the circuit [6, 7].
2 Simulation Technique 2.1 GPVDM Software GPVDM equalization software for OPV cell has been utilized for studying and analysing how to increase its efficiency and other performance parameters. The GPVDM simulation software uses various equations like Poisson Eq. (1), bipolar drift–diffusion Eqs. (2, 3) and the carrier continuity (4, 5) which are solved by the GPVDM simulation software [8, 9]. GPVDM’s electrical model is a 1-D/2-D drift–diffusion model. It solves the Shockley–Read–Hall equations as a function of energy and position space for modelling effects such as mobility/recombination rates changing as a function of carrier population [2, 6, 10].
2.2 Modelling of Organic Solar Cell The OPV comprises of various layers which are reactive with incident light in different measures in electrical terms [11–13]. Charge transport, recombination and trapping are main objects to be studied. In the wavelength range 300–1000 nm, refractive wide spectrum is also studied. Following are the equations used. The LUMO–HOMO band gaps are defined as E LUMO = −χ − qϕ
(1)
E HOMO = −χ − E G − qϕ
(2)
The internal potential distribution is obtained within the device and Poisson’s equation is solved, which is given by ∂ϕ d εo εr = q(n − p) dx ∂x
(3)
Electrical Simulation of Typical Organic …
689
where ϕ is the electrostatic potential, εo , εr are the dielectric constants of free space and material, q is the electronic charge and n and p denote the carrier densities of free electrons and holes. This equation provides information regarding resultant charge distribution in different layers of OPV cell. In turn, it provides an estimate of the Open Circuit Voltage (Voc) in Organic Solar Cells. To account for carrier trapping and recombination via trap states, Shockley–Read– Hall recombination two approaches are used. The first assumes that the trapped carrier distribution has reached equilibrium. It also assumes there are relatively few trapped charge carriers compared the number of free carriers, and thus the trapped charges do not significantly change the electrostatic potential. These assumptions are valid when the material is very ordered (i.e. GaAs) or at a push in steady state for some moderately disordered material systems. The second method assumes trapped carriers have not reached a reached equilibrium with the free carriers. GPVDM has this non-equilibrium Shockley–Read–Hall model for solving potential and current densities in organic solar cells. The bipolar drift–diffusion equations are solved in position space for electrons in order to describe the charge carrier transport, i.e. Jn = qμc n
∂(−χ − qϕ) ∂n + q Dn ∂x ∂x
(4)
and for holes ∂ −χ − E g − qϕ ∂p J p = qμn p + q Dp ∂x ∂x
(5)
Here, Jn and J p are the current densities of electrons and holes respectively, μc and μn and the electron and hole mobilities, Dn and D p are the diffusion coefficients of free electrons and holes. These equations are applicable when incident light is absent. In such case, generation and recombination of new charge carriers does not occur. Even then, some charge carriers are intrinsically generated due to the existing electric field in the device and due to differential distribution of charge carrier density in bulk hetero-junction comprising of Donor and Acceptor atoms. This current is called Dark Current. It relates the current density obtained from the solar cell to the conduction gap and the free charge carrier density in the absence of incident light. The charge carrier continuity equations for both electrons and holes given by Eq. 6 and 7 which is in accordance with conservation of charge carriers is forced by solving these equations. ∂n ∂ Jn = q Rn − G + ∂x ∂x ∂ Jp ∂p = −q R p − G + ∂x ∂x
(6) (7)
690 Table 1 Nomenclature
R. Usmani et al. Symbol
Detail
ε˳
Permittivity of free space
Ec
Free Electron Mobility Edge
εr
Relative Permittivity
Ev
Free Hole Mobility Edge
Voltage Profile
Dn
Electron Diffusion Coefficient
Q
Elementary charge on an electron
Dp
Hole Diffusion Coefficient
N
Free electron concentration
Rn
Net Recombination Rate for Electrons
P
Free hole concentration
Rp
Net Recombination Rate for Holes
G
Generation Rate for Holes and Electrons
where G and R are the next generation and recombination rates per unit volume, respectively. This equation describes balancing of the current due to newly generated charge carriers and already existing charge carrier density of electrons and holes, respectively. It is applicable when incident light is present. In such cases, the generation, G and recombination, R of free charges occurs. The conservation of charge carriers is achieved by solving the charge carrier continuity equation for both electrons and holes (Table 1).
2.3 Electrical Simulation of the Organic Solar Cell on Varying Thickness of Active Layer The variation of V oc , J sc , V mpp , J mpp , F.F, Max. Power and power conversion efficiency with change in active layer thickness was measured and is shown in Table 2. The maximum power of 45.738 W/m2 and maximum efficiency of 4.57 is obtained active layer thickness of 2.1 × 10–7 m.
3 Simulation Results As per the above observations, the maximum thickness is obtained at 2.1 × 10−7 m. GPVDM Software calculates the different parameters of organic solar cell at this obtained thickness, i.e. 2.1 × 10−7 m. The same are enlisted below in Table 3. Apart from this, for the optimized thickness, various other curves are plotted for analysis of the solar cell.
Electrical Simulation of Typical Organic …
691
Table 2 Variation of parameters of organic solar cells for different active layer thickness Thickness of active layer (m)
V oc (V)
J sc (A/m2 )
V mpp (V)
J mpp (A/m−2 )
F.F(a.u)
Max. power (W/m2 )
Power Conversion efficiency (%)
1.0 × 10–7
0.6097
–7.2239 × 102
5.1234 × 10–1
–6.5385 × 101
0.76
33.500
3.35
2.0 × 10–7
0.6047
–1.0793 × 102
4.8927 × 10–1
–9.1641 × 101
0.68
44.823
4.48
2.1 × 10–7
0.6045
–1.1141 × 102
4.5857 × 10–1
–9.7611 × 101
0.67
45.738
4.57
2.2 × 10–7
0.6036
–1.1227 × 102
4.6862 × 10–1
–9.7253 × 101
0.67
45.575
4.55
2.3 × 10–7
0.6029
–1.1349 × 102
4.6863 × 10–1
–9.7129 × 101
0.66
45.518
4.55
2.4 × 10–7
0.6021
–1.1423 × 102
4.6869 × 10–1
–9.6611 × 101
0.65
45.281
4.52
2.5 × 10–7
0.6008
–1.1304 × 102
4.6897 × 10–1
–9.4234 × 101
0.65
44.193
4.41
3.0 × 10–7
0.5960
–1.1404 × 102
4.4909 × 10–1
–9.3190 × 101
0.61
41.851
4.18
4.0 × 10–7
0.5934
–1.2368 × 102
4.4898 × 10–1
–9.4109 × 101
0.57
42.254
4.22
5.0 × 10–7
0.5905
–1.2481 × 102
4.2889 × 10–1
–9.4926 × 101
0.55
40.713
4.07
6.0 × 10–7
0.5894
–1.2741 × 102
4.0849 × 10–1
–9.8349 × 101
0.53
40.175
4.01
7.0 × 10–7
0.5871
–1.2365 × 102
4.0920 × 10–1
–9.2274 × 101
0.51
37.748
3.77
Table 3 Calculated parameters of organic solar cell at optimum active layer thickness Parameter
Value
Trap density of electron
3.8000e26 m-3 eV-1
Trap density of Hole
1.4500e25 m-3 eV-1
Tail slope (Electron)
0.04 eV
Tail slope (Hole)
0.06 eV
Mobility of Electron
2.48e–07 m2v-1 s-1
Mobility of Hole
2.48e–07 m2v-1 s-1
Relative permittivity
3.8
No. of traps
20
Organic solar cell I-V characteristics is shown in Fig. 1 and it is seen that as voltage increases after 0.6 V current increases. J-V characteristics of the organic solar cell is shown in Fig. 2 and it is seen that as voltage increases after 0.6 V current density increases. Figures 3, 4 and 5 shows electron–hole pair recombination perfect or with voltage applied across the device, charge density variations with voltage across the device and electron–hole contribution with voltage across the device organic solar cells, respectively.
692
Fig. 1 I-V characteristics of the organic solar cell
Fig. 2 J-V characteristics of the organic solar cell
Fig. 3 Electron–hole pair recombination perfector with voltage across the device
R. Usmani et al.
Electrical Simulation of Typical Organic …
693
Fig. 4 Variation in charge density with voltage across the device
Fig. 5 Electron–hole contribution with voltage across the device
4 Conclusion Organic solar photovoltaic cells have emerged as a main research focus area due to their low weight, low cost, flexibility, environmental stabilities and ease of manufacture. At the moment, power conversion efficiency is low, but immense scope of research exists due to OPV’s inherent potential, which has raised the hope of achieving higher performance characteristics. In the present work, study and experimentation has been done for improvement of the efficiency and other performance related parameters like identification of major parameters of performance of Organic Solar Cell. Also relationship of chemical properties of OSC like HOMO/LUMO Energy Levels, thickness of active layer, charge density, etc. with performance parameters has been studied.
694
R. Usmani et al.
Acknowledgements MCN.No. IU/R&D/2022-MCN0001386 is provided by the Integral University. The authors would like to thank the university for it.
References 1. J.D. Servaites, M.A. Ratner, T.J. Marks, Organic solar cells: a new look at traditional models. Energy Environ. Sci. 4(11), 4410–4422 (2011) 2. S. Sun, Z. Fan, Y. Wang, J. Haliburton, Organic solar cell optimizations. J. Mater. Sci. 40(6), 1429–1443 (2005) 3. M. Shahabuddin, M. Asim, A. Sarwar, in Parameter extraction of a solar PV cell using projectile search algorithm. 2020 International Conference on Advances in Computing, Communication & Materials (ICACCM) (pp. 357–361). IEEE (2020) 4. Y. Li, T. Pullerits, M. Zhao, M. Sun, Theoretical characterization of the PC60BM: PDDTT model for an organic solar cell. J. Phys. Chem. C 115(44), 21865–21873 (2011) 5. M. Asim, Modelling and simulation of 5 parameter model of solar cell modelling and simulation of 5 parameter model of solar cell. Int. J. Electr. Electron. Comput. Syst. 4, 2–7 (2015) 6. F. Deschler, D. Riedel, B. Ecker, E. von Hauff, E. Da Como, R.C. MacKenzie, Increasing organic solar cell efficiency with polymer interlayers. Phys. Chem. Chem. Phys. 15(3), 764–769 (2013) 7. H.H.P. Gommans, D. Cheyns, T. Aernouts, C. Girotto, J. Poortmans, P. Heremans, Electrooptical study of subphthalocyanine in a bilayer organic solar cell. Adv. Func. Mater. 17(15), 2653–2658 (2007) 8. A.K. Mishra, R.K. Shukla, Electrical and optical simulation of typical perovskite solar cell by GPVDM software. Mater. Today: Proc. 9. H. Jin, C. Tao, M. Velusamy, M. Aljada, Y. Zhang, M. Hambsch, P. Meredith, Efficient, large area ITO-and-PEDOT-free organic solar cell sub-modules. Adv. Mater. 24(19), 2572–2577 (2012) 10. A. Wagenpfahl, C. Deibel, V. Dyakonov, Organic solar cell efficiencies under the aspect of reduced surface recombination velocities. IEEE J. Sel. Top. Quantum Electron. 16(6), 1759– 1763 (2010) 11. D.H. Apaydın, D.E. Yıldız, A. Cirpan, L. Toppare, Optimizing the organic solar cell efficiency: role of the active layer thickness. Sol. Energy Mater. Sol. Cells 113, 100–105 (2013) 12. B. Ratier, J.M. Nunzi, M. Aldissi, T.M. Kraft, E. Buncel, Organic solar cell materials and active layer designs—improvements with carbon nanotubes: a review. Polym. Int. 61(3), 342–354 (2012) 13. K. Takahashi, N. Kuraya, T. Yamaguchi, T. Komura, K. Murata, Three-layer organic solar cell with high-power conversion efficiency of 3.5%. Solar Energy Mater. Solar Cells 61(4), 403–416 (2000)
Statistical Analysis of Blockchain Models from a Cloud Deployment Standpoint Himanshu V. Taiwade and Premchand B. Ambhore
Abstract Blockchain is quickly becoming one the most useful data security standard for cloud computing. This is due to the fact that blockchain systems possess immutability, transparency, traceability and distributed computing capabilities, which makes them highly usable in cloud environments. Cloud deployments essentially consist of distributed virtual machines (VMs), thereby assisting in blockchain implementation. Blockchains can be coupled with data privacy models like t-closeness, m-privacy, l-diversity, etc. for enhancing their security performance. But due to a wide variety of algorithmic models available for both blockchains and privacy preservation, it is difficult for researchers and security experts to select the most optimum models and their combinations for efficient system security. Thus, this text initially reviews a wide variety of blockchain implementations, and discusses their advantages, limitations, and future prospects. This is accompanied with a detailed discussion about data privacy models, and their characteristics. Following these discussions, this text compares these models in terms of performance metrics including computational delay, security level, application, and scalability. This comparison will assist researchers and cloud security experts to identify the best models and their combinations suited for their deployments. It is observed that blockchain and privacy preservation models when combined with machine learning techniques like Genetic optimization, neural networks, fuzzy rules, etc. outperform their counterparts. Furthermore, this text also recommends various proven fusion combinations, which can be used in order to improve cloud security without compromising on quality of service (QoS) parameters. Keywords Blockchain · Cloud computing · Privacy preservation · Deep learning · Blockchain models · Fuzzy · Security · Attacks · QoS · Models · Cloud deployment H. V. Taiwade (B) Department of Computer Science & Engineering, Priyadarshini College of Engineering, Nagpur, India e-mail: [email protected] P. B. Ambhore Department of Information Technology, Government College of Engineering, Amravati, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_53
695
696
H. V. Taiwade and P. B. Ambhore
1 Introduction Due to the widespread use of cloud in consumer and industrial domains, its security analysis is one of the most researched areas in computer security. To secure clouds, a wide variety of models are proposed by researchers, which include, cryptographic methods, hashing techniques, key exchange mechanisms, privacy models, traceability methods, etc. Integration of these models into cloud requires additional computational steps, which reduces its QoS performance. In order to achieve high security, along with good QoS performance, blockchain-based models are integrated into the cloud. One such model is depicted in Fig. 1, wherein blockchain is integrated between cloud automation layer and data storage layers. Each blockchain layer design consists of ledgers, consensus algorithms, block design (smart contract in this case), and cryptography mechanisms. A survey of these models along with their performance analysis, advantages, limitations, and future scope(s) is mentioned in details in the next section of this text. Each blockchain model has a custom designed block structure, which consists of the following entities, • Source of data • Destination of data • Timestamp at which the block is created
Fig. 1 Integration of blockchain with cloud
Statistical Analysis of Blockchain Models …
697
• Data and its relevant meta information • Nonce value (a value which makes this block’s hash unique) • Hash of the current block All this information is secured using cryptographic functions, which include both public and private key cryptosystems. Selection of a cryptosystem depends upon the type of blockchain being designed. It is observed that blockchain models do not have inherent privacy preservation methods to anonymize internal data fields. Due to this, next section also covers review of various privacy preservation models, which assist in further improving security performance of existing blockchain implementations. Thus, Sect. 2 covers a wide variety of security algorithms, discusses their characteristics, and recommends fusion models for strengthening system privacy and security. This section is followed by a comparative analysis of the reviewed models, and discussions about their application use cases, which will assist researchers and security designers to identify best algorithmic combinations for their deployments. Finally, this text concludes with some interesting observations about the reviewed models, and recommends methods to improve its performance.
2 Literature Review Cloud security is one of the most researched areas due to its large-scale applicability, and real-time usability. Thus, a wide variety of security models are proposed for performing this task, and each of these models find their application at different layers of the cloud. For instance, the work in [1] proposes use of consortium-based blockchain for cloud-based electronic healthcare record (EHR) application. The proposed model uses proof of authorization (PoA) for blockchain mining, and verification, which assists in strengthening authentication and access control capabilities. Overall model diagram for the proposed security architecture can be observed from Fig. 2, wherein different system components including data owner, data provider, data requester, and the underlying blockchain database is described. The model is capable of enforcing data confidentiality, access control, data integrity, authentication, secure encrypted search, privacy preservation, and collusion resistance. All this is possible due to utilization immutable blockchain, which protects the system for a wide variety of attacks. The blockchain model utilizes 3-different protocol layers, including data generation, data storage, and data sharing, each of which assists in secure communication between data owner, and data access entities. The propose scheme has better security than Ciphertext-policy attribute-based signcryption (CPABS), and Secure-aware and privacy preserving model (SAPP), better authentication than blockchain-based secure sharing (BSS), and blockchainbased privacy preserving data sharing (BPDS), and better collusion resistance than CPABS, SAPP, and BSS models. This improvement makes the PoA-based EHR model applicable for a wide variety of system deployments. But the PoA protocol requires complex calculations, which reduces speed of cloud service. This affects
698
H. V. Taiwade and P. B. Ambhore
Fig. 2 Blockchain with privacy preservation for improved EHR security [1]
large-scale deployment capabilities for the system, and thus must be considered during cloud design. This limitation can be removed via use of light-weight privacy preservation models like ciphertext-policy attribute-based encryption (CP-ABE) as proposed in [2], wherein researchers have used discrete logarithms, and decisional bilinear Deffie-Hellman model for reduced complexity design. The model is currently designed for Internet of vehicles (IoV)-based cloud systems, but can be extended to other cloud deployments. Due to use of light-weight CP-ABE, there are finer security gaps in the model, which can be reduced via use of Physical Unclonable Functions (PUFs) as described in [3]. The proposed PUFs also have low complexity; support multiple servers with privacy aware authentication protocols. Model for the proposed architecture can be observed from Fig. 3, wherein challenge-response pairs (CRPs), registration center (RC), and blockchain (BC) ledgers are depicted. Due to use of PuFs and blockchain, the model is showcased to have low delay, high efficiency with superior security when compared with non-blockchain-based methods. But the model has limited access control capabilities, thus this performance can be further tuned via integration of node-identity-based authorization as suggested in [4, 5]. Here, researchers have used smart contracts for request tracking and control. The proposed model is capable of low density, high throughput, access control and security reinforcements, thereby making it useful for large-scale cloud deployments.
Statistical Analysis of Blockchain Models …
699
Fig. 3 Use of physically unclonable functions (PuFs) for light-weight multi-server privacy and security [3]
Another high efficiency blockchain-based model for access control and authentication is proposed in [6], wherein researchers have used Selective Ring-based Access Control (SRAC) model for improving system security. The SRAC model uses a hybrid computing layer, which incorporates blockchain management, distributed data storage, edge computing, access control, and authentication processes. The model supports inter-department access control, wherein records generated by users for one-department can be used for another department via proper access permissions. These departments are also referred as groups, and are connected to a data processing module as observed from Fig. 4, wherein different modules for decision making, alert management, and automation are described. It is observed that the model is highly secure, has good privacy performance, is immutable, has higher availability, novel traceability, anonymity, scalability, and user-level control. Due to these properties, the model is capable of deployment for high security, and large-scale applications. A survey of these applications can be observed from [7], wherein researchers have concluded that data centers, large-scale institutions, and corporate deployments are some of the major deployment areas for blockchain-based cloud systems. Furthermore, Blockchain-as-a-Service (BaaS) model and searchable encryption models are observed to have better applicability in these scenarios, and thus can be used with high efficiency. Another interesting review is done in [8] wherein researchers have indicated various applications and challenges of blockchain deployments, and compared various architectures for cloud of things (CoT). They suggest that public blockchain, is better for small-scale organizations or for cryptocurrency applications, while private and permissioned blockchains are more suited for large-scale organizations that require higher privacy, due to their granular control. Thus, applications like smart cities, smart energy systems, smart industries, etc. must always use permissioned blockchains, which requires effective design of consensus protocols, along with efficient shared ledgers, shared contracts
700
H. V. Taiwade and P. B. Ambhore
Fig. 4 Group-based access control using selective rings [6]
and cryptography models. An example of such a model is proposed in [9, 10], where a deep blockchain framework for collaborative intrusion detection and prevention (IDP) is proposed. The proposed IDP model uses Bidirectional Long Short-Term Memory (BiLSTM) for processing sequential network data. The model also uses Ethereum-based smart contracts for distributed intrusion detection and prevention with large-scale access control. The model has better performance when compared with support vector machines (SVM), random forest (RF), Naïve Bayes (NB), Mixture Localisation-based Outliers (MLO), thereby making it more deployable for both small-scale and large-scale applications. The model’s privacy performance can be further extended via the application of privacy preservation as discussed in [11, 12], wherein a unique 20-byte Ethereum Addresses (EA) assignment to every authorized node, with almost no collision. Due to which the system model is capable of reducing probability of denial of service (DoS), modification attack, and man in the middle attack with high efficiency. Due to non-collision-based address assignment, the system is able to reduce delays needed for user addition, policy enforcement, and account deployment, when compared with general purpose Ethereum model deployments. Another highly efficient privacy preservation model is proposed in [13], wherein researchers have proposed use of a privacy preservation-based charging scheme for
Statistical Analysis of Blockchain Models …
701
electric vehicles. The scheme is deployed on fog computing nodes (FCNs) for provision of better localized services using trusted authority. Each vehicle charging cycle is controlled using the blockchain network, and information about the device and charging station is stored in an immutable and secure format for higher efficiency. This model is further extended in [14], wherein decentralized pubic auditing process is integrated to privacy preservation, which assists in improving data transparency. Due to this, an improvement in overall trust levels of the system is observed, which assists in easier adaptation for large-scale systems. Furthermore, this system uses a zero-knowledge protocol, which assists in information security between cloud server (CS), and the auditors. The working model for this system can be observed from Fig. 5, wherein data owner, key generator devices, auditor, etc. are observed to be working in tandem for storing data onto the blockchain. The model is observed to have better public auditing, identity-based encryption, privacy preservation, and resistance against challenge message guessing when compared with secure certificateless public verification (SCPV), Blockchain-based public integrity verification (BPIV), Decentralized big data auditing (DBDA), and Compact proofs of retrievability (CPR) models. This model does not incorporate searchable encryption, due to which its applicability is limited, but it can be improved using the work in [15], wherein blockchain-based searchable public-key encryption scheme with forward and backward privacy (BSPEFB) is proposed. The BSPEFB model uses processes to setup input parameters, generation of key pairs, database update, and trapdoor for searchable encryption, search process for actually performing search on encrypted data, and decryption for showcasing final results. The model is able to reduce search time by 15% when compared with normal search
Fig. 5 Decentralized public auditing process for trusted cloud deployment [14]
702
H. V. Taiwade and P. B. Ambhore
models and reduce probability of spoofing and spying attacks, thereby improving overall security and quality of service (QoS) of the deployment. Similar models are proposed in [16–18], wherein researchers have used mobile activated EHRs, with decentralized interplanetary file system (IPFS), secure fine grained searchable encryption, and differential privacy for improving security performance of blockchain deployments. These methods are basically application-specific extensions of already discussed models, and assist in performance evaluation for different applications. Currently discussed privacy preservation models are highly complex due to their internal calculations, which limits their performance to moderate-to-high performance systems. In order to reduce this complexity, the work in [19] proposes multiauthority Attribute-based signature (MABS), which is a light-weight implementation of ABS. The model is capable of reducing computational overheads via online and offline signing with server-aided verification mechanisms. Due to this multitier approach, the model is capable of integrating unforgeability and anonymity into the system, thereby assisting in removal of collision attack. The model is deployed on medical telehealth application, but can be extended to any kind of cloud use-case. The performance of this model is further improved via use of tucker train (TT) decomposition as discussed in [20], wherein gradient descent is used to reduce computational loads via reducing the number of elements to be updated during privacy calculations. Due to TT, the model is able to reduce encryption, decryption, and privacy preservation delays, thereby assisting in high-speed and high-privacy system design. The efficiency of this system can be further improved via use of crowdsourcing mechanisms as discussed in [21], wherein researchers have implemented blockchain-based task matching (BPTM). The task matching model assists in uses a key management algorithm for reducing the computational complexity on the server via effective requestor to worker mapping. Due to this task matching model, the BPTM system is capable of reducing delay needed for blockchain mining, and data communication, thereby making it useful for high-speed and high security applications. This system can be applied to the work in [22], wherein CP-ABE model is used for improving efficiency of smart contracts. The model uses highly complex ABE model, which needs optimization for large-scale deployments. This optimization can also be done using the work in [23], wherein a stateless cloud auditing scheme is proposed for privacy preservation using non manager dynamic groups. In the proposed scheme, users and third-party auditors (TPAs) utilize stateless models which do not require maintenance of data index information during cloud auditing. This process can be observed from Fig. 6, wherein data flow between TPA, group users (GU), cloud service providers (CSPs), content provider (CP), and key generation center (KGC) is depicted. Due to stateless auditing, the model is able to reduce overall delay of verification by 10% and cost of storage by 25% when compared with a stateful model, thereby assisting its use in real-time cloud deployments. Furthermore, the model is able to resist non repudiation attacks, collusion attacks, and data and identity aware privacy mechanisms with high efficiency. But the model is unable to scale for larger number of nodes, where data dependency-based operations are needed. Thus, the work in [24]
Statistical Analysis of Blockchain Models …
703
Fig. 6 Stateless auditing scheme [23]
can be used for overcoming this drawback, wherein privacy preserved task offloading is performed using deep reinforcement learning for mobile-based cloud deployments. This model offloads excessive requests to a mobile-based cloud network, wherein both IoT data transfer, and task transfer processes are performed. Due to which, the model is capable of high-speed, and low complexity authentication and access control operations, thereby assisting in its deployment for large-scale networks. Due to mobile task offloading, the privacy of this model is compromised, which can be improved via the use of privacy aware blockchains. Such a blockchain is discussed in [25], wherein use of group signature and broadcast encryption are proposed. The model is capable of enhancing privacy of permissioned blockchains but can be applied to permissionless blockchains as well with minor modifications in encryption and key exchange mechanisms. A model for fine grained access control using privacy-enhanced federated identity provider system (PFIPS) is proposed in [26], where researchers have Decentralized Audit and Ordering (DAO)-based blockchain for evaluating data correctness. The model uses a combination of deep learning-based classification for data validation, and generation retrieval system for improving security and efficiency of data processing. The model is capable of ensuring data privacy, preservation of trading orders and verification of retrieved data, improving security, and reducing delay needed for data retrieval. Due to these advantages, the DAO model is capable of deployment for multiple high-density cloud scenarios. Another highly secure and efficient privacy preservation model is proposed in [27], wherein researchers have used Linear Elliptical Curve Digital Signature (LECDS) along with Hyperledger blockchain for reducing privacy loss. The model uses a Modified version of
704
H. V. Taiwade and P. B. Ambhore
Spider optimization search Algorithm (MSOA) for integrity verification, which also assists in improving query processing capabilities between different cloud servers. It uses a policy verifier module for authenticating the Hyperledger blockchain, thereby assisting in large-scale deployments due to minimum data leakages. The MSOA model reduces execution delay when compared with Auth-Privacy-Chain, Merkle Hashing Tree (MHT) and Stochastic Diffusion Search (SDS) models, which assists in improving overall throughput of the system. The privacy performance of this model uses centralized control due to which it is dependent on a group of cloud servers for policy enforcement. Due to which it can be attacked by external adversaries, this privacy performance can be improved via use of decentralized privacy enforced using federated learning (FLBlock) as proposed in [28]. Here researchers have developed an autonomous machine learning model, which is capable of self-learning, and thereby assisting the system to be resilient against dynamic poisoning attacks. It uses proof of work (PoW)-based consensus model for enforcing data immutability, and traceability, due to which its security performance is optimum. But the model is tested for a small network size, thus its scalability must be tested via evaluation of multiple deployment scenarios. Applications of privacy preservation and data security using blockchain can be observed from [12, 29–31], wherein researchers have used driver monitoring, federated learning for IoT devices, privacy enhancement in edge computing environments, and intrusion detection systems. These models utilize permissioned blockchain, and its subtypes for enforcing privacy, request tracking, outlier detection, data validation, authentication, and access control for high security, and high quality of service (QoS) network design. A review of various models that utilize blockchains for privacy preservation in social Internet of vehicles is discussed in [32], wherein researchers have concluded that machine learning models, outperform linear models in terms of attack coverage, data analysis, tracking, and dynamic attack detection. But these models have high computational complexity, which must be reduced via use of light-weight learning mechanisms, or use of deep federated learning, in order to improve their deployment capabilities for a large number of networks. An example of such a network is proposed in [33], wherein researchers have used softwaredefined network (SDN) into 5th Generation vehicular adhoc networks (VANETs). Due to incorporation of SDNs, network scalability has improved, and the underlying nodes become more flexible in terms of policy application, and network reconfiguration. Another example of such a highly flexible blockchain implementation can be observed from [34], wherein Electric Vehicle Supply Chain (EVSC) is deployed with hybrid blockchains. Due to use of hybrid blockchains, the model is capable of reducing spoofing attacks, and has lower bandwidth requirements, thereby improving its scalability. Blockchains can also be used for multiple keyword-based ranked search over encrypted cloud [35], heterogeneous data source privacy preservation in cyber physical systems [36], medical data sharing [37], and fair payment models for deduplication-based cloud storage [38], wherein smart contracts are deployed for facilitating conditional immutability, decentralized processing, and enforcing integrity validation. It is observed that these applications do not have efficient access
Statistical Analysis of Blockchain Models …
705
control mechanisms, thus the work in [39] proposes a highly efficient model that enforces hierarchical access control using distributed key management. The proposed model was applied on multiple blockchains, and cross-domain access control capabilities were achieved using blockchain’s decentralization, high scalability, fine grained auditability, and extensibility. This model was used in [40] for improving privacy in carpooling applications, via fog computing. It uses a secret blockchain stored communication key between driver and passenger, due to which multiple attack types are removed from the system. Another example of blockchain deployments which improves fairness and reliability of cloud-based applications is proposed in [41], wherein smart parking is managed using Strong Diffie Hellman (SDH)-based cryptosystem. The model enforces fairness via provision of transparent pricing, encrypted identity-based access control, and transparent traceability among different nodes. A similar model is used for medical healthcare, and is discussed in [42] wherein multiple blockchains are maintained for user-level data, and doctor-level access control. The proposed model visualized in Fig. 7, wherein health data is transferred between chains via use of user-assisted privacy and access control. All the healthcare related data is stored in user-chains, while access control links to the user-chain, along with access hashes are stored on the doctor-chain. Due to this decentralized nature, the model is able to reduce single chain dependency, which assists in reducing chain length, and improving speed of access. This concept can be further used in [43], wherein decentralized access of electronic medical records (EMRs) is facilitated using Elliptic Curve Cryptography (ECC) and Edwards-Curve Digital Signature Algorithm (EdDSA). The model has higher security than [42], but is slower due to multiple high complexity cryptographic function application. Similar system models are deployed in [44–47], wherein vehicular fog services are anonymously authenticated via light-weight blockchain, SDN is applied to reinforcement learning-based spatial crowdsourcing models, and bilinear pairings are used for privacy preservation in IoT applications. A major application of these models
Fig. 7 Multiple blockchains for access control in medical healthcare applications [42]
706
H. V. Taiwade and P. B. Ambhore
is discussed in [5, 10, 47, 48], wherein researchers have implemented a Privacy Preserving and Secure Framework (PPSF) and other blockchain-based models for smart cities and other scenarios via anomaly detection using Gradient Boosting (GB), and PCA models. Due to use of these models, the underlying blockchain network is continuously evaluated, and has lower attack probability, thereby improving its realtime quality of service (QoS) performance. Thus, it can be observed that application of blockchain with machine learning, and utilization of deep learning-based privacy preservation models assists in improving overall efficiency and security of multiple network deployments. Performance analysis of these network models is compared in terms of computational delay, security level, application, and scalability in the next section of this text.
3 Fuzzy Empirical Analysis From the extensive review of cloud security and privacy models it can be observed that machine learning, and its variants are highly useful when designing privacy preservation, access control, authentication, and data security modules. In order to select the best possible model(s) for cloud-based blockchain deployments, their performance is compared in terms of computational delay (D), security level (SL), application (A), and scalability (S). These values were quantized into different ranges of low (L), medium (M), high (H), and very high (VH), and tabulated in Table 1 depending upon their internal model design. This comparison will assist readers to select the best model w.r.t. performance and security requirements. These parameters were selected because for any cloud deployment, QoS and security are the two major design concerns. QoS is covered using computational delay, and scalability, while security is covered via security level parameters. A comparison of these parameters along with the application of use will assist researchers and cloud designers to use these systems effectively for their application-specific deployments. From this analysis, a novel metric for algorithmic rank score (MARS) was evaluated using Eq. 1, MARS =
S + SL D
(1)
This rank was evaluated for each of the models, and their respective applications were observed. It is observed that models with higher MARS values are better in terms of deployment capabilities, and thus must be used when designing blockchain-based networks. From the MARS evaluation, it is observed that PuF [3], TT Decompose [20], NB [9], EA [11], FCN [13], LECDS MSOA [27], Data source privacy [36], and CPR [14] outperform other models in terms of scalability, security level, and computational delay. This can also be observed from Fig. 8, wherein each of these models are compared w.r.t. their MARS values for better understanding.
Statistical Analysis of Blockchain Models …
707
Table 1 Statistical analysis of different blockchain-based models Model
D
SL
S
MARS
Application
PoA blockchain [1]
H
H
M
1.75
Medical
CPABS [1]
M
M
L
1.67
Medical
SAPP [1]
H
M
M
1.50
Medical
BSS [1]
H
L
L
1.00
Medical
BPDS [1]
H
M
H
1.75
Medical
CP-ABE [2]
M
H
M
2.33
VANET
PuF [3]
L
H
H
4.00
General
Node Identity [4]
M
M
H
2.33
General
SRAC [6]
H
H
M
1.75
General
BaaS [7]
H
M
H
1.75
General
Permissioned blockchain [8]
VH
H
H
1.60
General
BiLSTM IDP [9]
VH
H
VH
1.80
General
SVM [9]
M
M
L
1.67
General
RF [9]
M
L
L
1.33
General
NB [9]
L
M
M
3.00
General
MLO [9]
M
M
H
2.33
General
EA [11]
M
VH
H
3.00
General
FCN [13]
M
H
H
2.67
VANET
Decentralized CS [14]
M
M
M
2.00
General
SCPV [14]
M
L
M
1.67
General
BPIV [14]
M
H
M
2.33
General
DBDA [14]
H
M
M
1.50
General
CPR [14]
L
L
M
2.50
General
BSPEFB [15]
H
M
M
1.50
General
Mobile activated EHR [16]
M
H
M
2.33
Medical
Fine grained search [17]
H
M
M
1.50
General
Differential Privacy [18]
VH
H
H
1.60
General
MABS [19]
M
M
H
2.33
Medical
TT Decompose [20]
L
H
H
4.00
General
BPTM [21]
VH
H
M
1.40
General
CP-ABE [22]
H
M
H
1.75
General
Stateless TPA [23]
M
M
H
2.33
General
Mobile task offloading [24]
H
M
H
1.75
General
Privacy aware blockchain [25]
H
M
M
1.50
General
PFIPS DAO [26]
H
H
H
2.00
General
LECDS MSOA [27]
M
H
H
2.67
General (continued)
708
H. V. Taiwade and P. B. Ambhore
Table 1 (continued) Model
D
SL
S
MARS
Application
MHT [27]
M
M
H
2.33
General
SDS [27]
M
H
M
2.33
General
FLBlock [28]
H
M
H
1.75
General
Decentralized blockchain [29]
H
M
H
1.75
VANET
Federated learning [30]
VH
H
H
1.60
General
Privacy using edge [31]
VH
H
H
1.60
General
SDN blockchain [33]
VH
H
VH
1.80
VANET
Hybrid flexible blockchain [34]
H
H
H
2.00
VANET
Encrypted cloud blockchain [35]
VH
VH
H
1.80
General
Data source privacy [36]
M
H
H
2.67
General
Medical data sharing chain [37]
M
H
M
2.33
Medical
Deduplication-based blockchain [38]
H
H
M
1.75
General
Hierarchical access control [39]
H
H
H
2.00
General
Secret blockchain [40]
H
M
M
1.50
General
SDH [41]
M
H
M
2.33
General
Multiple blockchains [42]
H
M
M
1.50
Medical
ECC EdDSA [43]
H
M
H
1.75
Medical
Light-weight blockchain [44]
M
H
M
2.33
General
SDN RL [45]
VH
H
H
1.60
General
Bilinear pairing [46]
H
M
M
1.50
General
PPSF [48]
H
H
H
2.00
General
The high scoring models will always have better deployment performance when compared to lower scoring models, and can be applied for any kind of general purpose cloud deployment. Thus, these models must be used when deploying any kind of blockchain-based network.
4 Conclusion and Future Scope From the extensive evaluation, it is observed that a wide variety of models are available for blockchain-based cloud deployments. Out of these models, PuF [3], TT Decompose [20], NB [9], CPR [14], and EA [11] have lowest computational delay for general purpose use cases. This is because these models reduce redundancies during blockchain deployments, which limits the number of unwanted calculations, thereby reducing operational delay. While EA [11], Encrypted cloud blockchain [35], PuF [3], TT Decompose [20], Data source privacy [36], FCN [13], and CP-ABE [2] have highest scalability. This is due to the fact that these models are applicable
Statistical Analysis of Blockchain Models …
709
MARS for different models 4.50 4.00 3.50 3.00 2.50 2.00 1.50 1.00 0.50 0.00
Fig. 8 MARS for different models
for a large variety of cloud deployments, which makes them highly scalable. Similarly, SDN blockchain [33], EA [11], Encrypted cloud blockchain [35], PuF [3], TT Decompose [20], and Data source privacy [36] have better security performance. Combining these performances, it is observed that PuF [3], TT Decompose [20], EA [11], NB [9], and Data source privacy [36] outperform other models in terms of all the 3 parameters, and thus should be used for large-scale blockchain-based cloud systems. Similarly, for medical deployments, LECDS MSOA [27], BPIV [14], SDS [27], and BiLSTM IDP [9, 10] must be used, because researchers have showcased their use on medical applications, and their performance is superior when used for these applications. While for VANET based systems, PFIPS DAO [26], Differential Privacy [18], and Privacy using edge [31] outperform other models, and thus must be used for real-time cloud deployments. This is because these system models are deployed on VANETs by researchers, and are observed to have better performance for VANET-based application deployments. Based on the findings of this survey, a research statement is formulated, which is, cloud security models that utilize blockchain-based methods are highly scalable, and secure with good QoS for a wide variety of deployment scenarios. Models like EA, PuF, CP-ABE, etc. must be used for application-specific blockchain deployments. In future, it is recommended that researchers must merge these models in order to estimate their utility for different applications, while measuring their performance in terms of security, scalability, and computational delay metrics. Furthermore, fusion of these models
710
H. V. Taiwade and P. B. Ambhore
must also be performed using machine learning techniques like Q-learning, reinforcement learning, swarm intelligence, bio-inspired models, convolutional neural networks (CNNs), etc. which will further facilitate their capabilities for large-scale cloud-based application deployments.
References 1. Y. Wang, A. Zhang, P. Zhang, H. Wang, Cloud-assisted EHR sharing with security and privacy preservation via consortium blockchain. IEEE Access 7, 136704–136719 (2019). https://doi. org/10.1109/ACCESS.2019.2943153 2. Y. Yao, X. Chang, J. Miši´c, V.B. Miši´c, Lightweight and privacy-preserving ID-as-a-service provisioning in vehicular cloud computing. IEEE Trans. Veh. Technol. 69(2), 2185–2194 (2020). https://doi.org/10.1109/TVT.2019.2960831 3. Y. Zhang, B. Li, B. Liu, Y. Hu, H. Zheng, A privacy-aware PUFs-based multiserver authentication protocol in cloud-edge IoT systems using blockchain. IEEE Internet of Things J. 8(18), 13958–13974 (2021). https://doi.org/10.1109/JIOT.2021.3068410 4. C. Yang, L. Tan, N. Shi, B. Xu, Y. Cao, K. Yu, AuthPrivacyChain: a blockchain-based access control framework with privacy protection in cloud. IEEE Access 8, 70604–70615 (2020). https://doi.org/10.1109/ACCESS.2020.2985762 5. C. Joe, Vijesh, J.S. Raj. Deniable authentication encryption for privacy protection using blockchain. J. Artif. Intell. Capsule Netw. 3(3), 259–271 (2021) 6. B.S. Egala, A.K. Pradhan, V. Badarla, S.P. Mohanty, Fortified-chain: a blockchain-based framework for security and privacy-assured internet of medical things with effective access control. IEEE Internet of Things J. 8(14), 11717–11731 (2021), https://doi.org/10.1109/JIOT.2021.305 8946 7. K. Gai, J. Guo, L. Zhu, S. Yu, Blockchain meets cloud computing: a survey. IEEE Commun. Surv. Tutor. 22(3), 2009–2030 thirdquarter (2020). https://doi.org/10.1109/COMST.2020.298 9392 8. D.C. Nguyen, P.N. Pathirana, M. Ding, A. Seneviratne, Integration of blockchain and cloud of things: architecture, applications and challenges. IEEE Commun. Surv. Tutor. 22(4), 2521– 2549 Fourthquarter (2020). https://doi.org/10.1109/COMST.2020.3020092 9. O. Alkadi, N. Moustafa, B. Turnbull, K. -K. R. Choo, A deep blockchain framework-enabled collaborative intrusion detection for protecting IoT and cloud networks. IEEE Internet of Things J. 8(12), 9463–9472 (2021). https://doi.org/10.1109/JIOT.2020.2996590 10. D. Sivaganesan, Performance estimation of sustainable smart farming with blockchain technology. IRO J. Sustain. Wirel. Syst. 3(2), 97–106 (2021) 11. A. Qashlan, P. Nanda, X. He, M. Mohanty, Privacy-preserving mechanism in smart home using blockchain. IEEE Access 9, 103651–103669 (2021). https://doi.org/10.1109/ACCESS.2021. 3098795 12. O. Alkadi, N. Moustafa, B. Turnbull, A review of intrusion detection and blockchain applications in the cloud: approaches, challenges and solutions. IEEE Access 8, 104893–104917 (2020). https://doi.org/10.1109/ACCESS.2020.2999715 13. H. Li, D. Han, M. Tang, A privacy-preserving charging scheme for electric vehicles using blockchain and fog computing. IEEE Syst. J. 15(3), 3189–3200 (2021). https://doi.org/10. 1109/JSYST.2020.3009447 14. Y. Miao, Q. Huang, M. Xiao, H. Li, Decentralized and privacy-preserving public auditing for cloud storage based on blockchain. IEEE Access 8, 139813–139826 (2020). https://doi.org/10. 1109/ACCESS.2020.3013153 15. B. Chen, L. Wu, H. Wang, L. Zhou, D. He, A blockchain-based searchable public-key encryption with forward and backward privacy for cloud-assisted vehicular social networks. IEEE Trans. Veh. Technol. 69(6), 5813–5825 (2020). https://doi.org/10.1109/TVT.2019.2959383
Statistical Analysis of Blockchain Models …
711
16. D.C. Nguyen, P.N. Pathirana, M. Ding, A. Seneviratne, Blockchain for secure EHRs sharing of mobile cloud based E-health systems. IEEE Access 7, 66792–66806 (2019). https://doi.org/ 10.1109/ACCESS.2019.2917555 17. Mamta, B.B. Gupta, K.-C. Li, V.C.M. Leung, K.E. Psannis, S. Yamaguchi, Blockchain-assisted secure fine-grained searchable encryption for a cloud-based healthcare cyber-physical system. IEEE/CAA J. Automatica Sinica 8(12), 1877–1890 (2021). https://doi.org/10.1109/JAS.2021. 1004003 18. K. Gai, Y. Wu, L. Zhu, Z. Zhang, M. Qiu, Differential privacy-based blockchain for industrial Internet-of-Things. IEEE Trans. Industr. Inf. 16(6), 4156–4165 (2020). https://doi.org/10.1109/ TII.2019.2948094 19. J. Liu, H. Tang, R. Sun, X. Du, M. Guizani, Lightweight and privacy-preserving medical services access for healthcare cloud. IEEE Access 7, 106951–106961 (2019). https://doi.org/ 10.1109/ACCESS.2019.2931917 20. J. Feng, L.T. Yang, R. Zhang, B.S. Gavuna, Privacy-preserving tucker train decomposition over blockchain-based encrypted industrial IoT data. IEEE Trans. Industr. Inf. 17(7), 4904–4913 (2021). https://doi.org/10.1109/TII.2020.2968923 21. Y. Wu, S. Tang, B. Zhao, Z. Peng, BPTM: blockchain-based privacy-preserving task matching in crowdsourcing. IEEE Access 7, 45605–45617 (2019). https://doi.org/10.1109/ACCESS.2019. 2908265 22. S. Wang, X. Wang, Y. Zhang, A secure cloud storage framework with access control based on blockchain. IEEE Access 7, 112713–112725 (2019). https://doi.org/10.1109/ACCESS.2019. 2929205 23. X. Yang, M. Wang, X. Wang, G. Chen, C. Wang, Stateless cloud auditing scheme for nonmanager dynamic group data with privacy preservation. IEEE Access 8, 212888–212903 (2020). https://doi.org/10.1109/ACCESS.2020.3039981 24. D.C. Nguyen, P.N. Pathirana, M. Ding, A. Seneviratne, Privacy-preserved task offloading in mobile blockchain with deep reinforcement learning. IEEE Trans. Netw. Serv. Manage. 17(4), 2536–2549 (2020). https://doi.org/10.1109/TNSM.2020.3010967 25. C. Lin, D. He, X. Huang, X. Xie, K.-K.R. Choo, PPChain: a privacy-preserving permissioned blockchain architecture for cryptocurrency and other regulated applications. IEEE Syst. J. 15(3), 4367–4378 (2021). https://doi.org/10.1109/JSYST.2020.3019923 26. G. Ra, D. Kim, D. Seo, I. Lee, A federated framework for fine-grained cloud access control for intelligent big data analytic by service providers. IEEE Access 9, 47084–47095 (2021). https:// doi.org/10.1109/ACCESS.2021.3067958 27. B. Sowmiya, E. Poovammal, K. Ramana, S. Singh, B. Yoon, Linear elliptical curve digital signature (LECDS) with blockchain approach for enhanced security on cloud server. IEEE Access 9, 138245–138253 (2021). https://doi.org/10.1109/ACCESS.2021.3115238 28. Y. Qu et al., Decentralized privacy using blockchain-enabled federated learning in fog computing. IEEE Internet Things J. 7(6), 5171–5183 (2020). https://doi.org/10.1109/JIOT. 2020.2977383 29. Q. Kong, R. Lu, F. Yin, S. Cui, Blockchain-based privacy-preserving driver monitoring for MaaS in the vehicular IoT. IEEE Trans. Veh. Technol. 70(4), 3788–3799 (2021). https://doi. org/10.1109/TVT.2021.3064834 30. Y. Zhao et al., Privacy-preserving blockchain-based federated learning for IoT devices. IEEE Internet of Things J. 8(3), 1817–1829 (2021). https://doi.org/10.1109/JIOT.2020.3017377 31. B. Ernest, J. Shiguang, Privacy enhancement scheme (PES) in a blockchain-edge computing environment. IEEE Access 8, 25863–25876 (2020). https://doi.org/10.1109/ACCESS.2020. 2968621 32. T.A. Butt, R. Iqbal, K. Salah, M. Aloqaily, Y. Jararweh, Privacy management in social internet of vehicles: review, challenges and blockchain based solutions. IEEE Access 7, 79694–79713 (2019). https://doi.org/10.1109/ACCESS.2019.2922236 33. L. Xie, Y. Ding, H. Yang, X. Wang, Blockchain-based secure and trustworthy Internet of Things in SDN-enabled 5G-VANETs. IEEE Access 7, 56656–56666 (2019). https://doi.org/10.1109/ ACCESS.2019.2913682
712
H. V. Taiwade and P. B. Ambhore
34. G. Subramanian, A.S. Thampy, Implementation of hybrid blockchain in a pre-owned electric vehicle supply chain. IEEE Access 9, 82435–82454 (2021). https://doi.org/10.1109/ACCESS. 2021.3084942 35. Y. Yang, H. Lin, X. Liu, W. Guo, X. Zheng, Z. Liu, Blockchain-based verifiable multi-keyword ranked search on encrypted cloud with fair payment. IEEE Access 7, 140818–140832 (2019). https://doi.org/10.1109/ACCESS.2019.2943356 36. M. Keshk, B. Turnbull, E. Sitnikova, D. Vatsalan, N. Moustafa, Privacy-preserving schemes for safeguarding heterogeneous data sources in cyber-physical systems. IEEE Access 9, 55077– 55097 (2021). https://doi.org/10.1109/ACCESS.2021.3069737 37. H. Jin, Y. Luo, P. Li, J. Mathew, A review of secure and privacy-preserving medical data sharing. IEEE Access 7, 61656–61669 (2019). https://doi.org/10.1109/ACCESS.2019.2916503 38. S. Wang, Y. Wang, Y. Zhang, Blockchain-based fair payment protocol for deduplication cloud storage system. IEEE Access 7, 127652–127668 (2019). https://doi.org/10.1109/ACCESS. 2019.2939492 39. M. Ma, G. Shi, F. Li, Privacy-oriented blockchain-based distributed key management architecture for hierarchical access control in the IoT scenario. IEEE Access 7, 34045–34059 (2019). https://doi.org/10.1109/ACCESS.2019.2904042 40. M. Li, L. Zhu, X. Lin, Efficient and privacy-preserving carpooling using blockchain-assisted vehicular fog computing. IEEE Internet Things J. 6(3), 4573–4584 (2019). https://doi.org/10. 1109/JIOT.2018.2868076 41. C. Zhang et al., BSFP: blockchain-enabled smart parking with fairness, reliability and privacy protection. IEEE Trans. Veh. Technol. 69(6), 6578–6591 (2020). https://doi.org/10.1109/TVT. 2020.2984621 42. J. Xu et al., Healthchain: a blockchain-based privacy preserving scheme for large-scale health data. IEEE Internet Things J. 6(5), 8770–8781 (2019). https://doi.org/10.1109/JIOT.2019.292 3525 43. A. Saini, Q. Zhu, N. Singh, Y. Xiang, L. Gao, Y. Zhang, A smart-contract-based access control framework for cloud smart healthcare system. IEEE Internet of Things J. 8(7), 5914–5925 (2021). https://doi.org/10.1109/JIOT.2020.3032997 44. Y. Yao, X. Chang, J. Miši´c, V.B. Miši´c, L. Li, BLA: blockchain-assisted lightweight anonymous authentication for distributed vehicular fog services. IEEE Internet Things J. 6(2), 3775–3784 (2019). https://doi.org/10.1109/JIOT.2019.2892009 45. H. Lin, S. Garg, J. Hu, G. Kaddoum, M. Peng, M.S. Hossain, Blockchain and deep reinforcement learning empowered spatial crowdsourcing in software-defined internet of vehicles. IEEE Trans. Intell. Transp. Syst. 22(6), 3755–3764 (2021). https://doi.org/10.1109/TITS.2020.302 5247 46. H. Zhang, L. Tong, J. Yu, J. Lin, Blockchain-aided privacy-preserving outsourcing algorithms of bilinear pairings for internet of things devices. IEEE Internet of Things J. 8(20), 15596–15607 (2021). https://doi.org/10.1109/JIOT.2021.3073500 47. S. Smys, H. Wang. Security enhancement in smart vehicle using blockchain-based architectural framework. J. Artif. Intell. 3(02), 90–100 (2021) 48. P. Kumar et al., PPSF: a privacy-preserving and secure framework using blockchain-based machine-learning for IoT-driven smart cities. IEEE Trans. Netw. Sci. Eng. 8(3), 2326–2341 (2021). https://doi.org/10.1109/TNSE.2021.3089435
Deferred Transmission Control Communication Protocol for Mobile Object-Based Wireless Sensor Networks Anand Vaidya and Shrihari M. Joshi
Abstract The residual energy is an essential metric measured in the wireless and wireline network. Since a sensor network is organized with an objective of accumulate the information and pass it to the base station. But due to the large distance between the sensor and sink, a lot of energy will be lost due to communication costs. To overcome this drawback new technique is proposed, where the master node will transfer the information to the mobile sink at periodical time intervals. Mobile object(sink) enlistment has been announced, in which mobile object travels in the sampling network area in a legitimate track. The layered mobile object count algorithm will decide the periodic duty cycle time interval of the nodes. The simulation results show energy-saving and enhance the life period of wireless sensor networks (WSN) by 30%. The proposed algorithm also helps in improving congestion control. Keywords Sensor networks · Layered mobile object · Communication cost · Duty cycle · Congestion control
1 Introduction A WSN is made up of tiny sensor devices, which are equipped with very small battery power and are adept of wireless communications. When a WSN frame work is deployed in a sensing field, these sensor nodes will be accountable for sensing irregular events. For example in the agriculture field sensing temperature and humidity of the atmosphere as shown in Fig. 1. In case of anomalous condition or set to intermittently report condition the sensed data will be transfer to sink by hop-by-hop approach. The routing path may be static or dynamic subject upon the routing path algorithm.
A. Vaidya (B) · S. M. Joshi SDM College of Engineering and Technology, Dharwad, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_54
713
714
A. Vaidya and S. M. Joshi
Fig. 1 Wireless sensor frame work [1]
Recent years have witnessed the wide increase of wireless sensor networks in a variety of applications. Range from house automation to medical systems. In recent survey indicates the major use in the agriculture field to identify moisture content, soil fertility, and other sensing parameters. Sensing hubs are normally self-sorted out, conveying information to the focal sink in a multiple-bounce path. The communication will happen normally from cluster head to base station. Due to the large distance from cluster head and base station witnessed huge energy loss [2, 3]. Which in-term reduces the lifespan period of the wireless sensor networks. To overcome this lacuna, initially, the specimen area is distributed into numerous layers [4, 5] by considering the communication capability of the master node (At any given time, every layer has only one master node). The master node will be placed nearer to the path of the mobile object. In the proposed work, we developed a Deferred Transmission Control Communication Protocol for Mobile Object-Based wireless sensor networks (DTCCMOB). In which mobile object capture only the nearby master node data and guide the master node either to be in idle, ready, or transmit state through flag. The decision of this will take place depending on the layered count and mobile object’s current position. These facilitate to the master node to send the packet at a particular time intervals rather than continuous. This also in-term avoid the traffic congestion.
2 Literature Survey A classic WSN is consisting of self-assured of stationary sensor nodes and a static sink placed inside the monitoring region. In such an emblematic scenario the major restriction may be the energy communication module of each node. The energy
Deferred Transmission Control Communication Protocol …
715
consumption depends on the communication distance. Different cluster head selection methods are invented so that energy consumption will be reduced. Another way to reduce the communication distance is to arrange multiple static sinks [6, 7]. The benefit of this arrangement will decrease the average path span from source to sink and hence results in smaller average energy dissipation per node related to the case of a single static sink. But a major challenge is the deployment of these mobile sink objects inside the monitoring area so that the data spreading load can be balanced among the nodes. This in-term results in facility location problems [8, 9].
2.1 Related Work In most of the scenarios, we presumed that nodes are homogenous in nature. Hence, Clustering of the nodes provide better communication cost and also extend the lifetime of a sensor network by reducing energy consumption. Connectivity is very essential for data transmission [10]. Clustering can also increase network scalability. The selection of cluster heads and connectivity among the different cluster heads is one of the major challenges. Some of the energy-effective cluster head selection methods and scheduling approaches will be summarized. Mahajan et al. [11] proposed Cluster Chain Weight Metrics (CCWM) mechanism to identify the cluster head, which intern uses service parameters to improve the performance. Singh and Lobiyal [12] proposed a method to select an energy-aware cluster head (CH) based on Particle Swarm Optimization (PSO) and analysis of packet transmissions in WSN. PSO delivers an energy alert cluster with optimal selection of CH. It reduces the cost of detecting the best location of CHs. The activity was performed inside the clusters instead of performed in the base station. The objective function was framed based on the minimum distance from the CH and member nodes and residual energy approach [13]. Can Tunca et al. [14], proposed movable base station approach, where the mobile sink will capture the sampling region sensed data. The route building is not required for this approach. But it suffers from the periodic overflowing of movable station location during the network period. Zeydin Pala et al. [1]. Proposed Mobile Sink-based Routing Protocol (MSRP) where the cluster head is determined by mobile sink by its quantified locality and collect the information from that cluster heads. The key drawback is accumulated data from the cluster head will be gathered by the movable station when it reaches nearer to the group head and it is suitable for only delay for bearing applications. Finally, by considering all the previous researcher’s papers they will speak only on the effective cluster head selection based on the locality of the mobile sink. We propose an intelligent method to transmit sensed data to the mobile object (sink) including the selection of an effective and efficient master node approach.
716
A. Vaidya and S. M. Joshi
3 Proposed Work We first divide the sampling region into multiple segments, which intern referred to as layers. The perimeter of each layer should be less than the transmission/receiving capacity of the mobile object (mobile sink). The traveling path of the mobile object is fixed by considering the factors like sink mobilization and the duty cycle of the sensor nodes. Let us consider radius of each layer should be R and X1 , X, X2 are three possible path in which mobile object can travel as shown Fig. 2. The current position of mobile object will be at position P1. The distance from center of the layer to current position (P1) is y2 = αR. For the values of αβ (0,1) the optimal area coverage of mobile object with less average energy consumption (Eavg ) of master node to transfer the sampling data given by formula 1. √ √ MOmax_covg = R 2 (ω/180) + R2 α 1 − α 2 + α 2 − β 2
(1)
We can compute the data load which has to pass on average through one master node along the arc c as a function of the distance of the mobile object from the center of the each layer is given by formula 2. √ √ R2 (ω/180) + R2 α 1 − α 2 + α 2 − Master node load = Total area of the layer (2 /180) ∗ arc cos(β/α)
(2)
Hence, we chosen the path of mobile object to be center of the layer to decrease the average energy consumption (Eavg ) of the master node. Figure 3 shows the configuration of entire setup. Let the sampling region will be divided into n layers and mobile object will know its traveling path in advance. The energy model for above scenario is as follows. Fig. 2 Mobile Object Path
Deferred Transmission Control Communication Protocol …
717
Fig. 3 Layer configuration
3.1 Energy Required to Transmit the Packet The energy required to broadcast of the k-bit envelope for a distance of ‘d’ is mentioned by (3), d 0 indicate the location distance. ESP_Tx Drange , K bit_PK = K bit_PK ∗ ENelec + β K bit_PK ∗ ENamp ∗ d 2
(3)
ENelec : The electronics energy, depends on factors such as digital coding and modulation. ENamp : Amplifier energy, depends on the transmission distance and the acceptable bit-error rate. β: Percentage of master node.
3.2 Energy Required Receiving the Packet ESP_Rx Kbit_PK = K bit_PK ∗ ENelec + ENdata_proc ENdata_proc :Energy required for data aggregation.
718
A. Vaidya and S. M. Joshi
3.3 Total Energy Consumed ETotal = ESP_Tx Drange , Kbit_PK + ESP_Rx Kbit_PK + Esen_st
(4)
where ESP_Tx: Energy spent for communicating ‘k’ bit envelope. ESP_Rx: Energy paid for receiving ‘k’ bit envelope. Esen_st : Energy paid during sleep state and sensing state For above scenario mobile object will capture the data by triggering particular master node w.r.t each layer. We assumed that each layer will have only one master node, which is responsible for data relay of that layer. In normal case master node continuously dissipate the sensed data to mobile object. Which indeed unnecessary loss of the energy and life period of wireless sensor networks decreases [15]. To overcome drawback of data congestion and continuous transmission of the data the new algorithm Deferred Transmission Control Communication Protocol for Mobile Object-Based wireless sensor networks (DTCCMOB) is introduced. In Deferred Transmission Control Communication Protocol for Mobile ObjectBased wireless sensor networks (DTCCMOB) the data of only two layers will be captured at any given time interval, i.e., suppose if mobile object is at layer 2 it will trigger the master node of a layer 2 and layer 3.At the same time it will send a broadcasting message to layer1 master node to switch off the data transmission. The broadcasting message contains trace table information and mobile object counter value. Mobile object(sink) have three data structure to sore the information about mobile object travel counter, info status table of master node and decision making unit as displayed in Fig. 4.
Fig. 4 Mobile object processing unit
Deferred Transmission Control Communication Protocol …
719
Algorithmic approach of controllable deferred communication protocol for mobile object-based wireless sensor networks: 1: procedure: Capture the master node data pertaining to the each layer M_object( MO_Trace_Table, TL , M_obj_Counter). // input: Mobile object trace table, Total layers. M_obj_Counter1. Layercount 1 // Initialize layer count to 0 as global variable. 2: for i to TL do 3: Initialize_Que(MO_Trace_Table, Transmit_bit,0) 4: for i to TL do 5: if (Layercount = MNid ) 6: update_Que ( MO_Trace_Table, Transmit_bit,1) // update the mobile object trace table to indicate corresponding master node transmit bit to high 7: Broadcast(MNid , Transmit_bit) 8: Capture_senser_data(MNid ,packet_info, ENGlevel ) continue 9: Layercount = Layercount + 1. 10: if (Layercount > 2) update_Que( MO_Trace_Table,MNid = Layercount -2,Transmit_bit,0) // update the mobile object trace table to indicate corresponding master node transmit bit to high 11: Broadcast(MNid , Transmit_bit) The status flag of each master node is initialed to zero indicate that mobile object not capturing the data for processing. It will be made one when it is in permissible range of receiving, i.e., layercount equal to master node id(MNID) as shown in the Fig. 5.
4 Result and Discussion 4.1 Experiment Setup Initially all the node are deployed in predetermistic approach and every node in a particular layer know the maser node information. The master node is responsible to deliver the data pertaining to that layer. The configuration of each layer as shown in Fig. 6. The mobile object only receives the data relating two layers, i.e., previous and existing layer of its moment. When it passes to different layer it send the broadcasting
720
A. Vaidya and S. M. Joshi
Fig. 5 Flow chart of the decision making unit of the Mobile object
Fig. 6 Layer outline
message to master node, which is two layer behind and intimate to stop for transmission of sensed data. At the same time it send the message to upcoming master node to ready for transmission of data as shown in Fig. 7.
Deferred Transmission Control Communication Protocol …
721
Fig. 7 Mobile object broadcasting
4.2 Simulation Setup To evaluate the usefulness of our procedure, we customary the network size 300 × 300 m circular area. The bandwidth of master node is to be assigned around 80 feet. The simulation stricture as specified in Table 1: To streamline our discussion, we form 50 layers and each layer contain 30 sensors. The motion of mobile object is 40 miles/h. The MAC–layer protocol is the communication media between master node and mobile object. The Fig. 8 will demonstrate the layer creation of the sampling region. The energy remunerated by each master node at precise interval of 1 mili-sec is shown in Fig. 9.
Table 1 Simulation constraints
Limitation
Cost
Network scope
300 m × 300 m
Nodes position
Unsystematic
No. of layers
50
Initial dynamism
10 J
Total Number of nodes
840
Broadcast power
100 W
Data frequency
2 × 106
Replication time
5 ms
722
A. Vaidya and S. M. Joshi
Fig. 8 Layer formation Specimen
The experimentation is prepared for interval of 1–5 ms by considering existing different communication protocol. The energy consumption of different protocol and proposed one is compared and as indicated in Table 2 and Fig. 10. The packet delivery ratio (PDR) is compared with directional broadcast, mobile object-based data transmission and proposed one, i.e., Deferred Transmission Control Communication Protocol for Mobile Object-Based wireless sensor networks (DTCCMOB). The simulation result indicate that in proposed protocol the packet delivery ratio is better compare to existing protocol. The proposed method the mobile object will captured the sensed data by moving nearer to the cluster head (master node) hence transmission distance is less compared to static sink approach. In the proposed method, at any given time only two master node will transmit the data and other master nodes in the network are in sleep mode so data traffic is reduced. Which intern increases throughput of the network as shown in Table 3 and Fig. 11.
5 Conclusion In this paper, energy-effective Deferred Transmission Control Communication Protocol for Mobile Object-Based wireless sensor networks (DTCCMOB) is proposed. In this protocol mobile object (mobile sink) will capture the layered data while it traveling in that region only. It will halt the master node data when it is far long (Two layers from the current position) by sending an appropriate message. This will make the master node to awake and transmit the sensed data only at some
Deferred Transmission Control Communication Protocol …
723
Fig. 9 Master node power feeding
Table 2 Energy comparison with different methods Packet transmission techniques (ms)
Simulation time (ms) 1
2
4
5
Directional broadcast (J)
1.454
1.821
2.25
2.58
Mobile object-based (J)
1.15
1.32
1.66
2.056
Proposal technique (J)
0.91
1.135
1.3
1.572
particular time interval, rather than all the time. The efficient and effective path of the mobile object is also taken into care so that maximum packet delivery ratio is achieved The simulation result shows the energy efficiency as well delivery packet ratio improvement compare to counter-based broadcast, directional broadcast and
A. Vaidya and S. M. Joshi
Energy consumed in Joules
724
Comparative Study of Energy Consumption 3.5 2.5 Direc onal 1.5 Broadcast 0.5 0 5 10 Mobile Object Based Time (msec)
Fig. 10 Energy comparison study of different Broadcasting protocols
Table 3 Packet delivery ratio of different protocols Packet transmission techniques
Simulation time (ms) 1
2
4
Directional broadcast
94
89
86
5 80
Mobile object-based
92
90
87
85
Proposal technique
98
96
95
94
Packet Delivery Ratio(PDR)
Number of Packets
100 95
Direc onal Broadcast
90 85
Mobile Object Based
80
Proposal Technique
75 70
0
0.5
1
1.5
2
Fig. 11 Packet delivery ratio of different protocols
mobile object-based mechanism. The analysis indicates the wireless sensor network lifetime is 30% extended with ideal routes and inadequate swamping.
References 1. K. Johny Elma, S. Meenakshi, Energy efficient clustering for lifetime maximization and routing in WSN. Int. J. Appl. Eng. Res. 13, 337–343 (2018). ISSN 0973-4562 2. S.K. Pandey, S. Patel, Energy distance clustering algorithm (EDCA) for wireless sensor networks. In: 9th International Conference on Cloud Computing, Data Science & Engineering (Confluence). 978-1-5386-5933-5/19/2019 IEEE 3. F. Engmann, F.A. Katsriku, Prolonging the lifetime of wireless sensor networks. a review of current techniques. Wirel. Commun. Mobile Comput. (2018). Article ID 8035065 2018
Deferred Transmission Control Communication Protocol …
725
4. A. Vaidya, S.M. Joshi, Energy efficient protocol for wireless sensor networks, using mobile object based master node approach. In: 2020 IEEE International Conference for Innovation in Technology (INOCON) Bangaluru, India. Nov 6–8 (2020) 5. M.H. Shafiabadi, A.K. Ghafi, D.D. Manshady, New method to improve energy savings in wireless sensor networks by using SOM neural network. J. Serv. Sci. Res. 11, 1–16 (2019) 6. Y. Zhang, Q. Huang, A learning-based adaptive routing tree for wireless sensor networks. J. Commun. 1(2) (2006) 7. R. Nileshkumar, P. Shishir Kumar, Wireless sensor networks challenges and future prospects. IEEE Xplore (2019) 8. S.Zou, J. Liang, Energy equalization algorithm based on controllable transmission direction in mobile wireless sensor networks. Sensors Transducers 182(11), 69–76 (2014) 9. M. Shio Kumar Singh, D. Singh, K. Singh, Routing protocols in wireless sensor networks a survey: Int. J. Comput. Sci. Eng. Surv. (IJCSES) 1(2) (2010) 10. S. Smys, C. Vijesh Joe, Metric routing protocol for detecting untrustworthy nodes for packet transmission. J. Inf. Technol. 3(02), 67–76 11. R. Rajasekar, P. Prakasam, A survey on mobile sampling and broadcast scheduling in WSN. Int. J. Appl. (delivery packet ration 2014) 12. M. Benaddy, B. El Habil, M. El Ouali,: A mutlipath routing algorithm for wireless sensor networks under distance and energy consumption constraints for reliable data transmission. ICEMIS2017, MONASTIR, TUNISIA978-1-5090-6778-7/17/$31.00©2017 IEEE 13. S.K. Singh, M.P. Singh, D.K. Singh, Routing protocols in wireless sensor networks—a survey. Int. J. Comput. Sci. Eng. Surv. IJCSES 1(2) (2010) 14. M.I. Khan, N. Wilfried, Static vs. mobile sink: the influence of basic parameters on energy efficiency in wireless sensor networks. Comput. Commun. 36, 965–978 (2013) 15. A.V.S. Elavarasi, M. Munira sulthana, R. Sangeetha, Sampling sensor fields using mobile object band based approach. Int. J. Innov. Res. Sci. Eng. Technol. 2(3) (2013)
Crıme Data Analysıs Usıng Machıne Learnıng Technıques Ankit Yadav, Bhavna Saini, and Kavita
Abstract Individuals, authorities, and governments have all placed a high focus on reducing crime. This research work investigates a few data mining techniques and machine learning algorithms that are used to mine crime datasets and methods, or techniques utilized in crime data analysis, forecasting and prediction. Crime estimating is a way of trying to be mining out and decreasing the upcoming crimes. This leads in predicting the future crime that will have chances to happen. In addition, a formal introduction is made of crime in India. In today’s world, where crime is on the rise, it is critical to be able to predict forthcoming crimes with greater precision. India’s official criminal code is the Indian Penal Code (IPC). It is a comprehensive code which focuses to cover all characteristic of criminal law. Here, the importance of data mining techniques and machine learning algorithms in resolving crime problems by uncovering hidden criminal trends cannot be overstated. As a result, the goal of this research project may be to examine and discuss alternative strategies for predicting crime and provides reasonable exploration of data mining techniques and machine learning algorithms for estimation and prediction of upcoming crime. Keywords Crime · Exponential smoothing · ARIMA · Prediction · Data visualization · Machine learning
1 Introduction A crime or offense (often known as a criminal act) is an act which impacts not only an individual but also a community, society, or state. If the relevant and applicable legislation declares such activities to be crimes, they are prohibited and punishable by law. Crime can be defined as an illegal conduct for which a person can be prosecuted by the government, especially if it is a flagrant violation of the law. The process of uncovering illegal behavior (or validating reported crime) and obtaining evidence to identify and prosecute its perpetrators is known as crime detection. The process of A. Yadav · B. Saini (B) · Kavita Department of Information Technology, Manipal University, Jaipur, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_55
727
728
A. Yadav et al.
studying the nature of the incidence of crime is defined as crime analysis or pattern analysis [1]. For crime analysis, a range of methodologies can be implemented. The process of analyzing crime patterns begins when a crime is first recognized. After that it begins with the collection of crime data. The analysts further examine a large volume of crime data sets to find particular links between the data sets [2]. Different forms of crime analysis are characterized as: 1. 2. 3.
Crime Intelligence Analysis Tactical Crime Analysis Strategic Crime Analysis
In this research, an attempt in developing a method for predicting/forecasting criminal activity and monitoring it. This provides people and government/authorities with advance warnings and notifications regarding the most densely populated locations. This sense that may be derived from these types of predictions might be valuable in expanding effective patrolling routes or assisting tourists or outsiders unfamiliar with the surroundings. Labeled data and supervised machine learning algorithms are used to make the predictions. There are several research domains in which modern crime prediction literature can be classified. Several researches have focused on environmental factors such as education, economic level, and unemployment, to mention a few, as well as spatial–temporal crime events [3]. On the other hand, the study primarily concerned with the origins of crimes and their effects. Therefore, this study is divided into two sections: 1.
2.
Crime predictions: Crime prediction is a method of attempting to anticipate and reduce future crimes by forecasting future crimes or activities that are likely to occur. Crime forecasting: By identifying historical crime patterns or identifying the most common types of crime in a given location, crime forecasting can help to prevent recurring crimes in a given area [4]. Different Research Approaches are:
1.
2.
3.
Using Historical Data: Using data from previous crimes to anticipate and prevent future occurrences is one technique to aid in the fight against crime. The study’s main goal is to examine the crime trends of various crime specific categories. Using crime statistics for various places, several studies have been conducted to forecast crime kinds, crime rates, and crime hot spots [5]. Spatial–temporal Analysis of Crimes: Spatial analysis is used to evaluate crime data using algorithms of machine learning techniques and to detect potential high crime areas in order to increase security by more patrolling and surveillance. The goal was to anticipate crime hotspots for the upcoming month that were not in the same area settings as the current month [6]. Using Dynamic Features: Dynamic features capture a large amount of data that can be used to better comprehend regional data (e.g., people movement) throughout an entire region. Nowadays, Location-Based Social Networks (LBSN) offer these possibilities for catching significant problems [7].
Crıme Data Analysıs Usıng Machıne Learnıng Technıques
729
Data mining techniques includes the use of filtered data analysis tools to find similar patterns and relationships between different datapoints of the data sets. These tools can be machine learning techniques, and mathematical algorithms, such as decision trees, classifications, clustering. Thus, data mining helps in analysis and prediction. Prediction used a combination of different techniques based on data mining such as clustering, classification. It examines past facts or incidents in the correct order to predict a future activity. In pervious data mining projects/research work, various data mining techniques have been developed and used. These including classification, clustering, prediction, and forecasting using Time Series Analysis. In other words, we can say that Clustering is a data mining technique to identify similar data. This technique helps to recognize the differences and similarities between the data. Clustering is very identical to the classification, but it involves putting data together based on their relationships. On the other hand, Time Series Analysis is used to identify most affected areas and high dense areas [8–11]. The rate of crime is increasing by the day, and it is important to determine which areas have the most crime reports/commits. Criminals have been known to commit crimes in any locale and in any manner. Major crimes in India: • • • • • • •
Cyber crimes Terrorism Crime against women Crime against children Crime against SC/ST Robbery, Dacoity, and Murder Cognizable crimes
These are some crime forms committed in different areas of India and other areas of the world. Also, there is a huge rise in the rate of factors of Poverty, Migrations, Unemployment, Frustrations, Illiteracy, over expectations/needs and corruption that leads sufferer to commit an Illegal act (crime) [12]. Data on crime is gathered from several crime bureaus or police departments, including the National Crime Documents Bureau (NCRB), the Criminal Justice Information Center (CBI), and government records. Every year, the National Crime Records Bureau (NCRB) publishes the annual Crime in India report, while state authorities provide individual reports [13]. The next sections of the paper are Research Objective which describes the main objective of the research work and Methodology describes the study scenario and methods to be used and status about the progress of the work.
730
A. Yadav et al.
2 Research Objective Historical data is used to anticipate crime. First the data is evaluated and on this data crime estimates are made to predict the upcoming crime with respect to positions (location coordinates), time, day-month-year, season, and other factors. • • • • •
To study about crime events and crime analysis in India. We have gone through the different data sources to collect the datasets. At initial state we will focus on various ARIMA for forecasting. Implement Time series analysis for forecasting Hotspot areas. Try to obtain good accuracy in predictions of crime pattern from various ML algorithm models for comparative study.
3 Methodology The process of selecting, exploring, and modeling huge volumes of data is referred to as methodology. Within Data Mining, there are several approaches that are commonly used, with Sample, Explore, Modify, Model, and Assess being the most widely used. In this paper, the analysis of the state-wise Crime data of India has been implemented. The data for this paper has been extracted from various sources, which was not complete and had to be preprocessed for visualization purposes. The visualization was done using Python and its various libraries. The mentioned steps have been elaborated on in the following points.
3.1 Data Collection To analyze the Crime rate or Hotspot of an area, the first and foremost requirement is gathering the data for the same. The data needs to be accurate for optimal analysis. Incorrect or data with errors may result in a flawed analysis. Therefore, the most recommended sources for data collection for this analysis are official government crime records which contain yearly records of crime, area, and type. In this paper, we collect state-wise data from the official government (data.gov.in and data.world domain) and NCRB records, as published by the National Crime Records Bureau, India, and use it for preprocessing.
3.2 Implementation The proposed framework consists of preprocessing and Data cleaning (Data manipulation). Using the crime data published by the Government of the India of different City and categories like crime against children, against women, Theft crimes, etc., a
Crıme Data Analysıs Usıng Machıne Learnıng Technıques
731
predictive regression model of the number of crimes expected for a given date, time, commune, and type of crime has been defined and trained. The available data had information from 2001 to 2014 and few datasets are of session 2016–2019 (yet to be used). The first objective is to select the sample dataset for the study. Initially there were an average of 10,000 crime instances within every category dataset, and due to improper structuring. There were different crime files having different attributes and from there category wise common attributes were chosen for better analysis few of them were named as “STATE”, “YEAR”, “REGION”, “CRIME TYPE” and “CASUALTY”, etc. Initially, there are 15–16 attributes in every dataset, and data cleaning was assured by removing all missing values. Then, we work on dataset and modify it in a cleaned format contains 10–11 attributes (by creating conditional columns). The reason behind this is for the efficient data modeling. Challenges like merging data from different sources and within the dataset files were more timeconsuming. We are merging the data from different sources in a single sheet to maintain the consistency of the data. Once a dataset sample has been determined, the available information must be explored, with a view to simplifying and subsequently optimizing the model to be created. Different featuring techniques are used to determine the relationships between variables, and then facilitate the choice of these. After that the manipulation of data to achieve the appropriate format for feeding the model to be built. Below table shows the different regions and no of crime occurrences based on crime types in each state of India. The data used is from 2001 to 2014 (Table 1).
3.3 Data Visualizations and Data Insights: For visualizing and interpreting massive amounts of statistical data, data visualization is a vital tool. It allows users to easily understand data trends and adapt and tweak the data to meet their needs. As previously said, we visualized the data retrieved and processed for the visualization process using Python and its many tools. Python comes with dynamic libraries that allow us to evaluate data and draw conclusions from it. NumPy and pandas are two data analysis libraries that we use. The python library NumPy is used for statistical analysis. Statistical Processing, Image Processing, Graphs and Networks, Mathematical Analysis, Multi-Dimensional Array Analysis, and other functionalities are included. Pandas supports the import of a variety of data types, including CSVs, JSON, and Microsoft Excel. The excel sheets containing the preprocessed data can be imported using pandas. Plotly is the visualization library we utilized, and it allows us to display large amounts of statistical data and information in a variety of graphical plots. The below two plots, shows the crime numbers occurred in Residential area and Highways, respectively. Similarly, the others region type graphs can be also plotted. The below graph concludes that in 10 years, it can be observed that the number of crime type “Theft and Burglary” in Residential area occurs the most, while Robbery in Highways region.
732
A. Yadav et al.
3.4 Time Series Forecasting The key to all predictive modeling problems lies in machine and deep learning methods. The research work includes evaluation of Simple Exponential Smoothing, ARIMA. The machine learning models were used to replicate and compare an earlier study [14–16], based on whose methodology the data was prepared. For the Forecasting, ARIMA and a combination of exponential smoothing achieved the best performance till now. In overall scenario, exponential smoothing showed the best efficiency with 2 seasonal periods. The study therefore recommends the use of these two methods before more elaborate methods are explored. Based on the results published by this study, we have applied an ARIMA model on our full crime dataset of India. ARIMA stands for Autoregressive Integrated Moving Average, is a class of functions and models that explains and predicts a time series based on its pre-existing values. So that the model can be used to forecast future values. It is a statistical equation that measures events that happen over a period in the future or the past, predicts specific values based on previous values. ARIMA is one of the most used time series forecasting models used forecasting analysis. This is due to its easy predictability and accuracy in forecasting future results. In this paper, we have performed a basic implementation of statistical ARIMA and Exponential Smoothing to plot a time series forecasting on the crime counts of each state of India. The state-wise crime data has been shown for 2001–2014 in Figs. 1, 2 and 3 and the ARIMA model would be applied to it (Figs. 4 and 5). This graph shows that the range of crimes(theft) in year (2009–2016) is in between 8700 and 9500. Using Exponential smoothing technique, first the model was trained using the training data (2001–2008) set and then predict the future outcomes up to 2016. These outputs are only for crime type—Theft only. Using similar steps, graphs and predictions of other crime types have been calculated (Fig. 5). Let us take a look at past performance in this graph and on prediction about future performance. The blue line sample data used to train our model; orange data sample is to test the model. After that green line shows the predictive data.
Fig. 1 Bar chart for the years 2001 and 2014 of Residential area (category type)
Crıme Data Analysıs Usıng Machıne Learnıng Technıques
Fig. 2 Line chart for the years 2001 and 2014 of Total crime counts in Residential area
Fig. 3 Line chart for the years 2001 and 2014 of Highways
Fig. 4 Predicted values for overall crimes (Theft only) in India
733
734
A. Yadav et al.
Fig. 5 Comparison between train, test and predicted data of residence area (Theft crime)
4 Future Work and Conclusion The main goal of the research work is to explore patterns in various crime data sets. tentative research objective includes the improvement of the accuracy score in crime detection or pattern detection. The prime target will be getting our data more accurate by using the ARIMA model. A better accuracy of the model would mean that the model selection can be applied to all the states to forecast various quotients and predict different parameters. The objective of model selection is to establish a relationship between the explanatory variables and the variables under study, which will make it possible to infer the value of these with a given level of confidence. Further we work on different techniques and different dataset. The dataset may contain different categories like crime on Women, Children, SC/ST, etc. with location coordinates values for best modeling, also includes current updated dataset. The techniques include traditional statistical methods (such as regression analysis and clustering methods), as well as data-based techniques such as neural networks, decision tree. Machine learning algorithms such as Ensemble Techniques, Supervised and Unsupervised Techniques will be applied to determine occurrences. The performance will be compared according to the data processing and transformation used. Ensemble techniques might include Random Forest. In order to design a crime forecasting model render to the city area based on the SEMMA modeling (SEMMA means Sample, Explore, Modify, Model, and Assess of data). After that the accuracy of the models will be look over and choose the best one.
Dacoity
Dacoity
2001
2001
2001
Maharashtra
Manıpur
Meghalaya
Dacoity
Dacoity
Dacoity
2001
2001
Kerala
Dacoity
2001
Karnataka
Madhya Pradesh
Dacoity
Dacoity
2001
2001
Jammu & Kashmır
Dacoity
Dacoity
Jharkhand
2001
Hımachal Pradesh
Dacoity
2001
2001
Gujarat
Haryana
Dacoity
2001
Goa
Dacoity
Dacoity
2001
2001
Bıhar
Dacoity
2001
Assam
Chhattısgarh
Dacoity
Dacoity
2001
2001
Andhra Pradesh
Type
Year
Arunachal Pradesh
State
29
1
162
59
45
70
339
11
0
25
105
3
54
818
381
9
100
Resıdentıal Premıses
Table 1 Sample of Crime dataset of year 2001–2014
25
17
76
31
16
16
209
0
0
11
58
0
10
162
46
0
57
Hıghways
0
0
0
0
0
0
0
0
0
0
2
0
0
1
1
0
2
Rıver And Sea
0
0
8
3
0
3
13
0
0
0
6
0
2
50
4
0
8
Raılways
0
0
5
4
0
0
4
4
0
0
2
0
2
23
1
0
0
Banks
5
0
39
11
28
14
16
1
1
5
39
0
4
27
22
5
10
Commercıal Est
38
2
239
58
87
75
55
8
3
36
115
4
15
210
77
8
37
Other Places
97
20
529
166
176
178
636
24
4
77
327
7
87
1291
532
22
214
Total
Crıme Data Analysıs Usıng Machıne Learnıng Technıques 735
736
A. Yadav et al.
References 1. S.K. Rumi, K. Deng, F.D. Salim, Crime event prediction with dynamic features. EPJ Data Sci. 7, 43 (2018) 2. D.E. Brown, in The Regional Crime Analysis Program (ReCAP): a framework for mining data to catch criminals. SMC’98 Conference Proceedings. 1998 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.98CH36218) (1998) 3. R.F. Reier Forradellas, S.L. Náñez Alonso, J. Jorge-Vazquez, M.L. Rodriguez, Applied machine learning in social sciences: neural networks and crime prediction. Soc. Sci. (2021) 4. W. Safat, S. Asghar, S. Gillani, Empirical analysis for crime prediction and forecasting using machine learning and deep learning techniques. IEEE Access 9, 70080–70094 (2021) 5. L. Fonseca, F. Pinto, S. Sargento, An application for risk of crime prediction using machine learning. World Academy of Science, Engineering and Technology, Open Science Index 170. Int. J. Comput. Syst. Eng. (2021) 6. A. Almaw, K. Kadam, Survey paper on crime prediction using ensemble approach. Int. J. Pure Appl. Math. 118(8) (2018) 7. H. David, A. Suruliandi, Survey on crıme analysıs and predıctıon usıng data mınıng technıques. ICTACT J. Soft Comput. 7(3) (2017) 8. P. Kapoor, P.K. Singh, A.K. Cherukuri. Crime data set analysis using formal concept analysis (FCA): a survey. Adv. Data Sci. Secur. Appl. (2020) 9. S. Kim et al., in Crime analysis through machine learning. 2018 IEEE 9th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON). IEEE (2018) 10. S.V. Nath, in Crime pattern detection using data mining. 2006 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology Workshops. IEEE (2006) 11. S. Prabakaran, S. Mitra, Survey of analysis of crime detection techniques using data mining and machine learning. J. Phys. Conf. Ser. 1000(1) (IOP Publishing) (2018) 12. P. Saravanan et al., in Survey on crime analysis and prediction using data mining and machine learning techniques. Advances in Smart Grid Technology (Springer, Singapore, 2021) 13. D.K. Tayal et al., Crime detection and criminal identification in India using data mining techniques. AI & Society 30(1) (2015) 14. A. Sathesh, Enhanced soft computing approaches for intrusion detection schemes in social media networks. J. Soft Comput. Paradigm (JSCP) 1(02) (2019) 15. R. Sharma, A. Sungheetha, An efficient dimension reduction based fusion of CNN and SVM model for detection of abnormal ıncident in video surveillance. J. Soft Comput. Paradigm (JSCP) 3(02) (2021) 16. S. Manoharan, Study on Hermitian graph wavelets in feature detection. J. Soft Comput. Paradigm (JSCP) 1(01) (2019)
Subsampling in Graph Signal Processing Based on Dominating Set E. Dhanya , Gigi Thomas , and Jill K. Mathew
Abstract Graph signal processing (GSP) can be understood as a branch in which signal processing, a branch in electrical engineering, collaborates with graph theory, a branch in mathematics. In GSP, digitalized signals are represented by graphs and it gives a simple representation of sampled data with vertices and edges of the graph. GSP employs a method called subsampling in which a subset of the original data set is selected which serves to reduce the size of the data [9]. This work selects the dominating set of the graph as the subsample, in deviation from the previous works on GSP. We are using the properties of dominating vertices to draw the advantages of such a subsampling. A more efficient subsampling process to ‘efficient dominating set’ is also presented and an upper bound for the number of vertices in such a subsampling is found. The case when efficient dominating set does not exist is also discussed. We also described some special cases of subsampling. Keywords Graph signal processing · Dominating set · Efficient dominating set
1 Introduction Signal processing can be termed as a branch of electrical engineering whose prime focus is the analysis, modification, and synthesis of signals such as sound, images, and the like. There are many signals, like our voices, which we use in the real world and these are analogue signals. Only by converting these signals to the digital form, we can process and analyse them in computers. One of the main dissimilarities between analogue and digital signal is with regard to the continuity and discreteness in time and amplitude. Hence, in order to analyse the signals they should be converted to discrete time from continuous, and the involved process is termed sampling. The value of the signal at specific intervals in time is measured and each of these measurements is described as a sample (Fig. 1).
E. Dhanya · G. Thomas (B) · J. K. Mathew PG and Research Department of Mathematics, Mar Ivanios College (Autonomous), Thiruvananthapuram, Kerala 695015, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_56
737
738
E. Dhanya et al.
Fig. 1 Sampling of a continuous signal
This digitalized signal can be represented by means of a graph, and here arises another branch of study named graph signal processing (GSP) [10]. Graph signal processing is a branch in which signal processing, a branch in electrical engineering collaborates with graph theory, a branch in mathematics. There are a number of publications in GSP from electrical and electronics areas, and it is a fast developing area of research. It is an area of opportunity for mathematicians to contribute and collaborate as it is highly correlated with graph theory. We are trying to correlate some results in graph theory with signal processing, and thereby provide a good picturization of complex signal processing tasks and develop some easy tools for complicated processes. In GSP, data is represented by a graph. A graph G is defined as a pair (V,E) of vertices V (represented by dots or small circles) and edges E (represented by lines joining the vertices). In GSP, data sensing points are represented by vertices, and edges represent the connection between the sensing points. Edges can be given weights to represent the amount of connection. The data (signal) is represented on corresponding vertex. For example, if we measure the temperature over a region, then the temperature measuring points or places are the vertices. Connected places are joined by edges. Geographical distance between the places gives the edge weights. Measured temperature is represented on the corresponding vertex. Thus, in GSP, the graphs are totally weighted. In GSP, subsampling method employs selection of a subset of the original data set with the objective of reducing the size of the data [9]. By doing so, the complexity of the problem is reduced and we can reduce the data volume and the processing overhead, and this will improve the performance also. So subsampling process has enormous significance in the world of signal processing. For this, we have to select some vertices which are most relevant for representing the data. Here, we consider the dominating set of graph as a subsample. Clearly the dominating vertices have more importance in a graph and is a good representation or perfect pick for entire vertices in the graph. This relevance can be extended to GSP and we try to get a simple and efficient representation for complex data. In this paper, we are trying to connect some well-developed concepts in graph theory with signal processing. To the best of our knowledge, there are only a few
Subsampling in Graph Signal Processing Based on Dominating Set
739
such works and hence a huge potential is open for research and development. In our paper, we are working on subsampling (reduce data size by selecting a subset of it) process in signal processing. We select the dominating set, a subset of the entire set of vertices of the graph for subsampling. We discussed the subsampling to some variants of dominating set. It will be an efficient technique for many signal processing requirements. The organization of the paper is: In Sect. 2, a subsampling to ‘dominating set’ of a graph is described, and a simple algorithm for finding dominating sets is introduced. In the Sect. 3, we describe a more efficient subsampling process to ‘efficient dominating set’ and we find an upper bound for the number of vertices in such a subsampling. In Sect. 4, we discuss the case when efficient dominating set does not exist and we define the next efficient subsampling technique. In Sect. 5, we discuss some special cases of subsampling.
2 Subsampling to Dominating Set We denote by G(V,E) a graph with set of vertices V and set of edges E. Two vertices are said to be adjacent in a graph if an edge connects them. Neighbours of a vertex v ∈ V consist of those vertices which are adjacent to v. The number of edges incident on a vertex v ∈ V is termed as degree of v. In GSP, the edges are given weights and the degree of a vertex can also be understood as the sum of the weights of different edges incident on the vertex. A path in a graph is a finite or infinite sequence of edges which join a sequence of vertices in which all the vertices are distinct. Suppose G = u 0 . . . u k−1 represents a path with k ≥ 3, then the graph G = G + u k−1 u 0 becomes a cycle. The above cycle G might be written as u 0 . . . u k−1 u 0 .. The number of edges (or vertices) of a cycle is termed as its length. A forest is nothing but a graph without any cycles. A spanning subgraph of a graph G contains all the vertices of G. Consider the shortest path between two vertices, say u, v, of a graph G and count the number of edges in the path and this is called distance between the vertices and is referred to as d(u,v). Pendant vertices of a graph are those vertices with degree one. A pendant edge of G is one which is incident on a pendant vertex [1, 4]. If a set D of vertices of G is in such a way that every vertex not in D has a neighbour in D, then D is called a dominating set in G. Equivalently, for every vertex v ∈ V , either D contains v or a neighbour of v. A minimal dominating set is one in which no proper subset of it is a dominating set. The domination number of G, denoted by γ (G), is the minimum cardinality of a set in minimal dominating sets of G [6]. We can give a subsampling to dominating vertices of the graph. If we subsample to minimum dominating set, then the number of vertices in the subsample will be minimum and is equal to γ (G). Figure 2 illustrates an example of a minimum dominating set. {v1 , v3 , v5 }, {v3 , v6 , v7 , v8 }, {v2 , v4 , v6 , v7 , v8 } are minimal dominating sets with cardinality 3, 4, and 5, respectively. Hence {v1 , v3 , v5 } is a minimum dominating set of G, giving γ (G) = 3.
740
E. Dhanya et al.
Fig. 2 Graph G=(V,E)
Theorem 1 [7] For a graph G, γ (G) + F (G) = n where n is the total number of vertices in G and F (G) is the maximum number of pendant edges in a spanning forest of G Lemma 1 Subsampling to minimum dominating set gives a graph representation of data with n − F (G) vertices where n and F (G) are as above. The minimum dominating set of a graph can be considered as an effective subsample for GSP problems and we provide an algorithm for finding dominating set of a graph.
2.1 Algorithm for Dominating Set Let D denote the dominating set of graph. 1. 2. 3. 4. 5. 6. 7. 8. 9.
Set D = ∅ Select a vertex v1 with maximum degree and colour it red Set D = D ∪ {v1 } Colour all neighbours of v1 black From the uncoloured vertices select a vertex v2 with maximum degree and colour it red D = D ∪ {v2 } Colour all the neighbours of v2 black Continue the process until all the vertices are coloured Return D = Set of all red coloured vertices.
3 Subsampling to Efficient Dominating Set If every vertex of G is dominated by exactly one vertex of a vertex set De , then De is called an efficient dominating set of G [2]. It will be more specific and efficient if we subsample the data to efficient dominating set. The following is a theorem which provides an upper bound for the number of vertices in a subsampling to efficient dominating set.
Subsampling in Graph Signal Processing Based on Dominating Set
741
Theorem 2 [8] Let De be a nontrivial efficient dominating set of graph G. Then, 1. For any two vertices u, v ∈ De , d(u, v) ≥ 3 2. For any vertex u ∈ De , there exists a vertex v ∈ De such that d(u, v) = 3 Theorem 3 If we subsample the data to efficient dominating set of the corresponding graph, then the number of vertices needed to consider is at most |V2 | . Proof The theorem is proved by considering the different cases of the cardinality of the efficient dominating set. If the efficient dominating set contains only one element, clearly at least one vertex of the graph is not in the subsample De unless the graph is a single isolated vertex. Case 1 Let |De | = 2 and De = {u 1 , u 2 } By Theorem [2], it is clear that at least 2 vertices in the graph are not in De (Fig. 3) and hence we drop at least 2 vertices of the graph from the subsample De . Case 2 Let |De | = 3 and De = {u 1 , u 2 , u 3 } By Theorem [2], it is clear that at least 3 vertices in the graph are not in De (Fig. 3) causing the removal of at least 3 vertices of the graph from the subsample De . Case 3 Let |De | = 4 and De = {u 1 , u 2 , u 3 , u 4 } By Theorem [2], we know at least 4 vertices in the graph are not in De (Fig. 4) which permits the pulling out of at least 4 vertices of the graph from the subsample De . Case 4 Let |De | = 5 and De = {u 1 , u 2 , u 3 , u 4 , u 5 } By Theorem [2],at least 5 vertices in the graph are not in De (Fig. 4). Hence, we can delete at least 5 vertices of the graph from the subsample De . Case 5 Let |De | = 6 and De = {u 1 , u 2 , u 3 , u 4 , u 5 , u 6 } By Theorem [2], it is clear that at least 6 vertices in the graph are not in De (Fig. 5). As in the above cases, here we can drop at least 6 vertices of the Graph from the subsample De . Any other cases |De | = 7, 8, 9, . . . can be dealt in a similar fashion as the above cases. So, in all cases, at least |De | vertices are removed. Considering the least case of removal, we have. Number of vertices Removed + Number of vertices subsampled = Total number of vertices |De | + |De | = |V |
|De | =
|V | . 2
So the number of vertices needed to be considered is at most |V2 | .
Fig. 3 Figures for Cases 1 and 2
742
E. Dhanya et al.
4 Subsampling to R-set or CR-set There are some graphs for which the efficient dominating set does not exist. If the efficient dominating set exists it coincides with minimum dominating set and the cardinality of any efficient dominating set is the same as the domination number γ (G). Let F(G) denotes the efficient domination number of G, that is, the maximum vertex count which an efficient dominating set efficiently dominates. It is clear that G contains an efficient dominating set if and only if F(G) = |V |. In the scenario of such an efficient dominating set does not exist, the entire vertex set can be subsampled to a set under the condition that every vertex is dominated at least once, and the amount of excess domination is minimum. For this, we consider two measures [5]: 1. Redundance measure R(G) is a measure of how many times the vertices are dominated. Redundance measure R(G) = min{ v∈V (G) |N [v] ∩ D| D is a Dominating set in G}, where N[v] denote the neighbours of v including v. 2. Cardinality Redundance CR(G) Cardinality Redundance CR(G) is the minimum number of vertices dominated more than once by a dominating set. The next theorem displays an idea of existence of efficient dominating set. We show G has an efficient dominating set if and only if F(G) = |V |. Theorem 4 [5] For any graph G with n vertices. 1. F(G) ≤ n ≤ R(G) and 2. F(G) = n if and only if R(G) = n if and only if C R(G) = 0 If G has no efficient dominating set, we can consider the subsampling to R-set or CR-set according to the case considered. Now, we explain an example (Fig. 6) to understand these concepts. We have R-sets D1 = {v3 , v4 , v5 , v8 , v13 } and D2 = {v1 , v2 , v4 , v5 , v8 , v13 } and R(G) = 16. D1 is a CR-set but D2 is not a CR-set and C R(G) = 1
Fig. 4 Figure for Case 3 and 4
Subsampling in Graph Signal Processing Based on Dominating Set
743
Fig. 5 Figure for Case 5
Fig. 6 Case of no efficient dominating set
5 Special Cases Let D be a dominating set of G. D is called a total dominating set of G if every vertex of G has a neighbour in D. We know that, there are special cases such as security-related matters in which each vertex (person) necessarily needs attention by some other vertex (person) to ascertain the knowledge of any possible attack at each vertex. In such cases, the total dominating set can be chosen for subsampling. In some cases, we can choose the dominating vertices to assign the most powerful or strong signals, and it will be applicable for constructing a system which will be more efficient and cost effective for representing a given data. K-neighbourhood of a vertex v is the set of all vertices in G which can be traced from v by a path of length at most k. Let Dk be a set in order that for every v ∈ V (G) either v ∈ Dk or v has a K-neighbour in Dk . If we give a subsampling of data to Dk , then the number of vertices considered is further reduced and it will further reduce the cost of the problem. But this case is applicable only when the distance is not an issue.
6 Conclusion Subsampling is a technique aimed at reducing the size of the data by choosing a subset of the original data in GSP. In this paper, we have attempted at demonstrating a simple subsampling technique which is applicable for representing complex
744
E. Dhanya et al.
data by a very efficient and developed concept. We have selected the dominating set, a subset of entire vertex set of the graph for subsampling. We discuss the subsampling to some variants of dominating set. It is an efficient technique for many signal processing requirements. Using this technique, complicated signal processing tasks can be performed easily and cost effectively. Time for the overall process can also be reduced considerably. This paper demonstrates the initial works of an advanced research in this area. Future course of work in the area would be aimed at strengthening of the signal. More advantages of subsampling to dominating set and also checking whether any other subsets of vertices fit well for subsampling will be studied in future.
References 1. J.A. Bondy, U.S.R. Murty, Graph Theory with Applications (Macmillan, London, 1976) 2. A. Brandstadt, A. Leitert, D. Rautenbach, Efficient dominating and edge dominating sets for graphs and hypergraphs, Conference Paper (2012). https://doi.org/10.1007/978-3-642-352614_30 3. N. Deo, Graph Theory with Applications to Engineering and Computer Science (Prentice Hall India Learning Private Limited, 1979) 4. R. Diestel, Graph Theory (Springer, New York, 1997) 5. T.W. Haynes, S.T. Hedetniemi, P.J. Slater, Fundamentals of Domination in Graphs (CRC Press, Boca Raton, 1998) 6. S.T. Hedetniemi, R.C. Laskar, Topics on Domination (North Holland, 1991) 7. J. Nieminen, Two bounds for the domination number of a graph. J. Inst. Maths Applics 14, 183–187 (1974) 8. I. Reiter, J. Zhou, Unique Efficient Dominating Sets. Open J. Discr. Math. 10, 56–68 (2020) 9. W.J. Schroeder, K.M. Martin, Overview of visualization, in Visualization Handbook, ed. by C.D. Hansen, C.R. Johnson (Elsevier Inc., 2005) 10. L. Stankovic, M. Dakovic, E. Sejdic, Vertex-Frequency Analysis of Graph Signals (SpringerNature, London, 2019)
Optimal Sizing and Cost Analysis of Hybrid Electric Renewable Energy Systems Using HOMER Basanagouda F. Ronad
Abstract This paper presents optimal sizing and cost analysis of hybrid renewable energy systems. HOMER Software is used for the optımızatıon and cost analysıs. Three different loads and their typical profiles are employed in the simulation. Simulation models are developed for a typical house, hostel and a small scale hospital in Bagalkot (16.1725° N, 75.6557° E), India. Load survey data of said systems were retrieved from the energy meters installed by HESCOM, Govt. of Karnataka. For optimization process, HOMER software simulates all configurations and lists based on Cost of Energy and Total Net Present Cost. Simulations are carried with various source combinations of Solar photovoltaic, Wind, Battery and Grid and corresponding results tabulated for all possible combinations. Results are analyzed systematically with cost of per unit energy as reference. Electricity generated by each source combination, unmet electric load, capacity shortage, excess electricity generated and renewable energy fractions are listed. For all the three loads solar photovoltaic (PV) with grid connected mode is proved as cost effective with least cost of energy. It is Rs. 4.8, Rs. 5.325 and Rs. 4.5 for house, hostel and hospital loads, respectively. It is concluded that for Bagalkot location Solar PV with grid integrated mode is most effective and economic option. Keywords Hybrid energy systems · Optimization · Renewable energy · Solar photovoltaic · Wind turbine generator
1 Introduction Fossil fuel reserves are decreasing day to day. To address the future energy issues, renewable sources are to be encouraged. Hybrid renewable energy systems are gaining importance due to their large advantages over standalone systems. Hybrid energy systems typically consists of more than two energy sources to enhance the B. F. Ronad (B) Department of Electrical and Electronics Engineering, Basaveshwar Engineering College (A), Bagalkot, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_57
745
746
B. F. Ronad
system efficiency and reliability. To effectively utilize the renewable energy sources, optimum sizing of the system components is necessary.
2 Related Work in the Literatures The process of optimizing energy system components requires accurate load profile and available potential in the selected site. Kumbar et al. [1] presented optimization of hybrid renewable electric energy system components for Covid hospitals. Optimal sizing of hybrid system components is taken up in HOMER software. Tefera et al. [2] presented optimization of system components in solar photovoltaic systems in islanded mode and grid connected mode. The optimization is conducted based on the electricity tariffs of a typical residential load. PV Watts, Homer Pro and PVGIS tools are employed for the analysis. Optimization and economic analysis was conducted in HOMER Pro. Further, verification of system performance is taken up in PVGIS and PV Watts packages [2]. Palanichamy et al. [3] proposed renewable energybased system for AIIMS healthcare center in Madurai. Survey and analysis of loads is conducted based on the daily energy consumption and critical and non-critical requirements. Non-conventional energy systems with different ratings were tested for various source combinations in HOMER grid package [3]. Vendoti et al. [4] presented electrification of villages in Chamarajanagar district of Karnataka, India. This analysis is conducted for grid isolated hybrid systems. Simulations are carried out with key objective to achieve cost effective operation, with reduced Net Preset Cost, Energy Cost and unmet load. The analysis is conducted in HOMER Pro Software and using Genetic Algorithm. Comparative analysis of results of both methods with different renewable energy source combinations is conducted. Further, sensitivity analysis is performed with reference to variation in wind potential and biomass fuel prices [4]. Anastasios et al. [5] presented design of independent microgrid for a remotely located community system. Source combination of solar PV, diesel generator and batteries are employed to feed loads. Multi-objective, non-derivative optimization is considered with primary objective to cost minimization without the load shedding. Further, assessment of CO2 emissions is conducted to explore environmental benefit of renewable sources. HOMER Pro software is employed for the study [5]. Abdullahi [6] presented prospects and cost-effectiveness of solar PV and wind energy hybrid system in Nigeria. HOMER software is used for system optimization. For the selected load of 5.21 kW PV, 3 numbers of 25 kW wind turbine generators with 12 numbers of 24 V lead acid batteries is proved as optimum combination. It was shown that the payback period for the proposed systems is 5 years [6]. Rilwan et al. [7] presented simulation models for remote health care center using HOMER software. Grid connected and off-grid system simulation and analysis is conducted. Load profile is established by usage hours and capacity of equipment in health care center. Renewable energy potential are obtained from NASA surface meteorology [7]. Rilwan and Marvin [8] presented optimization model for a hybrid systems with solar PV systems and diesel generator unit to supply the load of a hospital building. In
Optimal Sizing and Cost Analysis of Hybrid …
747
the said system, the operational hours for diesel generator reduced with PV capacity enhancement. Critical observations of the results in literatures states the increased cost of energy with renewable energy source. Further, the hydro power plants are also tested for effectiveness in hybrid energy systems [8]. Deepak Kumar et al. [9] proposed hybrid energy system model for remotely located loads using HOMER software. Simulation and analysis was conducted for rural area in Sundargarh, Orissa, India. Model was developed to attain optimal system configuration for energy demand and availability based on hourly time data. Authors included the hydro resource and concluded that renewable energy sources can effectively replace conventional sources [9]. Thus in the present case study, HOMER software is used for optimization of components of renewable energy systems for three different load profiles, i.e., House, Hostel and a Hospital load.
3 Resources for Modeling HOMER simulation study is taken up for different loads in Bagalkot city, Karnataka, India (16.1725° N, 75.6557° E). Renewable energy potential for Bagalkot location are taken from weather station installed in Energy Park, BEC Bagalkot.
3.1 Solar Energy Potential Figure 1 presents solar radiation data and corresponding clearness index. HOMER uses these solar resource inputs to calculate the power generated by panels [10]. Global Horizontal Radiation 1.0
6 0.8 5 0.6
4 3
0.4
2 0.2 1 0
0.0 Jan
Feb Mar Apr May Jun Daily Radiation
Jul
Aug Sep Oct
Nov Dec
Clearness Index
Fig. 1 Daily solar radiation and respective clearness index in Bagalkot site
Clearness Index
Daily Radiation (kWh/m²/d)
7
748
B. F. Ronad Wind Resource Wind Speed (m/s)
8 6 4 2 0 Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Fig. 2 Monthly average wind speed in the Bagalkot site
Output power of solar photovoltaic panels is assessed bases on local solar radiation. In the present simulation work, PV panels of ratings ranging from 1 to 25 kW are employed with investment cost of Rs.55 per watt including supply and installation with mechanical support accessories [11].
3.2 Wind Energy Potential Figure 2 shows the monthly average wind speed in the Bagalkot site. Total power available from wind is calculated based on local wind speed data. In the present simulation work, wind turbine generators of 1 and 3 kW capacities are employed for electricity generation [12].
3.3 Electric Load Load details of a typical house, hostel and a small scale hospital in Bagalkot city are collected from Electricity Distribution Company (HESCOM) by downloading through Meter Reading Instrument (MRI). The data is averaged to each hour of the day in the terms of kW and is employed as input to simulation. Figures 3, 4 and Daily Profile
Fig. 3 Daily load profile of the loads for House load Load (kW)
1.6 1.2 0.8 0.4 0.0 0
6
12 Hour
18
24
Optimal Sizing and Cost Analysis of Hybrid …
749
Daily Profile
Fig. 4 Daily load profile of the loads for Hostel load
25 Load (kW)
20 15 10 5 0 0
12 Hour
6
18
24
18
24
Daily Profile
Fig. 5 Daily load profile of the loads for Hospital load Load (kW)
12 8 4 0 0
6
12 Hour
5 presents the daily load profile of the loads for House, a Hostel and a small scale Hospital, respectively [13]. For house load scaled annual average requirement is 8 kWh/day with 1.73 kW peak load. The average load in a day is 0.333 kW with load factor of 0.193. For hostel load scaled annual average requirement is 155 kWh/day with 20.3 kW peak load. The average load in a day is 6.46 kW with load factor of 0.319. Further, for a hospital load scaled annual average requirement is 79 kWh/day with 13.8 kW peak load. The average load in a day is 3.29 kW with load factor of 0.239. For all the three load profiles day to day random variability is considered as 15% and time to time variability is 20% for the entire year [14, 15].
4 HOMER Simulation Models and Results Simulation of different source combinational models are carried out in HOMER Software. The proposed combinations are: Solar PV-Grid, Solar PV-Wind-Battery, Solar PV-Grid-Battery, Solar PV-Wind-Grid and Solar PV-Wind-Grid-Battery. For each of the load profiles, all combination simulation results are obtained. Comparative analysis of the results is carried out.
750
B. F. Ronad
4.1 House Load with Solar PV and Grid Hybrid Simulation model for house load with solar PV integrated with grid is presented in Fig. 6. Optimal combinations resulted in the HOMER simulation are presented in Table1. The combinations are listed with reference to cost of energy with increasing order. Capacity shortage constraint is given as 10%, however due to presence of grid, shortage has not raised irrespective of solar PV system capacity. It is observed that renewable energy fraction is 0.76 in optimum combination, with cost of energy 0.064 $/kWh. Average production from the most feasible energy source options is presented in Fig. 7. Solar PV systems generated 76% and 24% is purchased from the grid. With system architecture of 3 kW PV integrated with grid has generated 6076 kWh/year. AC load of 2920 kWh (53%) are supplied and 2619 kWh (47%) are sold to grid. Without any capacity shortage constraint, 83.2 kWh is generated in excess in the selected combination. The monthly energy purchase and sale details are presented in Table 2. It is observed that total 1454 kWh energy is purchased and 2619 kWh are sold to grid, with net profitable units of 1165 kWh. Fig. 6 Simulation model for house load with grid connected solar PV
Table 1 Optimization results for house load with grid connected solar PV Sl
PV (kW)
Initial capital ($)
COE ($/kWh)
Ren. fraction
% Capacity shortage
1
3
2809
0.064
0.76
0
2
2
2006
0.067
0.66
0
3
3
3009
0.069
0.76
0
4
4
3612
0.073
0.82
0
5
5
4615
0.074
0.85
0
6
9
8427
0.091
0.92
0
Optimal Sizing and Cost Analysis of Hybrid …
751
Monthly Average Electric Production 1.0
PV Grid
Power (kW)
0.8
0.6
0.4
0.2
0.0 Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Fig. 7 Monthly average electric production from 3 kW PV with grid
Table 2 Monthly energy purchase and sale details
Month
Energy purchased (kWh)
Energy sold (kWh)
Net purchases (kWh)
Jan
119
307
–188
Feb
109
205
–97
Mar
121
261
–140
Apr
112
268
–155
May
113
247
–135
Jun
120
165
–45
Jul
127
134
–7
Aug
125
223
–98
Sep
133
166
–33
Oct
121
210
–89
Nov
116
261
–145
Dec
137
172
–35
Annual
1454
2619
–1165
4.2 House Load with Solar PV, Batteries and Grid Hybrid simulation model for house load with solar PV integrated with grid and battery backup is presented in Fig. 8 and respective optimization results from HOMER are presented in Table 3. Solar PV systems generated 85% (7704 kWh) and 15% (1314 kWh) is purchased from the grid. AC load of 2920 kWh (35%) is supplied and 5328 kWh (65%) are sold to grid. Without any capacity shortage constraint, no electricity is generated in excess in the selected optimum combination. It is observed that inclusion of battery has increased the renewable energy fraction to 0.82%, however the cost of energy has raised to 0.108 $/kWh.
752
B. F. Ronad
Fig. 8 Simulation model for house load with grid connected solar PV with batteries
Table 3 Optimization results for house load with grid connected solar PV with batteries Sl
PV (kW)
Battery (No.s)
Initial capital ($)
COE ($/kWh)
Ren. fraction
1
5
1
5805
0.108
0.85
2
5
1
6005
0.114
0.85
3
5
1
6205
0.121
0.85
4
3
2
4589
0.123
0.76
5
2
1
3796
0.123
0.66
6
2
2
3786
0.125
0.66
4.3 House Load with Solar PV, Wind Turbine Generator and Grid Hybrid simulation model for house load with grid tied solar PV and wind turbine generator (WTG) is presented in Fig. 9. 1 kW DC wind turbine generator is employed in generating unit. Optimal combinations resulted in the HOMER simulation are presented in Table 4. It is observed that inclusion of wind generator has not yielded optimum results. Solar PV alone with grid is proved to be the optimum combination. However to Fig. 9 Simulation model for house load with grid connected solar PV-WTG system
Optimal Sizing and Cost Analysis of Hybrid …
753
Table 4 Optimization results for house load with grid connected solar PV-WTG system Sl
PV (kW)
WTG (No.s)
Initial capital ($)
COE ($/kWh)
Ren. fraction
1
5
–
5415
0.094
0.85
2
4
–
4612
0.094
0.82
3
3
–
3809
0.094
0.76
4
5
1
8915
0.179
0.92
5
4
1
8112
0.179
0.9
6
3
1
7309
0.179
0.87
Monthly Average Electric Production 1.6
PV Wind Grid
Power (kW)
1.2
0.8
0.4
0.0 Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Fig. 10 Monthly average electric production from 5 and 1 kW WTG with grid
enhance renewable energy fraction wind energy is incorporated. With 1 kW wind generator and 5 kW solar PV, cost of energy is 0.179 $/kWh. On other hand with only Solar PV, cost of energy is 0.094 $/kWh. Monthly electricity production from the most feasible source combinations with wind generators is shown in Fig. 10. Solar PV systems generated 75% (7704 kWh) and 17% (1762 kWh) is generated by wind generator. Further, only 8% (856 kWh) is purchased from the grid. AC load of 2920 kWh (31%) is supplied and 6456 kWh (69%) are sold to grid. With zero capacity shortage, no electricity is generated in excess. The detailed investigation of results, proved that wind turbine generators are not effective in the said location due to lower wind speed regime.
4.4 House Load with Off-Grid System Connected to Solar PV and Wind Turbine Generator Off-grid simulation model for house load with solar PV connected to wind turbine generator is presented in Fig. 11. Batteries are employed for backup supply. This model presents typical standalone renewable hybrid energy system. Optimal combinations resulted in HOMER simulation are presented in Table 5.
754
B. F. Ronad
Fig. 11 Simulation model for house load with off-grid solar PV-WTG hybrid system
Table 5 Optimization results for house load with off-grid solar PV-WTG hybrid system Sl
PV (kW)
WTG (No.s)
Battery (No.s)
Initial capital ($)
COE ($/kWh)
Ren. fraction
% Capacity shortage
1
2
1
4
8066
0.333
1
0.18
2
2
1
5
8456
0.34
1
0.14
3
3
1
3
8479
0.342
1
0.17
4
2
1
6
8846
0.347
1
0.12
5
4
1
7
10,842
0.386
1
0.01
6
3
1
10
11,209
0.391
1
0.01
It is observed that wind generators are operating effectively only during May– August months. Renewable energy fraction has increased to 100%, whereas the capacity shortage is 18% in the optimum combination. This clearly indicates energy supply reliability issues in the renewable energy systems. Cost of energy is increased to 0.33 $/kWh, which is highly uneconomic; however this can be reduced by increasing the capacity shortage constraint. Monthly electricity generation from the most feasible combination is shown in Fig. 12. Solar PV system generated 64% Monthly Average Electric Production 1.0
PV Wind
Power (kW)
0.8
0.6
0.4
0.2
0.0 Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Fig. 12 Monthly average electric production from 2 kW solar PV and 1 kW WTG with 4 No.s of batteries
Optimal Sizing and Cost Analysis of Hybrid …
755
Cash Flow Summary 5,000
PV BWC XL.1 Vision 6FM200D Converter
Net Present Cost ($)
4,000
3,000
2,000
1,000
0 PV
XL1
Vision 6FM200D
Converter
Fig. 13 Cost summary of off-grid solar PV-WTG hybrid system
(3082 kWh) and 36% (1762 kWh) is generated by wind generator. AC load of 2753 kWh is supplied and 17.8% of electricity is identified as shortage for loads. Further, to meet the shortage power, system capacity needs to be increased significantly and resulting in increased initial cost. The cost summary of the proposed model is shown in Fig. 13.
4.5 House Load with Grid Connected Solar PV-Wind Turbine Generator Grid connected simulation model for house load with solar PV and WTG with batteries integrated with grid is presented in Fig. 14. Optimal combinations resulted in the HOMER simulation are presented in Table 6. It is observed that optimum combination with 92% of renewable energy fraction has resulted in 0.193 $/kWh cost of energy. With grid integration shortage of electricity is reduced to zero. Monthly average electric production from the most feasible option is shown in Fig. 15. It is observed that 75% (7704 kWh) is generated by Solar PV and 17% (1762 kWh) is generated by wind generator. Further, 8% (856 kWh) is purchased from the grid. Fig. 14 Simulation model for house load with grid connected solar PV-WTG system with batteries
756
B. F. Ronad
Table 6 Optimization results for house load with grid connected solar PV-WTG system Sl
PV (kW)
WTG (No.s)
Battery (No.s)
Initial capital ($)
COE ($/kWh)
Ren. fraction
1
5
1
1
9305
0.193
0.92
2
4
1
1
8502
0.193
0.9
3
3
1
1
7699
0.193
0.87
4
2
1
1
6896
0.194
0.83
5
5
1
1
9505
0.199
0.92
6
4
1
1
8702
0.199
0.9
Monthly Average Electric Production 1.6
PV Wind Grid
Power (kW)
1.2
0.8
0.4
0.0 Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Fig. 15 Monthly average electric production with grid connected 5 kW solar PV, 1 kW WTG system with battery
Total AC primary load of 2920 kWh (31%) is supplied and 6456 kWh (69%) is sold to grid. It is observed that total 856 kWh energy is purchased and 6456 kWh are sold to grid, with net profitable units of 5600 kWh.
4.6 Hostel Load and Hospital Load Simulation models for hostel and a hospital load profiles with different energy sources are presented in Figs. 16 and 17, respectively. Further, detailed simulation results for different source combinations are presented in Tables7 and 8 for hostel and hospital loads, respectively. Selected five optimum combinations are listed for each case and listed with reference to cost of energy with increasing order. It is observed that renewable energy fraction is in the range of 50–60% (Hostel load) and 75–88% (Hospital load) for all grid connected operations. Comparison of cost of energy in all cases revealed that grid connected SPV system results in economic operation with 0.071 $/kWh (Hostel load) and 0.06 $/kWh (Hospital load).
Optimal Sizing and Cost Analysis of Hybrid …
757
Fig. 16 Simulation model for hostel load
Fig. 17 Simulation model for a hospital load
5 Conclusions Use of HOMER software for optimizing the energy system components of different loads (House, Hostel and a Hospital) is presented in the paper. The actual load survey data of said systems are retrieved from the energy meters installed by HESCOM, Govt. of Karnataka. Simulation results are obtained in HOMER for all possible combinations of energy source with different capacities. Detailed investigation of results revealed the following observations: • For all the three loads solar PV with grid connected mode proved to be cost effective with least cost of energy. It is Rs. 4.8, Rs. 5.325 and Rs. 4.5 for house, hostel and hospital loads, respectively. • In grid connected mode, employment of batteries has shown very less impact on cost and generation. However, impact of wind turbine generators is significant. The selected location Bagalkot is under low wind speed regime and hence impact of inclusion of wind turbine generators is not economic. But the renewable energy fraction has increased in combinations involving wind generators. • Net metering facility has enhanced effective utilization of power by reducing the excess electricity component. For house load, it is observed that 2169 units are sold to grid during the entire year. On other hand in off-grid mode of operation 32% of electricity is generated in excess of the load requirement.
758
B. F. Ronad
Table 7 Optimization results for hostel load with different source combinations PV (kW)
Solar PV + Grid
Solar PV + Grid + Battery
Solar PV + Wind + Grid
Solar PV + Wind + Battery
Solar PV + Wind + Battery + Grid
WTG (No.s)
Battery (No.s)
Initial capital ($)
COE ($/kWh)
Ren. fraction
% Capacity shortage
30
–
–
28,490
0.071
0.57
0.00
30
–
–
29,290
0.072
0.57
0.00
29
–
–
28,287
0.073
0.56
0.00
27
–
–
25,681
0.074
0.54
0.00
29
–
–
26,487
0.075
0.56
0.00
24
–
1
22,862
0.088
0.51
0.00
25
–
1
23,865
0.088
0.52
0.00
26
–
1
24,668
0.088
0.53
0.00
23
–
1
21,859
0.088
0.5
0.00
25
–
1
23,665
0.088
0.52
0.00
22
4
–
49,066
0.123
0.61
0.00
21
4
–
48,063
0.123
0.6
0.00
23
4
–
50,069
0.123
0.62
0.00
23
4
–
49,869
0.123
0.62
0.00
17
4
–
45,051
0.123
0.55
0.00
100
5
60
121,283
0.209
1
0.25
100
5
60
121,483
0.21
1
0.25
120
4
60
129,143
0.219
1
0.24
100
6
60
127,483
0.22
1
0.24
120
5
60
136,143
0.231
1
0.23
22
4
1
49,456
0.123
0.61
0.00
21
4
1
48,453
0.123
0.6
0.00
23
4
1
50,459
0.123
0.62
0.00
21
4
1
48,653
0.124
0.6
0.00
22
4
1
49,256
0.124
0.61
0.00
• Islanded wind-solar hybrid systems proves to be unreliable in power supply. 25% of capacity shortage of electricity is noted in all possible source combinations. Minimum cost of energy attained in islanded operation is Rs. 24/kWh. • In the present study, cost of energy in net metering is Rs. 6.75 for purchase and Rs. 3.75 for sale. Further, in the assessment Rs. 55 per watt is employed for solar PV panels. However, solar PV costs reducing drastically and the economic performance of the said systems are expected to improve significantly in future.
Optimal Sizing and Cost Analysis of Hybrid …
759
Table 8 Optimization results for hospital load with different source combinations PV (kW)
Solar PV + Grid
Solar PV + Grid + Battery
Solar PV + Wind + Grid
Solar PV + Wind + Battery
Solar PV + Wind + Battery + Grid
WTG (No.s)
Battery (No.s)
Initial capital ($)
COE ($/kWh)
Ren. fraction
% Capacity shortage
21
–
–
19,863
0.06
0.75
0.00
22
–
–
20,666
0.06
0.76
0.00
23
–
–
21,469
0.061
0.77
0.00
24
–
–
22,472
0.061
0.78
0.00
25
–
–
23,675
0.062
0.79
0.00
21
–
1
20,253
0.061
0.75
0.00
21
–
1
20,053
0.062
0.75
0.00
24
–
1
22,862
0.063
0.78
0.00
25
–
1
23,465
0.064
0.79
0.00
24
–
2
23,652
0.065
0.78
0.00
13
4
–
40,839
0.137
0.78
0.00
14
4
–
41,642
0.137
0.79
0.00
15
4
–
42,445
0.137
0.8
0.00
15
4
–
42,645
0.137
0.8
0.00
16
4
–
43,448
0.138
0.82
0.00
100
5
12
122,780
0.432
1
0.24
100
5
11
123,390
0.436
1
0.26
100
5
12
122,980
0.433
1
0.23
100
5
11
123,590
0.437
1
0.26
100
5
12
125,580
0.441
1
0.22
21
4
1
48,453
0.141
0.86
0
22
4
1
49,456
0.142
0.87
0
24
4
1
51,062
0.143
0.88
0
23
4
1
51,259
0.144
0.87
0
24
4
2
52,252
0.145
0.88
0
References 1. J. Kumbar, N. Goudar, S. Bhavi, S. Walikar, B. Ronad, in Optimization of hybrid renewable electric energy system components for covid hospitals. IEEE Mysore Sub Section International Conference (MysuruCon-2021), pp. 799–804 (2021) 2. M. Tefera, B. Ramchandra, R. Venkata, Modeling, Analysis and Optimization of GridIntegrated and Islanded Solar PV Systems for the Ethiopian Residential Sector: Considering an Emerging Utility Tariff Plan for 2021 and Beyond, MDPI, Energies 2021, 14, 3360 (2021) 3. C. Palanichamy, P. Naveen, Micro grid for All India Institute of Medical Sciences, Madurai. Clean Energy 2021, 254–272 (2021) 4. S. Vendoti, M. Muralidhar, Modelling and optimization of an off-grid hybrid renewable energy system for electrification in rural areas. Energy Rep. 6(2020), 594–604 (2020)
760
B. F. Ronad
5. R. Anastasios, T. Dimitrios, K. Ioannis, B. Campbell, Design of a hybrid AC/DC microgrid using HOMER Pro: case study on an Islanded residential application. MDPI Inventions 3(55), 1–14 (2018) 6. A. Abdullahi, The application of homer optimization software to investigate the prospects of hybrid renewable energy system in rural communities of Sokoto in Nigeria. Int. J. Electr. Comput. Eng. 7(2), 596–603 (2017) 7. U. Rilwan, G. Adamu, Feasibility analysis of a grid connected PV/Wind Options for rural Healthcare centre using HOMER. Euro. J. Eng. Technol. 5(3), 2056–5860 (2017) 8. U. Rilwan, B. Marvin, Analysis and simulation of electrical load in a hospital using hybrid (Diesel/Solar) system as back up. J. Electron. Commun. Eng. Res. 3(9), 01–14 (2017) 9. L. Deepak Kumar, A.K. Akella, Optimization of PV/Wind/Micro-Hydro/Diesel hybrid power system in HOMER for the Study Area. Int. J. Electr. Eng. Inform. 3(3), 307–325 (2011) 10. B.F. Ronad, S.B. Kumbalavati, in Performance assessment of SPV powered DC ırrigation pumps with solar tracking mechanism. IEEE International Conference on Energy Power and Environment, National Institute of Technology Meghalaya, Shillong, India (2021) 11. B.F. Ronad, S. Jangamshetti, in Optimal cost analysis of wind-solar hybrid system powered AC and DC ırrigation pumps using HOMER. 4th International Conference on Renewable Energy Research & Applications, Palermo, Italy, pp. 1038–1042 (2015) 12. L. Tom, G. Paul, L. Peter, Micropower System Modeling with Homer, Integration of Alternative Sources of Energy, Chapter 15. National Renewable Energy Laboratory (2008) 13. A.R. Nurul, O. Muhammad, M. Ismail, in Optimal sizing and operational strategy of hybrid renewable energy system using HOMER. 4th International Power Engineering and Optimization Conf. (PEOCO2010), Shah Alam, Selangor, Malaysia, 23–24 (2010) 14. B. Mohit, D.K. Khatod, R.P. Saini, in Modeling and optimization of ıntegrated renewable energy system for a rural site. International Conference on Reliability, Optimization and Information Technology—ICROIT 2014, India (2014) 15. G. Ranganathan, Energy storage capacity expansion of microgrids for a long-term. J. Electr. Eng. Autom. 3(1), 55–64 (2021)
Different Nature-Inspired Optimization Models Using Heavy Rainfall Prediction: A Review Nishant N. Pachpor, B. Suresh Kumar, Prakash S. Parsad, and Salim G. Shaikh
Abstract Predicting rainfall has become a problematic and unpredictable activity that profoundly impacts civilization. Precise and appropriate predictions can aid in pre-emptively reducing financial and human harm. Rainfall prediction helps with flooding warnings, water resources management, air transport strategic planning, mobility restrictions, building construction, and other significant human aspects. In this paper, various existing methodologies of rainfall prediction are reviewed and compared. The current techniques of rainfall prediction have different attributes for rainfall estimation. Humidity, temperature, the flow of wind, pressure, sunlight, evaporation, etc., are some attributes of rainfall prediction. Some machine learningbased models of heavy rainfall prediction and optimization models, such as artificial neural network (ANN ), Naive Bayes, decision forest regression (DFR), and boosted decision tree regression (BDTR), and optimization techniques such as firefly, particle swarm optimization, genetic algorithm are compared in this paper. RMSE, MAE, and correlation evaluation parameters are used to evaluate the rainfall prediction model. Keywords Heavy rainfall · Optimization models · Decision forest regression (DFR ) · Naïve Bayes (NB) classification
N. N. Pachpor (B) Amity University, Jaipur, India e-mail: [email protected] B. Suresh Kumar Sanjay Ghodawat University, Kolhapur, India e-mail: [email protected] P. S. Parsad Priyadarshini College of Engineering, Nagpur, India e-mail: [email protected] S. G. Shaikh Department of CE, SIT, Lonavala, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_58
761
762
N. N. Pachpor et al.
1 Introduction Agriculture is the backbone of India’s economy. Agriculture’s success is determined by precipitation. It also aids in watershed management. Rainfall data from the past has helped farm owners efficiently manage agricultural products, resulting in the increased economic development of a country [1]. Predicting rainfall is advantageous in preventing floodwaters, which protects people and assets. Predicting precipitation is difficult for meteorologists due to fluctuations in time and the number of rainy days. Creating a forecasting model for realistic rainfall is among the most challenging problems for investigators from several domains, including meteorological data gathering, ecological pattern recognition, operational hydrogeology, and quantitative prediction. A typical concern in such issues is interpreting prior forecasts and applying forecasts about the future. Rainfall forecast helps manage water resources, flooding warnings, air transport strategic planning, mobility restrictions, building construction, and other significant human aspects. Meteorological observatories, cable and wireless connectivity, and elevated machines capture precipitation data for prediction [2]. Although since the beginning of human civilization, weather forecasting has been an exciting and intriguing area, so it is among the most intricate and attractive regions today. To forecast precipitation, the researchers employ a variety of approaches and procedures, several of which are highly accurate than most others. Temperature variations, including moisture, warmth, elevation, precipitation, wind speed and direction, absorption, and others, are gathered in weather prediction. Precipitation forecast is currently an essential aspect with most rainwater collection facilities worldwide. Among the most challenging issues is the unreliability of precipitation statistics. The majority of precipitation predictions presently cannot identify underlying structures or nonlinear patterns in precipitation data [3]. This study will aid in the discovery of all previously unknown patterns and nonlinear tendencies, which are required for effective rainfall forecasting. The prediction projections were inaccurate, leading to substantial damages, attributed to the prevalence of complicated challenges in present methodologies, which can uncover underlying knowledge and nonlinear tendencies effectively a lot of the time. As a result, this study aims to design a precipitation forecasting model that can address all concerns, detect complexities, including previously unknown patterns, and deliver accurate and trustworthy estimations, thereby aiding the region’s agricultural and economic development. There are several existing cases of heavy rainfall in India. The summer monsoon in India lasts for 4 months (June–September) and receives the majority of the precipitation. Rajasthan, an Indian state, is located in drought-prone areas, and precipitation is the state’s primary water source [4]. Rajasthan has a different geography and has experienced numerous droughts and floods in the past. With such unpredictability in climatology, making reliable forecasts becomes a challenging undertaking. Predicting the regularity and rainfall intensity necessitates time and effort, starting with data gathering, purification, evaluation, analysis, and assessment and forecasting using an accurate methodology. There is a need for an advanced method that can predict rainfalls earlier.
Different Nature-Inspired Optimization …
763
The section categorization of this paper is as follows: In the first section, introduction about rainfall, issues that occur due to precipitation are discussed. The different existing techniques of rainfall prediction are reviewed in Sect. 2. The heavy rainfall areas are mentioned in Sect. 3. For the earlier forecast of the rainfall, several machine learning-based models are developed, the heavy rainfall prediction models are discussed in Sect. 4.
2 Literature Review Rainfall is among the most significant considerations influencing the life and behavior of ecological components that impact the region’s economy and residents. Measures shall be taken to minimize catastrophes induced by precipitation events that forecast impending precipitation instabilities activity. Two approaches are commonly used to predict rainfall. One option is to examine the large volumes of data accumulated over time to obtain insight into future rains. Another option is to construct solutions by specifying different parameters and replacing variables to accomplish that goal. Barrera-Animas et al. [5] compared precipitation forecasting methods premised on traditional machine learning methods and computational intelligence frameworks suitable for analytical applications. Throughout the objective of predicting daily precipitation amounts utilizing data, information, a structure called a bidirectional long short-term memory (Bi-LSTM)-based network, long short-term memory, an ensemble of gradient boosting (E-GB) regressor, linear support vector regression (SVR), extra-trees regressor, stacked-LSTM, and XGBoost, were compared. Temperature record from 5 primary areas of the United Kingdom (UK) was used from 2000 to 2020. The accuracy of the model was assessed using the parameters mean absolute error (MAE ), root mean squared error (RMSE ), loss, and root mean squared logarithmic error (RMSLE ). Raval et al. [6] used machine learning techniques to estimate precipitation and examine the effectiveness of different designs. An optimal computational model and a classification algorithm based on a neural network were created in the proposed methodology. The older and newer prediction models employ Australian precipitation data. The forecasting in this research was developed using precipitation information recorded more than 10 years from 2007 to 2017 and contributions from 26 regionally different areas. During verification reasons, the dataset was separated into testing and training sets. According to the findings, classical and machine learning-based neural network techniques can forecast future rainfall. Gowtham Sethupathi et al. [7] compared the effectiveness of two machine learning algorithms for predicting rainfall: random forest and logistic regression. Both produce a highly accurate forecast. The research would assist investigators in systematically reviewing precipitation forecasting efforts, focusing on classification techniques, and serve as the basis for conclusions and recommendations. The Anaconda approach was proposed, and Python was used as the programming language that was extensible and flexible. The packages used throughout the development were Seaborn, Numpy, Pandas, and Matplotlib. Choi et al. [8] introduced the Radar, aws and
764
N. N. Pachpor et al.
Asos, and Imerg Network Fusion (RAIN-F) database for modeling and forecasting the rainfall prediction. There were 26,280 frames, including all, comprising nine distinct atmospheric condition factors connected to rainfall factors. The RAIN-F database was the first to combine rainfall measurements and precise meteorological system variables. The RAIN-F database was the first to combine rainfall records using relevant meteorological system parameters. The frame rate was one hour, while the spatial resolution varies from around 0.5 km for RADAR to 0.1° for IMERG components, dependent mostly on data. The authors were also presented standard outcomes based on the province imagery to image retrieval paradigm, U-Net framework. The RAIN-F dataset’s reliability was also contrasted with RADAR statistics alone operation findings. In the end, the authors demonstrated that the RAIN-F collection surpasses the radar-only database in terms of forecasting, particularly in high rainfall locations. ASL [9] analyzed that data was erroneous with current technology, and that appropriate statistical procedure was not used, which added the intricacy. When it comes to predicting precipitation patterns, humans could make mistakes that might result in damage, mortality, and loss of income. The author [5] used artificial intelligence techniques to predict rainfall dependent on precipitation variables such as rising temperatures, minimum temperature, weather conditions, humidity, temperature, dew point, thunderstorms, and sunlight, to reduce precipitation. Furthermore, trustworthy precipitation forecast results were obtained using relevant database approaches and implementing appropriate conditions. Consequently, employed the decision tree combined with the Gini index to anticipate precipitation based on historical data reliably. Sulaiman and Wahab [10] used the artificial neural network to forecast severe rainfall regularly. Rainfall from regional climatic zones was gathered and used in the study from 1965 until 2015. To measure the efficiency of ANN approximations, various sets of prior rainfall data were created as predicting variables. The ANN model’s result was evaluated using autoregression integrated moving average’s analytical method (ARIMA). Mahanta et al. [11] and Sarkar et al. [12] were also worked on rainfall prediction-based methodologies. Methodologies were used to predict Indian rain. The proposed methods were based on natural fundamentals. Yaseen et al. [13] proposed a hybrid methodology based on firefly optimization and a fuzzy-based adaptive inference model. The main objective of the proposed methodology was to predict heavy rainfall. The rainfall was expected on the Malaysian Phang River. For the training and testing of the proposed model, 15 years of information were collected. The proposed methodology had provided adequate results. The correlation coefficient (presented with R2), root mean squared error (RMSE) are used to monitor the effectiveness of every strategy. The findings demonstrate that the ANN model accurately predicts extreme rainfall situations over the dangerous point. Table 1 shows various existing heavy rainfall prediction methodologies, methods with research gaps, and performance parameters. The different existing comparative techniques of rainfall prediction models with dataset and attributes are depicted in Table 2. The primary characteristics of rainfall prediction are humidity, temperature, ground-level pressure, wind direction, etc., are used in different rainfall prediction models.
Different Nature-Inspired Optimization …
765
Table 1 Comparative analysis of various existing heavy rainfall prediction methodologies Authors and year
Proposed methods
Research gaps/problems
Performance parameters
Barrera-Animas et al. (2022) [5]
Machine learning technique for rainfall forecasting
Overfitting issues
Root mean squared logarithmic error root mean square error (RMSE) Mean absolute error ( MAE) Loss
Raval et al. (2021) [6]
Automated rainfall prediction analysis
Need to work on the feature selection method
Loss Precision F1-score
Gowtham Sethupathi et al. (2021) [7]
Random forest Logistic regression-based methodology for rainfall prediction
Limited characteristics Accuracy of the dataset
Choi et al. (2021) [8]
Convolutional neural High computational network (CNN)-based time model for rainfall forecasting
–
ASL (2019) [9]
Gini index-based methodology for heavy rainfall forecasting
Low accuracy rate
Accuracy
Sulaiman and Wahab (2018) [10]
Artificial neural network ( ANN)-based technique for heavy rainfall prediction
More inaccurate results
RMSE Correlation coefficient
Mahanta et al. (2013) [11]
Statistical approach
Need to collect data for multiple attributes
Convective available potential energy ( CAPE), Convective inhibition energy ( CINE)
Sarkar et al. (2020) [12]
Rain bird dependent methodology for rainfall prediction
Issues in data filtering
Area under receiver operating characteristics ( ROC) Area under curve (AUC)
Yaseen et al. (2018) [13]
Nature-inspired techniques-based adaptive neuro-fuzzy inference system (ANFIS) Firefly algorithm ( FFA)
Need to collect data about more attributes of rainfall prediction
Nash Sutcliffe coefficient ( NSE ) Willmott’s index (WI) Relative error ( RE ) Correlation score RMSE MAE
766
N. N. Pachpor et al.
Table 2 Existing comparative techniques of rainfall prediction models with dataset and attributes Authors and year
Comparative techniques
Dataset
Attributes for rainfall prediction
Barrera-Animas et al. (2022) [5]
XGBoost LSTM AUTO-ML
Open-weather dataset
Ground-level pressure Humidity Temperature
Raval et al. (2021) [6]
Naive Bayes Random forest K-nearest neighbor ( KNN) Decision tree ( DT ) Linear discriminant analysis ( LDA) Logistic regression (LR)
Online data from several Australian Weather Stations (142,194 sets)
Rain Temperature Sunshine
Gowtham Sethupathi et al. (2021) [7]
Random forest Logistic regression
Data collected from different regions of India (2015–2018)
Region Temperature Pressure Rain Wind direction
Choi et al. (2021) [8]
Convolutional neural RAIN-F dataset network ( CNN ) (26,280 images)
Humidity Temperature Pressure Rain Wind direction
ASL (2019) [9]
Decision tree
Hongkong Observatory-based dataset
Temperature Speed of wind Moisture Rainfall
Sulaiman and Wahab (2018) [10]
Autoregressive integrated moving average (ARIMA) Artificial neural network (ANN)
Rainfall data from meteorological areas (1965–2015)
Rain Temperature Sunshine
Mahanta et al. (2013) [11]
–
31 years data of Northeast Indian rainfall
Rainfall
Data collected from North India
Rainfall
Sarkar et al. (2020) [12] – Yaseen et al. (2018) [13]
Adaptive 15 years data of neuro-fuzzy rainfall (Malaysia) inference system (ANFIS) ANFIS-FFA (Firefly algorithm)
Rain Temperature Sunshine
Different Nature-Inspired Optimization …
767
3 Several Case Studies in Heavy Rainfall Areas Heavy rainfall prediction occurrences have a pervasive problem in operations and maintenance meteorological for India. It is mainly the case in which the natural topography encourages the development of heavy rainfall activities [11]. The Northern Region of India is one of the continental rainfall zone’s regions that receive a lot of heavy rainfall prediction throughout the pre-monsoon and midsummer monsoon periods and the summer–autumn changeover period of October. Flooding, agricultural devastation, and a halt to living are consequences of such disasters. Due to seasonal variability of heavy rainfall prediction frequencies are investigated utilizing hourly rainfall from 15 weather stations over 31 years (1971–2001). One of the ideal spots for observatories across 27.5° north and 28.1° North were discovered employing statistics from the Indian meteorological department of India. Several incidents are most likely to appear between the monsoon seasons (June 10–August 5). The summer months have the most heaviest rainfall activities, accompanied by June and August. So for the year, high precipitation occurrences throughout the area have been dropping steadily. In April and May, before the rainy season arrives, there are also many rainstorm activities during this period, which is also the origin of heavy rainfall occurrences. In India, flooding is a major natural disaster. The flooding areas of India that suffer from heavy rainfall are mentioned in this section. The highest rain in the southwest causes the Brahmaputra and other waterways to overflow respective boundaries, drowning the nearby areas. Pahang is a significant state in Peninsular Malaysia, as it’s one of Malaysia’s most flood-prone states. Due to its proximity to Peninsular Malaysia’s largest river, Sungai Pahang. The watercourse becomes the primary outlet for water flowing upstream in severe rains. As a result, providing prior precipitation forecasts and tsunami warnings has become one of the most excellent strategies to deal with the flooding challenge. In August 2018, the southern regions of Karnataka and Kerala received weighty rains. Heavy downpours in Kerala from August 7 to 17 caused severe floods in several parts of the area, resulting in the deaths of 433 persons and the displacement of 5.4 million inhabitants [14]. Heavy rains in Maharashtra killed nearly 1094 persons, included significant portions of the metropolis Mumbai that recorded 567 (inches) rainfall on July 26, 2005, altogether. The areas mentioned above suffer from heavy and risk heavy rain every year. The chance of heavy rain can be predicted with the help of machine learning-based techniques such as random forest, artificial neural network (ANN), support vector machine (SVM), decision tree (DT). The below-mentioned study explained the use of a machine learning-based approach for rainfall prediction. In Ref. [15], different techniques were used to predict rainfall. The various methods such as logistic regression, J48, PART, Naive Bayes, and random forest were used to predict rain. The rainfall prediction area was Srinagar (Jammu and Kashmir), India. The data for training and testing purposes of techniques were collected from 2015–2016. Srinagar is the populous district in the INDIAN state of Jammu and Kashmir and
768
N. N. Pachpor et al.
its summertime capital. It is situated on the banks of the River Jhelum, a River Indus branch, and the Dal and Anchar Lakes. The annual precipitation averages roughly 720 mm (28 in). The maximum temperature securely reported is 38.3 °C (Celsius) (100.9 °F), while the minimum is 20.0 °C (4.0 °F). Humidity, temperature, pressure, level, the flow of wind, dew point, pressure (sea level), information about prior rainfall, and snowfall were all attributes considered for the construction of the rainfall model.
4 Different Nature-Inspired Optimization and Machine Learning Models Used for Heavy Rainfall Prediction Analysis The existing rainfall prediction models are based on machine learning methods such as artificial neural networks, recurrent neural networks, boosted decision tree regression (BDTR), decision forest regression (DFR), Naive Bayes, and fuzzy systems.
4.1 Recurrent Neural Network (RNN) A recurrent neural network is an artificial neural network in which nodes are directly connected in a semantic network with linear dependence. This allows a recurrent neural network to respond in a temporally dynamic way. RNNs are evolved from feed-forward neural networks and can handle arbitrary length input sequences by using hidden memory states. The name “recurrent neural network” is generally implemented in two major kinds of systems that have a similar overall architecture, including one limited impulse the other with unlimited inspirations [16].
4.2 Artificial Neural Network (ANN) ANN is the artificial neural network abbreviation based on computational networks inspired by biological phenomena. ANN is the connections model neural network and a sort of ANN that can solve complex pattern-oriented tasks such as classification and trend analysis time-series issues [17]. Due to the general nonparametric design of neural networks, models are generated without previous knowledge of the population’s distribution or probable degree of association across components, which is essential by most parametric statistical techniques. Artificial neural network (ANN) is a technique based on central nervous system work, as shown in Fig. 1. Such frameworks are modeled on the biological nervous system, although the nervous system
Different Nature-Inspired Optimization …
769
Fig. 1 Formation of the neural network [17]
only employs a subset of the principles found in biological nervous systems. ANN models, in particular, mimic the electrical impulses of the central nervous system. Components often referred to as a neurode or a perceptron are linked to one another. The neurodes are generally structured inside a layer, through one layer’s output functioning as follows layer input and maybe further layers. A neurode can be linked to everyone or a part of the neurodes within the next layer, replicating neural connections between neurons.
4.3
Boosted Decision Tree Regression (BDTR)
A BDTR is a well-known approach for generating a combination of regression trees wherein every tree relies on the previous one. In a nutshell, it is an ensemble learning (EL) algorithm under which the secondary tree rectifies the mistakes of the fundamental tree; the tertiary tree tries to correct the faults of the firstst and secondnd trees. Forecasts are made using the entire collection of trees to create estimates. When it comes to handling summary statistics, the BDTR excels. The benefits of BDTR are that it is resilient to incomplete information and assigns attribute importance ratings appropriately [18].
4.4
Decision Forest Regression (DFR)
A DFR is a collection of classification algorithms such as decision trees that have been randomized constructed. It operates by learning large decision trees and then producing a single tree modeling of categories (categorization) or an average prediction (generalized linear models) as the outcome. Every tree is constructed with a randomly selected subset of characteristics and an inconsistent selection of statistics that causes the trees to diverge by presenting in multiple databases [18]. The trees
770
N. N. Pachpor et al.
and the range of identifying factors through each vertex are the two essential parameters. It is often resistant to overfitting; DFR is useful for generating unequal large datasets with incomplete components. It also exhibits fewer categorization mistakes and a higher f-scores than decision trees, but the outcomes are more difficult to comprehend.
4.5 Naïve Bayes Naive Bayes is a supervised-based classification technique. These types of “probabilistic classifiers” are based on Bayes’ theory. Naive Bayes is the classification technique used for binary and multiple class classification. This is one of the most basic Bayesian network models used to find high precision levels when combined with density estimation methods. This technique is rapid and can accurately predict the testing dataset’s category. If the premise of feature interdependence holds, the Naïve Bayes classification outperforms alternative approaches with a more miniature training set. It is used to predict multiple class categorization issues and deal with both discrete and continuous data. Naive Bayes is simple to construct because no complex parameter estimate is required, making it suitable for massive datasets [19].
4.6 Fuzzy System A fuzzy system can be described as a set of IF–THEN logic containing fuzzy propositions, a mathematical or differential calculation containing random variables as variables that represent the uncertainty of attribute values. The non-deterministic nature of biological neural computations inspired fuzzy logic. The combination of fuzzy logic [20], and neural network-like neuro-fuzzy structures, is appropriate because both are derived through brain calculations requiring training, adaptability, and disturbance tolerance. A fuzzy system is beneficial for rainfall prediction and analysis. It is an inference engine platform built on the Takagi–Sugeno fuzzy reasoning approach. In the early nineties, the fuzzy-based methodology was introduced. It combines neural network models with probabilistic reasoning concepts in a unified model, allowing it to encapsulate the advantages of each. The inference system of fuzzy system is based on a combination of fuzzy If–Then constraints with the capacity to simulate nonlinear variables through training.
Different Nature-Inspired Optimization …
771
4.7 Firefly Algorithm FA is a unique of the current meta-heuristic approaches. The FA depends on the social nature of fireflies. To communicate, find prey, and search mates, fireflies use bioluminescence with diverse light intensity (LI) designs. Firefly (FF) with maximum LI designs will attract the other FFs to move toward it. The basic principle of the intensity of light is that it would be approximately exponential equal to the difference of the surface, allowing us to construct a technical committee for the potential difference between two fireflies. Participants are compelled to make purposeful or randomized movements in the community to maximize the technical committee [13]. As a result, all of the flies will flock toward the more intelligent ones with stronger lights flashing till the community converges around the strongest individual.
4.8 Genetic Algorithm A genetic algorithm is a heuristic-based technique, which fellow Charles Darwin’s natural selection hypothesis. The technique mimics natural selection, in which the best individuals from the population of individuals are chosen for the process of reproduction in an attempt to develop the subsequent generation’s offspring [21]. The individuals are also known as phenotypes as well as creatures. A genetic algorithm is a technique for addressing constrained as well as unconstrained optimization issues which are based on a natural evolution methodology similar to evolutionary biology.
4.9
Particle Swarm Optimization (PSO)
The particle swarm optimization technique was developed by Eberhart and Kennedy. The meta-heuristic strategy is another name for it. The PSO’s operation is focused on the basic behavior of birds. The researchers of PSO analyzed the flocking behavior of birds and the swarming nature of birds. It is a very popular optimization technique due to its simplicity in application [21]. In PSO, each particle is the candidate solution of the problem and the candidate solutions are encoded by a vector or an array. In the real example, as each bird flew for finding the food as in algorithm, each particle moves in space of search for finding the optimal solutions. Each particle finds its own solution, and the end the optimal value is updated which provide the best solution. The performance of different heavy rainfall prediction models such as ANN, BDTR, DFR, naïve Bayes, and fuzzy system is depicted in Table 3. The performance of existing rainfall models is analyzed based on mean absolute error (MAE), root mean squared error (RMSE), and correlation coefficient. ANN-based rainfall prediction model had achieved efficient results with a minimum value of RMSE.
772
N. N. Pachpor et al.
Table 3 Different heavy rainfall prediction models Heavy rainfall prediction models
RMSE
MAE
Correlation coefficient (R2)
ANN [10]
0.06
–
0.46
Boosted decision tree regression ( BDTR ) [18]
0.16
0.10
0.67
Decision forest regression ( DFR) [18]
0.17
0.11
0.66
Fuzzy system [20]
0.457
0.226
0.285
Table 4 Optimization techniques
Optimization techniques for Heavy rainfall prediction RMSE models Firefly technique [13]
0.53
Genetic algorithm [21]
0.67
PSO [21]
0.93
The performance of existing optimization techniques for heavy rainfall prediction models is depicted in Table 4. The performance metric (RMSE) is used for the performance evaluation. The firefly technique has provided efficient results from other existing techniques. The comparative analysis of different existing rainfall prediction models based on various performance metrics is depicted below. The ANN model has a minimum root mean square error (RMSE ). The maximum correlation coefficient is attained by boosted decision tree regression (BDTR) [18], and BDTR gains the minimum value of the MAE. Figure 2 represents the comparison analysis of existing models of rainfall prediction based on the RMSE performance parameter. The techniques BDTR and DFR provided an approximately similar value of RMSE. The ANN model provided the minimum value of RMSE from other comparable models. RMSE
Value
Fig. 2 Comparison of different existing models of rainfall prediction: RMSE
0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 ANN [10]
BDTR [14]
DFR [14]
Existing models
Fuzzy system [16]
Different Nature-Inspired Optimization …
773 MAE
Fig. 3 Comparison of different existing models of rainfall prediction: MAE
0.25
Value
0.2 0.15 0.1 0.05 0 BDTR [14]
DFR [14]
Fuzzy system [16]
Existing Models
Correlation coefficient
Values
Fig. 4 Comparison of different existing models of rainfall prediction: correlation coefficient
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
Existing techniques
The mean absolute error (MAE) results of existing rainfall prediction are depicted in Fig. 3. BDTR and DFR had attained the same approximation value of MAE and provided efficient outcomes. Figure 4 represents the correlation coefficient values of different rainfall prediction models. The existing fuzzy-based model has achieved the minimum value of the correlation coefficient, which provides inefficient results. The correlation values of BDTR and DFR are similar and provide adequate performance for rainfall prediction. The comparison of different existing optimization techniques for rainfall prediction models is presented in Fig. 5. The particle swarm optimization technique has attained the highest value of RMSE, which is not adequate. The genetic algorithm has attained effective outcomes with the lowest value of RMSE from other techniques.
5 Conclusion and Future Work The concluded and precipitation is the primary source of revenue for national economic development. It should be regarded as the most pressing concern for most
774
RMSE
Value
Fig. 5 Comparison of different nature-inspired optimization techniques for rainfall prediction: RMSE
N. N. Pachpor et al.
1 0.8 0.6 0.4 0.2 0 Firefly [20]
Genetic algorithm [21]
PSO [21]
Existing techniques
of us. Conventional precipitation prediction models underperform in the most challenging scenarios since they cannot predict the underlying knowledge that must be recognized to make the correct estimate. Several methodologies are being examined to establish an accurate way of predicting precipitation. In this paper, different models of rainfall prediction and optimization techniques for rainfall prediction are compared as well as analyzed. The comparison analysis of rainfall prediction models is based on RMSE, MAE, correlation coefficient, etc. The ANN model had achieved the minimum root mean square error (RMSE ) from other comparable models. The maximum correlation coefficient is attained by boosted decision tree regression (BDTR) and achieves the minimum value of the mean absolute error. The optimization techniques for rainfall prediction models are also compared with RMSE parameter. In the future, more rainfall prediction-based models will be reviewed and compared. The firefly technique has provided efficient results from other existing techniques. The existing study’s findings revealed that independent machine learning algorithms could forecast precipitation with an adequate level of certainty. However, better effective precipitation forecasting might be accomplished by developing hybrid machine learning techniques and various climate projections.
References 1. B.T. Pham, L.M. Le, T.T. Le, K.T.T. Bui, V.M. Le, H.B. Ly, I. Prakash, Development of advanced artificial intelligence models for daily rainfall prediction. Atmos. Res. 237, 104845 (2020) 2. A. Parmar, K. Mistree, M Sompura, Machine learning techniques for rainfall prediction: A review, in International Conference on Innovations in Information Embedded and Communication Systems, vol. 3 (2017) 3. P. Tarolli, M. Borga, K.T. Chang, S.H. Chiang, Modeling shallow landsliding susceptibility by incorporating heavy rainfall statistical properties. Geomorphology 133(3–4), 199–211 (2011) 4. K. Srivastava, D. Pradhan, Real-time extremely heavy rainfall forecast and warning over Rajasthan during the monsoon season (2016). Pure Appl. Geophys. 175(1), 421–448 (2018) 5. A.Y. Barrera-Animas, L.O. Oyedele, M. Bilal, T.D. Akinosho, J.M.D. Delgado, L.A. Akanbi, Rainfall prediction: A comparative analysis of modern machine learning algorithms for timeseries forecasting. Mach. Learn. Appl. 7, 100204 (2022) 6. M. Raval, P. Sivashanmugam, V. Pham, H. Gohel, A. Kaushik, Y. Wan, Automated predictive analytics tool for rainfall forecasting. Sci. Rep. 11(1), 1–13 (2021)
Different Nature-Inspired Optimization …
775
7. M. Gowtham Sethupathi, Y.S. Ganesh, M.M. Ali, Efficient rainfall prediction and analysis using machine learning techniques. Turkish J. Comput. Math. Educ. (TURCOMAT) 12(6), 3467–3474 (2021) 8. Y. Choi, K. Cha, M. Back, H. Choi, T. Jeon, RAIN-F: A fusion dataset for rainfall prediction using convolutional neural network, in 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS (IEEE, 2021), pp. 7145–7148 9. S.N. ASL, Heavy rainfall prediction using Gini index in decision tree. Int. J. Recent Technol. Eng. (IJRTE), 8(4), 4558–4562 (2019) 10. J. Sulaiman, S.H. Wahab, Heavy rainfall forecasting model using artificial neural network for flood-prone area, in IT convergence and security 2017 (Springer, Singapore, 2018), pp. 68–76 11. R. Mahanta, D. Sarma, A. Choudhury, Heavy rainfall occurrences in northeast India. Int. J. Climatol. 33(6), 1456–1469 (2013) 12. D. Sarkar, B. Tomar, R.S. Kumar, S. Saran, G. Talukdar, Tracking the Rain Bird: Modeling the monthly distribution of Pied cuckoo in India (2020). bioRxiv 13. Z.M. Yaseen, M.I. Ghareb, I. Ebtehaj, H. Bonakdari, R. Siddique, S. Heddam, … R. Deo, Rainfall pattern forecasting using novel hybrid intelligent model based ANFIS-FFA. Water Resour. Manag. 32(1), 105–122 (2018) 14. Floods in India—Wikipedia. En.wikipedia.org. (2022). Retrieved 5 Jan 2022, from https://en. wikipedia.org/wiki/Floods_in_India 15. R. Mohd, M.A. Butt, M.Z. Baba, Comparative study of rainfall prediction modeling techniques (A case study on Srinagar, J&K, India). Asian J. Comput. Sci. Technol. 7(3), 13–19 (2018) 16. M.P. Darji, V.K. Dabhi, H.B. Prajapati, Rainfall forecasting using neural network: A survey, in 2015 International Conference on Advances in Computer Engineering and Applications (IEEE, 2015), pp. 706–713 17. D.R. Nayak, A. Mahapatra, P. Mishra, A survey on rainfall prediction using artificial neural network. Int. J. Comput. Appl. 72(16) (2013) 18. W.M. Ridwan, M. Sapitang, A. Aziz, K.F. Kushiar, A.N. Ahmed, A. El-Shafie, Rainfall forecasting model using machine learning methods: Case study Terengganu Malaysia. Ain Shams Eng. J. 12(2), 1651–1663 (2021) 19. R. Mohd, M. Ahmed, M. Zaman, Modelıng rainfall prediction: A Naive Bayes approach. Int. J. Adv. Electron. Comput. Sci. 5(12) (2018) 20. N.Z.M. Safar, A.A. Ramli, H. Mahdin, D. Ndzi, K.M.N.K. Khalif, Rain prediction using fuzzy rule based system in North-West Malaysia. Indonesian J. Electric. Eng. Comput. Sci. 14(3), 1572–1581 (2019) 21. J. Wu, J. Long, M. Liu, Evolving RBF neural networks for rainfall prediction using hybrid particle swarm optimization and genetic algorithm. Neurocomputing 148, 136–142 (2015)
Hyper Chaos Random Bit-Flipping Diffusion-Based Colour Image Cryptosystem Sujarani Rajendran , Manivannan Doraipandian, Kannan Krithivasan, Ramya Sabapathi, and Palanivel Srinivasan
Abstract In today’s digital environment, sensitive and personal images are transmitted over different social media applications over an insecure network. By considering the user’s privacy and security, their sensitive images have to be transferred in a secure manner. This work proposes a new colour image crypt architecture by utilizing the hyper Lorenz chaotic map. A new methodology random bit-flipping process is employed in the diffusion process to increase the security level of the cryptosystem. The developed architecture is the combination of R, G, and B channels confusion and random bit-level diffusion of each pixel. Experimental results and security analysis indicate that the developed model has the ability to resist statistical, differential, and exhaustive attacks and also achieves sufficient robustness. Keywords Bit-flipping · Hyper chaos · Lorenz map · Trajectories · Statistical analysis
1 Introduction In the advancement of digital era, transferring of the digital content has been increased especially in media and social networks. Since sensitive and personal images are frequently transmitted, security in transmission comes to be a demanding issue and gets the highest attention [1, 2]. Most of the images carry some authentication data like biometric images, which easily reveal the personal content. However, transmitting the images in readable form is not an advisable solution and hence, encryption of images provides sufficient security during transmission. Furthermore, conventional security algorithms like AES, ECC, IDEA, and RSA are available, but these algorithms work well for encrypting text data and no longer applicable for encrypting S. Rajendran (B) Department of Computer Science and Engineering, Srinivasa Ramanujan Centre, SASTRA Deemed University, Kumbakonam 612001, India e-mail: [email protected] M. Doraipandian · K. Krithivasan · R. Sabapathi · P. Srinivasan School of Computing, SASTRA Deemed University, Thanjavur 613401, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_59
777
778
S. Rajendran et al.
images, because of the in-built properties of images like large data and huge redundancy and correlation among pixels [3]. To overwhelm these complications, various encryption methodologies have been developed by different researchers such as fuzzy logic [4], neural network [5], finite series [6], and chaotic system [7]. Hereof, chaosbased image cryptosystems have been proved to be efficient in terms of security and performance [8–10] and have been developed for providing security for images in different applications like e-health care [11], e-military [12], IoT applications, and in different social media. The base of chaotic cryptosystem is a chaotic map. Different chaotic maps are available depending on their dimensions (one-dimensional (1D), two-dimensional (2D), and multidimensional). Earlier, most of the researchers utilized 1D chaotic maps like logistic [13], Chebyshev [14], and tent map [15]. But later, 1D chaotic maps are identified of depriving sufficient security for large size images, because of their short term randomness and less key size. In connection with this, image cryptosystems are developed by utilizing 2D chaotic maps like Henon [16], Arnold cat map [17], cubic map [18], etc. Rehman et al. [19] proposed a novel double layer confusion and diffusion, by taking the key series generated by 2D logistic map and employing DNA computing technology for enhancing the security level of the developed cryptosystem. Liu et al. [20] proposed a novel 2D Chebyshev-sine map by combining two 1D sine and Chebyshev map, and proved the unpredictability of the key series by exposing the results of trajectories, bifurcation, and Lyapunav’s exponent results. These chaotic series were taken as a key values for encrypting each component of the colour images with efficient execution time. An efficient encryption algorithm for secure image communication was developed by Mondal et al. [21] by utilizing the pseudo-random series generated by 2D Baker’s map. For diffusion, XOR operation was executed between chaotic series of Baker’s map and pixels of confused image. Jin et al. [7] proposed a new non-RGB colour image encryption by using the strategies YCbCr and HSV where, these chroma channels were confused by using 2D Arnold cat map and DNA encoding, and computing was used for diffusing each channel of the image and efficiency, and the security level was identified by applying several statistical and correlation attacks. Xiaolin et al. [22] proposed a new 2D rectangular transform (RT) based on the extension of Arnold cat map, and developed an image cryptosystem by combining 2D-RT with chaotic tent map. In encryption, chaotic series generated by both transform and map were utilized for executing the confusion and diffusion process, which achieved good encryption level. These fusional or 2D chaotic maps provide better randomness than the 1D chaotic maps; however, they lag in key size, and the same set of chaotic series have to be applied for all the channels for encrypting the colour images. These limitations have been overcome by utilizing higher dimensional hyper chaotic maps like three-dimensional (3D), four-dimensional (4D), etc. Many colour image cryptosystems have been developed with higher key size and different key series used for encrypting each channels of the image [23–25]. A novel hyper chaotic colour image encryption with random bit-shift diffusion based on the inspiration of the previously discussed image cryptosystem, and their limitations have been proposed in this paper. At first, the chaotic key series are
Hyper Chaos Random Bit-Flipping Diffusion …
779
generated based on an iterative execution of 4D Lorenz chaotic system. After the confusion part is executed on each channel of the image, the confused channels are converted into binary form, and diffusion is executed by applying random bit-shift of each binary pixel based on the random chaotic series. Finally, the encrypted colour image is obtained by joining the encrypted channels. This article is framed as Sect. 2 describes the 3D Lorenz chaotic system used for generating the chaotic series. Section 3 discourses the architecture of the proposed encryption model, and Sect. 4 discusses the security analysis part, and this article ends with the conclusion part described in Sect. 5.
2 Material and Methods In the developed crypt model, 3D Lorenz chaotic system [26] is utilized for generating three different chaotic series with different mathematical function. These chaotic series are utilized for encrypting each channel of the given input colour image. The following subsections elaborate the definition of Lorenz chaotic map.
2.1 Lorenz Hyper Chaotic System It is one of three coupled dynamic function with good chaotic behaviour and highly sensitive to initial value of chaotic series and system parameters. The mathematical representation of 3D Lorenz system is described in Eq. (1). dp = α(q − p) dt dq = p(β − r ) dt dr = ( pq − γ r ) dt
⎧ 0 < p, q, r ⎪ ⎪ ⎨ α = 10 ⎪ β = 28 ⎪ ⎩ γ = 8/3
(1)
This chaotic map contains three initial and system parameters such as p0 , q0 , r0 , α, β and γ . Lorenz map delivers random set of chaotic series depending on the value initialized to the system parameters as mentioned in Eq. (1). The trajectories of the p, q, and r series represent that the Lorenz map hold efficient chaotic behaviour (Fig. 1).
3 Proposed Colour Image Cryptosystem The developed method is the combination of three processes, hyper chaotic series generation, pixel position confusion, and each pixel randomized bit-level diffusion.
780
S. Rajendran et al.
Fig. 1 Trajectories of the chaotic series of (a) p and q, (b) q and r, (c) p and r, and (d) p, q ,and r
Here, the confusion process is used to permute the image for decreasing the correlation among pixels. When confusion is executed, pixel positions are changed but the values are not changed. So, based on histogram analysis, attackers easily identify the origin image. Hence, diffusion process is added for increasing the security, which changes the pixel values of the image. The step-by-step procedure of the amplified cryptosystem is discussed in the following contents, and its block view is illustrated in Fig. 2.
Fig. 2 Proposed hyper chaos colour image crypt architecture
Hyper Chaos Random Bit-Flipping Diffusion …
781
Step 1: Input colour images CIM of size H × W (Height × Width) are separated into three channels (CR, CG, and CB) with equal size. Step 2: Generate the chaotic series P, Q, and R as P = p0 , p1 , p2 . . . psize Q = q0 , q1 , q2 . . . qsize R = r0 , r1 , r2 . . . rsize where size = (H × W ). Step 3: Sort all the three chaotic series in ascending order and maintain the old index and new array index of the arranged and unsorted key series as represented in the following Eq. (2). [sor t p, pindex] = sor t(P) [sor tq, qindex] = sor t(Q) [sor tr, rindex] = sor t(R)
(2)
Here, sorting is required for interchanging the pixels for confusion process. Old index value and new index value are taken for representing row and column of the pixels to interchange all the pixels. Step 4: Confuse all the three channels by exploiting the sorted array index and unsorted array index chaotic series as represented in Algorithm 1. Algorithm 1: Confusion Process For i = 1: H For j = 1: W Temp1 = CR(i, j) Temp 2 = CG(i,j) Temp 3 = CB(i,j) CR(i, j) = CR(pindex(i), opindex(j)) CG(i, j) = CG(qindex(i), oqindex(j)) CR(i, j) = CR(rindex(i), orindex(j)) CR(oindex(p), nindex(q)) = Temp1 CG(oindex(p), nindex(q)) = Temp2 CB(oindex(p), nindex(q)) = Temp3
782
S. Rajendran et al.
End End Step 5: Create key images key1, key2, and key3, as equal to the image size by utilizing the three chaotic series P, Q, and R which are generated as equal to the size of the image. At first, convert the fractional form chaotic values to numeral form by multiplying the chaotic value into 104 and modulo operation is executed to make the chaotic series within the range of 0–255 as discussed in Algorithm 2. Algorithm 2: Key Image Creation K = 1; For i = 1:H For j = 1:W Key1(i,j) = Mod((P(k) × 104 ), 256); Key2(i,j) = Mod((Q(k) × 104 ), 256); Key3(i,j) = Mod((R(k) × 104 ), 256); K = k + 1; End End Step 6: Convert each channel of pixel values to 8-bit binary form and flip the bits of each pixel by using random values ranging from 1 to 7 which are obtained from the chaotic series stored in key images key1, key2, and key3, further XOR operation is executed between each channel pixels with key images as represented in Algorithm 3. Algorithm 3: Diffusion Process For i = 1 to H For j = 1 to W Do Flip1 = 8 – mod(Key1( i, j), 7) Flip2 = 8 – mod(Key2( i, j), 7) Flip3 = 8 – mod(Key3( i, j), 7) BCR(i , j) = flipud(dec2bin(CR( i, j),)Flip1) BCG(i , j) = flipud(dec2bin(CG( i, j),)Flip2) BCB(i , j) = flipud(dec2bin(CB( i, j),)Flip1) ER(i,j) = (bin2dec(BCR(i, j)) ⊕ key1(i, j) EG(i,j) = (bin2dec(BCG(i, j)) ⊕ key2(i, j)
Hyper Chaos Random Bit-Flipping Diffusion …
783
EB(i,j) = (bin2dec(BCB(i, j)) ⊕ key3(i, j) End End Finally, the encrypted image EI is obtained by merging {ER, EG, EB}.
4 Experiment Outcome and Performance Analysis The proposed cipher is implemented in MATLAB2016a, on a personal computer with 4.00 GB RAM, Intel i5 processor, and Windows 8.1 OS. For simulation and comparison purpose, colour images with size 256 × 256 are taken from SIPI database [27]. Figure 3 illustrates the result of confusion and diffusion. For illustration purpose, the system parameters and seed key values are taken as p0 = 0.6758493827, q0 = 0.238776234, r0 = 0.2398764523, α = 10, β = 28, and γ = 8/3.
4.1 Histogram Analysis Histogram map out each pixel value of the image and exposes the property of pixel distribution [28]. An effectual image cipher should achieve uniform intensity distribution, which is entirely distinct from the input original image. For demonstration, the histograms of original medical image and its corresponding encrypted image are depicted in Fig. 4. Figure 4a shows the histograms of three channels with respect to original image. Figure 4b represents the histograms of cipher image. From the images, it can be identified that all the pixels are equally distributed, which makes it difficult for the attacker for finding the origin image by applying histogram or statistical attacks.
Fig. 3 Cryptosystem results a plain image, b confusion, and c diffusion
784
S. Rajendran et al.
Fig. 4 Histogram analysis: a Plain image b Encrypted image
Table 1 Comparison of correlation coefficient results Direction
Plain image
Proposed cipher
Horizontal
0.9853
−0.0022
Vertical
0.9753
0.0034
Diagonal
0.9734
−0.0048
0.0014
Ref. [9]
Ref. [30]
Ref. [31]
0.0046
0.0027
−0.0030
−0.0028
−0.0013
0.0025
0.0039
−0.0001
4.2 Correlation Between Adjacent Pixels In a normal digital image, pixel values are identical to its nearby pixel [29]. This provides an easy way for the attacker to retain the original image if the image is encrypted by using the same pattern of function for all the pixels. In the proposed cipher, each pixel is encrypted using the chaotic values delivered by the Lorenz system. As a result, the random distribution of pixels in the cipher image is increased. Table 1 shows the correlation coefficient results of test images and their assessment. Assessment result depicts that the developed medical image cipher greatly reduced the correlation and increased the security in terms of uniform distribution of pixels.
4.3 Entropy Analysis An incredible report to precise the amount of uncertainty in cipher image is the entropy analysis which is computed based on (3). A good image cipher should have the entropy result as eight or nearest to eight, since eight is the entropy value of an ideal image [32].
Hyper Chaos Random Bit-Flipping Diffusion …
785
Table 2 Entropy result comparison Image
Proposed Scheme R
G
Ref. [9] B
R
G
B
Lena
7. 9994
7.9996
7.9992
7.9992
7.9993
7.9994
Baboon
7.9992
7.9994
7.9992
7.9991
7.9992
7.9992
Fruits
7.9992
7.9991
7.9992
7.9992
7.9992
7.9993
Flowers
7.9991
7.9990
7.9992
7.9990
7.9990
7.9991
Girl
7.9996
7.9992
7.9995
7.9995
7.9995
7.9996
E(E I ) =
M 2 −1
k=0
P(E Ik ) log2
1 P(E I )
(3)
EI represents the cipher image. The entropy value of the test images and their comparison are given in Table 2. In connection with the comparison table, it can be finalized that the proposed model greatly increases the randomness of the cipher image.
4.4 Key Space and Key Sensitivity Analysis To overcome the brute-force attack, image cipher should have at least 2100 bit key size [33]. In the proposed cipher architecture, three parameters act as seed keys (α, β, γ , p0 , q0 , r0 ) each having a maximum bit of 214 . Hence, the total key size of the proposed scheme is >2100 , which depicts the strength of the cipher to withstand brute-force attack. Key sensitivity is one of the terrific feature to check the strength to cipher. Even a single bit change in the key should produce an entirely different encrypted image [34]. For evaluating the key sensitivity, two key sets are taken with a slight change like, key1 as (p0 = 0.6758493827, q0 = 0.238776234, and r 0 = 0.2398764523) and key2 as (p0 = 4.6758493828, q0 = 4.238776234, and r 0 = 0.2398764523). Sample 2 image is taken as input, and two cipher images are produced by encrypting Lena with key1 and key2, and the difference between these two cipher images are visualized in the following Fig. 5. Based on visualizing the pixels, it is justified that the slight change in keys greatly affect the cipher image and makes the cipher to strongly resist the brute-force attack.
786
S. Rajendran et al.
Fig. 5 Key sensitivity analysis: a Plain image. b Encrypted image using key1. c Encrypted image using key2. d Difference between b and c
4.5 Robustness Analysis In real-time e-healthcare application, medical images are transmitted over network. During transmission, cipher image may be affected by crop and noise attacks. Therefore, some unwanted values are updated in the pixels of the cipher image. Moreover, salt and pepper noise and Gaussian noise mostly affect the image data during transmission. An efficient cipher should reclaim the original image even though cipher images are affected by either noise or crop attack [35]. To assess the proposed cipher, some percentage of images are cropped and some amount of noises are applied intentionally to the cipher image. Decrypted images are taken for evaluating the robustness against these attacks and is illustrated in Fig. 6. Based on the illustration, it is finalized that the proposed cipher architecture greatly maintains the robustness.
5 Conclusion In this work, a colour image cryptosystem has been developed based on hyper chaotic Lorenz map system. The developed cryptosystem employs random bit-level-based flipping that depends on the chaotic series generated by Lorenz map which provides more security than the lower dimensional chaotic maps. The developed architecture has efficient confusion and diffusion process on each channel of the colour image, which increases the resistance of the developed cipher architecture against common cipher attacks. This highly enhances the quality of the image cipher. Simulation and performance results indicate the strength of the proposed model against differential, exhaustive, and robustness attack analysis. The execution time of the proposed methodology is greatly reduced when compared to the conventional cryptosystem. Comparison study indicates that the proposed model is more efficient than the stateof-the-art. The main limitation of the proposed work is that the chaotic series have to be generated as equal to the size of the images. In future work, this drawback shall be resolved by creating a new methodology.
Hyper Chaos Random Bit-Flipping Diffusion …
787
Fig. 6 Robustness analysis: Crop attack analysis—decrypted image after cropping a 5%, b 15%, and c 25%. Noise attack analysis—Gaussian noise affected and decrypted image of d 0.00001, e 0.0001, and f 0.0005
Acknowledgements The Authors gratefully acknowledge the Department of Science and Technology, India for Fund for Improvement of S&T Infrastructure in Universities and Higher Educational Institutions (SR/FST/ETI-371/2014), (SR/FST/MSI-107/2015) and Tata Realty—IT City— SASTRA Srinivasa Ramanujan Research Cell of our University for the financial support extended to us in carrying out this research work.
References 1. M. Satheesh, M. Deepika, Implementation of multifactor authentication using optimistic fair exchange. J. Ubiquitous Comput. Commun. Technol. 2, 70–78 (2020) 2. R. Dhaya, Light weight CNN based robust ımage watermarking scheme for security. J. Inf. Technol. Digit. World. 3, 118–132 (2021)
788
S. Rajendran et al.
3. M. Asgari-Chenaghlu, M.A. Balafar, M.R. Feizi-Derakhshi, A novel image encryption algorithm based on polynomial combination of chaotic maps and dynamic function generation. Signal Process. 157, 1–13 (2019) 4. Y. Shen, C. Tang, M. Xu, Z. Lei, Optical selective encryption based on the FRFCM algorithm and face biometric for the medical image. Opt. Laser Technol. 138 (2021) 5. G. Maddodi, A. Awad, D. Awad, M. Awad, B. Lee, A New Image Encryption Algorithm Based on Heterogeneous Chaotic Neural Network Generator and DNA Encoding (2018), pp. 24701– 24725 6. A. Ullah, S.S. Jamal, T. Shah, A novel scheme for image encryption using substitution box and chaotic system. Nonlinear Dyn. 91, 359–370 (2018) 7. X. Jin, S. Yin, N. Liu, X. Li, G. Zhao, S. Ge, Color image encryption in non-RGB color spaces. Multimed. Tools Appl. 77, 15851–15873 (2018) 8. T.S. Ali, R. Ali, A new chaos based color image encryption algorithm using permutation substitution and Boolean operation. Multimed. Tools Appl. 79, 19853–19873 (2020) 9. S. Cai, L. Huang, X. Chen, X. Xiong, A symmetric plaintext-related color image encryption system based on bit permutation. Entropy 20, 1–20 (2018) 10. A. Girdhar, V. Kumar, A RGB image encryption technique using Lorenz and Rossler chaotic system on DNA sequences. Multimed. Tools Appl. 77, 27017–27039 (2018) 11. W. Cao, Y. Zhou, C.L.P. Chen, L. Xia, Medical image encryption using edge maps. Signal Process. 132, 96–109 (2017) 12. V. Sangavi, P. Thangavel, An efficient radical image encryption based on 3-D lorenz chaotic system. Int. J. Eng. Adv. Technol. 9, 1792–1801 (2019) 13. X. Wang, X. Qin, C. Liu, Color image encryption algorithm based on customized globally coupled map lattices. Multimed. Tools Appl. 78, 6191–6209 (2019) 14. S. Zhu, C. Zhu, A new image compression-encryption scheme based on compressive sensing and cyclic shift. Multimed. Tools Appl. 78, 20855–20875 (2019) 15. S. Som, A. Mitra, S. Palit, B.B. Chaudhuri, A selective bitplane image encryption scheme using chaotic maps. Multimed. Tools Appl. 78, 10373–10400 (2019) 16. R. Vidhya, M. Brindha, A novel dynamic chaotic image encryption using butterfly network topology based diffusion and decision based permutation. Multimed. Tools Appl. (2020) 17. D.M.S Bandara, Y. Lei, Y. Luo, Fingerprint ımage encryption using a 2D chaotic map and elliptic curve cryptography. Int. J. Comput. Inf. Eng. 12, 871–878 (2018) 18. M.A. Mokhtar, N.M. Sadek, A.G. Mohamed, Design of image encryption algorithm based on different chaotic mapping, in Proceedings of National Radio Science Conference NRSC (2017), pp. 197–204 19. A. Rehman, X. Liao, A novel robust dual diffusion/confusion encryption technique for color image based on Chaos, DNA and SHA-2. Multimed. Tools Appl. 78, 2105–2133 (2019) 20. H. Liu, F. Wen, A. Kadir, Construction of a new 2D Chebyshev-Sine map and its application to color image encryption. Multimed. Tools Appl. 78, 15997–16010 (2019) 21. B. Mondal, P. Kumar, S. Singh, A chaotic permutation and diffusion based image encryption algorithm for secure communications. Multimed. Tools Appl. 77, 31177–31198 (2018) 22. X. Wu, B. Zhu, Y. Hu, Y. Ran, A novel color image encryption scheme using rectangular transform-enhanced chaotic tent maps. IEEE Access. 5, 6429–6436 (2017) 23. S.A. Banu, R. Amirtharajan, A robust medical image encryption in dual domain: chaos-DNAIWT combined approach. Med. Biol. Eng. Comput. 58, 1445–1458 (2020) 24. H. Chen, C. Tanougast, Z. Liu, L. Sieler, H. Ramenah, Optical image asymmetric cryptosystem using fingerprint based on iterative fraction Fourier transform. Opt. Quantum Electron. 49, 1–13 (2017) 25. P. Rakheja, R. Vig, P. Singh, An asymmetric hybrid cryptosystem using hyperchaotic system and random decomposition in hybrid multi resolution wavelet domain. Multimed. Tools Appl. 78, 20809–20834 (2019) 26. K.A. Kumari, B. Akshaya, B. Umamaheswari, K. Thenmozhi, R. Amirtharajan, P. Praveenkumar, 3D lorenz map governs DNA rule in encrypting DICOM images. Biomed. Pharmacol. J. 11, 897–906 (2018)
Hyper Chaos Random Bit-Flipping Diffusion …
789
27. A. Weber, The USC-SIPI Image Data Base: Version 4 (1993) 28. H. Zhang, X.Q. Wang, Y.J. Sun, X.Y. Wang, A novel method for lossless image compression and encryption based on LWT, SPIHT and cellular automata. Signal Process. Image Commun. 84 (2020) 29. S. Rajendran, M. Doraipandian, Construction of two dimensional cubic-tent-sine map for secure ımage transmission. Commun. Comput. Inf. Sci. 1116 CCIS, 51–61 (2019) 30. H. Huang, S. Yang, Colour image encryption based on logistic mapping and double randomphase encoding. IET Image Process. 11, 211–216 (2017) 31. M. Mollaeefar, A. Sharif, M. Nazari, A novel encryption scheme for colored image based on high level chaotic maps. Multimed. Tools Appl. 76, 607–629 (2017) 32. Y.G. Yang, B.W. Guan, Y.H. Zhou, W.M. Shi, Double image compression-encryption algorithm based on fractional order hyper chaotic system and DNA approach. Multimed. Tools Appl. (2020) 33. S. Yoosefian Dezfuli Nezhad, N. Safdarian, S.A. Hoseini Zadeh, New method for fingerprint images encryption using DNA sequence and chaotic tent map. Optik (Stuttg) 224, 165661 (2020) 34. S. Rajendran, K. Krithivasan, M. Doraipandian, A novel cross cosine map based medical image cryptosystem using dynamic bit-level diffusion. Multimed. Tools Appl. 80, 24221–24243 (2021) 35. M.Z. Talhaoui, X. Wang, M.A. Midoun, Fast image encryption algorithm with high security level using the Bülban chaotic map. J. Real-Time Image Process. (2020)
Implementation of Fuzzy Logic-Based Predictive Load Scheduling in Home Energy Management System Nirmala Jegadeesan and G. Balasubramanian
Abstract Load scheduling is one of the prominent areas of research in home energy management system. The objective of load scheduling is to meet out the balance between supply power and demand. Though many research works concentrate on load scheduling, majority of the works fail to account the real-time scenario. Hence, the proposed work addresses fuzzy logic-based predictive load scheduling using real-time loads. The energy consumption of all the domestic loads is considered in computing the demand. Further, fuzzy logic-based load scheduling predictive system is implemented in MATLAB by considering supply power, demand, and state of charge of battery. The proposed scheme predicts the need for scheduling and duration. From the simulation results, it is clear that the proposed predictive model predicts the possibility of scheduling accurately. Keywords Smart grid · Home energy management system · Load scheduling · Fuzzy logic · Demand side management
1 Introduction The traditional grid is facing numerous challenges like conventional infrastructure, lack of communication, centralized power generation, etc. [1–3]. To address this issue, smart grid has emerged with integration of information and communication technologies and bidirectional communication between the generations to utilization [4–7]. In utilization side, demand side management (DSM), a kind of energy management system (EMS) is a promising technique to meet out the ever increasing demand [8]. The main purpose of DSM is to organize the customers’ load demand based on their requirements without compromising consumer’s convenience [9, 10]. Load scheduling, a technique in DSM, is used to mitigate shortage of power induced N. Jegadeesan · G. Balasubramanian (B) School of Electrical and Electronics Engineering, SASTRA Deemed University, Thanjavur, India e-mail: [email protected] N. Jegadeesan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_60
791
792
N. Jegadeesan and G. Balasubramanian
during peak hours when high voltage alternating current appliances like heater and air conditioner are in operation through intelligent load sharing. The main objective of the load scheduling is to encourage the users to consume less power during peak hours by shifting their loads to off-peak hours thereby flattening the demand curve [11, 12]. Home energy management system (HEMS), a DSM approach, provides a platform to observe energy usage and production to facilitate automation of energy usage in household appliances through load scheduling. Energy management model is developed for prosumer, and by using mathheuristic algorithm, the issues involved in the system are solved related to extended simulation time horizon [13]. However, the proposal addresses the model for single home that may not be suitable for large scalability. Reduction of customer’s total energy cost and peak load control for achieving those different types of loads are proposed [14]. Meta-heuristic optimization-based scheduling scheme for the home appliance by considering two prices, namely real-time pricing and critical peak pricing, is discussed [15]. Reduction of electricity bill through maintaining limited power demand is presented [16]. In the above mentioned works, it is evident that they do not concentrate on time of use tariff and scheduling is done for few times for a whole day since the study is not pertained for whole day. Hence, the proposed work aims to predict load scheduling using fuzzy logic.
2 Proposed System Fuzzy logic-based home energy management system for intelligent scheduling of loads is proposed. The proposal considers base, deferrable, and non-deferrable loads based on their characteristics. The need for scheduling of loads is predicted based on the highly influencing input parameters. The proposed intelligent predictive scheduling is presented below.
2.1 Predictive Scheduling Home energy management system based on predictive load scheduling scheme using fuzzy logic is implemented in MATLAB. The renewable energy source, namely solar in addition to battery, is considered as input sources. Initially, total power supply PS (photo voltaic (PV) and battery) is computed and is compared by the demand power PD (for all loads). The appliances are classified into three groups, namely base load appliances, deferrable appliances, and non-deferrable appliances and are presented in Table 1. Deferrable appliances in Table 1 represent the appliances that can be shifted to any time to achieve peak shaving, whereas non-deferrable appliances ought to be operated at fixed time intervals. If supply power is greater than the demand power, then no need for scheduling the appliances or vice versa. Fuzzy logic-based intelligent
Implementation of Fuzzy Logic-Based Predictive … Table 1 Power ratings of appliances
793
Appliance type
Name of the appliance
Power rating (kW)
Base load appliances
Cooker hub
3–4
Cooker oven
4–5
Microwave
1.7–2.5
Laptop
0.1–0.2
Desktop
0.3–0.5
Vacuum cleaner
1.2–2
Electric car
3.5–5
Dish washer
1.5–2
Washing machine
1.5–2
Deferrable appliances
Non-deferrable appliances
Spin dryer
2.5–3.5
Interior lighting
0.84–1
Refrigerator
0.3–0.5
scheduling system uses three inputs, namely PV source, battery SoC, and demand to predict the probability of scheduling. The proposal considers total power requirement computed based on the power calculation of the appliances that is 28 kW of which 26 kW is supplied by PV panel whereas 2 kW by battery SoC.
2.2 Fuzzy Controller for Proposed Work Fuzzy logic is obtained from fuzzy set theory which operates based on reasoning, whereas the classical set theory provides precise output. The variables used for computing the output are called fuzzy variables. Initially, using the statically defined membership functions the input data is fuzzified. The membership function generates a truth value between 0 and 1. The fuzzy logic defines the relationship among different inputs in natural language through a set of fuzzy rules with the output. The system continuously estimates the given inputs and delivers outputs of the system based on the rules. The block diagram of the proposed fuzzy logic controller for predictive scheduling is shown in Fig. 1. Figure 2 represents the fuzzy input and output variables. Here, three inputs and one output are used. The input and output parameters along with its term sets are presented in Tables 2 and 3, respectively. Each input has three term sets, and output has five term sets. Triangular membership functions are used for both inputs and output parameters as it represents the input and output parameters effectively. The fuzzy rules are framed by using IF and THEN statements. For example, if PV output and battery state of charge are sufficient to meet the power demand then probability of scheduling is low or little low. In such case, PV output highly influences the
794
N. Jegadeesan and G. Balasubramanian
PV Output Probability of Scheduling
Fuzzy Logic Controller
BaerySoC Power Demand
Fig. 1 Schematic diagram of fuzzy logic controller for predictive scheduling
Fig. 2 Schematic representation of the fuzzy input and output variables Table 2 Input parameters membership function Input parameters Term sets PV output
Battery SoC
Membership functions Limits (kW)
Minimum
Triangular
Satisfaction
Triangular
Excess
Triangular
Charging
Triangular
Saturation
Triangular
{0; 5; 10} {8; 13; 18} {16; 21; 26}
{0; 0.4; 0.8} {0.6; 1.0; 1.4} {1.2; 1.6; 2}
Discharging Triangular Demand
Minimum
Triangular
Normal
Triangular
Peak
Triangular
{0; 6; 12} {8; 14; 20} {16; 22; 28}
Table 3 Output parameters membership function Output parameters
Term sets
Membership functions
Limits (kW)
Probability of scheduling
Little low
Triangular
Low
Triangular
Little medium
Triangular
{0; 0.1; 0.2} {0.18; 0.27; 0.35} {0.3; 0.45; 0.55} {0.5; 0.625; 0.73} {0.7; 0.85; 1}
Medium
Triangular
High
Triangular
Implementation of Fuzzy Logic-Based Predictive …
795
probability of scheduling as majority of the power demand is supplied by the PV output. Based on the requirements, remaining rules are framed and totally 3 * 3 * 3 = 27 is formed and depicted in the Table 4. Based on the inputs, i.e., PV output, battery SoC, and power demand, the probability of scheduling may be little low (LL), low (L), little medium (LM), medium (M), and high (H). Figures 3 and 4 portray the probability of scheduling with increase in PV output and battery SOC, respectively. Similarly, Fig. 5 exhibits the impact of increase in demand with respect to probability of scheduling. From Figs. 3 and 4, it is inferred that the probability of scheduling decreases with increase in PV panel output and battery SoC. This clearly elucidates the significance Table 4 Fuzzy rule base Rule No
PV output
Battery SoC
Demand
Probability of scheduling
1
Minimum
Charging
Minimum
Little medium
2
Minimum
Charging
Normal
Medium
3
Minimum
Charging
Peak
High
4
Minimum
Saturation
Minimum
Low
5
Minimum
Saturation
Normal
Little medium
6
Minimum
Saturation
Peak
Medium
7
Minimum
Discharging
Minimum
Little low
8
Minimum
Discharging
Normal
Low
9
Minimum
Discharging
Peak
Little medium
10
Satisfaction
Charging
Minimum
Low
11
Satisfaction
Charging
Normal
Little medium
12
Satisfaction
Charging
Peak
Medium
13
Satisfaction
Saturation
Minimum
Little low
14
Satisfaction
Saturation
Normal
Low
15
Satisfaction
Saturation
Peak
Little medium
16
Satisfaction
Discharging
Minimum
Little low
17
Satisfaction
Discharging
Normal
Low
18
Satisfaction
Discharging
Peak
Little medium
19
Excess
Charging
Minimum
Low
20
Excess
Charging
Normal
Little medium
21
Excess
Charging
Peak
Medium
22
Excess
Saturation
Minimum
Low
23
Excess
Saturation
Normal
Little medium
24
Excess
Saturation
Peak
Little medium
25
Excess
Discharging
Minimum
Little medium
26
Excess
Discharging
Normal
Low
27
Excess
Discharging
Peak
Little low
796
N. Jegadeesan and G. Balasubramanian
Fig. 3 PV output versus probability of scheduling
Fig. 4 Battery SoC versus probability of scheduling
of input power to meet out the demand. Further, from Fig. 5, it is obvious that the probability of scheduling increases with increase in demand. In addition, the impact of the various values of PV output on probability of scheduling with fixed SoC of 1.6 kW is presented in Fig. 6. Similarly, the influence of various values of battery SoC on probability of scheduling with fixed PV output of 17 kW is depicted in Fig. 7. Figures 6 and 7 prove the influence of both the input parameters on scheduling. It is obvious that the probability of scheduling decreases when both inputs are high and vice versa. This clearly demonstrates the efficacy of the proposed fuzzy logic-based load scheduling prediction scheme.
Implementation of Fuzzy Logic-Based Predictive …
797
Fig. 5 Demand versus probability of scheduling
Fig. 6 Demand versus probability of scheduling (PV output constant)
3 Conclusion Load scheduling is one of the efficient methods to achieve demand side management in home energy management system. Hence, the proposed model addresses predictive load scheduling using fuzzy logic by considering the inputs, namely photo voltaic output and battery state of charge. Fuzzy-based prediction scheme is implemented in MATLAB. The influence of the chosen input parameters is so significant in predicting the probability of scheduling which clearly proves the effectiveness of the proposed model. As the proposed model considered the real-time loads as base loads, deferrable
798
N. Jegadeesan and G. Balasubramanian
Fig. 7 Demand versus probability of scheduling (SoC constant)
loads, and non-deferrable loads, it is suitable for real-time scenarios. Further, the proposed model can be validated through simulation of load scheduling in home energy management system. Acknowledgements The authors wish to thank the SASTRA Deemed University for financial support under Prof. T. R. Rajagopalan research fund and Electrical drives lab facility.
References 1. N. Rajeswari, J. Janet, Load scheduling using fuzzy logic in a home energy management system. Int. J. Eng. Technol. IJET 10(5) (2018) 2. N. Krishna Prakash, S.R. Gupta, P.V. Shankaranarayanan, S. Sidharth, M. Sirphi, Fuzzy logic basedsmart home energy management system, in 9th International Conference on Compuing, Communication and Networking Technologies (IEEE Press, Bengaluru, 2018) 3. V. Pilloni, A. Floris, A. Meloni, L. Atzori, Smart homeenergy management including renewable sources: A QoE-driven approach. IEEE Trans. Smart Grid 9(3), 2006–2018 4. S. Atef, A.B. Eltawil, A fuzzy logic controller for demand side management in smart grids, in Proceedings of the 8th International Conference on Operations Research and Enterprise Systems (2019), pp. 221–228 5. N. Uddin, M.S. Islam, Optimal fuzzy logic based smart energy management system for real time application integrating RES, grid and battery, in 4th International conference on Electrical Engineering and Information & Communication Technology, Dhaka (2018), pp. 296–301 6. S. Madhura, IoT based monitoring and control system using sensors. J. IoT Soc. Mobile Anal. Cloud 3(2), 111–120 (2021) 7. S. Aman, Y. Simmhan, V.K. Prasanna, Energy management systems: State of the art and emerging trends. IEEE Commun. Mag. 51(1), 114–119 (2013)
Implementation of Fuzzy Logic-Based Predictive …
799
8. J.S. Vardakas, N. Zorba, C.V. Verikoukis, A survey on demand response programs in smart grids: pricing methods and optimization algorithms. IEEE Commun. Surv. Tutorials 17(1), 152–178 (IEEE Press, 2014) 9. K.D. Chawda, T. Paul, J. Waris, A. Badar, Fuzzy logic based short-term load forecasting. Int. J. Innov. Res. Sci. Eng. Technol. 6(2), 1594–1599 (2017) 10. G.K. Jabash Samuel, J. Jasper, MANFIS based SMART home energy management system to support SMART grid. Springer Netw. and Appl. (6), 2177–2188 (2020) 11. S. Coetzee, T. Mouton, M.J. Booysen, Home energy management systems: A qualitative analysis and overview, in IEEE Africon (Cape Town, 2017), pp.1260–1265 12. K. Patel, A. Khosla, Home energy management systems in future smart grid networks: A systematic review, in Proceedings of the 1st International Conference on Next Generation Computing Technologies, Dehradun, India (2015), pp. 479–483 13. F.Y. Melhem, O. Grunder, Z. Hammoudan, N. Moubayed, Energy management in electrical smart grid environment using robust optimization algorithm. Trans. Indus. Appl. 54(3), 2714– 2726 (2018) 14. T. Alquthami, A.P. Sakis Meliopoulos, Smart house management and control without customer inconvenience. IEEE Trans. Smart Grid 9(4), 2553–2562 (2018) 15. S. Aslam, Z. Iqbal, D.N. Javaid, Z.A. Khan, K. Aurangzeb, S.I. Haider, Towards efficient energy management of smart buildings exploiting heuristic optimization with real time and critical peak pricing schemes. Energies 10(12), 2065 16. S.L. Arun, M.P. Selvan, Intelligent residential energy management systemfor dynamic demand response in smart buildings. IEEE Syst. J. 12(2), 1329–1340 (2018)
Firmware Attack Detection on Gadgets Using Least Angle Regression (LAR) E. Arul and A. Punidha
Abstract Threats on personal computer interfaces are among biggest risks to industrial corporations. The software holds the utmost rights and permits hackers to circumvent conventional restrictions. The firmware crust also rapidly being one of the most prominent information security fields, with hackers progressively shifting their attention to the business field, with several exploits and sometimes poor protections. The proposed Firmware Last Angle Regression (like F-LARS) is a regression analysis pattern filtering approach where you are concerned with unnerving or want to make it easier to understand the sample. Forward consolidation regression attempts only to selectively incorporate factors to overcome the avarice of inward picking over firmware update of the victim machines. The findings reveal that 98.35% is accurate and 0.01% is a successful firmware threat. Keywords Gadgets · Backdoors · Firmware · API calls · Classification · Linear regression analysis · LAR · Adware
1 Introduction The program for controlling processor architectures is precisely the firmware. Such services, representative of the completely non-volatile memory including Kernel, Boot loader including flash have at certain one of this kind memory form for almost each electronic gadget nowadays [1]. Chips, network devices, interface control system, right up to your cursor and keypad all have firmware controls. Contribute to an un-volatile driver, firmware exploits also type in adware to this non-volatile memory. Such threats might be non-meditated, in the event of a computer flashing to inject compromised data [2]. Since the firmware is fairly light secured, exploits are E. Arul (B) Department of Information Technology, Coimbatore Institute of Technology, Coimbatore, Tamilnadu, India e-mail: [email protected] A. Punidha Department of Computer Science and Engineering, Coimbatore Institute of Technology, Coimbatore, Tamilnadu, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_61
801
802
E. Arul and A. Punidha
not being used as much as hacks aimed at these areas. Only, it is not very straightforward to compromise firmware, since the program is used more effectively and efficiently to carry out the usual malevolent roles such as pilfering PII. Firmware is also a custom-made code deployed on an incredibly limited number of computers, and it is likely to have a probability of success much higher than the threats exploiting firmware due to its far wider user base [3]. Firmware vulnerabilities may appear in many ways; some of them were common distribution methods for vulnerabilities and executable files. Another thing that needs to be learned is compromised USBs, hacked drivers and weak firmwared goods [4]. A intruder never has to touch a computer in objective terms to supply the program— it could be achieved online using Wi-Fi including Ethernet. There seems to be a growing potential of such an assault enough that everyone can attach our phones and computers, network tv sessions, laptops and such with the online. The growing victimhood rate—intruders automatically try the maximum possible protections to promote their attacks [5]. On an application level this usually means that credentials can be elevated (circle 3) from either the client to the controller, or device credentials can be obtained on a device level (circle 0). And although are the most critical rights for the Bios, the firmware is located under the file system. In consequence spyware in the firmware could subvert the kernel and thus have even higher rights narratively than Circle 0. Often these rights are known as Rounds-1 to -3 [6]. Classic Protection Bypass—Most notably the potential to circumvent various levels enables hacker to escape restrictions and safeguards at protocol stack or even circle 0 [7]. This involves conventional operating system protection including structures of the virtual environment. For particular, the corrupted firmware will potentially allow a hacker to manipulate the manner in which a device loads, to patch the kernel internally, and to access protected data out of devices. For data center’s the firmware sacrifices could also enable the hypervisor as well as workstation framework in cloud resources of an organization [8]. Perseverance—The ability to block and escape the kernel gives hackers intense resilience on a affected computer [9]. Besides preventing tests, unauthorized firmware code is inevitably connected to the computer hardware as against the software. This implies that the software of the intruder will obviously proceed even with acomplete re-image of the device. Such a capacity is especially vital to an intruder because, often by management of entry points and facilitation of the continuing invasion, it is important to forward the wider effort [10]. If you think of the devices packed with firmware—including webcams and hearing devices sometimes to batteries—, a firmware exploit gets significantly worst [11]. This extensive use is used by firmware threats. Since the firmware is not protected by the malicious payload, penetration can not be identified, which implies the IT protection team will take several cycles of figuring out that all is incorrect. Such threats may be unreasonable since they are so harder to detect. When installed, it might inflict ongoing harm, compromise authorized firmware upgrades, and even stick after OS factory reset or indeed absolute hddwiping [12].
Firmware Attack Detection on Gadgets …
803
There is still no question that a wide range of users were now finding even more firmware security flaws. The flip side of the coin of these results is that the competition is pushing vendors to build software protection measures to survive cyber warfare. Several businesses, such as Intel, have fixes or patches issued for particular compromised firmware components [13].
2 Related Work Yohan, Many bugs and threats have currently emerged in IoT environments all across the globe. Some of the key challenging platforms is using IoT system firmware to jeopardize the goal IoT equipment as an entry gateway [14]. For IoT equipment vendors, which is why it is essential to promote secure and reliable firmware patch features for IoT products purchased or configured. This article proposes a stable and provable IoT-based firmware patch process [15]. The objectives of the current proposal are to ensure that each new custom firmware published by a respective service provider is protected by pair-to-peer test process and that the modified firmware is distributed in a stable way on IoT systems at an early date. In addition, the use of crypto currency guarantees the security of the firmware throughout delivery across the Web in the current system. The new system for firmware updates involves four separate mechanisms: the contract for firmware update, third party contracts for firmware update, and the Move method to update them as well as the system for Request updates. Those quad procedures are assisted by six accompanying guidelines. The reliability and safety intensity assessment of the planned software upgrade process is carried out. The current system promotes shared protection and protects against significant cyber threats based on facts from structured security research: software manipulation attack, spoof attack, middle-to-center assault and repeat attack [16]. Xie, Greater and greater smart home devices are related to the web with the invention of the Internet of Technology (IoT) [17]. IoT applications gradually gained educational and commercial exposure to user confidentiality concerns. The main innovation for IoT systems shielding them from zero-day threats is exploit identification. Nevertheless, conventional threat identification approaches and techniques could not be used for the IoT firmware review directly. This article explores the malfunction identification in IoT firmware for the very first time; scientific studies are divided into four groups, i.e., passive study, conceptual action, simulator flushing as well as a thoroughness check. The accuracy of IoT firmware defect analysis instead brings out whether functional defects in integrated binary systems are found based on both the Microprocessor design. This report suggests a framework for detecting Authorization Compromise vulnerabilities in integrated boolean Internet of things services based on flood-driven and persistent study. Through checking and detecting recognized CVEs, the suggested approach is shown to be successful. Danese, The frequency of vulnerability breaches which have been recorded using uncontrolled source firmware properties has increased in recent years [18]. This
804
E. Arul and A. Punidha
research offers a new protection system, known as DOVE, to tackle the unexpected running currents of firmware, especially anyone who might disclose a vulnerable protection. The DOVE system works on the basis of a conceptual firmware implementation model, combined with an uncertainty estimate, which defines unexpected activity flows and provides the user with correct systematic hypotheses. Teng, Huge quantities of IoT equipment and controls link and share vast quantities of data as the Internet of Things (IOT) development grows. There are several significant new concerns, like online security including confidentiality, together with modern IoT technologies [19]. There have also been ongoing assaults on home networks and security surveillance by a range of spear phishing (DDoS) delivery. The insecurity of IoT consumer devices due to existing limitations, e.g., remote access, resource constraints, revised lack as well as platform diversity, is maximum. They are suggesting an over—the software upgrade program for domestic adapters that comply with the ISPs Access control Program and OSS. This program is available for a total of 1 billion Chunghwa telecommunications adapters and has achieved a firmware upgrade rate of approximately 96.32%. This holds modem firmware away—, guarantees consumers’ residential network protection and decreases consumer annoyance when upgrading.
3 Theoretical Backround 3.1 Delineation Firmware Attack Detection on Gadgets Using Least Angle Regression (LAR) First ever phase in LAR is to define the factor that is closest to the answer. Instead of fully incorporating this factor, LAR consistently pushes the flow rate to a lowest squares rate (that decreases its connection with either the changing leftover). The cycle is halted when a factor’ comes back’ in terms of its similarity to the remaining [20]. The next factor then leaves the functional set and its coefficients are pushed in a such a way as to preserve and decrease its similarities. In addition, just as much indicator as it merits is the least angle regression. Stepwise regression is a technique for matching correlation patterns in which automated procedures perform selection of probabilistic factors. For addition and subtraction from an informative set of variables on the basis of certain pre-specified criteria, a factor is included in each phase [21]. This typically consists of a F-test or t-test series. However, other strategies are feasible. The repeated method of incorporating the ultimate prototype, which is accompanied by disclosing measurements and probabilities without adapting them to the modeling method, has led to calls for an end to the incremental use of model building, or for prototype ambiguity throughout the minimum to be combined [12]. That is approximately the way the LARS process operates. Let begin from all correlations equivalent to none, such as the standard upwards selection, and seek the
Firmware Attack Detection on Gadgets …
805
indicator most relevant to the comment, say xj1. Should consider the widest possible step on the path of this prediction before there is too much overlap between those other determinants; say xj2 and the current one. LARS Group Pieces for the forward Choice at this stage. Instead of proceeding along x+, LARS continues in an equal path between all the dual indicators until the “highly correlated” range is reached by the third factor xj3. LARS then continues similarly from the “lowest angle” point, among xj1, xj2 and xj3. LARS projections are established μ = Xβ Moving each covariate to both the template in consecutive phases, and afterward That k steps just k of the β+’s are non-zero. βj which minimize squared error. In the m = 2 covariate algorithm, X = (xj1, xj2). The existing correlations depend only on the Y 2 projection into the L(Xj) linear space of xj1 and xj2, c(μ) = X y − μ = X y 2 − μ . Pseudo Code for LAR: • Begin all bj parameters identical to none. • Find the xj factor that is most linked to y. • Raise the bj factor to show its similarity to y. R&Ds along the way. Take r = y – yhat. • Stop if there is as much relationship between a predictor xk and r as xj has. • Rise (bj, bk) in the less square direction through your conjunction once any other predicted xm is similarly associated with r. • Keep going to: all determinants will be in the framework. Ironically, this method reveals the whole route of lasso approaches with one change, since s differs between zero and infinity. The required adjustment is to delete it off the current set of determinants when the non-zero coffee reaches zero and retrieve it from the common path. Ironically, this method reveals the whole route of lasso approaches with one change, since s differs between zero and infinity. The required adjustment is to delete it off the current set of determinants when the non-zero coffee reaches zero and retrieve it from the common path (Fig. 1). It is analytically accurate when P >> N is algorithmically just as rapid as inside filtering, of the same degree of magnitude as the normal linear regression (i.e., once the quantity of measurements is substantially larger than the number between points). This provides a complete linear path, useful for cross testing or similar efforts to change the pattern. When any factors were almost identical to the solution, they will growing their correlations at about the same pace. Therefore, the method is more robust as intuition would assume. Alternatives for other linear regressions, such as the Lasso, can be easily changed.
806
E. Arul and A. Punidha
Fig. 1 Delineation of firmware attack detection on gadgets using Least Angle Regression (LAR)
4 Experimental Results and Comparison Future picking starts without factors in the framework, then with every stage it applies a more informative factor to the sample, halting if the explicative capacity decreases under a certain level. It is a short and easy solution however it may be quite stingy: can introduce factors entirely at any stage, such that associated determinants are not likely to be used in the system. Reference to the modern model-selection approach recognized as inside picking (LARS): can pick one for the greatest objective connection to answer y, assume xj1 and conduct a logarithmic reconstruction y at xj1. The “initial inference” is defined. It is now found the solution to be a persistent matrix perpendicular to xj1. We assign radially outward certain determinants to xj1 and continue the cycle of choosing. It consists in a range of exogenous variables xj1, xj2, after k phases xjk which is then used to create a continuous k-parameter structure in the normal way [22]. • The beginning is r = y, β1, β2, … Amounting p = 0. • Suppose that xj is uniform. • Consider the most associated factor xj to r.
Firmware Attack Detection on Gadgets …
807
Fig. 2 Forward stabilization of firmware LAR
• Rise βj in the sign path (corr(r, xj)) before every other competitor xk has the same association as xj. • Transfer (βj, βk) in the lowest joint path for (xj, xk) to a certain competitor x, before all the predictors have been reached. That is how long interaction with the present residual remains. Arrest if corr(r, xj) = 0 function j, i.e., type of OLS. Lasso and forward stability may be considered as small LAR models. Future balanced inference only partly incorporated factors to overcome the avarice of forward picking [23]. Through advanced discovery identifying the most stated variable and applying this to the scale, forward stabilization determines the parameter in the most explaining strength and epsilonically adjusts its value throughout the appropriate manner (Figs. 2, 3, 4, 5 and Table 1). The LAR model is: Y = 2.233e + 05 + −2.2797e + 04X. Mean Absolute Error: 0.7232672726097084. Mean Squared Error: 0.7852776385369729. Root Mean Squared Error: 0.886158923972993.
5 Conclusion and Future Work Admittedly, firmware is an often neglected aspect of extremely sensitive and enticing hacker entry points. When attacking hardware, hackers have a clear benefit. Attacking software is easier to find, because software updates sometimes execute preceding
808
E. Arul and A. Punidha
Fig. 3 The barchart results depicts the least regression analysis on gadgets API calls
Fig. 4 The results depicts the firmware attack major attribute analysis using LAR on firmware
virus protection applications, and anti-virus tools have a hard time detecting malicious code. They are perfect to “stitch” a computer, i.e., making it extremely obsolete. Firmware exploits can also be the chosen tool of cyber warfare as it is directed at locking down the whole information network of a nation or organization. This document is neither total nor full but attempts to present the key principles for firmware assaults. Firmware LAR Begin the zero-equivalent correlations. whereupon it can
Firmware Attack Detection on Gadgets …
809
Fig. 5 OLS regression result using firmware attack detection on gadgets using least angle regression (LAR)
Table 1 Compared with existing malware methods of the proposed firmware—LAR Methods
Number of malware are TP ratio(%) FP detected FP ratio (%) detected
Yohan
910
93.52
63
0.06
Xie
895
91.98
78
0.07
Proposed firmware—LAR 957
98.35
16
0.01
Test scale of malware: 973 Collection of harmless samples: 1021
consider the regressor highly associated toward everyone then push that correlation before the next regressor is similarly correlated with the remaining components. So you shift the two coefficients in the same path until there is a coupled regressor. Firmware LAR will proceed to attach a regressor at a time before you have 0 residuals or both regressors in the process to find hacker attack. The outcome is a true positivity score of 98.35% and a false positive ratings of the various gadgets firmware assault of 0.01%. This work will be done in future with other gadgets APIs that allow the execution of malicious network activities.
References 1. H. Darabian, A. Dehghantanha, S. Hashemi, M. Taheri, A. Azmoodeh, S. Homayoun, K.-K. R. Choo, R. Parizi, A multiview learning method for malware threat hunting: windows, IoT and android as case studies. World Wide Web 23 (2020).https://doi.org/10.1007/s11280-01900755-0
810
E. Arul and A. Punidha
2. S. Tahsien, H. Karimipour, P. Spachos, Machine learning based solutions for security of Internet of Things (IoT): a survey. J. Netw. Comput. Appl. 161, 102630 (2020). https://doi.org/10.1016/ j.jnca.2020.102630 3. H.-T. Nguyen, Q.-D. Ngo, D.-H. Nguyen, V.-H. Le, PSI-rooted subgraph: A novel feature for IoT botnet detection using classifier algorithms. ICT Express 6 (2020)https://doi.org/10.1016/ j.icte.2019.12.001 4. S. Susanto, M.A. Arifin, D. Stiawan, Y. Idris, R. Budiarto, The trend malware source of IoT network. Indonesian J. Electric. Eng. Comput. Sci. 22, 450 (2021). https://doi.org/10.11591/ ijeecs.v22.i1.pp450-459 5. S. Nakhodchi, A. Upadhyay, A. Dehghantanha, A Comparison Between Different Machine Learning Models for IoT Malware Detection (2020). https://doi.org/10.1007/978-3-030-455415_10 6. M.N. Islam, S. Kundu, IoT Security, Privacy and Trust in Home-Sharing Economy via Blockchain (2020).https://doi.org/10.1007/978-3-030-38181-3_3 7. Z.A. Solangi, Y.A. Solangi, S. Chandio, M.B.S.A. Aziz, M.S. Bin Hamzah, A. Shah, The future of data privacy and security concerns in Internet of Things, in 2018 IEEE International Conference on Innovative Research and Development (ICIRD), Bangkok (2018), pp. 1–4 8. https://www.thesslstore.com/blog/firmware-attacks-what-they-are-how-i-can-protect-myself/ 9. https://www.plugandplaytechcenter.com/resources/firmware-security-making-case/ 10. A. Yohan, N. Lo, FOTB: a secure blockchain-based firmware update framework for IoT environment. Int. J. Inf. Secur. (2019). https://doi.org/10.1007/s10207-019-00467-6 11. https://towardsdatascience.com/a-beginners-guide-to-linear-regression-in-python-with-scikitlearn-83a8f7ae2b4f 12. https://www.opswat.com/blog/who-needs-worry-about-firmware-attacks 13. http://www.ai-junkie.com/ann/som/som3.html 14. M. Arunkumar, K. Ashok Kumar, Malicious attack detection approach in cloud computing using machine learning techniques. Soft Comput. (2022). https://doi.org/10.1007/s00500-02106679-0 15. A.M. Saghiri, K.G. HamlAbadi, M. Vahdati, The ınternet of things, artificial intelligence, and blockchain: ımplementation perspectives, in Advanced Applications of Blockchain Technology. Studies in Big Data, vol. 60, ed. by S. Kim, G. Deka (Springer, Singapore, 2020) 16. E. Arul, P. Angusamy, Firmware Attack Detection on Gadgets Using Ridge Regression (FADRR) (2021). https://doi.org/10.1007/978-981-16-0708-0_19 17. K. Balasamy, S. Suganyadevi, A fuzzy based ROI selection for encryption and watermarking in medical image using DWT and SVD. Multimed Tools Appl 80, 7167–7186 (2021). https:// doi.org/10.1007/s11042-020-09981-5 18. P. Jayasri, A. Atchaya, M. SanfeeyaParveen, J. Ramprasath, Intrusion detection system in software defined networks using machine learning approach. Int J Adv Eng Res Sci 8(4), 135–142 (2021) 19. M. Humayun, N.Z. Jhanjhi, A. Alsayat, V. Ponnusamy, Internet of things and ransomware: evolution, mitigation and prevention, Egypt. Inform. J. 22(1), 105–117 (2021). ISSN 1110– 8665. https://doi.org/10.1016/j.eij.2020.05.003 20. J. Zhang, H. Chen, L. Gong, J. Cao, Z. Gu, The current research of IoT security, in 2019 IEEE Fourth International Conference on Data Science in Cyberspace (DSC), Hangzhou, China (2019), pp. 346–353 21. https://securityboulevard.com/2019/12/anatomy-of-a-firmware-attack/ 22. https://medium.com/@venali/conventional-guide-to-supervised-learning-with-scikit-learnleast-angle-regression-generalized-11b4ce2dec89 23. https://machinelearningmastery.com/implement-simple-linear-regression-scratch-python/
STEMS—Smart Traffic and Emergency Management System A. Rajagopal, Chirag C. Choradia, S. Druva Kumar, Anagha Dasa, and Shweta Yadav
Abstract The Traffic congestion and bottlenecks are a major problem in many countries as well as in India. Failure of signals, poor law enforcement, and bad traffic management have deteriorated the road transport system. STEMS—‘Smart Traffic and Emergency Management System’ deals with smart handling of vehicular traffic, pedestrian crossing, and emergency services. Traffic density in each lane of a junction is determined through computer vision. This data is used to set the traffic lights as per the dynamic needs. To ensure pedestrian safety, pedestrian crossing indicators can be installed at every traffic junction that provides audio–visual reference. Keywords Traffic congestion · Smart traffic and emergency management system · Computer vision · Pedestrian safety · Audio–visual–physical reference
1 Introduction Traffic congestion is a significant problem in many cities across the world resulting in a significant time burden. With an increase of about 34% in 10 years, India has a road network of 5.5 million kilometers compared to just 4.1 million kilometers in 2008. In year 2016, total number of vehicles registered in India were 57,500,000. A survey from 2018 concludes that the number of cars per capita is 22. A. Rajagopal (B) · C. C. Choradia · S. Druva Kumar · A. Dasa · S. Yadav Dept. of Electronics and Communication Engineering, Dayananda Sagar College of Engineering, Kumaraswamy Layout, Bangalore 560078, Karnataka, India e-mail: [email protected] C. C. Choradia e-mail: [email protected] S. Druva Kumar e-mail: [email protected] A. Dasa e-mail: [email protected] S. Yadav e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_62
811
812
A. Rajagopal et al.
According to the statistics published by The International Energy Agency, there will be a 775% increase in passenger car ownership in India. This increase will lead to a total of 175 cars per capita in 2040. This increase is concentrated in metropolitan cities, as more focus is laid on to the development of areas which are already under development. Lack of space, obsolete road infrastructure, and poor traffic management policies have led to an increase in road traffic congestions and bottlenecks, thus causing loss of revenue, time, and human life (due to the delay in providing first aid in case of accidents and increased ambulance travel time). Thus, it is high time to introduce a smart traffic management system which is capable of handling an oversaturated network of vehicles, while providing reduced transit time and congestion-free pathways.
1.1 Overview STEMS deals with smart handling of vehicular traffic, emergency services, and pedestrian crossing. The above is achieved with the help of modern technologies like computer vision, deep learning and AI, and sensor networks. A complex system of advanced cameras continuously monitor the traffic in each lane of a junction, and with the application of a ML model, the wait period for each lane is dynamically calculated depending on the average initial speed of the vehicles when they start from rest. As for the pedestrian safety, a wide variety of sensors consisting of ultrasonic sensors, IR sensors, piezoelectric sensors, capacitive touch sensors, inductive loop sensors, etc. are considered for the implementation. A trigger from the sensor induces the alert system to warn the oncoming pedestrian, of the dangers that lie ahead of him/her (Fig. 1). Smart Traffic and Emergency Management System aims at: • Ensuring the safety of pedestrians while crossing road, thereby reducing the percentage of pedestrian-induced accidents.
Fig. 1 Overview of STEMS implementation
STEMS—Smart Traffic and Emergency Management System
813
• Creating a self-decision-making, automated system, capable of handling vast amounts of traffic, and providing a congestion-free path while also maintaining a minimum wait period at the junctions, and hence, this will also help in reducing noise and air pollution as there will be a less waiting time at traffic junctions.
1.2 Literature Survey Besides many prominent reasons, a report by Lancet Planetary Health, which states that around 1.7 million Indians have died due to air pollution caused illness in 2019, motivated us to take up this project and help create a congestion-free commute system and reduce the air pollution caused by vehicular traffic for the future generations. Over the years, many attempts were made to create a better traffic management system. Among various vehicular density estimation techniques, most successful implementations were of induction-based models [1] and VANETs. The concept of vehicular ad hoc networks has been discussed to create an Intelligent Traffic System (ITS) [2]. Smart vehicles in the future will be equipped with large number of sensors such as ECUs, optical sensors, cameras, and Lidar for ranging and object detection. These sensors generate huge data which is collected and further processed. The collected data is known as mobile big data which is not fully utilized by current traditional systems. Amplified channel propagation characteristics are the defining features of high mobility networks. Changing vehicle densities depending upon location and time is also another source of dynamics. Traditional wireless systems cater to low mobility systems. In an automotive environment with high mobility, these wireless networks fail to conform to the stringent QoS requirements. New methods have to be determined to deal with high mobility networks such as VANETs. Authors of Ref. [2] have tried exploring ways in which machine learning can be employed to make use of mobile big data to make smart decisions regarding vehicular network traffic control and to acquire and track the dynamics of vehicular environments [2]. One of the problem statements discussed deals with traffic flow prediction. Making use of the traffic information generated by the real-time data collected by various on board and roadway sensors, traffic congestion can be alleviated. Further ways to improve this model by adding cellular connectivity correlation with vehicular traffic flow have been discussed. The machine learning algorithms are discussed in great detail. Successful implementation of VANETs would require few more years of research and experimentation. The paper by Medina discusses the management of traffic signals in an oversaturated network and the effectiveness of max-plus algorithm along with Q-learning agents for the same [3]. A realistic traffic simulation software called TRANSYT-7F was used to train and verify the machine learning algorithm. Reinforcement learning agents with communication capabilities have been studied, and coordinating mechanisms using max-plus algorithm has been implemented to improve throughput and decrease the number of stops taken per vehicle. Reinforcement learning agents have
814
A. Rajagopal et al.
better adaptability to new situations and environments thus making them suitable for real-time applications. Solving traffic signal control problem depends on various factors: The type of driver, lane changing behavior, desired speeds, acceleration, deceleration rates of various vehicles, and the extent of congestion. Congestion management must be done in a manner that does not disturb the acceptable operational levels. If the operational levels are not maintained, the entire system might ultimately collapse due to the spreading of queue slip backs and traffic breakdowns to greater areas. The traffic model evolves with time and is a stochastic process that is subjected to the abovementioned variables. Markov’s decision-making process is used to enable the traffic system to control the signal timings and come up with optimal solutions to balance the vehicular densities. The paper also discusses the well-known Q-learning algorithm. This algorithm is suited for processes that require sequential decision-making and is applicable for high dimensional spaces. The decentralized RL agents can act independently and also take information about the status of neighboring RL agents into their decision-making process. The max-plus algorithm has been used to create a well-coordinated system of RL agents who are capable of controlling a traffic system that alleviates traffic congestion considerably. The sum of local terms between two nodes at a time leads to a decomposition of relations in a coordination graph. On basis of this, a max-plus algorithm has a message-passing strategy that enables the interchange of messages between nodes in a manner that the system will arrive at a final decision based on their local payoff and global payoff. Existing issues associated with vision-based approach of traffic density estimation such as illumination changes, occlusions, and congestions have been discussed along with ways to tackle them [4]. Traffic surveillance is of three major types: vehicle counting, vehicle tracking, and holistic methods. There are traffic management models that have employed deep neural networks with image processing [4] with high accuracy but at the cost of high latency caused by the computational complexity of such a system, thus having low efficiency. The problems associated with vehicle tracking was discussed in earlier section. Vehicle counting method depends on moving object segmentation. There are four main categories of moving object segmentation: frame differencing, background subtraction, object-based methods, and motion-based methods. Frame differencing cannot deal with changes in environment such as illumination changes, noise, and background changes. Background subtraction deals well with illumination changes but it requires high computational power. Object-based methods try to identify individual objects with the help of 3D models, and motion-based method uses optical flow. These two methods are highly complex which makes them unsuitable for realtime applications and low-cost platforms. Vehicle counting methods also have to deal with shadow detection and elimination. All these hurdles make vehicle counting an unreliable method to determine traffic density [5–8]. Tracking-based systems are also ineffective since they are subject to inaccuracies caused during velocity determination of moving vehicles and occlusions. While
STEMS—Smart Traffic and Emergency Management System
815
holistic methods are immune to environmental changes, they require specialized hardware for real-time implementation. The paper titled “Real-time road traffic density estimation using block variance”, presents a method called block method to estimate lane-wise traffic density [9]. In this method, each lane is divided in several blocks. Depending on the number of blocks occupied by the vehicles, overall percentage of occupancy of vehicles is determined. This gives us an estimate of the traffic density present in that lane. The computational complexity required to analyze the data has been decreased by blockbased background construction and update method, which only uses the intensity variance of blocks. For tackling the illumination changes and vehicles and shadows differentiation, vehicle block detection technique has been used. Current road traffic research tends to neglect pedestrian traffic as an important factor of consideration in building road infrastructure [10]. This negligence leads to poor installation of pedestrian crossing facilities. This leads to pedestrian-based accidents and unnecessary loss of lives. The traffic research tends to ignore the fact that pedestrian crossings rely heavily on individual pace that determines the time needed for them to cross the road. These factors are responsible for pedestrian crossing violation, traffic disorder, and even traffic accidents. The aim of a safe pedestrian crossing facility should be to reduce conflicts between vehicles and pedestrians. The vehicles and the pedestrians must be separated in terms of timing and spacing. This separation can be achieved through regulations and laws, and/or infrastructures, such as traffic islands, separation guardrails and grade—separated crossings. Based on the literature survey, an efficient traffic system should consider the expected traffic flow, type of lanes and junctions, existing infrastructure, ideal wait period, pattern of signaling, and traffic analysis. A system needs to be designed that is easy to implement, low cost, requires less computational power, pedestrian friendly, and efficient.
1.3 Objectives This project’s main objective was to alleviate the existing traffic congestions in metropolitan cities by creating a system that can monitor and control the traffic flow effectively. Our project objectives can be divided into three sub-objectives as follows: • Ensuring the safety of pedestrians while crossing road, thereby reducing the percentage of pedestrian-induced accidents. • Creating a self-decision-making, automated system, capable of handling vast amounts of traffic, and providing a congestion-free path while also maintaining a minimum wait period at the junctions. • Reducing noise and air pollution, minimizing traffic congestions, and removing bottle necks through the fulfillment of the above stated objectives.
816
A. Rajagopal et al.
2 Methodology As eyes are the integral part of the human vision system, so are the cameras in the traffic management system. It is vital to choose a camera with good resolution, operating power, and low maintenance as it would be deployed on the road network. While eyes are for just the visual system that helps us to look at objects, the actual formation of image and perception is carried out by the brain. Analogous to the brain is the Micro-Processing Unit [MPU]. It is the MPU that interprets the data from the camera and takes the vital decision of dynamically changing the signal wait periods. As the data from camera is too bulky and complex in its raw form, it is important to carry out a few preprocessing steps. The first of them being edge detection. This is carried out to establish the side boundaries of the road and classifying of road network into either a one-way or a two-way road. Method 1: The next step is the identification of vehicles which are at rest, waiting for traffic light to turn green and classifying them into different types (like scooters, cars, buses, light cargo vehicles, heavy cargo vehicles). Depending on the type of vehicle and the average area occupied by it on the road, the density of the traffic in the observation lane is estimated. Method 2: As the vehicles keep approaching the stop line at the lane, if the signal is red, they come to a halt. As each vehicle comes to a halt, the image processing unit groups them into a single block. Depending on the ratio of area of occupancy of the grouped block to the lane boundaries, the traffic density is estimated. This estimated density of all the lanes of a junction is compared, and depending on the average throughput of vehicles at junctions, the signal wait period for each lane is calculated. Once the junction is set to operate in either clockwise or anti-clockwise direction, the same order of change will be followed until and unless it is changed manually.
3 Block Diagram 3.1 Block Diagram of Traffic Management System Figure 2 illustrates the working principle behind the Traffic Management System. A CCTV camera overseeing the lane constantly generates the live camera feed. This camera feed is fed to the pre-processor, which continuously reads the frame one after the other. To reduce the required computational power and the stress on the system, the frames are converted into grayscale from BGR for further processing. The grayscale images are then fed into the SSIM sub-program for the estimation of vehicular density in the given lane. To further reduce the computational stress on the system, the vehicle density in the given lane is estimated only when its signal wait period time falls to zero. The estimated vehicle density value is then used for the dynamic calculation of signal
STEMS—Smart Traffic and Emergency Management System
817
Fig. 2 Block diagram of traffic management system
go period for the lane in question. A double feedback loop, i.e., a positive as well as negative feedback loop, is designed to calculate the most appropriate signal go period such that it intelligently increases or decreases the period depending on the degree of increase or decrease in the vehicular density in the lane compared to the previous estimated vehicular density value.
3.2 Block Diagram of Pedestrian Safety System Figure 3 illustrates the working principle behind pedestrian safety system. The alert system gets activated and starts sensing the unruly crossing of pedestrians once the lane in question has been given a GO/GREEN signal. In case of a STOP/RED signal, the alert system stays idle. While the signal status forms one of the triggers, the other one is the pedestrian himself/herself. In case the alert system is active and the pedestrian has been detected to be crossing the road, the buzzer gets triggered which sounds a high pitched note to grab the attention of the pedestrian as well as an LED system starts flashing which substitutes for visual reference.
Fig. 3 Block diagram of pedestrian safety system
818
A. Rajagopal et al.
4 Algorithm 4.1 Density Estimation Conventional computer vision techniques such as background subtraction was used to find density. A module named SSIM was employed for this purpose. The SSIM predicts the perceived quality of different types of pictures, as well as frames in videos. The SSIM index works on the principle of full reference metric which states that an image without distortion is considered to be a reference image and this image is in turn used to measure the quality of the images. This proved to be a holistic method of density estimation, as this uses very little computational power. The entire program was made using the PyCharm IDE and Python on Raspberry Pi 4B. Following steps are involved in density estimation: • Start capturing frames in real time. • Assign a reference frame. • Perform gray scaling on the reference frame and incoming frames. This helps reduce the computational power required to run the program. Perform SSIM on the incoming frames and calculate the relative density in each lane.
4.2 Signal Wait/Go Period Calculation In order to calculate the signal go period for each lane, a feedback loop must be created. A series of if and else statements were used to achieve that. Simple operations were tested and later converted into functions that could be called anytime during the execution of the program. This provided modularity and made the further steps simpler. The SSIM values obtained from earlier section determine the signal go period for a particular lane. Default wait periods are given initially; later depending on the real-time SSIM values, the feedback loop updates the GO period for that particular lane. When a particular signal has a GO signal, SSIM is evaluated in the alternate lane, and based on this, the alternate lane’s GO period is calculated. Generic flowchart for determination of signal wait period is given in Fig. 4.
4.3 Switching of Traffic Lights A specific order of switching of lights is maintained. This ensures that the switching of traffic lights goes in an orderly fashion. An interval of three seconds is given between two GO signals so that drivers get enough time to register the change in traffic signals and STOP/GO accordingly.
STEMS—Smart Traffic and Emergency Management System
819
Fig. 4 Signal wait period calculation
4.4 Activation of Pedestrian Safety System Following algorithm was used to develop a pedestrian safety system: • A sensing element capable of detecting a human presence is selected. We have chosen an IR sensor for convenience. • A controller is required to determine the course of action to be taken when a green signal is detected and a person is trying to cross the road. We used Raspberry Pi 4B to serve this purpose. • An alerting system is needed to alert the pedestrian, who is trying to cross the road during a green signal for vehicular traffic. A buzzer can alert a pedestrian effectively.
5 Results and Analysis Having the model deployed on a small scale, two-lane traffic junction model, it was observed that the system was capable of measuring the vehicle density up to about 85% accuracy, as well as dynamically change the signal period for the lanes. With the successful implementation of STEMS, we estimate 15–25% reduction in traffic congestion, thus leading to better traffic movement. As the vehicular congestion reduces, the signal wait duration also reduces by approximately 30%. As the vehicles would be required to wait for a lesser time duration at the junction, fuel consumption would be less and engines would stay efficient for a longer duration. This in turn would improve the air quality as well as reduce noise pollution. With the shortening of transit time, the urge to drive faster and reach the destination on time would decrease, thereby resulting in reduction of road accidents. With the provisions made for audio– visual reference for pedestrian crossing, the rate of pedestrian-induced accidents will reduce considerably. As for the future work, the system can be extended to clear traffic and provide an obstruction-free path for emergency services such as medical, police,
820
A. Rajagopal et al.
Fig. 5 STEMS for a two-lane model
Fig. 6 Pedestrian safety node for a lane
and fire [11]. The model can be trained and modified to monitor vehicles for traffic law violations and issue e-challans [12]. It can also be trained to predict the upcoming traffic pattern based on the present and past patterns from nearby traffic junctions and provide a congestion-free path (Figs. 5, 6, and 7).
6 Advantages and Limitations 6.1 Advantages • Walking is an essential part of a non-motorized transport mode. To achieve safe walkable goals for people especially for elder and physically challenged people, it is essential to deploy a model capable of alerting them of the impending moving
STEMS—Smart Traffic and Emergency Management System
821
Fig. 7 Vehicle density estimation for a single lane
vehicular traffic and here lies the importance of our project which is the best fit for this requirement. • This model provides a great hand in reducing air and noise pollution. As with the help of computer vision and machine learning algorithms, the density calculation of vehicles is done and the wait period is determined which in turn helps in reduction of traffic congestion. • Having the model work in real time that constantly monitors and controls traffic flow with the help of advanced cameras installed at the junctions is advantageous.
6.2 Limitations • Traffic congestion caused due to crowding of vehicles at a particular spot or blockage of roads is difficult to mitigate as their source is not traffic signal timing mismanagement. • The vegetation cover on the roads has a tendency to obstruct the field of view of the cameras which might lead to erroneous decisions. • The constantly changing illumination in the open world casts varying shadows from time to time, which needs to be accounted for while calculating the vehicular density. • The system currently developed is for a two-lane traffic junction. For a three-/four/five-lane junction, the algorithm for signal go period time or signal wait period time becomes further complex and involves a higher number of nested feedback loops. • STEMS alerts pedestrians who are crossing the roads at the zebra crossings only.
822
A. Rajagopal et al.
• The system alerts the pedestrians who are trying to cross the road in unsafe situations but it does not physically stop them from crossing the road.
7 Conclusion With the successful implementation of STEMS, we estimate 15–25% reduction in traffic congestion, thus leading to better traffic movement. As the vehicles would be required to wait for a lesser time duration at the junction, fuel consumption would be less and engines would stay efficient for a longer duration. This in turn would improve the air quality as well as reduce noise pollution. With the shortening of transit time, the urge to drive faster and reach the destination on time would decrease, thereby resulting in reduction of road accidents. With the provisions made for audio–visual– physical reference for pedestrian crossing, the rate of pedestrian-induced accidents reduce considerably. As for the future work, the system can be developed for a multilane junction and be extended to clear traffic and provide an obstruction-free path for emergency services. The model can be trained and modified to monitor vehicles for traffic law violations and issue e-challans. It can also be trained to predict the upcoming traffic pattern based on the present and past patterns from nearby traffic junctions and provide a congestion-free path, as well as help the analysts in suggesting radical changes in the road and transport structure for promoting faster commute. The pedestrian safety system can also be further developed to include physical barriers which can prevent the pedestrians from physically crossing the road to a greater extent.
References 1. L. Bhaskar, A. Sahai, D. Sinha, G. Varshney, T. Jain, Intelligent traffic light controller using inductive loops for vehicle detection, in 2015 1st International Conference on Next Generation Computing Technologies (NGCT), Dehradun (2015), pp. 518–522. https://doi.org/10.1109/ NGCT.2015.7375173 2. L. Liang, H. Ye, G.Y. Li, Toward intelligent vehicular networks: a machine learning framework. IEEE Internet Things J. 6(1), 124–135 (2019). https://doi.org/10.1109/JIOT.2018.2872122 3. J. Medina, R. Benekohal, Traffic signal control using reinforcement learning and the max-plus algorithm as a coordinating strategy, in Conference Record—IEEE Conference on Intelligent Transportation Systems (2012), pp. 596–601.https://doi.org/10.1109/ITSC.2012.6338911 4. J. Chung, K. Sohn, Image-based learning to measure traffic density using a deep convolutional neural network, in IEEE Transactions on Intelligent Transportation Systems (2017), pp. 1–6. https://doi.org/10.1109/TITS.2017.2732029 5. M.-C. Huang, S.-H. Yen, A real-time and color-based computer vision for traffic monitoring system, in 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763), vol. 3, Taipei (2004), pp. 2119–2122. https://doi.org/10.1109/ICME.2004.139 4685 6. T. Osman, S.S. Psyche, J.M. Shafi Ferdous, H.U. Zaman, Intelligent traffic management system for cross section of roads using computer vision, in 2017 IEEE 7th Annual Computing and
STEMS—Smart Traffic and Emergency Management System
7.
8.
9.
10.
11.
12.
823
Communication Workshop and Conference (CCWC), Las Vegas, NV (2017), pp. 1–7.https:// doi.org/10.1109/CCWC.2017.7868350 M.H. Malhi, M.H. Aslam, F. Saeed, O. Javed, M. Fraz, Vision based intelligent traffic management system, in 2011 Frontiers of Information Technology, Islamabad (2011), pp. 137–141.https://doi.org/10.1109/FIT.2011.33 Krishna, M. Poddar, M.K. Giridhar, A.S. Prabhu, V. Umadevi, Automated traffic monitoring system using computer vision, in 2016 International Conference on ICT in Business Industry & Government (ICTBIG), Indore (2016), pp. 1–5. https://doi.org/10.1109/ICTBIG.2016.7892717 K. Garg, S. Lam, T. Srikanthan, V. Agarwal, Real-time road traffic density estimation using block variance, in 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Placid, NY (2016), pp. 1–9. https://doi.org/10.1109/WACV.2016.7477607 D. Qiu, Q. Xu, J. Zhang, Improvement of pedestrian crossing safety on urban roads, in 2010 International Conference on Intelligent Computation Technology and Automation, Changsha (2010), pp. 514–517. https://doi.org/10.1109/ICICTA.2010.148 P.A. Kamble, R.A. Vatti, Bus tracking and monitoring using RFID, in 2017 Fourth International Conference on Image Information Processing (ICIIP), Shimla (2017), pp. 1–6. https://doi.org/ 10.1109/ICIIP.2017.8313748 R.J. Franklin, Mohana, Traffic signal violation detection using artificial intelligence and deep learning, in 2020 5th International Conference on Communication and Electronics Systems (ICCES), COIMBATORE, India (2020), pp. 839–844. https://doi.org/10.1109/ICCES48766. 2020.9137873
Retraction Note to: High Accuracy for Hyperspectral Image Classification Using Hybrid Spectral 3D-2D CNN Mohini Shivhare and Sweta Tripathi
Retraction Note to: Chapter 13 in: J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_13 The Series Editor and the Publisher have retracted this chapter. An investigation by the Publisher found a number of chapters, including this one, with various concerns, including but not limited to compromised editorial handling, incoherent text or tortured phrases, inappropriate or non-relevant references, or a mismatch with the scope of the series and/or book volume. Based on the findings of the investigation, the Series Editor therefore no longer has confidence in the results and conclusions of this chapter. The authors have not responded to correspondence regarding this retraction.
The retracted version of this chapter can be found at https://doi.org/10.1007/978-981-19-2894-9_13
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_63
C1
Author Index
A Achuthan, Geetha, 179 Afshar Alam, M., 491 Ajitesh Nair, 599 Alaguraja, R. A., 301 Alekya, B., 271 Alvi, A. S., 541 Aman Sharma, 141 Anagha Dasa, 811 Anand Vaidya, 713 Anish Ghiya, 15 Ankit Mundra, 317 Ankit Yadav, 727 Anshita Malviya, 355 Apeksha Arun Wadhe, 207 Aravinth, P., 343 Archana, N. V., 457 Archana, R., 101 Arul, E., 801 Arun Kumar Yadav, 403 Ashish Choudhary, 59 Ashish Kumar, 317 Asirvatham Masilamani, 551 Atharva Naik, 317 Author, 271
Bhuvaneshwari, S., 613
B Balasubramanian, G., 791 Bali Devi, 191 Banakar, R. M., 577 Basanagouda F. Ronad, 745 Behera, H. S., 427 Bharath Vinay Reddy, S., 569 Bharti Shukla, 403 Bhavna Saini, 727
F Fadhil, Hilal A., 179 Foram Bhut, 671
C Cheedella Akhil, 239 Chirag C. Choradia, 811 Chiyangwa, T. B., 151
D Dakshata Argade, 531 Darshee Mehta, 43 Dhamodaran, S., 569 Dhanya, E., 737 Dhiyanesh, B., 1 Dhruv Arora, 191 Dipak Pralhad Mahurkar, 559 Dipak Ramoliya, 671 Divakar Yadav, 403 Druva Kumar, S., 811 Duy Hung, Phan, 91
E Ezhil E. Nithila, 551
G Geetika Narang, 369 Gigi Thomas, 737 Gopikakumari, R., 589
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9
825
826 H Hariprasad Reddy, P., 473 Harish Gowda, G. P., 473 Harold Robinson, Y., 415 Hasoon, Feras N., 179 Hemant Patidar, 559 Himanshu V. Taiwade, 695 Hitesh Chhikaniwala, 43 I Ishan Padhy, 599 J Jagadevi N. Kalshetty, 127 Jainab Zareena, 115 Janani, R. P., 415 Janmenjoy Nayak, 427 Jany Shabu, S. L., 569 Jill K. Mathew, 737 K Kabilesh, S. K., 505 Kalaiarasi, G., 505 Kannan Krithivasan, 777 Kannan, P., 415 Kanthamani, S., 613 Karyy, Oleh, 75 Kashish Sorathia, 671 Kaushal Shah, 247 Kavita, 727 Kavya Shetty, 127 Kesavan, Suresh Manic, 179 Keshav Kumar, 141 Khalaf Aal Thani, Mustafa, 179 Khush Dassani, 473 Khushi Patel, 671 Kisiołek, Artur, 75 Korrapati Pravallika, 261 Kottam Akshay Reddy, 239 Krithika, M., 115 Kulyniak, Ihor, 75 L Lakshmi Narayanan, K., 415 Lall, M., 151 Lokesh B. Bhajantri, 517 Lubna Ansari, 491 M Mage Reena Varghese, 289
Author Index Magesh, D., 1 Maithri Suresh, 127 Malik Nasibullah, 687 Mamta Soni, 59 Manivannan Doraipandian, 777 Marimuthu Karuppiah, 15 Md. Tabrez Nafis, 491 Megha B. Kunder, 127 Mehul Vala, 327 M. Kameswara Rao, 261 Mohamed Mansoor Roomi, S., 301 Mohammed Asim, 687 Mohanapriya, D., 505 Mohd Abdul Ahad, 491 Monicashree, M., 141 Muthukumaran, G., 443 Muthuram, G., 1
N Nandalal Vijayakumar, 551 Nandhini, J., 505 Naveenkumar, E., 1 Naveen, V., 343 Neev Shah, 247 Nida Shakeel, 381 Nikhil, J. K., 599 Nirmala Jegadeesan, 791 Nirmal Jothi, 551 Nisarg Dave, 247 Nishant N. Pachpor, 761
P Palanivel Srinivasan, 777 Pallavi, K. N., 127 Pallavi Kudal, 59 Paritosh Joshi, 247 Pavithra, A. C., 457 Payal Mahipal, 59 Pelusi, Danilo, 15, 427 Plasin Francis Dias, 577 Prakash S. Parsad, 761 Premchand B. Ambhore, 695 Prince Dawar, 59 Priya, K., 301 Punidha, A., 801 Pusuluri Sidhartha Aravind, 15
R Radha, R., 1 Radha SenthilKumar, 343 Rahul, C., 589
Author Index
827
Rajagopal, A., 811 Rajathilagam, B., 101 Rajendra Kumar Dwivedi, 355, 381 Ramani, S., 15 Ramesh, S. R., 29 Ramya Sabapathi, 777 Raphiri, T. S., 151 Refonaa, J., 569 Relin Francis Raj, J., 551 Renuka, K., 415 Rohit Kumar Gupta, 317 Rohma Usmani, 687 Rupali A. Meshram, 541
Sowmya Natarajan, 643 Sreevatsan Radhakrishnan, 29 Sri Hari Balaji, M., 343 Stephen Sagayaraj, A., 505 Sudheesh, P., 239 Sujarani Rajendran, 777 Sumit Srivastava, 191 Sunny Dawar, 59 Suresh, B., 271 Suresh Kumar, B., 369, 761 Suresh Kumar Pemmada, 427 Syamala Tejaswini, 261 Syed Ishtiyaq Ahmed, 29
S Salim G. Shaikh, 369, 761 Samiksha Shukla, 657 Santhana Krishnan, R., 415 Saritha, B., 505 Sarojadevi, H., 473 Satyanarayana Murthy, 271 Satyankar Bhardwaj, 191 Satya Ranga Vara Prasad, R., 569 Saurabh Suthar, 317 Sayam Rahul, 239 Sehaj Jot Singh, 473 Selvakumar George, 551 Selvakumar, P., 443 Selvanathan, N., 1 Shanthi, D. L., 223 Shikha Singh, 631 Shivani Desai, 43 Shivhare, Mohini, 167 Shraddha Suratkar, 207 Shridevi Angadi, 657 Shrihari M. Joshi, 713 Shweta Yadav, 811 Sindhura, S., 599 Sonam Gupta, 403
T Tabassum N. Mujawar, 517 Thanh Van, Pham, 91 Tripathi, Sweta, 167
V Vaishali Khairnar, 531 Vandana, M. L., 141, 599 Vasuki, P., 301 Vatsal Sinha, 141 Venkatesh Gauri Shankar, 191 Vijayakumar Ponnusamy, 643 Vijaya Shetty, S., 473 Vijay Dulera, 43 Vijay Krishna, B. S., 141, 599 Vishal Vora, 327
X X. Anitha Mary, 289
Y Yagnesh B. Shukla, 631