254 51 28MB
English Pages 853 [826] Year 2022
Lecture Notes in Networks and Systems 458
Jennifer S. Raj Yong Shi Danilo Pelusi Valentina Emilia Balas Editors
Intelligent Sustainable Systems Proceedings of ICISS 2022
Lecture Notes in Networks and Systems Volume 458
Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Fernando Gomide, Department of Computer Engineering and Automation—DCA, School of Electrical and Computer Engineering—FEEC, University of Campinas— UNICAMP, São Paulo, Brazil Okyay Kaynak, Department of Electrical and Electronic Engineering, Bogazici University, Istanbul, Turkey Derong Liu, Department of Electrical and Computer Engineering, University of Illinois at Chicago, Chicago, USA Institute of Automation, Chinese Academy of Sciences, Beijing, China Witold Pedrycz, Department of Electrical and Computer Engineering, University of Alberta, Alberta, Canada Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Marios M. Polycarpou, Department of Electrical and Computer Engineering, KIOS Research Center for Intelligent Systems and Networks, University of Cyprus, Nicosia, Cyprus Imre J. Rudas, Óbuda University, Budapest, Hungary Jun Wang, Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
The series “Lecture Notes in Networks and Systems” publishes the latest developments in Networks and Systems—quickly, informally and with high quality. Original research reported in proceedings and post-proceedings represents the core of LNNS. Volumes published in LNNS embrace all aspects and subfields of, as well as new challenges in, Networks and Systems. The series contains proceedings and edited volumes in systems and networks, spanning the areas of Cyber-Physical Systems, Autonomous Systems, Sensor Networks, Control Systems, Energy Systems, Automotive Systems, Biological Systems, Vehicular Networking and Connected Vehicles, Aerospace Systems, Automation, Manufacturing, Smart Grids, Nonlinear Systems, Power Systems, Robotics, Social Systems, Economic Systems and other. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution and exposure which enable both a wide and rapid dissemination of research output. The series covers the theory, applications, and perspectives on the state of the art and future developments relevant to systems and networks, decision making, control, complex processes and related areas, as embedded in the fields of interdisciplinary and applied sciences, engineering, computer science, physics, economics, social, and life sciences, as well as the paradigms and methodologies behind them. Indexed by SCOPUS, INSPEC, WTI Frankfurt eG, zbMATH, SCImago. All books published in the series are submitted for consideration in Web of Science. For proposals from Asia please contact Aninda Bose (aninda.bose@springer. com).
Jennifer S. Raj · Yong Shi · Danilo Pelusi · Valentina Emilia Balas Editors
Intelligent Sustainable Systems Proceedings of ICISS 2022
Editors Jennifer S. Raj Gnanmani College of Engineering and Technology Namakkal, India
Yong Shi Department of Computer Science Kennesaw State University Kennesaw, GA, USA
Danilo Pelusi Faculty of Communication Sciences University of Teramo Teramo, Italy
Valentina Emilia Balas Automatics and Applied Software Aurel Vlaicu University of Arad Arad, Romania
ISSN 2367-3370 ISSN 2367-3389 (electronic) Lecture Notes in Networks and Systems ISBN 978-981-19-2893-2 ISBN 978-981-19-2894-9 (eBook) https://doi.org/10.1007/978-981-19-2894-9 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022, corrected publication 2024 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
We are honored to dedicate the proceedings of 5th ICISS 2022 to all the participants, organizers and editors of 5th ICISS 2022.
Preface
With a deep gratification, we are delighted to welcome you to the proceedings of the 5th International Conference on Intelligent Sustainable Systems (ICISS 2022) organized at SCAD College of Engineering and Technology, Tirunelveli, India, on February 17–18, 2022. The major goal of this international conference is to gather the academicians, industrialists, researchers and scholars together in a common platform to share their innovative research ideas and practical solutions toward the development of intelligent sustainable systems for a more sustainable future. The conference delegates had a wide range of technical sessions based on different technical domains involved in the theme of conference. The conference program has included invited keynote sessions on developing a sustainable future, stateof-the-art research work presentations and informative discussion with the distinguished keynote speakers by covering a wide range of topics in information systems and sustainability research. This year, ICISS has received 312 papers in different conference tracks, and based on the 3–4 expert reviews from the technical program committee, internal and external reviewers, 62 papers were finally selected for the conference. The entire conference proceedings include papers from different tracks like intelligent systems, sustainable systems and applications. Each paper, regardless of track, has received at least three reviews, who have professional expertise in the particular research domain of the paper. We are pleased to thank the conference organization committee, conference program committee and technical reviewers for working generously toward the success of the conference event. A special mention to the internal and external reviewers for working very hard in reviewing the each and every paper received to the conference and for giving valuable suggestions to the authors for maintaining the quality of the conference. We are truly obliged to the authors, who have contributed their innovative research results to the conference. Special thanks go to Springer Publications from their impeccable support and guidance throughout the publication process.
vii
viii
Preface
We wish the proceedings of ICISS 2022 will give an enjoyable and technical– rewarding experience for both attendees and readers. Namakkal, India Kennesaw, USA Teramo, Italy Arad, Romania
Dr. Jennifer S. Raj Dr. Yong Shi Dr. Danilo Pelusi Dr. Valentina Emilia Balas
Contents
Lung Ultrasound COVID-19 Detection Using Deep Feature Recursive Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E. Naveenkumar, B. Dhiyanesh, D. Magesh, G. Muthuram, N. Selvanathan, and R. Radha Predicting New York Taxi Trip Duration Based on Regression Analysis Using ML and Time Series Forecasting Using DL . . . . . . . . . . . . S. Ramani, Anish Ghiya, Pusuluri Sidhartha Aravind, Marimuthu Karuppiah, and Danilo Pelusi Implementation of Classical Error Control Codes for Memory Storage Systems Using VERILOG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sreevatsan Radhakrishnan, Syed Ishtiyaq Ahmed, and S. R. Ramesh Parkinson’s Disease Detection Using Machine Learning . . . . . . . . . . . . . . . Shivani Desai, Darshee Mehta, Vijay Dulera, and Hitesh Chhikaniwala Sustainable Consumption: An Approach to Achieve the Sustainable Environment in India . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sunny Dawar, Pallavi Kudal, Prince Dawar, Mamta Soni, Payal Mahipal, and Ashish Choudhary
1
15
29 43
59
The Concept of a Digital Marketing Communication Model for Higher Education Institutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Artur Kisiołek, Oleh Karyy, and Ihor Kulyniak
75
A Lightweight Image Colorization Model Based on U-Net Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pham Van Thanh and Phan Duy Hung
91
Comparative Analysis of Obesity Level Estimation Based on Lifestyle Using Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 R. Archana and B. Rajathilagam
ix
x
Contents
An Empirical Study on Millennials’ Adoption of Mobile Wallets . . . . . . . 115 M. Krithika and Jainab Zareena An IoT-Based Smart Mirror . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 K. N. Pallavi, Jagadevi N. Kalshetty, Maithri Suresh, Megha B. Kunder, and Kavya Shetty AI-Assisted College Recommendation System . . . . . . . . . . . . . . . . . . . . . . . . 141 Keshav Kumar, Vatsal Sinha, Aman Sharma, M. Monicashree, M. L. Vandana, and B. S. Vijay Krishna An Agent-Based Model to Predict Student Protest in Public Higher Education Institution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 T. S. Raphiri, M. Lall, and T. B. Chiyangwa RETRACTED CHAPTER: High Accuracy for Hyperspectral Image Classification Using Hybrid Spectral 3D-2D CNN . . . . . . . . . . . . . . 167 Mohini Shivhare and Sweta Tripathi Design Smart Curtain Using Light-Dependent Resistor . . . . . . . . . . . . . . . 179 Feras N. Hasoon, Mustafa Khalaf Aal Thani, Hilal A. Fadhil, Geetha Achuthan, and Suresh Manic Kesavan Machine Learning Assisted Binary and Multiclass Parkinson’s Disease Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 Satyankar Bhardwaj, Dhruv Arora, Bali Devi, Venkatesh Gauri Shankar, and Sumit Srivastava Category Based Location Aware Tourist Place Popularity Prediction and Recommendation System Using Machine Learning Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 Apeksha Arun Wadhe and Shraddha Suratkar Maximization of Disjoint K-cover Using Computation Intelligence to Improve WSN Lifetime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 D. L. Shanthi Car-Like Robot Tracking Using Particle Filter . . . . . . . . . . . . . . . . . . . . . . . 239 Cheedella Akhil, Sayam Rahul, Kottam Akshay Reddy, and P. Sudheesh Secured E-voting System Through Blockchain Technology . . . . . . . . . . . . 247 Nisarg Dave, Neev Shah, Paritosh Joshi, and Kaushal Shah A Novel Framework for Malpractice Detection in Online Proctoring Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 Korrapati Pravallika, M. Kameswara Rao, and Syamala Tejaswini Frequency Reconfigurable of Quad-Band MIMO Slot Antenna for Wireless Communication Applications in LTE, GSM, WLAN, and WiMAX Frequency Bands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 B. Suresh, Satyanarayana Murthy, and B. Alekya
Contents
xi
Intelligent Control Strategies Implemented in Trajectory Tracking of Underwater Vehicles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289 Mage Reena Varghese and X. Anitha Mary Fused Feature-Driven ANN Model for Estimating Code-Mixing Level in Audio Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301 K. Priya, S. Mohamed Mansoor Roomi, R. A. Alaguraja, and P. Vasuki Pre-emptive Caching of Video Content Using Predictive Analysis . . . . . . 317 Rohit Kumar Gupta, Atharva Naik, Saurabh Suthar, Ashish Kumar, and Ankit Mundra Information Dissemination Strategies for Safety Applications in VANET: A Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327 Mehul Vala and Vishal Vora Tech Stack Prediction Using Hybrid ARIMA and LSTM Model . . . . . . . 343 Radha SenthilKumar, V. Naveen, M. Sri Hari Balaji, and P. Aravinth Deceptive News Prediction in Social Media Using Machine Learning Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355 Anshita Malviya and Rajendra Kumar Dwivedi Several Categories of the Classification and Recommendation Models for Dengue Disease: A Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369 Salim G. Shaikh, B. Suresh Kumar, and Geetika Narang Performance Analysis of Supervised Machine Learning Algorithms for Detection of Cyberbullying in Twitter . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381 Nida Shakeel and Rajendra Kumar Dwivedi Text Summarization of Legal Documents Using Reinforcement Learning: A Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403 Bharti Shukla, Sonam Gupta, Arun Kumar Yadav, and Divakar Yadav Use of Near-field Communication (NFC) and Fingerprint Technology for Authentication of ATM Transactions . . . . . . . . . . . . . . . . . . 415 K. Renuka, R. P. Janani, K. Lakshmi Narayanan, P. Kannan, R. Santhana Krishnan, and Y. Harold Robinson Light Gradient Boosting Machine in Software Defect Prediction: Concurrent Feature Selection and Hyper Parameter Tuning . . . . . . . . . . . 427 Suresh Kumar Pemmada, Janmenjoy Nayak, H. S. Behera, and Danilo Pelusi An Three-Level Active NPC Inverter Open-Circuit Fault Diagnosis Using SVM and ANN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443 P. Selvakumar and G. Muthukumaran
xii
Contents
Hybrid Control Design Techniques for Aircraft Yaw and Roll Control System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457 A. C. Pavithra and N. V. Archana A Review of the Techniques and Evaluation Parameters for Recommendation Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473 S. Vijaya Shetty, Khush Dassani, G. P. Harish Gowda, H. Sarojadevi, P. Hariprasad Reddy, and Sehaj Jot Singh Fostering Smart Cities and Smart Governance Using Cloud Computing Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491 Lubna Ansari, M. Afshar Alam, Mohd Abdul Ahad, and Md. Tabrez Nafis IoT-Enabled Smart Helmet for Site Workers . . . . . . . . . . . . . . . . . . . . . . . . . 505 D. Mohanapriya, S. K. Kabilesh, J. Nandhini, A. Stephen Sagayaraj, G. Kalaiarasi, and B. Saritha Efficient Direct and Immediate User Revocable Attribute-Based Encryption Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517 Tabassum N. Mujawar and Lokesh B. Bhajantri Comparative Analysis of Deep Learning-Based Abstractive Text Summarization Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 531 Dakshata Argade and Vaishali Khairnar Crop Disease Prediction Using Computational Machine Learning Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541 Rupali A. Meshram and A. S. Alvi A Survey on Design Issues, Challenges, and Applications of Terahertz based 6G Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551 Selvakumar George, Nandalal Vijayakumar, Asirvatham Masilamani, Ezhil E. Nithila, Nirmal Jothi, and J. Relin Francis Raj A Study of Image Characteristics and Classifiers Utilized for Identify Leaves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559 Dipak Pralhad Mahurkar and Hemant Patidar COVID-19 Detection Using X-Ray Images by Using Convolutional Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 569 S. L. Jany Shabu, S. Bharath Vinay Reddy, R. Satya Ranga Vara Prasad, J. Refonaa, and S. Dhamodaran Polarimetric Technique for Forest Target Detection Using Scattering-Based Vector Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 577 Plasin Francis Dias and R. M. Banakar Multilingual Identification Using Deep Learning . . . . . . . . . . . . . . . . . . . . . 589 C. Rahul and R. Gopikakumari
Contents
xiii
AI-Based Career Counselling with Chatbots . . . . . . . . . . . . . . . . . . . . . . . . . 599 Ajitesh Nair, Ishan Padhy, J. K. Nikhil, S. Sindhura, M. L. Vandana, and B. S. Vijay Krishna A High-Gain Improved Linearity Folded Cascode LNA for Wireless Applicatıons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613 S. Bhuvaneshwari and S. Kanthamani Design and Analysis of Low Power FinFET-Based Hybrid Full Adders at 16 nm Technology Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 631 Shikha Singh and Yagnesh B. Shukla A Review on Fish Species Classification and Determination Using Machine Learning Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643 Sowmya Natarajan and Vijayakumar Ponnusamy Malicious URL Detection Using Machine Learning Techniques . . . . . . . . 657 Shridevi Angadi and Samiksha Shukla Comparative Study of Blockchain-Based Voting Solutions . . . . . . . . . . . . . 671 Khushi Patel, Dipak Ramoliya, Kashish Sorathia, and Foram Bhut Electrical Simulation of Typical Organic Solar Cell by GPVDM Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 687 Rohma Usmani, Malik Nasibullah, and Mohammed Asim Statistical Analysis of Blockchain Models from a Cloud Deployment Standpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 695 Himanshu V. Taiwade and Premchand B. Ambhore Deferred Transmission Control Communication Protocol for Mobile Object-Based Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . 713 Anand Vaidya and Shrihari M. Joshi Crıme Data Analysıs Usıng Machıne Learnıng Technıques . . . . . . . . . . . . 727 Ankit Yadav, Bhavna Saini, and Kavita Subsampling in Graph Signal Processing Based on Dominating Set . . . . 737 E. Dhanya, Gigi Thomas, and Jill K. Mathew Optimal Sizing and Cost Analysis of Hybrid Electric Renewable Energy Systems Using HOMER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 745 Basanagouda F. Ronad Different Nature-Inspired Optimization Models Using Heavy Rainfall Prediction: A Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 761 Nishant N. Pachpor, B. Suresh Kumar, Prakash S. Parsad, and Salim G. Shaikh
xiv
Contents
Hyper Chaos Random Bit-Flipping Diffusion-Based Colour Image Cryptosystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 777 Sujarani Rajendran, Manivannan Doraipandian, Kannan Krithivasan, Ramya Sabapathi, and Palanivel Srinivasan Implementation of Fuzzy Logic-Based Predictive Load Scheduling in Home Energy Management System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 791 Nirmala Jegadeesan and G. Balasubramanian Firmware Attack Detection on Gadgets Using Least Angle Regression (LAR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 801 E. Arul and A. Punidha STEMS—Smart Traffic and Emergency Management System . . . . . . . . . 811 A. Rajagopal, Chirag C. Choradia, S. Druva Kumar, Anagha Dasa, and Shweta Yadav Retraction Note to: High Accuracy for Hyperspectral Image Classification Using Hybrid Spectral 3D-2D CNN . . . . . . . . . . . . . . . . . . . . Mohini Shivhare and Sweta Tripathi
C1
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 825
Editors and Contributors
About the Editors Dr. Jennifer S. Raj received the Ph.D. degree from Anna University and Master’s Degree in communication System from SRM University, India. Currently, she is working in the Department of ECE, Gnanamani College of Technology, Namakkal, India. She is Life Member of ISTE, India. She has been serving as Organizing Chair and Program Chair of several international conferences, and in the Program Committees of several international conferences. She is book reviewer for Tata McGraw-Hill Publication and publishes more than 50 research articles in the journals and IEEE conferences. Her interests are in wireless healthcare informatics and body area sensor networks. Dr. Yong Shi is currently working as Associate/Tenured Professor of Computer Science, Kennesaw State University and Director/Coordinator of the Master of Computer Science. He is responsible for directing the Master of Computer Science Program, reviewing applications for the Master of Computer Science. He has published more than 50 articles in national and international journals. He acted as an editor, reviewer, editorial board member, and program committee member in many reputed journals and conferences. His research interest includes Cloud Computing, Big Data, and Cybersecurity. Danilo Pelusi received the Ph.D. degree in Computational Astrophysics from the University of Teramo, Italy. He is Associate Professor at the Department of Communication Sciences, University of Teramo. Co-editor of books in Springer and Elsevier, he is/was Associate Editor of IEEE Transactions on Emerging Topics in Computational Intelligence, IEEE Access and International Journal of Machine Learning and Cybernetics. Guest editor for Elsevier, Springer, and Inderscience journals and keynote speaker in several conference, he belongs to the editorial board member of many journals. Reviewer of reputed journals such as IEEE Transactions on Fuzzy Systems and IEEE Transactions on Neural Networks and Machine Leaning,
xv
xvi
Editors and Contributors
his research interests include Fuzzy Logic, Neural Networks, Information Theory, Machine Learning, and Evolutionary Algorithms. Dr. Valentina Emilia Balas is currently Full Professor at “Aurel Vlaicu” University of Arad, Romania. She is Author of more than 300 research papers. Her research interests are in Intelligent Systems, Fuzzy Control, Soft Computing. She is Editor-in Chief to International Journal of Advanced Intelligence Paradigms (IJAIP) and to IJCSE. She is Member of EUSFLAT, ACM, and a SM IEEE, Member in TC—EC and TC-FS (IEEE CIS), TC—SC (IEEE SMCS), Joint Secretary FIM.
Contributors Mohd Abdul Ahad Department of Computer Science and Engineering, Jamia Hamdard, Delhi, India Geetha Achuthan Electrical and Computer Engineering Department, National University of Science and Technology, Muscat, Sultanate of Oman M. Afshar Alam Department of Computer Science and Engineering, Jamia Hamdard, Delhi, India Cheedella Akhil Department of Electronics and Communication Engineering Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India R. A. Alaguraja ECE, Thiagarajar College of Engineering, Madurai, Tamil Nadu, India B. Alekya Department of ECE, V R Siddhartha Engineering College, Vijayawada, India A. S. Alvi PRMIT & R, Badnera, Amravati, India Premchand B. Ambhore Department of Information Technology, Government College of Engineering, Amravati, India Shridevi Angadi Christ University, Bangalore, India Lubna Ansari Department of Computer Science and Engineering, Jamia Hamdard, Delhi, India Pusuluri Sidhartha Aravind School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, Tamil Nadu, India P. Aravinth Anna University, Madras Institute of Technology, Chennai, Tamil Nadu, India N. V. Archana Electrical and Electronics Department, NIEIT, Mysore, Karnataka, India
Editors and Contributors
xvii
R. Archana Department of Computer Science and Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India Dakshata Argade Terna Engineering College, Navi-Mumbai, India Dhruv Arora Computer Science and Engineering, Manipal University Jaipur, Jaipur, Rajasthan, India E. Arul Department of Information Technology, Coimbatore Institute of Technology, Coimbatore, Tamilnadu, India Apeksha Arun Wadhe Department of Computer Engineering and Information Technology, VJTI, Mumbai, India Mohammed Asim Integral University, Lucknow, India Author Department of ECE, V R Siddhartha Engineering College, Vijayawada, India G. Balasubramanian School of Electrical and Electronics Engineering, SASTRA Deemed University, Thanjavur, India R. M. Banakar Department of Electronics and Communication Engineering, BVBCET, Hubli, India H. S. Behera Department of Information Technology, Veer Surendra Sai University of Technology, Burla, India Lokesh B. Bhajantri Department of ISE, Basaveshwar Engineering College, Bagalkot, Karnataka, India S. Bharath Vinay Reddy Sathyabama Institute of Science and Technology, Chennai, India Satyankar Bhardwaj Computer Science and Engineering, Manipal University Jaipur, Jaipur, Rajasthan, India Foram Bhut Department of Information Technology, Devang Patel Institute of Advance Technology and Research (DEPSTAR), Faculty of Technology and Engineering (FTE), Charotar University of Science and Technology (CHARUSAT), CHARUSAT Campus, Changa, Gujarat, India S. Bhuvaneshwari Department of ECE, Thiagarajar College of Engineering, Madurai, India Hitesh Chhikaniwala Info Comm Technology, Adani Institute of Infrastructure Engineering, Ahmedabad, Gujarat, India T. B. Chiyangwa Computer Science Department, University of South Africa, Gauteng, South Africa Chirag C. Choradia Dept. of Electronics and Communication Engineering, Dayananda Sagar College of Engineering, Bangalore, Karnataka, India
xviii
Editors and Contributors
Ashish Choudhary Manipal University Jaipur, Jaipur, Rajasthan, India Anagha Dasa Dept. of Electronics and Communication Engineering, Dayananda Sagar College of Engineering, Bangalore, Karnataka, India Khush Dassani Nitte Meenakshi Institute of Technology, Bengaluru, India Nisarg Dave Pandit Deendyal Energy University, Gandhinagar, Gujrat, India Prince Dawar Poornima Group of Colleges, Jaipur, Rajasthan, India Sunny Dawar Manipal University Jaipur, Jaipur, Rajasthan, India Shivani Desai Computer Science and Engineering Department, Nirma University, Ahmedabad, Gujarat, India Bali Devi Computer Science and Engineering, Manipal University Jaipur, Jaipur, Rajasthan, India S. Dhamodaran Sathyabama Institute of Science and Technology, Chennai, India E. Dhanya PG and Research Department of Mathematics, Mar Ivanios College (Autonomous), Thiruvananthapuram, Kerala, India B. Dhiyanesh Hindusthan College of Engineering and Technology, Coimbatore, India Manivannan Doraipandian School of Computing, SASTRA Deemed University, Thanjavur, India S. Druva Kumar Dept. of Electronics and Communication Engineering, Dayananda Sagar College of Engineering, Bangalore, Karnataka, India Vijay Dulera Computer Science and Engineering Department, Nirma University, Ahmedabad, Gujarat, India Phan Duy Hung FPT University, Hanoi, Vietnam Rajendra Kumar Dwivedi Department of Information Technology and Computer Application, MMMUT Gorakhpur, Gorakhpur, India Hilal A. Fadhil Department of Electrical and Computer Engineering, Sohar University, Sohar, Sultanate of Oman Plasin Francis Dias Department of Electronics and Communication Engineering, KLS VDIT, Haliyal, India Selvakumar George Department of ECE, SCAD College of Engineering and Technology, Tirunelveli, Tamilnadu, India Anish Ghiya School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, Tamil Nadu, India R. Gopikakumari Division of Electronics Engineering, School of Engineering, CUSAT, Cochi, India
Editors and Contributors
xix
Rohit Kumar Gupta Manipal University Jaipur, Jaipur, India Sonam Gupta Ajay Kumar Garg Engineering College, Ghaziabad, India P. Hariprasad Reddy Nitte Meenakshi Institute of Technology, Bengaluru, India G. P. Harish Gowda Nitte Meenakshi Institute of Technology, Bengaluru, India Y. Harold Robinson School of Information Technology and Engineering, Vellore Institute of Technology, Vellore, India Feras N. Hasoon Electrical and Computer Engineering Department, National University of Science and Technology, Muscat, Sultanate of Oman Syed Ishtiyaq Ahmed Department of Electronics and Communication Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India R. P. Janani Francis Xavier Engineering College, Tirunelveli, Tamil Nadu, India S. L. Jany Shabu Sathyabama Institute of Science and Technology, Chennai, India Nirmala Jegadeesan School of Electrical and Electronics Engineering, SASTRA Deemed University, Thanjavur, India Paritosh Joshi Pandit Deendyal Energy University, Gandhinagar, Gujrat, India Shrihari M. Joshi SDM College of Engineering and Technology, Dharwad, India Nirmal Jothi Department of ECE, SCAD College of Engineering and Technology, Tirunelveli, Tamilnadu, India S. K. Kabilesh Department of Electronics and Communication Engineering, Jai Shriram Engineering College, Avinashipalayam, Tiruppur, India G. Kalaiarasi Department of Electronics and Communication Engineering, Jai Shriram Engineering College, Avinashipalayam, Tiruppur, India Jagadevi N. Kalshetty Nitte Meenakshi Institute of Technology, Bangalore, India P. Kannan ECE Department, Francis Xavier Engineering College, Tirunelveli, Tamil Nadu, India S. Kanthamani Department of ECE, Thiagarajar College of Engineering, Madurai, India Marimuthu Karuppiah Department of Computer Science and Engineering, SRM Institute of Science and Technology, Ghaziabad, Uttar Pradesh, India Oleh Karyy Lviv Polytechnic National University, Lviv, Ukraine Kavita Department of Information Technology, Manipal University, Jaipur, India Suresh Manic Kesavan Electrical and Computer Engineering Department, National University of Science and Technology, Muscat, Sultanate of Oman
xx
Editors and Contributors
Vaishali Khairnar Terna Engineering College, Navi-Mumbai, India Mustafa Khalaf Aal Thani Electrical and Computer Engineering Department, National University of Science and Technology, Muscat, Sultanate of Oman ´ Artur Kisiołek Greate Poland University of Social Studies and Economics in Sroda ´ Wlkp., Sroda Wielkopolska, Poland M. Krithika Department of Management Studies, Saveetha School of Engineering, SIMATS, Chennai, India Kannan Krithivasan School of Computing, SASTRA Deemed University, Thanjavur, India Pallavi Kudal Dr DY Patil Institute of Management Studies, Pune, Maharashtra, India Ihor Kulyniak Lviv Polytechnic National University, Lviv, Ukraine Ashish Kumar Manipal University Jaipur, Jaipur, India Keshav Kumar Computer Science, PES University, Bengaluru, India Megha B. Kunder Atos Syntel, Bangalore, India K. Lakshmi Narayanan ECE Department, Francis Xavier Engineering College, Tirunelveli, Tamil Nadu, India M. Lall Department of Computer Science, Tshwane University of Technology, Gauteng, South Africa D. Magesh Hindusthan College of Engineering and Technology, Coimbatore, India Payal Mahipal Manipal University Jaipur, Jaipur, Rajasthan, India Anshita Malviya Department of Information Technology and Computer Application, MMMUT Gorakhpur, Gorakhpur, India X. Anitha Mary Karunya Institute of Technology and Sciences, Coimbatore, India Asirvatham Masilamani Department of ECE, SCAD College of Engineering and Technology, Tirunelveli, Tamilnadu, India Jill K. Mathew PG and Research Department of Mathematics, Mar Ivanios College (Autonomous), Thiruvananthapuram, Kerala, India Darshee Mehta Computer Science and Engineering Department, Nirma University, Ahmedabad, Gujarat, India Rupali A. Meshram PRMIT & R, Badnera, Amravati, India S. Mohamed Mansoor Roomi ECE, Madurai, Tamil Nadu, India
Thiagarajar
College
of
Engineering,
Editors and Contributors
xxi
D. Mohanapriya Department of Electronics and Communication Engineering, Jai Shriram Engineering College, Avinashipalayam, Tiruppur, India M. Monicashree Computer Science, PES University, Bengaluru, India Tabassum N. Mujawar Research Scholar, Department of CSE, Basaveshwar Engineering College, Bagalkot, Karnataka, India; Department of CE, Ramrao Adik Institute of Technology, D Y Patil Deemed to be University, Navi Mumbai, Maharashtra, India Ankit Mundra Manipal University Jaipur, Jaipur, India Satyanarayana Murthy Department of ECE, V R Siddhartha Engineering College, Vijayawada, India G. Muthukumaran Department of Electrical and Electronics Engineering, School of Electrical Sciences, Hindustan Institute of Technology and Science, Chennai, India G. Muthuram Hindusthan College of Engineering and Technology, Coimbatore, India Atharva Naik Manipal University Jaipur, Jaipur, India Ajitesh Nair Computer Science, PES University, Bengaluru, India J. Nandhini Department of Electronics and Communication Engineering, Jai Shriram Engineering College, Avinashipalayam, Tiruppur, India Geetika Narang Department of CSE, TCOER, Pune, India Malik Nasibullah Integral University, Lucknow, India Sowmya Natarajan Department of ECE, SRM IST, Chennai, India V. Naveen Anna University, Madras Institute of Technology, Chennai, Tamil Nadu, India E. Naveenkumar Hindusthan College of Engineering and Technology, Coimbatore, India Janmenjoy Nayak Department of Computer Science, Maharaja Sriram Chandra Bhanja Deo University, Baripada, Odisha, India J. K. Nikhil Computer Science, PES University, Bengaluru, India Ezhil E. Nithila Department of ECE, SCAD College of Engineering and Technology, Tirunelveli, Tamilnadu, India Nishant N. Pachpor Amity University, Jaipur, India Ishan Padhy Computer Science, PES University, Bengaluru, India K. N. Pallavi NMAM Institute of Technology, Nitte, India Prakash S. Parsad Priyadarshini College of Engineering, Nagpur, India
xxii
Editors and Contributors
Khushi Patel Department of Computer Engineering, Devang Patel Institute of Advance Technology and Research (DEPSTAR), Faculty of Technology and Engineering (FTE), Charotar University of Science and Technology (CHARUSAT), CHARUSAT Campus, Changa, Gujarat, India Hemant Patidar Electronics and Communication Engineering, Oriental University, Indore, India A. C. Pavithra Electronics and Communication Department, ATMECE, Mysore, Karnataka, India Danilo Pelusi Faculty of Communications Sciences, University of Teramo, Teramo, Italy Suresh Kumar Pemmada Department of Computer Science and Engineering, Aditya Institute of Technology and Management (AITAM), Tekkali, India; Department of Information Technology, Veer Surendra Sai University of Technology, Burla, India Vijayakumar Ponnusamy Department of ECE, SRM IST, Chennai, India Dipak Pralhad Mahurkar Electronics and Communication Engineering, Oriental University, Indore, India Korrapati Pravallika Department of Electronics and Computer Engineering, KLEF, Vaddeswaram, India K. Priya ECE, Thiagarajar College of Engineering, Madurai, Tamil Nadu, India A. Punidha Department of Computer Science and Engineering, Coimbatore Institute of Technology, Coimbatore, Tamilnadu, India R. Radha Karpagam Institute of Technology, Coimbatore, India Sreevatsan Radhakrishnan Department of Electronics and Communication Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India C. Rahul Division of Electronics Engineering, School of Engineering, CUSAT, Cochi, India Sayam Rahul Department of Electronics and Communication Engineering Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India A. Rajagopal Dept. of Electronics and Communication Engineering, Dayananda Sagar College of Engineering, Bangalore, Karnataka, India B. Rajathilagam Department of Computer Science and Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India Sujarani Rajendran Department of Computer Science and Engineering, Srinivasa Ramanujan Centre, SASTRA Deemed University, Kumbakonam, India
Editors and Contributors
xxiii
S. Ramani School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, Tamil Nadu, India S. R. Ramesh Department of Electronics and Communication Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India Dipak Ramoliya Department of Computer Science and Engineering, Devang Patel Institute of Advance Technology and Research (DEPSTAR), Faculty of Technology and Engineering (FTE), Charotar University of Science and Technology (CHARUSAT), CHARUSAT Campus, Changa, Gujarat, India M. Kameswara Rao Department of Electronics and Computer Engineering, KLEF, Vaddeswaram, India T. S. Raphiri Department of Computer Science, Tshwane University of Technology, Gauteng, South Africa Kottam Akshay Reddy Department of Electronics and Communication Engineering Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India J. Refonaa Sathyabama Institute of Science and Technology, Chennai, India J. Relin Francis Raj Department of ECE, SCAD College of Engineering and Technology, Tirunelveli, Tamilnadu, India K. Renuka Francis Xavier Engineering College, Tirunelveli, Tamil Nadu, India Basanagouda F. Ronad Department of Electrical and Electronics Engineering, Basaveshwar Engineering College (A), Bagalkot, India Ramya Sabapathi School of Computing, SASTRA Deemed University, Thanjavur, India Bhavna Saini Department of Information Technology, Manipal University, Jaipur, India R. Santhana Krishnan ECE Department, SCAD College of Engineering and Technology, Tirunelveli, Tamil Nadu, India B. Saritha Department of Electronics and Communication Engineering, Jai Shriram Engineering College, Avinashipalayam, Tiruppur, India H. Sarojadevi Nitte Meenakshi Institute of Technology, Bengaluru, India R. Satya Ranga Vara Prasad Sathyabama Institute of Science and Technology, Chennai, India P. Selvakumar Department of Electrical and Electronics Engineering, School of Electrical Sciences, Hindustan Institute of Technology and Science, Chennai, India N. Selvanathan Sona College of Technology, Salem, India
xxiv
Editors and Contributors
Radha SenthilKumar Anna University, Madras Institute of Technology, Chennai, Tamil Nadu, India Kaushal Shah Pandit Deendyal Energy University, Gandhinagar, Gujrat, India Neev Shah Pandit Deendyal Energy University, Gandhinagar, Gujrat, India Salim G. Shaikh Department of CE, SIT, Lonavala, India; Department of CSE, Amity University, Jaipur, Jaipur, India Nida Shakeel Department of Information Technology and Computer Application, MMMUT Gorakhpur, Gorakhpur, India Venkatesh Gauri Shankar Information Technology, Manipal University Jaipur, Jaipur, Rajasthan, India D. L. Shanthi BMS Institute of Technology and Management, Bengaluru, India Aman Sharma Computer Science, PES University, Bengaluru, India Kavya Shetty Netanalytiks Technologies Pvt Ltd, Bangalore, India Mohini Shivhare Department of Elecronics and Communication, Kanpur Institute of Technology, Kanpur, India Bharti Shukla Ajay Kumar Garg Engineering College, Ghaziabad, India Samiksha Shukla Christ University, Bangalore, India Yagnesh B. Shukla Gujarat Technological University, Ahmedabad, India S. Sindhura Computer Science, PES University, Bengaluru, India Sehaj Jot Singh Nitte Meenakshi Institute of Technology, Bengaluru, India Shikha Singh Gujarat Technological University, Ahmedabad, India Vatsal Sinha Computer Science, PES University, Bengaluru, India Mamta Soni Manipal University Jaipur, Jaipur, Rajasthan, India Kashish Sorathia Department of Information Technology, Devang Patel Institute of Advance Technology and Research (DEPSTAR), Faculty of Technology and Engineering (FTE), Charotar University of Science and Technology (CHARUSAT), CHARUSAT Campus, Changa, Gujarat, India M. Sri Hari Balaji Anna University, Madras Institute of Technology, Chennai, Tamil Nadu, India Palanivel Srinivasan School of Computing, SASTRA Deemed University, Thanjavur, India Sumit Srivastava Information Technology, Manipal University Jaipur, Jaipur, Rajasthan, India
Editors and Contributors
xxv
A. Stephen Sagayaraj Bannari Amman Institute of Technology, Sathyamangalam, India P. Sudheesh Department of Electronics and Communication Engineering Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India Shraddha Suratkar Department of Computer Engineering and Information Technology, VJTI, Mumbai, India B. Suresh Kumar Department of CSE, Sanjay Ghodawat University, Kolhapur, India B. Suresh Department of ECE, V R Siddhartha Engineering College, Vijayawada, India Maithri Suresh Oracle, Bangalore, India Saurabh Suthar Manipal University Jaipur, Jaipur, India Md. Tabrez Nafis Department of Computer Science and Engineering, Jamia Hamdard, Delhi, India Himanshu V. Taiwade Department of Computer Science & Engineering, Priyadarshini College of Engineering, Nagpur, India Syamala Tejaswini Department of Electronics and Computer Engineering, KLEF, Vaddeswaram, India Gigi Thomas PG and Research Department of Mathematics, Mar Ivanios College (Autonomous), Thiruvananthapuram, Kerala, India Sweta Tripathi Department of Elecronics and Communication, Kanpur Institute of Technology, Kanpur, India Rohma Usmani Integral University, Lucknow, India Anand Vaidya SDM College of Engineering and Technology, Dharwad, India Mehul Vala Atmiya University, Rajkot, Gujarat, India Pham Van Thanh FPT University, Hanoi, Vietnam M. L. Vandana Computer Science, PES University, Bengaluru, India Mage Reena Varghese Department of Robotics Engineering, Karunya Institute of Technology and Sciences, Coimbatore, India P. Vasuki ECE, Sethu Institute of Technology, Madurai, Tamil Nadu, India B. S. Vijay Krishna CTO, nSmiles, Bengaluru, India S. Vijaya Shetty Nitte Meenakshi Institute of Technology, Bengaluru, India Nandalal Vijayakumar Department of ECE, Sri Krishna College of Engineering and Technology, Coimbatore, Tamilnadu, India
xxvi
Editors and Contributors
Vishal Vora Atmiya University, Rajkot, Gujarat, India Ankit Yadav Department of Information Technology, Manipal University, Jaipur, India Arun Kumar Yadav National Institute of Technology, Hamirpur, H.P., India Divakar Yadav National Institute of Technology, Hamirpur, H.P., India Shweta Yadav Dept. of Electronics and Communication Engineering, Dayananda Sagar College of Engineering, Bangalore, Karnataka, India Jainab Zareena Department of Management Studies, SCAD College of Engineering and Technology, Tirunelveli, India
Lung Ultrasound COVID-19 Detection Using Deep Feature Recursive Neural Network E. Naveenkumar, B. Dhiyanesh, D. Magesh, G. Muthuram, N. Selvanathan, and R. Radha
Abstract Coronavirus disease (COVID-19) is a universal illness that has been prevalent since December 2019. COVID-19 causes a disease that extends to more serious illnesses than the flu and is formulated from a large group of viruses. COVID-19 has been announced as a global epidemic that has greatly affected the global economy and society. Recent studies have great promise for lung ultrasound (LU) imaging, subjects infected by COVID-19. Extensively, the growth of an impartial, fast, and accurate automated method for evaluating LU images is still in its infancy. The present algorithms provide results of LU detecting COVID-19, are very time consuming, and provide high false rate for early detection and treatment of affected patients. Today, accurate detection of COVID-19 usually takes a long time and is prone to human error. To resolve this problem, Information Gain Feature Selection (IGFS) based on Deep Feature Recursive Neural Network (DFRNN) algorithm is proposed to detect the COVID-19 automatically at an early stage. The LU images are preprocessed using Gaussian filter approach, then quality enhanced by Watershed Segmentation (WS) algorithm, and later trained into IGFS algorithm to detect the finest features of COVID-19 to improve classification performance. Thus, the proposed algorithm detects whether the person is COVID-19 affected or not, from his LU image, in an efficient manner. The proposed experimental results show improved precision, recall, F-measure, and classification performance with low time complexity and less false rate performance, compared to the previous algorithms.
E. Naveenkumar (B) · B. Dhiyanesh · D. Magesh · G. Muthuram Hindusthan College of Engineering and Technology, Coimbatore, India e-mail: [email protected] G. Muthuram e-mail: [email protected] N. Selvanathan Sona College of Technology, Salem, India e-mail: [email protected] R. Radha Karpagam Institute of Technology, Coimbatore, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_1
1
2
E. Naveenkumar et al.
Keywords COVID-19 · Lung ultrasound (LU) · Information gain feature selection (IGFS) · Deep feature recursive neural network (DFRNN) · Preprocessing · Segmentation · Classification
1 Introduction In December 2019, the world has faced a universal healthiness disaster, a new coronavirus disease pandemic typically known as COVID-19. The outburst of COVID-19 and its associated complications of suppression cause global health hazards and affect all aspects of personal life. Spreading through human-to-human communication by direct contact or droplets is a known quality of the virus, with an intricate imitation amount of 2.25–3.59 and a development period of 2–14 days. According to a study, 99% of COVID-19 patients experience runny nose, dry cough, body pain, dyspnea, confusion, headache, sore throat, high fever, and vomiting. There are various diagnostic methods for COVID-19 detection, among which Computed Tomography (CT) and MRI (Magnetic Resonance Imaging) methods have wide applicability, but these tests have less sensitivity and accuracy. Quick detection of highly contagious COVID-19 infection is a requisite to isolate patients to control the epidemic and save many lives. Lung ultrasound (LU) imaging focuses on lesion characteristics such as shape, number, distribution, density, and associated symptoms. Deep learning (DL) has played a key role in medical imaging tasks, including CT imaging. Deep learning-based approaches are widely used to guarantee the highest level of accuracy in disease detection and prediction. This paper focuses on detecting the COVID-19 infection using a LU image dataset based on the proposed algorithm to identify the presence of lung infection in patients, in a classification mode which is most helpful for the diagnosis. Initially, normal tissue is compared with the infected tissue using image preprocessing steps for segmentation and finest feature selection. To evaluate the performance of the proposed model, Deep Feature Recursive Neural Network (DFRNN) algorithm is used, which automatically detects new coronavirus disease from lung ultrasound images. The proposed algorithm is categorized into four parts; they are preprocessing, image segmentation, feature selection, and classification.
2 Related Work COVID-19 is currently spreading worldwide which can be detected by RT-PCR test and CT scan. The paper [1] proposed an algorithmic Indefiniteness Elimination Network (IE-Net) to obtain accurate results about COVID-19 and remove the impact of different dimensions. However, the algorithm did not provide an accurate result about the presence or absence of COVID-19.
Lung Ultrasound COVID-19 Detection Using Deep Feature …
3
The authors of [2] investigated lung ultrasound imagery for COVID-19 prediction using DL techniques that consists of VGG19, InceptionV3, and ResNet50. Infection Segmentation Deep Network (ISD-Net) algorithm was used to find the lung infected area from Computed Tomography (CT) images. Article [3, 4] explored the convolutional neural network (CNN) algorithm and transfer learning (TL), to predict COVID-19 by determining different abnormalities from X-ray images. Likewise, the authors of [5] used the Generative Adversarial Networks (GANs) to improve the CNN performance to predict COVID-19. Paper [6] suggested the use of the CNN and Gravitational Search Optimization (GSO) algorithms to detect COVID-19. In paper [7], the GSO was used to identify the finest values and CNN was used to predict the COVID-19. DL methods efficiently identified COVID-19 from CT scan and X-ray images [8]. Researchers of [9] employed DL-based TL algorithm to identify the coronavirus using CT images. However, the method produced low accuracy classification performance. The authors of [10] introduced the Saliency-Based Region Detection and Image Segmentation (SBRDIS) approach to minimize noise and identify the infected class. The pre-trained CNN method for COVID-19 identification using the RYDLS-20 image dataset was done by [11]. The authors of [12] evaluated the Mini-COVIDNet-based Deep DNN algorithm to efficiently identify COVID-19 using the lung ultrasound image dataset. Similarly, paper [13] analyzed the Deep CNN algorithm for automatically classifying positive and negative for COVID-19 using X-ray images [13]. Quick and trustworthy recognition of patients diseased with COVID-19 is crucial to inhibit and boundary of its occurrence [14]. Researchers of [15] utilized the Deep features with BAT optimization and the fuzzy K-nearest neighbor (FKNN) algorithm to automatic diagnosis COVID-19 [15, 16]. The diagnosis of lung disease by analyzing the chest CT image has developed a significant tool for the prediction of COVID-19 patients [17, 18]. Markov Model (MM) and Viterbi Algorithm (VA) was used to find the affected regions and Support Vector Machine (SVM) algorithm to classify COVID-19 or nonCovid in [19]. A powerful Deep CNN model was proposed to identify COVID-19 using online available datasets [20]. The purpose of [21] was to use various medical imaging methods such as CT and X-ray, based on the deep learning technology, and provide an overview of the recently developed systems. An Adaptive Thresholding Technique (ATT) and the Semantic Segmentation (SS) algorithm was designed to find the infected lungs using the LIDCIDRI image dataset [22, 23]. In that paper, the author proposes a framework of transfer learning to detect deep uncertainty perception and COVID-19 [24]. Extracted features to identify the status of COVID-19 it is handled by various machine learning and statistical modelling techniques. MM and VA algorithm is used to find the affected regions and SVM algorithm is used to classify the COVID-19 or non-Covid [25]. The model can be a given chest x-ray of the patient COVID-19, which must determine if it is accurate [26].
4
E. Naveenkumar et al.
3 Proposed Methodology This paper presents the Deep Feature Recursive Neural Network (DFRNN) algorithm for accurately detecting the presence or absence of COVID-19 with low time complexity and less false rate. The method has been introduced to detect COVID-19 infections more quickly by analyzing lung ultrasound (LU) images in the early stage. Figure 1 describes the proposed overall architecture diagram for COVID-19 identification using lung ultrasound (LU) image dataset. The first process is that the collected LU image is fed into Gaussian filter algorithm to remove noise and resize the image. Then, the preprocessed image is trained into Watershed segmentation algorithm to enhance the image quality for finest features using Information Gain Feature Selection (IGFS). Finally, the proposed algorithm classifies and detects the existence of COVID-19.
3.1 Gaussian Filter Since the features of lung ultrasound (LU) images have different intensities and gray levels, image preprocessing must be applied before using such images as classifier inputs. Image normalization in the preprocessing step is aimed at reducing noise and resizing the images. A Gaussian filter is a linear smoothing filter that chooses weights according to the shape of the Gaussian function. Gaussian smoothing filtration is completed to eliminate noise and adjust the size of the LU images. The Gaussian filter performs smoothing by replacing each image pixel with the weighted average of adjacent pixels. As a result, the weight given to adjacent pixels diminishes monotonously with distance from the center pixel. The below equation is used
Fig. 1 Proposed framework for COVID-19 detection using lung ultrasound image dataset
Lung Ultrasound COVID-19 Detection Using Deep Feature …
5
to calculate the image smoothing filter G s (x, y). G s (x, y) = e−
(x 2 +y2 ) 2σ 2
(1)
where x and y are the image pixels, and σ is a standard deviation value. Nr (x, y) =
1 aw (x, y)o(i ) n(x, y) i∈ϕ(x,y)
(2)
In the Equation (2) can be expressed as noise removing process Nr (x, y) where aw (x, y) refers to the average weight calculation assign to remove noise from Nr (x, y), n(x, y) is the normalizing constant, O refers to the original noise image and ϕ(x, y) is the, and the normalized noise removing image is Nr .
3.2 Watershed Segmentation (WS) This phase proposes a Watershed Segmentation (WS) approach to overcome the problems of uneven intensity in the image. The proposed SW algorithm is a new energy function that can effectively extract objects with complex backgrounds considering severe non-uniformity in the preprocessed LU images. The proposed WS stems from the fact that during the process of detecting a lung infection, at first the area of infection is roughly identified and then its contours are accurately extracted based on the local appearance. Therefore, WS first predicts the coarse area and then implicitly models the boundary through reverse attention and edge constraints, thus clearly enhancing the boundary recognition. Equation (3) is expressed as LU image edge detection (ed ) which is, ed =
1 1 + |∀gσ ∗ Nr (x, y)|
(3)
where ∀ and gσ are gradient processes, Nr (x, y) refers to the preprocessed image, σ refers to standard deviation, and ∗ refers to reduce intense background from the preprocessed image. Ra = →ed −ed (x, y)
(4)
The above equation is to identify covid-affected region and coarse area Ra , and →ed refers to the image mean value of ed (x, y). WS = Ret (∅) + Rit
(5)
6
E. Naveenkumar et al.
The above equation is to identify the segment energy image information ‘W s ’ to predict the covid-affected region. Gradient color external information is Ret (∅), and Rit refers to internal evolution of level set.
3.3 Information Gain Feature Selection In this phase, segmented image is fed into Information Gain Feature Selection (IGFS) to gain the finest information from segmented LU image of COVID-19. The IGFS algorithm is no longer used as a black box. On the contrary, in binary particle aggregate optimization, the interrelationship coefficient is added as the feature coefficient that determines the image position, and hence, it is more likely to select features with additional information. Evaluating the performance of the feature subgroup, i.e., evaluating the feature combined with the highest aspect ratio is approved by the classifier. Algorithm steps Input: Segmented LU images W s Output: Optimal original features (f ). Begin Step 1: Initialize the Each feature images Step 2: Calculate the weights of features Set all feature weights W(R) = 0.0; For n = 1 to m do Randomly select the features weights (R); Find Information of the features (If ). For Each R = Class (Ws) do Find Coefficient Feature weights (m f ) End for For R = 1 to feature weights do W (R) = W (R) −
n
diff(R, W s, D)/(m X I )
a=1
+
R=class(F)
m(R) diff(R, W s, D)/(m X I ) 1 − m(class(F)) a=1 n
[
End For Step 3: Calculate the Each image Feature max. weights (m f ) Step 4: Update the best feature set Step 5: Update each combined feature values
Lung Ultrasound COVID-19 Detection Using Deep Feature …
7
Step 6: Obtain finest (f) result of COVID-19 End where W (R)—Weight Random features that represents Class feature set, n—number of images, in the feature selection that is based on the IGFS to analyze the maximum number of features and update the feature set of the images. Size of the image is N, and Dimension of Features is represented as D.
3.4 Deep Feature Recursive Neural Network In this phase, feature selected image is trained into the proposed DFRNN algorithm to detect COVID-19. The DFRNN algorithm using LU imaging can identify COVID19 attack by detecting the lung consolidation and tissue. The DFRNN algorithm is gaining popularity due to its improved prediction accuracy. In addition, it is used to eliminate unwanted information that negatively impacts accuracy. Algorithm steps Input: Finest features (f) LU image Output: Return optimized result Begin Import LU Finest features image (f) Set the initial layers based on the feature weights Set the DFRNN limitations ε, μ, β Set convert features Image in training Train the features weights of the images For ε = 1 to ∈ do Randomly select the image features from T1 Compute the loss ranges Update the classification weights rates ω∗ Return optimized result End The above algorithm steps are performed to provide covid-positive or covidnegative result in an efficient manner. The infection seems to be often bilateral and predominant in the lower area, and the size of the infected area depends on the patient’s condition. For example, in mild cases, lesions may appear small, whereas in severe cases, lesions may be widespread. Therefore, DFRNN algorithm deals with changes in lesion size and location. ε⇓0(1 + a)n = 1 +
na 1!
+
n(n − 1)a 2 , 1!
8
E. Naveenkumar et al.
where μ refers to performance rate, ε is the Iteration stage, ∈ is the maximum number of iterations, and β is the number of images covered iterations.
4 Simulation and Results The proposed DFRNN algorithm is implemented in python language at an anaconda environment using LU image dataset and is compared with other algorithms such as convolutional neural network (CNN) and Generative Adversarial Networks (GANs). Table 1 describes the details of simulation parameters for the proposed implementation compared with previous algorithms. Table 2 defines the classification of accuracy performance for coronavirus detection. It is evident that the proposed algorithm achieves high performance of results compared to the existing algorithms. Figure 2 portrays the exploration of classification accuracy performance for coronavirus detection. Table 3 depicts the analysis of precision performance for accurate detection of coronavirus. Precision performance accurately predicts how many positives this class has, in the test data set. Figure 3 represents the precision performance for coronavirus detection. Table 4 shows the analysis of recall performance for coronavirus detection. Recall performance refers to how many times the actual true-positive classification predicts the class. Figure 4 shows the exploration of recall performance for coronavirus detection. Table 5 shows the false rate performance for coronavirus detection. The proposed algorithm’s false rate performance is low compared to the other existing methods. Table 1 Details of Simulation parameters
Table 2 Classification of accuracy performance
Values
Parameters Name of the dataset
Lung ultrasound image dataset
Language
Python
Tool used
Anaconda
Number dataset
800
Training dataset
600
Testing dataset
200
No. of images
CNN (%)
GANs (%)
DFRNN (%)
1
63
67
71
2
69
73
77
3
74
80
83
4
77
83
92
Lung Ultrasound COVID-19 Detection Using Deep Feature …
9
Classification of Accuracy Performance Accuracy in %
100 80 60 40 20 0 1
2
3
4
No. of images CNN
GANs
DFRNN
Fig. 2 Exploration of classification accuracy performance Table 3 Analysis of precision performance
No of image
CNN (%)
GANs (%)
DFRNN (%)
1
61
67
71
2
68
71
78
3
74
73
81
4
77
85
90
Precision Performance Precision in %
100
50
0
1
2
3
4
No. of images CNN
GANs
DFRNN
Fig. 3 Exploration of precision performance Table 4 Analysis of Recall performance
No of image
CNN (%)
GANs (%)
DFRNN (%)
1
64
68
72
2
67
71
79
3
73
78
81
4
79
84
91
10
E. Naveenkumar et al.
Recall Performance
100 Recall in %
80 60 40 20 0 1
2 No. of images GANs
CNN
3
4 DFRNN
Fig. 4 Exploration of Recall performance
Table 5 Analysis of false rate performance
No of image
CNN (%)
GANs (%)
DFRNN (%)
1
37
33
29
2
31
27
23
3
26
20
17
4
23
17
8
Figure 5 depicts the analysis of false rate performance for coronavirus detection. Analysis of time complexity performance result graph is shown in Fig. 6. False Rate Performance
False rate in %
40 30 20 10 0 1
2
3
4
No of image CNN
Fig. 5 Analysis of false rate performance
GANs
DFRNN
Lung Ultrasound COVID-19 Detection Using Deep Feature …
11
Time Complexity Performance
Time In Sec
40 30 20 10 0 1
2
3
4
No. of images CNN
GANs
DFRNN
Fig. 6 Analysis of time complexity performance
5 Conclusion In the current COVID-19 pandemic, medical services are often saturated, and therefore, automatic diagnosis imaging tools can significantly decrease the burden on the medical system by a limited number of specialists required. In this paper, the proposed Deep Feature Recursive Neural Network (DFRNN) algorithm is performed to classify the result as COVID-19 affected or Non-Covid case, using lung ultrasound image dataset. The first stage of this algorithm is the Gaussian filter algorithm to reduce the noise and resize the image, then the preprocessed image is supplied to the Watershed Segmentation (WS) algorithm to enhance the image quality, and later, the quality image is fed into Information Gain Feature Selection (IGFS) approach to increase the classification accuracy performance. Finally, the proposed DFRNN algorithm classifies effectively if the individual is COVID-19 affected or not affected from the lung ultrasound image at an early stage. Another advantage of the DFRNN algorithm is that it is a very versatile deployment and has a high COVID-19 detection accuracy of 92%, precision performance as 90%, recall performance as 91%, false rate performance as 8%, and classification time complexity performance as 22 s. Hence, the DFRNN algorithm proves to be superior to CNN and GANs in detecting lung infection.
12
E. Naveenkumar et al.
References 1. G. Guo, Z. Liu, S. Zhao, L. Guo, T. Liu, Eliminating indefiniteness of clinical spectrum for better screening COVID-19. IEEE J. Biomed. Health Inform. 25(5), 1347–1357 (2021). https:// doi.org/10.1109/JBHI.2021.3060035 2. J. Diaz-Escobar, N.E. Ordóñez-Guillén, S. Villarreal-Reyes, A. Galaviz-Mosqueda, V. Kober, R. Rivera-Rodriguez, J.E. Lozano Rizk, Deep-learning based detection of COVID-19 using lung ultrasound imagery. PLoS One. 2021 16(8), e0255886. doi: https://doi.org/10.1371/jou rnal.pone.0255886 3. B. Dhiyanesh, S. Sakthivel, UBP-Trust: user behavioral pattern based secure trust model for mitigating denial of service attacks in software as a service (SaaS) cloud environment. J. Comput. Theor. Nanosci. 13(10) (2016) 4. A. Narin, C. Kaya, Z. Pamuk, Automatic detection of coronavirus disease (COVID-19) using X-ray images and deep convolutional neural networks (2020) 5. D.P. Fan, T. Zhou, G.P. Ji et al., Inf-Net: automatic COVID-19 lung infection segmentation from CT images. IEEE Trans. Med. Imaging 39, 2626–2637 (2020). https://doi.org/10.1109/ TMI.2020.2996645 6. I.D. Apostolopoulos, T.A. Mpesiana, COVID-19: automatic detection from X-ray images utilizing transfer learning with convolutional neural networks. Phys. Eng. Sci. Med. 43, 635–640 (2020). https://doi.org/10.1007/s13246-020-00865-4 7. S. Kasthuripriya, B. Dhiyanesh, S. Sakthivel, LFTSM-local flow trust based service monitoring approach for preventing the packet during data transfer in cloud. Asian J. Inform. Technol. 15(20) (2016) 8. B. Dhiyanesh, S. Sakthivel, F2C: an novel distributed denial of service attack mitigation model for SAAS cloud environment. Asian J. Res. Soc. Sci. Hum. 6(6) (2016) 9. A. Waheed, M. Goyal, D. Gupta, A. Khanna, F. Al-Turjman and P.R. Pinheiro, CovidGAN: data augmentation using auxiliary classifier GAN for improved COVID-19 detection. IEEE Access. 8, 91916–91923 (2020). doi: https://doi.org/10.1109/ACCESS.2020.2994762 10. D. Ezzat, A.E. Hassanien, H.A. Ella, An optimized deep learning architecture for the diagnosis of COVID-19 disease based on gravitational search optimization. Appl. Soft. Comput. (2020). https://doi.org/10.1016/j.asoc.2020.106742 11. B. Dhiyanesh, S. Sakthivel, Secure data storage auditing service using third party auditor in cloud computing. Int. J. Appl. Eng. Res. 10(37) (2015) 12. P. Karthikeyan, B. Dhiyanesh, Location based scheduler for independent jobs in computational grids. CIIT Int. J. Netw. Commun. Eng. 3(4) (2011) 13. Y. Jiang, H. Chen, M. Loew, H. Ko, COVID-19 CT image synthesis with a conditional generative adversarial network. IEEE J. Biomed. Health Inform. 25(2), 441–452 (2021). https://doi.org/ 10.1109/JBHI.2020.3042523 14. J. Kaur, P. Kaur, Outbreak COVID-19 in medical image processing using deep learning: a state-of-the-art review. Arch Comput. Methods Eng. (2021). https://doi.org/10.1007/s11831021-09667-7 15. B. Dhiyanesh, K.S. Sathiyapriya, Image inpainting and image denoising in wavelet domain using fast curve evolution algorithm, in 2012 IEEE International Conference on Advanced Communication Control and Computing Technologies 2012. https://doi.org/10.1109/ICA CCCT.2012.6320763 16. T. Kaur, T.K. Gandhi, B.K. Panigrahi, Automated diagnosis of COVID-19 using deep features and parameter free BAT optimization. IEEE J. Trans. Eng. Health Med. 9, 1–9 (2021). Art no. 1800209. https://doi.org/10.1109/JTEHM.2021.3077142 17. B. Dhiyanesh, Dynamic resource allocation for machine to cloud communications robotics cloud, in 2012 International Conference on Emerging Trends in Electrical Engineering and Energy Management (ICETEEEM), 2012. doi:https://doi.org/10.1109/ICETEEEM.2012.649 4498
Lung Ultrasound COVID-19 Detection Using Deep Feature …
13
18. P. Dutta, T. Roy, N. Anjum, COVID-19 detection using transfer learning with convolutional neural network, in 2021 2nd International Conference on Robotics, Electrical and Signal Processing Techniques (ICREST), 2021, pp. 429–432. https://doi.org/10.1109/ICREST51555. 2021.9331029 19. S. Wang, B. Kang, J. Ma, et al., A deep learning algorithm using CT images to screen for corona virus disease (COVID-19) (2020). medRxiv 2020.02.14.20023028. https://doi.org/10. 1101/2020.02.14.20023028 20. A. Joshi, M.S. Khan, S. Soomro, A. Niaz, B.S. Han, K.N. Choi, SRIS: saliency-based region detection and image segmentation of COVID-19 infected cases. IEEE Access 8, 190487– 190503 (2020). https://doi.org/10.1109/ACCESS.2020.3032288 21. R.M. Pereira, D. Bertolini, L.O. Teixeira et al., COVID-19 identification in chest X-ray images on flat and hierarchical classification scenarios. Comput. Methods. Programs Biomed. (2020). https://doi.org/10.1016/j.cmpb.2020.105532 22. N. Awasthi, A. Dayal, L.R. Cenkeramaddi, P.K. Yalavarthy, Mini-COVIDNet: efficient lightweight deep neural network for ultrasound based point-of-care detection of COVID-19. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 68(6), 2023–2037 (2021). https://doi.org/10. 1109/TUFFC.2021.3068190 23. M.M. Islam, F. Karray, R. Alhajj, J. Zeng, A review on deep learning techniques for the diagnosis of novel coronavirus (COVID-19). IEEE Access 9, 30551–30572 (2021). https://doi. org/10.1109/ACCESS.2021.3058537 24. A. Shamsi et al., An uncertainty-aware transfer learning-based framework for COVID-19 diagnosis. IEEE Trans. Neural Netw. Learn. Syst. 32(4), 1408–1417 (2021). https://doi.org/10.1109/ TNNLS.2021.3054306 25. E. Irmak, A novel deep convolutional neural network model for COVID-19 disease detection. Med. Technol. Congress (TIPTEKNO) 2020, 1–4 (2020). https://doi.org/10.1109/TIPTEKNO5 0054.2020.9299286 26. L. Carrer et al., Automatic pleural line extraction and COVID-19 scoring from lung ultrasound data. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 67(11), 2207–2217 (2020). https://doi. org/10.1109/TUFFC.2020.3005512
Predicting New York Taxi Trip Duration Based on Regression Analysis Using ML and Time Series Forecasting Using DL S. Ramani, Anish Ghiya, Pusuluri Sidhartha Aravind, Marimuthu Karuppiah, and Danilo Pelusi
Abstract The taxi fare and the duration of a trip are highly dependent on many factors such as traffic along route or late-night drives, which might be a little slower due to restricted night vision and many more. In this research work, it is attempted to visualize the various factors that might affect the trip durations such as day of the week, pickup location, drop-off location and time of pickup. The research work mainly analyses the dataset obtained from the NYC Taxi and Limousine Commission (TLC) which contains the data of taxi trips from January 2016 to June 2016 with GPS coordinates. The analysis of the data is performed, and the prediction of the taxi trip duration is done using multiple machine learning and deep learning models. The analysis is done for these models based on the mean squared error and the R2 score that is found without scaling and performing scaling on the data. The maximum R 2 score was attained with the recurrent neural network (RNN) using time series analysis with a score of 0.99 and 0.97 with XGBRegressor, and an increment of 0.6% was observed with normalizing value using log transform while analysing it as a regression perspective. Keywords New York City Taxi and Limousine Commission · Regression · Scaling · Logarithmic transformation · Machine learning · Deep learning · Mean squared error (MSE) · R 2 values
S. Ramani · A. Ghiya · P. S. Aravind School of Computer Science and Engineering, Vellore Institute of Technology, Vellore , Tamil Nadu 632014, India e-mail: [email protected] P. S. Aravind e-mail: [email protected] M. Karuppiah Department of Computer Science and Engineering, SRM Institute of Science and Technology, Delhi-NCR Campus, Ghaziabad, Uttar Pradesh 201204, India D. Pelusi (B) Faculty of Communications Sciences, University of Teramo, Teramo, Italy e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_2
15
16
S. Ramani et al.
1 Introduction New York, also known as the ‘concrete jungle’ is riddled with a large number of one ways, small side streets and an almost incalculable number of pedestrians at any given point in time, not to mention the number of cars/motorcycles/bicycles clogging up the roads. This when combined with an urgency to get from point A to point B will make oneself late for whatever one needs to be on time for. An average taxi organization faces a typical issue of proficiently assigning the taxis to travellers so the administration is smooth and bother free. One of the fundamental issues is deciding the duration of the current trip so it can anticipate when the taxi will be free for the following trip. The solution to getting from A to B when living in a city like New York (without losing your mind) is to either take a taxi/Uber/Lyft/, etc. One does not need to stress about the traffic or pedestrians and can have a moment to do something else, like catch up on emails. Although this sounds simple enough, it does not mean you will get to your destination in time. The driver needs to take the shortest trip possible for one to make it on time to their destination. If route A is X kilometres longer, but gets you there, Y minutes faster than route B would, one would take route B over A. New York City Taxi and Limousine Commission (TLC) deals with the licencing of taxicabs operated by the private companies in New York along with overseeing about 40,000 other for-hire vehicles. Taxicab vehicles must each have a medallion to operate and are driven an average of 180 miles per shift. As of March 14, 2014, there were 51,398 individuals licenced to drive medallion taxicabs. Taxis collectively make between 300,000 and 400,000 trips a day. The approach presented in this paper uses the dataset which has a considerable number of rows (i.e. 1,458,644 trip records). This paper presents predictive analysis on the novel approach. This paper employs ML models for predictive analysis on the datasets mentioned in Sect. 3.1, and we have used the time series data from the target variable (trip_duration) as input for LSTMs and RNNs, and we are presenting the comparison between these two modes of analysis. Taxicab vehicles, every one of which must have a medallion to work, are driven a normal of 180 miles for each move. As of March 14, 2014, there were 51,398 people authorized to drive medallion taxicabs. There were 13,605 taxicab medallion licences in presence. By July 2016, the number had dropped somewhat to 13,587 medallions or 18 lower than the 2014 aggregate. With a taxicab network, this big along with the huge New York population of nearly 8.4 million people creates a huge network of traffic routes, creating a huge amount of traffic in different places. Thus, it is necessary to know how the traffic is at different times, days, for the organization to efficiently distribute their taxi cabs so as to obtain maximum profit, which would require for the cabs to be efficient in dropping off the passengers. To know which route is the best one to take, we need to be able to predict how long the trip will last when taking a specific route. Knowing the duration that the trip would take is also essential for us to take into consideration all the factors like traffic, weather and many more. Performing data analysis can help to get a clear idea
Predicting New York Taxi Trip Duration Based on Regression . . .
17
to get bits of knowledge about the data and decide how various factors are subject to the objective variable trip duration. This study can help us get a good visualization of how the taxi traffic moves and how people move within the city as well, helping us get a better understanding of the popular places for increasing the business of the taxi companies that are targeted. The study is significant in a manner as it will educate organizations about the circumstance present. It will likewise permit individuals to watch the current state of the spot they are heading out to. Alongside this, the expectation tab will permit individuals to set themselves up with predictive cases for what’s to come. With the help of the data obtained from the NY Taxi and Limousine Company (TLC), it becomes easier as we have access to huge amount of data which has been collected over a period of 7 21 years, which will help bring about a much more accurate prediction with the help of machine learning (ML) and artificial intelligence (AI). Since this paper focuses on the predictive analysis aspect of the NYC taxi trip durations based on the various input features like weather, location, etc., it differs from the travelling salesman problem (TSP) which emphasizes on the optimum path to be followed in order to get to the location in the best possible manner with the least use of resources.
2 Literature Review This particular paper highlights the prevailing focus on the dataset of NYC taxi trips and fare. Big data was under the limelight to analyses such a massive dataset from the early 2000s. There were around 180 million taxi rides in the city of New York in the year of 2014 itself. The intent of analysing this data was for several purposes like avoiding traffic, lower rate where services are not functioning, more frequency than a cab on prime location and many more. The paper focuses on four main areas— analysis on individual taking fare, distance, time and efficiency; analysis on region taking pickup location and drop-off location; analysis based on fare and; analysis based on fare [1]. The vital section of the paper involves a visual query model that allows the users to quickly select data slices and use them. The mentioned analysis is performed to help the vendor provide more taxis where necessary as per region and use the other analysis as well to make the system work more efficiently. This paper deals with the analysis of sale of iced products affected by the variation of temperature, and the paper utilizes the collection of data from previous years. The paper utilizes a regression analysis model based on the data that has been cleansed. Python3.6 being a completely object-oriented language design programme with a pure script allows for the combination of the essence and designing rules of various design languages making it easier for the use in making large-scale software [2]. The paper uses Python3.6 to set up a linear regression analysis model targeting the effect of temperature variation on the sale of companies with the help of Pandas analysis package. The paper mainly analyses a simple linear regression model, which performs the analysis of studying relations between independent variable and dependent variable. While the paper presents an essential use of python as a programming lan-
18
S. Ramani et al.
guage, we can also see from this that the model linear regression that was used could not be of greater significance in the case of big data analysis. The paper primarily deals to improve the ceramic sector with the performance indicators being analysed with multiple regression analysis; this analysis belongs to the multivariate methods. This type of analysis is used for modelling and analysing several variables; it extends regression analysis by describing the relationship between dependent and independent variables which can be used for predicting and forecasting [3]. This model can be much more realistic than the uni-factorial regression model as not always only a single variable affects the data. The results showed that three of the variables analysed are very significant predictors for the magnitude of profit. We could also find significant correlations between the analysed indicators. While the paper performs a multivariate correlation analysis limited to only 3 variables, we expand this to deal with nearly 30 variables bringing the best possible result for the analysis performed and the further model created from this analysis. Throughout past years, the forecasting domain has been impacted by the way that specialists have excused neural networks (NNs) as being non-serious, whereas NN enthusiasts have presented few new and improved NN models, generally without solid empirical assessments when contrasted with streamlined univariate measurable techniques [4]. For feature-based forecasting, this paper used XGBoost as a meta model to assign weights to forecasting variables. In the paper [5], their model outperforms all other individual models in the ensemble. The models used in this paper is the model we have used as the main model for training and also included other boosting models. The idea of using ML models for forecasting was obtained from [6] wherein the authors use multiple basic ML models in order to make predictions on the M3 dataset and evaluate relative performance across multiple horizons. While the paper provides insight into the advantages that the ML models provide, further enhancing them with the help of the Booster model furthers the cause of working on larger datasets with ease. The deep learning approach has taken a more positive turn in the forecast and analysis of network traffic as in [7–9] where the authors analyse as well as predict the network traffic in the future using LSTM and RNN. And, the major comparison is between these models are general, and hence for this paper, we have also used the same two network architectures, i.e. LSTM and RNN, for analysing the taxi trip duration of future trips. Traffic analysis has been a hot topic of research, especially in the network domain which can easily be incorporated into the real-life scenario of taxi trips since this can be visualized as a network. For industrial datasets from [10], it was observed that normalizing plays a vital role while using ML models for predictions; it is termed crucial for a pre-processing pipeline, especially for larger datasets as it can bring down computations drastically while improve the performance of the models to a great extent. For skewed data, log transforms prove to be beneficial, especially for right-skewed data. Although as put by the authors of [11], it might not always be the case; analysis needs to be done on the data after log transform to confer the outputs if they are needed in the way they are given. If the data output from the log transform is not normalized and it presents another skew, then it would be better to not use log transform, but if it does provide the
Predicting New York Taxi Trip Duration Based on Regression . . .
19
Fig. 1 Hidden layer of LSTM blocks
right output, then it makes complete sense to use log transform for skewed dataset as it might improve performance and also give normalized outputs while removing skew. As put forward by the authors of [12], accurate time series forecasting plays a critical role, taking business operations into consideration, as it covers predicting customer growth, understanding trends and anomaly detection. LSTM was considered in being used to capture the nonlinear traffic dynamics which demonstrates the capability of time series prediction with long temporal dependency. The authors in the paper [13] utilize local information to train an LSTM neural network to predict the traffic status as shown in Fig. 1. The real-world traffic network being highly dynamic, a deep stacked bidirectional and unidirectional LSTM is used to predict the traffic state. The model takes spatial time series data and outputs the predicted values. The LSTM cell includes the input layer xt and the output layer ht, with each cell providing an output state. While the paper provides an avenue for the expansion of the concept to a large network with the help of a hierarchical method, the use of CatBoostRegressor or an XGBRegressor could have dealt with the sophisticated network being generated in a much more efficient way, as it would be able to deal with the increased complexity of the data along with the size of the data. This gated structure enables long-term dependencies to be learned by the LSTM allowing the useful data to pass alongside the network. (1) h t = t × tanh(Ct ) t Ct = f t × Ct−1 + it × C
(2)
t is the This is used to calculate the cell output state and the layer output. Here, C output generated from performing the calculation for each iteration with the help of the sequential data. In Fig. 1, the LSTM block is depicted where each timestamp indicates the input whose base design is similar to that mentioned in [13].
20
S. Ramani et al.
3 Proposed Methodology 3.1 Data Collection The datasets were collected from Kaggle. For our research work, three datasets were obtained namely: 1. taxi trip duration [14], 2. distance dataset [15], 3. weather dataset [16].
3.2 Data Pre-processing The data obtained from these sources were not processed (i.e. the datatypes of some attributes were mismatched, example: date was object format whereas it should have been date-time format). All the datasets are then merged into one to get a complete data frame that can be used to perform all further analysis. The datasets 1, 2 from the above section are merged based on the column ID, whereas the weather dataset is merged on the basis of date. After analysing the data, it was observed that the target variable had a left skew in it, and so to remove the effects of the skew, logarithmic transformations are applied. Also, the attributes with NA values were detected and filled with 0.
3.3 Feature Engineering Date–time attributes can be extracted from this like day of the week, day of the month, month, etc. K -means clustering is performed on the dataset to get a cluster analysis of the pickup and drop-off latitude and longitude. This approach is adopted so as to apprehend the concept of cluster then predict in the models to attain higher accuracies. Figure 2 represents the first 50,000 data points present in the dataset and how they are clustered into 50 clusters. The objective function of the K -means algorithm is as follows: J=
50 50000
weightik ||x i − u k ||2
(3)
i=0 k=1
where x i data point belongs to the kth cluster and u k is the centroid of x i cluster and weightik shows the weight that is trained for the kth cluster in the ith training example. From Fig. 3a, it is clear that the data are right-skewed (positively skewed distribution). Using a log transform on the target variable, the skew is normalized and is visualized in Fig. 3b, c for log and log(1 + x) transforms, respectively.
Predicting New York Taxi Trip Duration Based on Regression . . .
21
Fig. 2 Representation of clusters using K -means
x i = log(1 + x i )
(4)
x i = log(x i )
(5)
Here, xi is the data point and the log transforms are applied, respectively, using Eqs. 4 and 5.
3.4 Machine Learning To achieve a clean dataset, the data is processed using K -means clustering and then normalised using log transformations. Machine learning models, namely XGBRegressor, LGBRegressor, CatBoostRegressor, are used to predict the taxi trip durations as a regression problem. To check neural network performance on this, the problem is analysed through a time series problem perspective. For the deep learning models in the paper, MinMaxScaler was used to get all the attributes into the range of 0–1. LSTM Network: The LSTM network used for the predictions in this paper follows a three layered block system where each block consists of 2 LSTM layers, followed by 1 dropout layer which is used for regularization. There are three blocks that are placed one after the other, and in each layer, the number of neurons is reduced starting
22
S. Ramani et al.
Fig. 3 Distribution of trip durations, a without log transform, b with log(1 + x) and c with log(x) transforms
with 64 neurons and the final set containing 16 neurons each. A final dense layer is then used for the predictions. The input to this model is of the shape (1, 42) which is 1 column with 42 days of previous records data. RNN Network: A simple RNN network with three block architecture is used in this paper. Each block consists of two simple RNN modules and one dropout module which is used for regularization wherein 20% of the neurons are effectively dropped from the network at random to avoid overfitting. The input to this remains the same as the LSTM network.
3.5 Feature Analysis To analyse if the engineered features are correlated to each other which might cause issues in modelling the machine learning models. A value of 1 depicts a perfect linear relationship, and 0 shows a nonlinear relationship. Figure 4a shows a heatmap of correlations between all features from the original dataset and also after feature engineering; Fig. 4b shows correlation with the target variable.
Predicting New York Taxi Trip Duration Based on Regression . . .
Fig. 4 Pearson coefficient
23
24
S. Ramani et al.
As expected, it is observed that total_distance has the maximum positive correlation with the trip_duration. Also, pickup latitude and longitude have different correlations to the target variable, and pickup and drop-off longitude have different correlations to the target variable. Speed and cluster also are significantly correlated to the target variable.
3.6 Evaluation To evaluate the models, two metrics were chosen with respect to the two metrics R 2 score and mean squared error (MSE). Each model is trained on 90% of the dataset and evaluated on the remaining 10% of the dataset. R score = 1 − 2
[ytest[i] − pred[i]]2 ytest[i] − µ
(6)
This score can be explained as the variance of the model based on the predictions that are made by the model versus the total variance. A low value will depict that the model’s predictions are lowly correlated, and hence, the models used in this paper aim to attain high R 2 score. MSE =
n 1 (predi − vali )2 n i=1
(7)
MSE score is used to evaluate the sum of the variance of the predictor variables and the squared bias of the variables at play, where predi is the prediction made by the model and vali is test set value of the target variable.
4 Results From Table 1, it is clearly noticeable that the XGBRegressor was the best performing model with an R 2 score of 0.97 which is 12% more than the LGBRegressor and 5% more than the CatBoostRegressor (as shown in Fig. 5). From Fig. 5 we can clearly see that the model generates a non-noisy, nonheteroscedasticity along with a range from the lowest and highest values of the y_predicted. The prediction error plots the test set targets from the dataset against the predictions made by the model. The plot depicts the difference between the residuals on the vertical axis and the dependant variable on the horizontal axis, allowing to detect the regions susceptible to the error. The error plot in Fig. 5 shows the variance of the error of the regressor model, and from the figure, we can see that a fairly random distribution of the data targets in the two dimensions.
Predicting New York Taxi Trip Duration Based on Regression . . .
25
Fig. 5 Prediction errors presented from XGBoost
Fig. 6 Comparison of R 2 score
From the implementation, it was observed that the log transforms perform the best and the results are shown in Table 2. In general, it was observed that with and without scaling the R 2 score was the same for the three models that were used for the regression task. This is possibly because of the use of boosting models that are used for predictions (as shown in Fig. 6). MSE values were the least in the case of the XGBoost model with only a value of 396, while both the other models are in the vicinity of 1100. From Fig. 6, it is clearly seen that the XGBoost model scores better than all models and also a major point to note is that with the log transform (i.e. the normalized variables) results have been better both for the R 2 and for the MSE score, except for the CatBoost model which performs below expectations.
4.1 Deep Learning Model Performance From Fig. 7, it is clearly notable that the RNN model has performed better, especially because the MSE value is far lesser than that of the LSTM model, and both these
26
S. Ramani et al.
Fig. 7 R 2 score and MSE values for the deep learning models Table 1 Results for R 2 score and MSE without log transform MSE Model LGBRegressor XGBRegressor CatBoostRegressor
1158.154 459.85 836.87
Table 2 Results for R 2 score and MSE with log transform MSE Model LGBRegressor XGBRegressor CatBoostRegressor
1105.9 396.58 1089.6
Table 3 Results for R 2 score and MSE Model MSE LSTM RNN
0.97986 0.991586
R 2 score 0.855 0.9787 0.9295
R 2 score 0.877 0.984 0.881
R 2 score 0.74508 0.48158
models performed better than that of the regression models with the LSTM model also performing better than ML models and the RNN outperforming the LSTM (as values shown in Table 3).
Predicting New York Taxi Trip Duration Based on Regression . . .
27
XGBoost is the best of the ML models that were trained, and RNN is the best of the deep learning models. When comparing the two models indicated above, we find that the RNN performs somewhat better, i.e. by 2% (as shown in Tables 1 and 2).
5 Conclusions An approach was proposed to predict the taxi trip durations using regression analysis with machine learning and time series forecasting using deep learning models. We first adopted the idea of using logarithmic transform to normalize the target value. Along with that new features are engineered which include date–time features like day, time, etc., and also cluster using K -means. Predictive analysis is done using two methods, the first being the regression analysis using ML models, and the second being time series analysis using deep learning models. The deep learning models outperform the ML models by 12% in terms of R 2 score and the RNN outperforms the LSTM. Normalizing the target variable using log transform increases the overall R 2 score by 0.6% for the XGBoost model. The output from the ML models is decent given the nature of the dataset and that of the deep learning models is better when compared to that of the ML models.
References 1. U. Patel, A. Chandan, NYC taxi trip and fare data analytics using BigData, in Analyzing Taxi Data Using Bigdata (2015) 2. S. Rong, Z. Bao-wen, The research of regression model in machine learning field, in MATEC Web of Conferences, vol. 176, pp. 01033. EDP Sciences (2018) 3. Z. Turóczy, L. Marian, Multiple regression analysis of performance indicators in the ceramic industry. Procedia Econ. Finan. 3, 509–514 (2012) 4. J.G. De Gooijer, R.J. Hyndman, 25 years of time series forecasting. Int. J. Forecast. 22(3), 443–473 (2006) 5. P. Montero-Manso, G. Athanasopoulos, R.J. Hyndman, T.S. Talagala, FFORMA: feature-based forecast model averaging. Int. J. Forecast. 36(1), 86–92 (2020) 6. S. Makridakis, E. Spiliotis, V. Assimakopoulos, Statistical and machine learning forecasting methods: concerns and ways forward. PloS One 13(3), e0194889 (2018) 7. R. Madan, P.S. Mangipudi, Predicting computer network traffic: a time series forecasting approach using DWT, ARIMA and RNN. in 2018 Eleventh International Conference on Contemporary Computing (IC3), pp. 1–5. IEEE (2018) 8. S. Nihale, S. Sharma, L. Parashar, U. Singh, Network traffic prediction using long short-term memory, in 2020 International Conference on Electronics and Sustainable Communication Systems (ICESC), pp. 338–343. IEEE (2020) 9. T. Shelatkar, S. Tondale, S. Yadav, S. Ahir, Web traffic time series forecasting using ARIMA and LSTM RNN, in ITM Web of Conferences, vol. 32, pp. 03017. EDP Sciences (2020) 10. J. Sola, J. Sevilla, Importance of input data normalization for the application of neural networks to complex industrial problems. IEEE Trans. Nucl. Sci. 44(3), 1464–1468 (1997) 11. F.E.N.G. Changyong, W.A.N.G. Hongyue, L.U. Naiji, C.H.E.N. Tian, H.E. Hua, L.U. Ying, Log-transformation and its implications for data analysis. Shanghai Archiv. Psychiat. 26(2), 105 (2014)
28
S. Ramani et al.
12. S. Du, M. Pandey, C. Xing, Modeling Approaches for Time Series Forecasting and Anomaly Detection (ArXiv, Stanford, 2017) 13. M. Abdoos, A.L. Bazzan, Hierarchical traffic signal optimization using reinforcement learning and traffic prediction with long-short term memory. Expert Syst. Appl. 171, 114580 (2021) 14. https://www.kaggle.com/c/nyc-taxi-trip-duration/data. Last Accessed 4 Oct 2021 15. https://www.kaggle.com/oscarleo/new-york-city-taxi-with-osrm. Last Accessed 4 Oct 2021 16. https://www.kaggle.com/mathijs/weather-data-in-new-york-city-2016. Last Accessed 4 Oct 2021
Implementation of Classical Error Control Codes for Memory Storage Systems Using VERILOG Sreevatsan Radhakrishnan, Syed Ishtiyaq Ahmed, and S. R. Ramesh
Abstract Error coding is a method of detecting and correcting errors that ensures the detection of information bits and error recovery in case of damage. Encoding is done using mathematical techniques that pad extra bits to data which aid the recovery of the original message. Several error coding techniques offering different error rates and recovery capabilities are employed in modern-day communication systems facilitating error-free transmission of information bits. Hardware-based implementations of these error coding techniques for robust memory systems and processors has become imperative due to error resistance compared to their software counterparts. In this work, the authors demonstrate the VERILOG implementation targeted for Artix-7 board, various error coding, and correction methodologies in the view of hardware storage using field programmable gate array (FPGA), thereby providing the readers an insight into the performance and advantages offered by these techniques. Their performance in terms of power consumption and utilization is evaluated and analyzed. Keywords Error control codes · Hamming encoding · Cyclic redundancy check · Field programmable gate array · Verilog
1 Introduction Error coding is a technique for improved and reliable data storage when the medium has a high bit error rate (BER) due to physical damages and soft errors. Usually, these are termed as single event upset (SEU) and is of paramount importance in memory systems, as with aggressive scaling and higher packaging of devices, the probability of erroneous flip has increased. Therefore, it has become imperative to include error control modules as a part of hardware architecture. Instead of storing message bits as a bit form, this message is encoded with additional bits before sent to storage. This longer “code word” is then stored, and the decoder can retrieve the desired data S. Radhakrishnan · S. Ishtiyaq Ahmed · S. R. Ramesh (B) Department of Electronics and Communication Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_3
29
30
S. Radhakrishnan et al.
by using maximum likelihood decoding rule. The additional bits transform the data into a valid code word for the given coding scheme. The space of valid code words is a subset of all space of possible bit words of that length; so, the destination can recognize invalid code words. In case of errors introduced while storing, they are detected in the decoding process at the destination as the word would be the result from an invalid word. The code rate is the ratio of data bits to total bits in the code words. A high code rate results in information content for given length and lesser overhead bits. However, fewer bits appended as redundancy, the higher error-prone is the scheme. The error control capabilities of a coding scheme are correlated with its complexity and code rate. A trade-off is made between available bandwidth and the error protection offered for the storage. Consider a processor that computes the implemented logic and has a data path to store the logic in the memory. But once sent via data bus, processor has no way to ensure the correctness of data store in that memory location and in case of error, after retrieval processor must recompute. Further such mismatch allows for accidental bitflips or induced malicious attack to introduce error pattern. So herein, the authors propose various models of error control codes that would prevent such flips. This is illustrated in Fig. 1 where this error control block containing encoder and decoder is present in between storage elements and processor units connected via data bus. These encoder block add redundant bits to actual message and store in reserved memory location, while decoder removes the additional bits appended and corrects errors, if any. Also, this module rises an internal flag suggesting the correctness and recovery from the introduced error. Authors also show that these modules are
Fig. 1 Block diagram depicting Error Control Codes for memory storage system
Implementation of Classical Error Control Codes …
31
utility efficient and consume very less amount of hardware to provide the additional protection. The authors implement various error detection and error correction schemes for such memory storage system, developing a VERILOG module, schematic, synthesis the schemes targeted for FPGA implementation and have mapped it out for Artix-7 board to arrive at various reports. The choice of VERILOG as hardware language is due to the more flexible construct and C-like syntax. The following are designed in this work, checksum module, CRC encoder, hamming encoder, and decoder. In this work, authors opt for a hybrid of behavioral and structural modeling style as structural model allows to model bigger system into of simpler sub-modules while behavioral model allows for top-level abstraction of the system. This modeling style allows for independent processing of the sub-modules before integrating them into a complex system. This work is organized as follows: Next section highlights major works done in the field. This followed by methodology section explains the key concepts used in this implementation error control codes and shows the steps taken to arrive at the results. Results and discussion section highlight the waveform, report on power utilized, and show waveform for selected inputs. This section is followed by conclusion highlighting further scope of this work.
2 Literature Review Shannon’s noiseless coding theorem gives statement that there exists an efficient lossless method, at a rate approaching the channel capacity [1]. The development of modern error control techniques is credited to the works of R. Hamming [2]. The works of Gilbert [3] and Varshamov et al. [4] formalize a bound and introduction of error correction efficiency. The series of work by Wozencraft [5] highlights the computational aspects of such error correction schemes. (15,5,3), (15,7,2), and (15,11,1) Bose–Chaudhuri–Hocquenghem (BCH) codes on FPGA are designed and implemented in [6]. The work in [7] proposes a design of propagation decoder using FPGAs for polar codes [8] which shows higher throughput and improved complexity of the architecture, in comparison with convolutional turbo code decoder. Advances in serial-in serial-out (SISO) decoder algorithm have enhanced the turbo product codes and thus prominent in practice [9, 10]. Convolutional encoder and adaptive Viterbi decoder are implemented in FPGA platform using VHDL as in [11]. In [12], the work presents a bit rate and power consumption comparison of various ECC on different hardware. Data handling capability with low-power design for the system has been addressed [13, 14]. A multi-data LDPC decoder architecture is implemented on a Xilinx FPGA device [15]. From the literature survey, authors observe that no significant work is done on integrating the error control codes with FPGA memory systems, even though lot of error control systems are published. Authors in this work address the error control system in view of FPGA-based memory system design.
32
S. Radhakrishnan et al.
3 Methodology 3.1 Error Detection Schemes Checksum. The technique of checksum for memory systems is to ensure speed up in direct memory access (DMA) applications for accelerated hardware (as in C2H compilers) without trade-off on reliability. The DMA engine using checksum incurs very little overhead when switching between buffer locations, as this requires no software interruption. The computation of checksum is 8-bit long and is obtained by performing modulo255 addition over message bits in random-access memory (RAM) contents for all the retrieved 64-bytes buffer space. The obtained checksum is then flipped and appended along with the message to the processor. At the processor side, the re-computation of checksum is performed by including the checksum bits with the message. If the recomputed checksum with original checksum and message is computed to be 0, implies correct transmission while incorrect transmission requires retransmission from buffers. CRC encoder. There is possibility of soft error propagation in configuration randomaccess memory (CRAM) cells which are due to external radiations resulting in SEUs. To protect such information, these bits are protected by encoding using CRCs. The cyclic redundancy check (CRC) is formulated by making the encoded binary number as coefficients of a finite polynomial ring devised with special operations governed by finite field evaluation. The data bits at memory side are appended with the remainder bits obtained by the finite polynomial division. The retrieved bit string is divided by the same generator polynomial, and obtained remainder is compared with 0. The generator polynomial is latched up with pull-up data lines as it will utilized in decoding process. CRC can be implemented as a linear feedback shift register (LFSR) with for serial data input. But such design is suboptimal as this implementation has an unusually high clock speed required for operation and only one data bit every clock cycle. To achieve higher throughput, using linearity property of CRC, the serial LFSR designed as parallel N-bit-wide circuit, so that N bits are processed in every clock. A CRC can detect burst errors of up to r bits of errors, where r is the order of generator polynomial [16].
3.2 Error Correction Schemes Hamming module: Hamming code encodes 4 bits of message into 7-bit code words by adding additional parity bits to each message string and is thus called (7,4) Hamming code. The hamming bits are inserted in integral powers of 2 (i.e., bit positions 1,2,4). The parity bits computed herein are non-systematic, and using cyclic
Implementation of Classical Error Control Codes …
33
property, these are valid code words. This scheme is suitable for burst, errorless, low-noise communication channels. In FPGAs, the DDR memory controllers use matrix-based hamming encoders. These encoders are implemented using LUTs even though faster and are suboptimal for memory utilization, but matrix multiplication is costly. This [17] work suggests implementing them using cyclic codes and polynomials as serial LFSR, with a tradeoff on the processing cycle. Authors herein have implemented the encoder simply as series of XOR function of corresponding bit positions, thus saving memory and holding the clock cycle. Hamming decoder is constructed by computing the syndrome bits “s” using the input word from retrieved data. If “s” is computed to be 0, this implies that no error is present. For all the non-zero values of “s,” the interleaving allows to read out the single-error bit. This corresponding bit is flipped to arrive at the mapped code word. From this code word, the data bits are recovered by omitting the parity bits that are interleaved at bit positions that are integral powers of 2.
4 Results and Discussion Note that the high values of static power in all the schemes implemented are due to power consumed by I/O pins, while the actual design consumes a lot less power as this is sub-module present within the memory system and would involve no I/O pins as this act on the computed data bits. So, for in all practical applications, only the switching power is considered as dynamic power.
4.1 Error Detection Schemes Checksum. Figure 2 corresponds to the waveform obtained where “data” of 64 bits (processed message from processor) is the input for the checksum module while “out” (72 bits long) is the checksum (8 bits at the start—marked in yellow) appended message that ensures the correctness of the written signal during retrieval from memory. Figure 3 shows the circuit schematic. Figure 4 shows the RTL top block is presented where “sender” module is the processor module that performs write operation to the memory and checksum block is an internal sub-module that computes the redundant “checksum” bits. Figure 5
Fig. 2 Waveform for 8-bit checksum on 64-bit data (yellow box depicts the checksum bits)
34
S. Radhakrishnan et al.
Fig. 3 Checksum circuit schematic (65 cells and 168 nets)
Fig. 4 Top-level module sender depicting the functional module and corresponding I/O pins
Fig. 5 Power report of checksum
represents the power value, and Fig. 6 correspond to the utilization report. From utilization table, we note that the LUT utilization for the module is less (66/63400 LUT slices) and allows room for further actual logic to be implemented.
Implementation of Classical Error Control Codes …
35
Fig. 6 Utilization report of checksum
Fig. 7 Waveform of CRC encoding 1 s compliment design
Fig. 8 Circuit layout for CRC encoder depicting gates, LUTs, and buffers
CRC encoder: Fig. 7 shows waveform obtained where “data_in” is the input message and “crc_out” (16 bits long highlighted in yellow) is the CRC signal that is stored in the memory. Waveform also shows the “crc_en” to be active high and “rst” to be active low signals. The “crc_out” is computed by appending the negated message with negated remainder obtained by field division. Herein, the mathematical division is retrieved to be made in XOR counting and this is further simplified to be made parallel, resulting in lesser footprint as in circuit shown. Figure 8 shows the circuit schematic showing the simplicity of parallel LFSR implementation as this saves on clock cycle by making it linearly independent computations. The schematic uses 75 cells and 84 nets for the circuit implementation. Figure 9 is the result of RTL block view that maps all the input signal from the data bus and output signal to memory. Figures 10 and 11 show the utilization and power report of the implementation. Unlike checksum, we require to store the computed results of initial bits as they are reused in serial LFSR implementation and thus uses slice registers. But in comparison with serial counterparts, this method allows for design flexibility and division polynomial can be modified on the fly as well only one cycle, thus a better throughput.
36
S. Radhakrishnan et al.
Fig. 9 Top-level module of CRC encoder
Fig. 10 Power report for CRC encoder
Fig. 11 Utilization report of CRC encoder
Fig. 12 Hamming encoder waveform: Non-systematic “code word” generation
Implementation of Classical Error Control Codes …
37
4.2 Error Correction Schemes Hamming module: Waveform presented in Fig. 12 shows “data” is the input signal to the module and “code word” is the encoded signal that is sent to memory. This signal is arrived by doing a non-systematic encoding of message signal appending with parity bits computed by XOR operations. Figure 12 shows the circuit schematic. This module utilizes active high clock enable, and it involves only three additional gates represented by LUTs in the circuit above. The circuit footprint is represented in Fig. 13. The top-level footprint is shown in Fig. 14. One of the major advantages of this scheme is that it is highly power efficient as well as leaves only very less footprint. This is evident from utilization as in Figs. 15 and 16. Utilization table clearly shows very little consumption to add redundancy. Similarly, decoder system is designed with “code word” as the input for this module, while this module sends the actual “data” back to the processor. Figure 17 shows the circuit schematic. This schematic shows the buffers and MUX used in decoding by computing syndrome bits and mapping them to retrieve to actual corrected data bits. The top-level module depicting the RTL layout with input and output lines is presented in Figs. 18, 19, and 20. Again, the major highlight of this hamming decoder design is the scalability and power efficiency. The power report suggests active power in decoding logic
Fig. 13 Circuit schematic of hamming encoder
38
S. Radhakrishnan et al.
Fig. 14 Top-level module of hamming encoder
Fig. 15 Power report of hamming encoder
Fig. 16 Utilization report-hamming encoder
Fig. 17 Waveform for hamming decoder
is 0.131 W which in comparison with others is optimal for single bit correction. The scalability to higher hamming codes depending on the errors can be done as utilization report in Fig. 21 suggests that less 1% of total LUTs available is used in such design (Table 1).
Implementation of Classical Error Control Codes …
Fig. 18 Schematic for hamming decoder Fig. 19 Top-level module of hamming decoder
Fig. 20 Power report for hamming decoder
39
40
S. Radhakrishnan et al.
Fig. 21 Utilization report for hamming decoder
Table 1 Power comparison table for various schemes in this work
Error control scheme
Dynamic power Static power (in W) (in W) Signal
Logic
Checksum
1.567
0.551
0.223
CRC encoder
0.689
0.334
0.141
Hamming encoder
0.060
0.020
0.087
Hamming decoder
0.075
0.056
0.085
5 Conclusion With rapid growth and development of big data and Internet of Things, as well with increasing emphasis in hardware security in compact miniaturized system, the development of efficient error control systems that aids error-free storage and retrieval of data has become imperative. The design integration of such error control codes using FPGA in various application is flexible as the utilization and power dependence are specific to nature of error, available bandwidth, and thus customizable. In this work, the Verilog-based implementation of a plethora of error control codes is depicted, specifically targeted for the Artix-7 board technology, and their performance is analyzed. The scalability of system is independent of targeted board although in such case power and utilization may differ. The systems demonstrate improvement in reliability of data storage in hardware-based memory systems and the simulations presented in this work validate the use of FPGA-based implementation of error control codes for building systems with improved pipelining efficiency and low-power systems.
References 1. C.E. Shannon, A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948) 2. R.W. Hamming, Error detecting and error correcting codes. Bell Syst. Tech. J. 29(2), 147–160 (1950) 3. E.N. Gilbert, A comparison of signalling alphabets. Bell Syst. Tech. J. 31(3), 504–522 (1952) 4. R.R. Varshamov, Estimate of the number of signals in error correcting codes. DockladyAkad. Nauk, SSSR 117(1957), 739–741 5. J.M. Wozencraft, List decoding. Q. Progress Rep. 48, 90–95 (1958)
Implementation of Classical Error Control Codes …
41
6. A.K. Panda, S. Sarik, A. Awasthi, FPGA implementation of encoder for (15, k) binary BCH code using VHDL and performance comparison for multiple error correction control, in 2012 International Conference on Communication Systems and Network Technologies (CSNT) (IEEE, 2012), pp. 780–784 7. A. Pamuk, An FPGA implementation architecture for decoding of polar codes, in 2011 8th International Symposium on Wireless Communication Systems (ISWCS) (IEEE, 2011), pp. 437441 8. E. Arikan, A performance comparison of polar codes and Reed-Muller codes. IEEE Commun. Lett. 12(6), 447–449 (2008) 9. S. Khavya, B. Karthi, B. Yamuna, D. Mishra, Design and analysis of a secure coded communication system using chaotic encryption and turbo product code decoder, in Advances in Computing and Network Communications (Springer, Singapore, 2021), pp. 657–666 10. G. Shivanna, B. Yamuna, K. Balasubramanian, D. Mishra, Design of high-speed turbo product code decoder, in Advances in Computing and Network Communications (Springer, Singapore, 2021), pp. 175–186 11. Y.S. Wong et al., Implementation of convolutional encoder and viterbi decoder using VHDL, in Proceedings of 2009 IEEE Student Conference on Research and Development (IEEE, Serdang, Malaysia, 2009), pp. 22–25 12. G. Balakrishnan et al., Performance analysis of error control codes for wireless sensor networks, in 4th International Conference on Information Technology, 2007 (ITNG’07, IEEE, 2007), pp. 876–879 13. A.S.K. Vamsi, S.R. Ramesh, An efficient design of 16 bit mac unit using vedic mathematics, ın 2019 International Conference on Communication and Signal Processing (ICCSP) (IEEE, 2019), pp. 319–322 14. C. Mahitha, S.C.S. Ayyar, S. Dutta, A. Othayoth, S.R. Ramesh, A low power signed redundant binary vedic multiplier, in 2021 5th International Conference on Trends in Electronics and Informatics (ICOEI) (IEEE, 2021), pp. 76–81 15. L. Yang, H. Liu, C.-J.R. Shi, Code construction and FPGA implementation of a low-error-floor multi-rate low-density parity-check code decoder. IEEE Trans. Circ. Syst. I Regul. Pap. 53(4), 892–904 (2006) 16. E. Stavinov, A practical parallel CRC generation method. Circ. Cellar-Mag. Comput. Appl. 31(234), 38 (2010) 17. J.M. Gilbert, C. Robbins, W. Sheikh, FPGA implementation of error control codes in VHDL: an undergraduate research project. Comput. Appl. Eng. Educ. 27(5), 1073–1086 (2019)
Parkinson’s Disease Detection Using Machine Learning Shivani Desai , Darshee Mehta, Vijay Dulera, and Hitesh Chhikaniwala
Abstract Parkinson’s disease (PD) is one of the illnesses which influences the development and its moderate and non-treatable sensory system problem. Side effects of Parkinson’s infection might incorporate quakes, inflexible muscles, stance and equilibrium weakness, discourse changes, composing changes, decline squinting, grinning, arms development, and so on. The manifestations of Parkinson’s illness deteriorate as time elapses by. The early location of Parkinson’s sickness is one of the critical applications in the present time because of this explanation. According to the execution, part is concerned it is partitioned into two unique parts. The first incorporates pre-handling of the MRI picture dataset utilizing different methods like resizing, standardization, histogram coordinating, thresholding, separating, eliminating predisposition and so forth to zero in on the part which is significant and gets more precise outcomes. In the subsequent section, a dataset with different various elements of human discourse which helps in identifying Parkinson’s illness has been utilized. Here additionally, the dataset will be handled first, imagined, adjusted, and afterward, at last, be split into preparing and testing. Utilizing machine learning calculations, we will prepare the model like decision tree classifier, logistic regression, support vector machine, XGBoost and K neighbors classification, and after testing, we will get results utilizing execution boundaries like exactness score, accuracy review, disarray grid and so on. Keywords Parkinson’s disease · Machine learning · Magnetic resonance imaging · Parkinson’s progression markers initiative S. Desai (B) · D. Mehta · V. Dulera Computer Science and Engineering Department, Nirma University, Ahmedabad, Gujarat, India e-mail: [email protected] D. Mehta e-mail: [email protected] V. Dulera e-mail: [email protected] H. Chhikaniwala Info Comm Technology, Adani Institute of Infrastructure Engineering, Ahmedabad, Gujarat, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_4
43
44
S. Desai et al.
1 Introduction Parkinson’s disease (PD) is a non-reparable moderate tangible framework issue that effects working of the mind and various turns of events. This disease is achieved by the lack of dopamine-conveying neurons and is significantly found in people of 50–70 years of age and comes later Alzheimer’s. The symptoms of PD fall apart as your condition progresses as time goes on. Notwithstanding the way that there is no indisputable therapy for Parkinson’s sickness (PD), early area and appropriate organization may reduce the signs and especially work on the condition of patients’ lives. Thusly, Parkinson’s affliction revelation is one of the fundamental consistently. Side effects of PD might incorporate quakes, unbending muscles, stance and equilibrium weakness, discourse changes, composing changes, decline flickering, grinning, arm development and so forth. Determination of PD at a beginning phase is a significant testing task as early non-engine manifestations of PD might be gentle and can be brought about by numerous different conditions, and subsequently, these indications are regularly disregarded [1, 2]. As a result, to overcome these difficulties, various AI techniques for the arrangement of PD and sound controls or patients with comparative clinical introductions have been developed (e.g., development issues or other Parkinsonian disorders). Using AI approaches, we can then identify relevant elements that are not commonly used in the clinical analysis of Parkinson’s disease and rely on these alternative measures to identify PD in the early stages [3]. For analysis of PD, procedures like magnetic resonance imaging (MRI), single photon emission computed tomography (SPECT), positron emission tomography (PET), functional magnetic resonance imaging (fMRI) are utilized [4, 5]. The explanation we have utilized MRI examines here is on the grounds that they are considered to give results better compared to other utilizing procedures of ML and DL-SVM, ANN, naïve Bayes, 2D-CNN, 3D-CNN and so on For building the model, the means followed will be in the request-information assortment (dataset), information pre-handling utilizing different pre-handling strategies, improvement of the model, preparing the model and testing something similar and assessing the outcomes utilizing different execution boundaries like accuracy review, confusion matrix, loss score, etc. The center would be that of picture prehandling and in this manner include determination and extraction. In the subsequent section, a dataset with different various highlights of human discourse which helps in identifying Parkinson’s sickness has been utilized and thought about. Here likewise, the dataset will be handled first and in this manner managing expulsion of repetitive information, insignificant information, checking for missing information and so on. Utilizing different plots information can be imagined. Likewise, the dataset is checked for unevenness, and utilizing SMOTE dataset has been adjusted alongside setting up the train test highlights from the dataset for model preparation and testing. We have taken a stab at assessing the presentation by different ML methods like XGBoost classifier, support vector machine, naïve Bayes, K neighbors classification and so on. The model will initially be prepared and afterward tried, and we will get the outcomes utilizing execution boundaries like exactness score, accuracy review, disarray grid and so on.
Parkinson’s Disease Detection Using Machine Learning
45
1.1 Motivation Early non-engine indications of PD can be humble and may be brought about by an assortment of different sicknesses, making early analysis of PD troublesome. Therefore, these side effects are habitually missed. Therefore, a few AI, profound learning approaches for the order of PD and sound controls or patients with similar clinical introductions have been created to settle these difficulties (e.g., development issues or other Parkinsonian disorders). We can identify useful features which cannot be notices otherwise by using ML and DL techniques. Also, with machine learning, there are some models created but very few work on MRI images. They mostly work on tabular data directly available. We will try to develop a model which will work on MRI images and from that images we will obtain data and do the further processing.
1.2 Research Contribution The research contribution for this paper includes following things: For the MRI scans, pre-processing is done using various techniques available as per the need. This will help in focusing and working with only the part which will be useful for the detection of PD, selecting and extracting the features and enhancing the same. Results of various ML and DL have also been specified.
1.3 Organization For the remaining portion, section-wise description is as follows: Sect. 2 consists of date in table format of related work with details of papers, their title, technique, dataset, pros and cons, etc. Section 3 shows the workflow steps, design of model of the system as proposed. Section 4 shows comparative study of various techniques of machine learning as well as deep learning based on their accuracy and other such parameters. Following that, in Sect. 5, specifics about the dataset, such as the source, type of scans, quantity of data, and so on, are provided. Section 6 details about model, and results obtained are discussed. Toward the end, conclusion as well as future work is mentioned.
2 Related Work In this part, a study of some current ML and DL strategies is introduced. For instance, in cliteb7, the creator has proposed a counterfeit neural organizations (ANN) model.
46
S. Desai et al.
Right around, 200 SPECT pictures were utilized and pre-handling; division on this information of pictures was done to step up the efficiency. ANN was utilized with sigmoid as actuation work for the model structure. Another approach by SolanaLavalle et al. [6] has proposed a model using machine learning using techniques like (k-closest neighbors) KNN and support vector machine (SVM). Additionally, for zeroing in just one significant area/part voxel-based morphometry is utilized. Conversely [7], we fostered a model for early discovery of Parkinson’s illness utilizing the AlexNet model created by preparing DICOM images. Images are pre-handled using techniques such as image rotating and mirror image handling. Pictures are pre-handled using methods like image rotating and mirror image handling. In paper [8, 9], creator has proposed DL foresee Parkinson’s and characterized into a phase. They used a CNN model with five convolutional layers and used MRI pictures as an information base. Standardization is utilized for pre-handling. For model K-field cross-approval has likewise been done. Table 1 shows beneath the correlation of different investigations identified with it.
3 Proposed Work 3.1 Proposed System Model The framework model proposed in this article has a few stages related to it before the model was prepared, which incorporates investigation and pre-handling of the pictures. The whole stream should be visible in Fig. 1. All the MRI checks utilized were gathered from the PPMI data set and had been separated through certain boundaries, for example, considering just the hub view and the sort of MRI grouping. From that point onward, the pictures were separated into two primary envelopes according to their group PD or HC. Each picture is advanced pre-handled before being given as a contribution to the model, beginning with predisposition field amendment, histogram coordinating, standardization of the pixel esteems, thresholding, sifting and resizing every one of the pictures to a comparable size. The resultant pictures will be given as contributions to the model on which they will be prepared and tried, and utilizing different assessment boundaries execution of the model was estimated [10]. Next, in Fig. 2, the workflow for the text-based speech dataset is proposed. The dataset was collected from UCI ML repository which had data of HC and PD patients and their speech data characteristics. Then comes the part of processing the data like checking for null values, duplicate values, unnecessary values, etc. Next, we will visualize the processed data using various plots like count plot, pair plot, etc., and make observations. Before splitting the data for training and testing, one important part of balancing the data remains. This is very useful when the classes are not near to equal. We can do this using SMOTE analysis. Next, the training data will first be trained and then tested using various ML techniques like KNN, SVM, naïve Bayes, XGBoost, etc. Once that is done, we can measure the performance of the models
End-to-end Parkinson disease diagnosis using Brain MR-Images An explainable machine learning model for early detection of Parkinson’s disease using LIME on DaTscan Imagery Deep learning based diagnosis of Parkinson’s disease using convolutional neural network MRI scans classification using machine learning and voxel-based Model using deep Image alignment and learning for prediction of segmentation, extraction stages of ROI, PCA
2018
Mahesh, Pavam Richard 2020 DelwinMyloth, Rij Jason Tom
2019
2021
2021
2018
S. Sivaranjini, C. M. Sujatha
Solana-Lavalle, Rosas-Romero
Mozhdehfarahbakhsh
Bhalchndra
Data pre-processing
Early detection of Parkinson’s disease through shape-based features
Segmentation, Feature extraction
Normalization
Normalization, Gaussian Filter
Normalization, resizing
Skull stripping, data augmentation
Hough transformation, sequential grass fire algorithm, moments of image, XGBoost
Soheil Esmaeilzadeh, Yao Yang,
Title Early detection of Parkinson’s disease using image processing and artificial neural network
Year
2018
Author
MosarratRumman, Abu Nayeem Tasneem, Sadia Farzana,
Table 1 Comparison of related work (literature survey) Technique
LDA, SVM classifier
KNN, SVM, multilayer perceptron, Random forest, Bayesian networks
CNN
CNN
CNN
CNN
Segmentation and ANN
Pros
Cons
Inefficient image processing
Use of only one slice of image
Prior to training model authenticity
Insufficient image processing
Use only one angle of image
Used K-fold cross-validation
Feature extraction done manually
Results is accurate one of Feature extraction done the reasons being speech manually based
K-fold cross-validation
Good image quality
Transfer learning
High parse data in MRI
Good enough image processing, adaptive learning
Parkinson’s Disease Detection Using Machine Learning 47
48
S. Desai et al.
Fig. 1 Proposed system model for MRI scan database
Fig. 2 Proposed system model for text-based dataset Table 2 Comparative study of ML/DL algorithms Type of dataset Technique used SVM Naïve Bayes classifier Decision tree classifier SVM LDA and SVM Classifier ANN CNN Transfer learning, Alexnet pre-trained model Various ML techniques
Accuracy score
MRI Scans MRI Scans 3-T MR imaging MRI Scans SPECT Scans SPECT Scans MRI Scans MRI Scans
92.35 93 92 86 98 94 96 88.9
SPECT Scans
97.27
using various evaluation parameters like accuracy score, confusion matrix, precision, recall, etc. [11].
4 Comparative Study of ML/DL Algorithms [1] By taking techniques used, data set type and accuracy score as parameters comparative study of various ML/DL algorithms has been done as shown in Table 2 [12].
5 Dataset The dataset utilized here comprising of MRI examine pictures to recognize Parkinson’s illness is taken from Parkinson’s progression markers initiative (PPMI). As told the dataset comprises of MRI filter pictures in .nii (NifTI—neuroimaging informatics technology initiative) record design and for the most part two kinds of pictures are
Parkinson’s Disease Detection Using Machine Learning
49
available pizazz (fluid attenuated inversion recovery) and non-energy (T2 weighted) pictures. Alongside that we likewise have various kinds of picture design records for pizazz and non-style which are accessible as changed over, standardized and expanded and so on Notwithstanding, we have chosen to utilize and work basically with energy and T2-weighted pictures for a similar which is .nii documents. Also, for the other part, we have made use of speech dataset of normal healthy people and ones suffering from PD. The dataset for the same is taken from machine learning repository of UCI. The dataset has total of 240 records, and 46 instances of which 120 records are of healthy patients and other 120 are of people with Parkinson’s disease. The attributes/features included here are jitters (pitch local perturbation measures), shimmers (amplitude local perturbation measures), status (HC/PD), gender, harmonic to noise ratio (HNR) measures, recurrence period density entropy (RPDE), pitch period entropy (PPE), glottal to noise excitation ratio (GNE), detrended fluctuational analysis (DFA) [13, 14].
5.1 Data Pre-processing for MRI Scans Data pre-processing is one of the important steps before using images from dataset for our machine learning/deep learning techniques as it helps in better performance and better diagnosis of disease prediction and thereby improving accuracy of the model by taking into account only the parts/features which are important and enhancing the same. Pre-processing of the dataset may include resizing, normalization, segmentation, cropping of the images, data augmentation, histogram matching, removing bias, etc. Some specific image processing methods are used in the case of neuroimages such as – – – – – – – –
Normalization Histogram matching Image resizing Smoothening of images CLAHE Otsu thresholding Skull stripping Binary mask, unsharp masking, etc.
Bias and Removal Field of Bias: Space of the top of a patient in the scanner, shortcoming in the scanner, head twists, temperature, and perhaps a couple of issues can add to the heterogeneity of differing power all through the MRI check. At the end of the day, the force worth might differentiate such that it is not tentatively huge inside a comparative tissue [15]. This is known as the field of predisposition. This can create issues like division, and we probably would not get exact outcomes. Accordingly, a specific kind of pre-processing is expected to wipe out or right the
50
S. Desai et al.
inclination field. Thus, through this, low recurrence and non-consistency found in MRI pictures can be revised. Histogram Matching: Here, there is a source picture which will be changed according to the reference picture given [15]. In fundamental words, histogram matching is the point at which we roll out specific improvements or adjust the first information, which is the MR picture. Likewise, we will get histogram of the source coordinating with that taken as reference subsequently. Histogram matching redesigns the differentiation of the picture and can save the misfortunes that happen as a result of difference gains or clippings on account of shadows. Figure 3 shows the matching impact, as we can see the subsequent picture endeavors to copy the pixel forces of the reference picture while keeping up with the semantic and spatial highlights (Figs. 4, 5, 6, 7 and 8). Normalization: The standardization of the picture is a change from the info picture power highlights to deliver an alternate differentiation between the result picture tissue. The forces of the info information are changed in standardization to do the picture standardization, to comprehend the changes and the advancement, individually. Two diverse difference pictures are the T2-weighted and FLAIR pictures. It is finished by deducting the normal pixel power esteem from the unique picture and
Fig. 3 Flair axial image
Fig. 4 Speech dataset
Parkinson’s Disease Detection Using Machine Learning
51
Fig. 5 Before bias removal
Fig. 6 After bias removal
Fig. 7 Before histogram matching
separating it by the qualities for getting a proper reach somewhere in the range of 0 and 1 (Figs. 9, 10, and 11). Z = x − μ/σ CLAHE: Contrast limited adaptive histogram equalization (CLAHE) is one of the parts of adaptive histogram equalization. Here, the work is not done on entire image, but it is done on very small regions called tiles. It is used for removal of contrast at times as it handles over amplification of contrast.
52
S. Desai et al.
Fig. 8 After histogram matching
Fig. 9 Before normalization
Fig. 10 After normalization
Otsu Threshold: This technique is widely used for the purpose of image thresholding, wherein after processing, the input image, and its histogram, the threshold value of it is calculated [5]. And then, the pixels of image will be replaced half into one region and other half into another region, i.e., replace by white if amount of saturation more than that of threshold and replace with black otherwise.
Parkinson’s Disease Detection Using Machine Learning
53
Fig. 11 Output after applying CLAHE, Otsu threshold and resizing
Fig. 12 Healthy control (HC) count and Parkinson disease (PD) patients count (0 denotes HC, 1 denotes PD)
6 Performance Evaluation and Results In the below image, a count plot of the data in our dataset is shown. Here, we can see that in our dataset there are 120 records of healthy people and 120 records of people suffering from Parkinson’s disease which is shown correctly as below (Figs. 12 and 13). The below image shows pair plot for different features like Shimloc, ShimdB, ShimAPQ3, ShimAPQ5, ShiAPQ11. These features are correlated to a level which is one thing we can observe from this. Next, SMOTE is used for balancing of the dataset. The number of instances for both the classes is equal in our case. This is not always the case. So, we need to do balancing of the dataset. Once done, we divide our dataset into training and validation data. And then, we have applied ML algorithms (Figs. 14, 15, 16, 17, 18, 19 and 20).
54
S. Desai et al.
Fig. 13 Pair plot where we can see that all these fundamental frequencies, variation in amplitude are highly correlated with each other Fig. 14 Confusion matrix for KNN. Accuracy achieved—85.41
Parkinson’s Disease Detection Using Machine Learning
Fig. 15 Confusion matrix forXGBoost. Accuracy achieved—83.33
Fig. 16 Confusion matrix for naïve Bayes. Accuracy achieved—83.33
Fig. 17 ROC AUC curve for SVM
55
56
S. Desai et al.
Fig. 18 Confusion matrix for SVM. Accuracy achieved—86.45
Fig. 19 Performance measures for various different ML-based techniques (results of our model implementation) Fig. 20 Performance measures for various different ML-based techniques as comparison [9] (results of already implemented models)
Parkinson’s Disease Detection Using Machine Learning
57
We have implemented ML techniques like decision tree classification, support vector machine, naïve bayes classification, KNN classification and XGBoost classifier. We can conclude the following from tables: For the naïve Bayes and KNN, we have been able to achieve almost the same accuracy. And we got the highest accuracy. Also, the accuracy achieved by us is higher as compared to the one we are comparing with. Thus, in this way we have implemented and performed Parkinson’s disease classification using various machine learning techniques.
7 Conclusion and Future Work For the detection of PD, dataset used was obtained from PPMI Web site consisting of MRI scans and flair and non-flair images with .nii file format. After that, pre-processing techniques like histogram matching, z-score normalization, image resizing, etc., have been applied for selection, extraction and enhancement of features of the images. From various machine learning techniques, we will then study and implement the technique which gives us the best results. The results and output have been visualized using accuracy, precision, recall, confusion matrix, etc., which are parameters for evaluation of performance. Similarly, for the text-based speech dataset, after processing and visualizing the data, we split the data into train and test data and provide the train data to the various ML techniques like naïve Bayes, SVM, XGBoost, Naïve Bayes, etc. The results and output have been visualized using accuracy, precision, recall, confusion matrix, etc., which are parameters for evaluation of performance. Also, we were able to increase the performance efficiency. The entire implementation work currently has been done on Jupyter Notebook and Google Colab. In the future, implementation of deep learning techniques like CNN can also be done which can give us more accurate results for detection of Parkinson’s disease.
References 1. P.M. Shah et al., Detection of Parkinson disease in brain MRI using convolutional neural network, in 2018 24th International Conference on Automation and Computing (ICAC). IEEE (2018) 2. S.R. Nair et al., A decision tree for differentiating multiple system atrophy from Parkinson’s disease using 3-T MR imaging. Eur. Radiol. 23(6), 1459–1466 (2013) 3. P.R. Magesh, R. DelwinMyloth, R. Jackson Tom, An explainable machine learning model for early detection of Parkinson’s disease using LIME on DaTSCAN imagery. Comput. Biol. Med. 126(2020), 104041 4. S. Sivaranjini, C.M. Sujatha, Deep learning based diagnosis of Parkinson’s disease using convolutional neural network. Multimed. Tools Appl. 79(21), 15467–15479 (2020) 5. M. Rumman et al., Early detection of Parkinson’s disease using image processing and artificial neural network, in 2018 Joint 7th International Conference on Informatics, Electronics & Vision
58
6.
7. 8. 9.
10.
11.
12.
13. 14. 15.
S. Desai et al. (ICIEV) and 2018 2nd International Conference on Imaging, Vision & Pattern Recognition (icIVPR). IEEE (2018) G. Solana-Lavalle, R. Rosas-Romero, Classification of PPMI MRI scans with voxel-based morphometry and machine learning to assist in the diagnosis of Parkinson’s disease. Comput. Methods Programs Biomed. 198, 105793 (2021) E. Huseyn, Deep Learning Based Early Diagnostics of Parkinsons Disease (2020). arXiv preprint arXiv:2008.01792 A. Mozhdehfarahbakhsh et al., An MRI-Based Deep Learning Model to Predict Parkinson Disease Stages. medRxiv (2021) C.O. Sakar et al., A comparative analysis of speech signal processing algorithms for Parkinson’s disease classification and the use of the tunable Q-factor wavelet transform. Appl. Soft Comput. 74, 255–263 (2019) M.B.T. Noor et al., Application of deep learning in detecting neurological disorders from magnetic resonance images: a survey on the detection of Alzheimer’s disease. Parkinson’s disease and schizophrenia. Brain Inform. 7(1), 1–21 (2020) S. Haller et al., Differentiation between Parkinson disease and other forms of Parkinsonism using support vector machine analysis of susceptibility-weighted imaging (SWI): initial results. Eur. Radiol. 23(1), 12–19 (2013) N.A. Bhalchandra et al., Early detection of Parkinson’s disease through shape based features from 123 I-Ioflupane SPECT imaging, in 2015 IEEE 12th International Symposium on Biomedical Imaging (ISBI). IEEE (2015) L. Naranjo et al., Addressing voice recording replications for Parkinson’s disease detection. Exp. Syst. Appl. 46, 286–292 (2016) B.E. Sakar et al., Collection and analysis of a Parkinson speech dataset with multiple types of sound recordings. IEEE J. Biomed. Health Inform. 17(4), 828–834 (2013) V. Tarjni et al., Deep learning-based scheme to diagnose Parkinson’s disease. Exp. Syst. e12739 (2021)
Sustainable Consumption: An Approach to Achieve the Sustainable Environment in India Sunny Dawar, Pallavi Kudal, Prince Dawar, Mamta Soni, Payal Mahipal, and Ashish Choudhary
Abstract In last few years, research on sustainable environment has motivated to unfold the problems through different marketing and consumption patterns. This claims to provide an alternative path to conceptualize the dynamic nature of society to speak about the sustainability. Most of the conceptual–practical research focus on routine problems of people neglecting the need of protection of environment for future generation. The core issues had been unaddressed by behavioural researchers like role of consumers in sustainable development. This research article aims to examine the determinants of consumer behaviour linked with sustainable consumption. The focus would remain on sustainable consumption and how dream of protection of sustainable environment can be achieved through sustainable consumption. The research makes an attempt to find out the determinants and effects of demographic variables on sustainable consumption. Keywords Sustainable environment · Sustainable consumption · Human behaviour · Sustainable development
1 Introduction Sustainability has been defined by several authors keeping in notice various variables which directly impact the sustainable living, sustainability inhibitors, and sustainability strategies. These all things have direct influence on perception and attitudes of S. Dawar · M. Soni · P. Mahipal (B) · A. Choudhary Manipal University Jaipur, Jaipur, Rajasthan, India S. Dawar e-mail: [email protected] P. Kudal Dr DY Patil Institute of Management Studies, Pune, Maharashtra, India e-mail: [email protected] P. Dawar Poornima Group of Colleges, Jaipur, Rajasthan, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_5
59
60
S. Dawar et al.
people to pursue sustainable development policies and practices. Sustainable development refers to social, economic, and environmental foundation which must be considered, harmonized, and communicated to sustain long-term feasibility of the society and the planet earth. It deals with the fulfilment of present generation needs without compromising needs of future generation. From last decade, a debate is going on, how to conserve the environment by sustainable consumption. Sustainable consumption has got vast importance at different levels of consumption. It has different suggestions for welfare of economy. It directly effects the consumption behaviour of people to encourage the sustainable consistency in an economy. The phenomenon of sustainable consumption is influenced by the allocation and utilization of different resources so that it has an ability to affect the long-term growth of an economy. Sustainable consumption is a consumption of goods which have very low impact on environment considering the social and economic parameters viable for meeting the common needs of human beings. It targets every individual and countries from an individual to every government and multinational corporations. The idea behind adoption of practice of sustainable consumption is to provide the consumers various products by reflecting their new environmental values. There are different methods by which sustainable consumption can be promoted like providing tax rebates to eco-friendly products, increasing taxes on more water and energy consumption, promoting 3R’s framework (Reduce, Recycle, and Reuse) through communication and educational campaigns. In current scenario, people show a lot of concern to environmental protection practices and want to purchase and use ecofriendly products. They make their purchase decisions concerning the eco-friendly practices. If we focus on demand side, it is only sustainable consumption which fosters the sustainable production practices. There is a requirement of multinational and multidisciplinary approaches to achieve the goal of sustainable environment. Less exploitation of resources would help the people to think more about conservation of environment and reduction of personal consumption. There are different issues which must be addressed and supported by people like providing training to consumers, increasing awareness, and bringing government and industry closer to work upon the sustainable issues. There is an emergent need to find out why and how consumers behave for sustainability issues. Various researchers have identified different contributing factors including demographic and psychographic that influence consumers’ behaviour towards sustainability. Consumption patterns in the world needs to be administrated in such a way which can increase the opportunities for the sustainable development in the world economy. There are many products which are having environmental advantages that are produced by many companies to improve the value of production. More utilization of these eco-friendly products would help to minimize the stress on environment and increase the customer satisfaction and would also promote the sustainable consumption. Sustainable consumption, in return, will help the human being to ensure their long-term survival on earth.
Sustainable Consumption: An Approach …
61
2 Literature Review Consumers have gained the sustainability knowledge as per the research policies developed by various researchers, and they are involved in mass consumption on the price of damaging consequences of environment [1]. The fundamental basics of sustainability are dependent on conscious efforts of consumers to gain the goals of sustainable consumption through deciding the future gain of society, environment, and economic systems [2]. Sustainable consumption is dependent on utilization of environmentally labelled products. The organizations are focussing to produce the eco-friendly products and organic food items to lead the protection of sustainable environment [3]. Existing research literature has an individualistic approach about sustainable consumption and environmental labels. Consumer choices are the reflection of not only of quality and prices but also are related with values and social phenomenon of consumers which have seen enormous growth in the global market [4]. The major goal of sustainable development remains to recycle and reuse of water through modern sources [5]. Conscious acceptance of prevented design can be the starting step for long-term changes in the ways of changing consumer behaviour. When consumers would become more and more aware, then demand for sustainable consumption would increase [6]. There is significant impact of different variables like social demographics and explicit and implicit consumer attitudes on sustainable consumption. Majority of the studies are concerned with production of new market and have a very less concern with consumer exploration end. Motivation and values support the actual sustainable behaviour for consumption of fashion products [7]. Consumers are responsible for unsustainable consumption behaviour [8]. The choices they make have substantial environmental effect which is separated from those facets which are influenced by consumers. Consumer choices are influenced by sustainable consumption. Problems related to sustainability can be solved by sustainable consumption as it empowers people for active lifestyle and can change the method of sustainable orientation. It is not good to use those products which disturb ecological balance, while not giving notice to those products related to positive influence on environment [9]. Ethical and sustainable consumption can be encouraged through perceived consumer effectiveness, social norms, values, and positive consumer attitudes [10]. Social media like Twitter can also be used for increasing the awareness about sustainable consumption. To extract crucial features, the tweets would go through six stages of pre-processing. After that, the tweets are categorized as good, neutral, or negative [11]. The multi-criteria decision problem [12] and sustainable consumption are gaining a lot of attention of international communities. Various effective programmes have acknowledged the financial and social dimensions of consumer decision making and have drawn more attention for the role of households as stimulating factor for production [13]. There are some limitations related to cultural, social, and historical context of sustainable consumption [14]. For more safety and protection against any glitches which have been discovered in residences, corporate structures, and various
62
S. Dawar et al.
manufacture sites, a comprehensive addition of the sanitizer machine with the door lock method is required [15].
3 Research Methodology The current research was conducted utilizing an empirical research design. The information was gathered using a standardized questionnaire. A pilot study with 75 respondents was done first, and several wordings of the final questionnaire were revised as a result of the pilot study. The final questionnaire was divided into two portions, each with structured questions. The first component included demographic information, while the second included questions about the structure of the proposed model. The data was collected using snowball sampling, which is handy when a population list is not accessible. The questionnaire was sent on Facebook, one of the most prominent social media networks [16]. The sample was taken using convenience and judgemental sampling techniques, and the frame of questions was developed using previous studies. The study utilized the five-point rating Likert scale. The study initially adopted the exploratory research to get new ideas and understandings. Thereafter, descriptive research was done. Survey method was utilized for collecting the research data using questionnaire. The respondents were also interviewed personally to gain more insights. Respondents were also given questionnaire to fill. A sample of 350 respondents was taken. Respondents were based on different demographic background. The data was analysed by SPSS 21.0 on the responses given by different respondents. Correlation technique has been used, and structural equation modelling (SEM) was used for data analysis. The Cronbach’s alpha was calculated to investigate the internal consistency of the items used in structured questionnaire. The content validity of the questionnaire was tested by the discussion of experts coming from industry and academics.
4 Data Analysis and Interpretation Data analysis and interpretation is dependent on the descriptive and inferential analysis.
4.1 Analysis of Reliability For testing the reliability, the structured questionnaire was given to 75 respondents and Cronbach’s alpha was calculated, and it must be exceeded from 0.70 and then it is said that questionnaire shows good reliability [17]. The Cronbach’s alpha value
Sustainable Consumption: An Approach … Table 1 Analysis of reliability
63
Reliability statistics Cronbach’s Alpha
No. of items
0.830
20
was 0.83 which is quite good and shows that the statements made to measure the constructs are quite reliable and not confusing. We moved ahead with some minute changes with this value and sent the questionnaire to all respondents using snowball sampling (Table 1).
4.2 Respondent Demographic Profile Table 2 reveals the demographic report of the respondents. This data has been collected using the questionnaire. Questionnaire was sent to 850 customers, 475 responses were received, out of those only 350 responses filled appropriately were taken for the final research study. Table 2 shows that the 62% respondents were males and only 37% were females. Maximum respondents were age-group of 18–30 years (youngsters). 54% respondents had graduation and higher degrees. And most of the respondents were from service class. The demographics is majorly normally distributed. Following constructs were identified after detailed review of literature. The constructs are borrowed from published works of few researchers. The details are given in next section (Table 3). Data collection was performed through structured questionnaires which were got filled though personal interaction and self-administrated electronic mediums. The Table 2 Demographic report of respondents
Variables Gender Age
Education
Occupation
Frequency
Percentage (%)
Male
220
62.86
Female
130
37.14
18–30 years
150
42.86
31–40 years
90
25.71
41–50 years
45
12.86
Above 50 years
65
18.57
High school
45
12.86
Graduate
190
54.29
Postgraduate
115
32.85
Student
140
40.00
Business
55
15.71
Service
155
44.29
64
S. Dawar et al.
Table 3 Constructs of structural equation modelling (SEM) Demographic variables DEM_1
Gender
DEM_2
Age
DEM_3
Education level
Sustainable consumption behaviour SCB_1
Try to preserve the environment through my daily activities
SCB_2
Promote social justice and human rights through my concrete activities
SCB_3
Consume local products for supporting economy
SCB_4
Encouraged for making changes in lifestyle for sustainable consumption
Influence of social environment SEN_1
My family members or friends motivate me to follow them for protecting environment
SEN_2
Participate in environmental protection and social work
SEN_3
Buy organic and ecological friendly products from the supermarket
SEN_4
Follow tradition of family to protect environment
SEN_5
In our society separation of wastes for recycling is normal phenomenon
Awareness and information AI_1
Have participated in workshop related with environmental issue
AI_2
From my peer group I have been taught to be responsible for various resources like electricity, energy fuel, and water, etc.
AI_3
Have been informed for sustainability issues
AI_4
Know about the negative effects of the products which are harmful for environment
Market determinants MD_1
Organic products help to protect the environment
MD_2
Know about the advertising tools promoting organic products
MD_3
Know about available distribution channels to buy environment saving products
MD_4
Even organic products are expensive, I still buy them
data set had taken 20 items. These items were taken to judge the external variables like demographic factors, awareness and information, influence of social environment, and market determinants.
4.3 Hypothesized Proposed Model This model was developed using available literature of different variables related to external variables which are considered as possible determinants. The addition of constructs and the relationship among different items in the model is based on earlier
Sustainable Consumption: An Approach …
65
knowledge and researches. The studies performed on different encouraging factors of sustainable consumption behaviour. Demographic variable: As adopted from [18] who identified impact of demographics like gender age and education level of green purchase and consumption behaviour. More demographics than in earlier studies were shown to be linked to a number of specific environmentally friendly activities. The authors came to the conclusion that individual behaviours, rather than broad statements or attitudes, may be more responsive to demographic influences. H1 : There is a positive and significant relationship between demographic variables and sustainable consumption behaviour. Awareness and information: As per study [19] found that the understanding of that term is a prerequisite for changes in consumer behaviour and consumption models. We adopted the construct from their findings in research paper Consumers’ Awareness of the Term Sustainable Consumption published in 2018. They had found that consumers are unfamiliar with the concept of sustainable consumption. The majority of respondents came across concepts that were connected to the concept of sustainable consumption, but they were unable to make links between them. Only around half of those polled were confident in their ability to interpret sustainable consumption on their own. It is worth mentioning, though that the responders’ wide range of behaviours suggests that they understand the notion. H2 : There is a positive and significant relationship between awareness and information and sustainable consumption behaviour. Influence of Social Environment: The study [20] contributed to a better understanding of what quality of life means from the standpoint of sustainable consumption. Different consumer motivations are discussed, as well as the contributions of the rich and poor to unsustainable consumption patterns. They began a discussion about the complicated relationship among consumption, beliefs, uniqueness, and mechanisms for making purchase decisions in a globalized perspective, using relevant literature as a starting point. We adopted the constructs developed in study [20] to understand the impact influence of social environment on sustainable consumption behaviour. Social environmental influence can be applied by family, friends, and other peer group members. They influence the attitudes towards the environment. Those groups which operate on the power mode can encourage the behaviour when people reach to them. Search of identity in a group and expectation of support also increase the influence of social environmental forces. H3 : There is a positive and significant relationship between influence of social environmental forces and sustainable consumption behaviour. Market determinants: Several authors identified factors like price, advertising and distribution channels impact the sustainable consumption behaviour [21]. The construct is based on this work, and few of the relevant items developed have been borrowed for this research paper. Market determinants of products and service of
66
S. Dawar et al.
sustainable behaviour also modify the consumption behaviour. The behaviour would be distorted when price of sustainable product increase. It is very important to know how consumers perceive efficiency of sustainable products in market. H4 : There is a positive and significant relationship between market determinants and sustainable consumption behaviour.
4.4 Correlation Analysis The correlation among all the variables was found which were used in the hypotheses. Table 4 shows positive and high correlation values between demographic variables and sustainable consumption behaviour. Table 5 shows positive and high correlation values between awareness and information related with consumers and their related sustainable consumption behaviour. Table 6 shows positive and high correlation values influence of social environment and sustainable consumption behaviour. Table 4 Correlation between demographic variables and sustainable consumption behaviour Demographic variable Sustainable consumption behaviour Demographic variable
Pearson correlation 1
0.725*
Sig. (2-tailed)
0.002
N
350
350
Pearson correlation 0.725* Sustainable consumption behaviour Sig. (2-tailed) 0.002 N
1
350
350
(*) represents that the significance of correlation value at 1% level of significance
Table 5 Correlation between awareness and information and sustainable consumption behaviour
Awareness and information
Sustainable consumption behaviour
Pearson correlation
Awareness and information
Sustainable consumption behaviour
1
0.745*
Sig. (2-tailed)
0.003
N
350
350
Pearson correlation
0.745*
1
Sig. (2-tailed)
0.003
N
350
350
(*) represents that the significance of correlation value at 1% level of significance
Sustainable Consumption: An Approach …
67
Table 6 Correlation between influence of social environment and sustainable consumption behaviour Influence of social environment
Sustainable consumption behaviour
1
0.785*
Influence of Social Environment
Pearson correlation N
350
350
Sustainable consumption behaviour
Pearson correlation
0.785*
1
Sig. (2-tailed)
0.000
N
350
Sig. (2-tailed)
0.000
350
(*) represents that the significance of correlation value at 1% level of significance
Table 7 Correlation between market determinants and sustainable consumption behaviour Market determinants Sustainable consumption behaviour Market determinants
Pearson correlation 1
0.765*
Sig. (2-tailed)
0.002
N
350
Sustainable consumption Pearson correlation 0.765* behaviour Sig. (2-tailed) 0.002 N
350
350 1 350
(*) represents that the significance of correlation value at 1% level of significance
Table 7 shows positive and high correlation values between market determinants and sustainable consumption behaviour. All the constructs showed relatively high positive correlation with the sustainable consumption behaviour. Hence, it is a good premise to move ahead and check for structural equation modelling results.
4.5 Measurement Model Estimation To evaluate the measurement model by applying statistical instrument, structural equation modelling (SEM) AMOS 21.0 was operated. It is dependent on a process of three steps:
4.5.1
Assessment of Convergent Validity
This assessment is based on the examination of average variance explained (AVEs) of every construct. AVEs values should be at least 0.50 to demonstrate that most
68
S. Dawar et al.
Table 8 AVEs and CRs of construct
Construct
No. of items
AVEs
CR
Demographic variables
3
0.71
0.82
Awareness and information
4
0.77
0.86
Influence of social environment
5
0.73
0.84
Market determinants
4
0.78
0.90
of the variance is responsible for the construct. Results of analysis can be seen in Table 8. It is quite visible that AVEs exceed the threshold value 0.50 showing that convergent validity had been satisfactory.
4.5.2
Examination of Internal Consistency
This is second step in which reliability of each construct is determined and composite reliability statistics shows internal consistency of each construct. It is identical to Cronbach’s alpha and uses 0.70 as threshold to show acceptable reliability of construct. Table 8 shows that CRs of all theoretical construct exceed 0.70 so there is internal consistency in items used for research.
4.5.3
Evaluation of Discriminant Validity
The establishment of suitable discriminant validity is the final step in evaluating a measurement model. This guarantees that each construct is distinct from the others and that the survey instrument variables are loaded on the appropriate construct. This can be investigated by comparing the square root of each construct’s AVEs to the correlations of further constructs. Discriminant validity is considered satisfied with AVEs are higher than the correlations values. Table 9 shows the discriminant validity of the instruments and this is examined by following Fornell and Larcker (1981). The square root of the AVE as showed in bold values on the diagonals was greater than the corresponding row and column values that indicate discriminant validity of the constructs. The bold values show discriminant validity. Table 9 Correlations of construct and AVEs 1 Demographic variables
0.81
Information and education
0.18
2
3
4
0.75
Influence of social environment
0.15
0.15
0.85
Market determinants
0.20
0.22
0.20
0.87
Sustainable Consumption: An Approach …
69
4.6 Overall Measurement Model Fit and Testing of Hypotheses To access this, SPSS AMOS software version 21.0 was applied as it can judge the complex path model, examine the overall model fit, and investigate the confirmatory factor analysis. It has two steps. In first step, assessment of overall fit of model is done, and in second step, level of significance is done to find out the hypothesis relationships. Bivariate analysis was performed, i.e. examine correlative p-value two tailed 0.05. To determine the fit of the model, chi-square, CMIN, GFI, CFI, and RMSA were determined as shown in Table 10. Overall fit indices used for structured modelling show that it fits with acceptable values to support the hypothesized model as shown in Fig. 1. GFI, CFI, IFI, and TLI, the model fit indices, were measured at 0.918, 0.950, 0.950, and 0.937, respectively. The RMSEA was 0.040, that is less than 0.05 and indicates a good model fit [22]. All fit indices needed to be better than 0.9 to indicate a good or acceptable model fit [23]. The final measurement model met all fit criteria. As a result, the final measurement model may be assessed to fit the data well and to have an acceptable level of reliability and validity (Table 10).
Fig. 1 Hypothesized proposed model
Table 10 Summary of overall measurement model
Fit indices
Overall measurement model Initial
Final
CMIN/DF
3.816
1.540
GFI
0.665
0.918
TLI
0.443
0.937
CFI
0.515
0.950
RMSEA
0.091
0.040
IFI
0.521
0.950
70
S. Dawar et al.
Table 11 shows that all null hypotheses are not accepted. The first hypothesis concluded that there is a positive and significant relationship between demographic variables and sustainable consumption behaviour which means that demographic variables related to consumer play an important role to impart the sustainable consumption behaviour. The acceptance of alternate option of second hypothesis emphasized that awareness of consumers plays an important in regulating the consumers’ sustainable consumption behaviour. So updated information has a significant role in controlling the consumption behaviour towards sustainability. The nonacceptance of third hypothesis highlighted that there is a positive and significant relationship between influence of social environmental forces and sustainable consumption behaviour. The social environmental factors considerably affect sustainable consumption behaviour for achieving the sustainable environment. The acceptance of alternate option of last and fourth hypothesis emphasized that market determinants have a significant role in controlling the sustainable consumption behaviour. So, companies need to use better marketing factors and practices for influencing sustainable consumption behaviour.
5 Conclusion The study was done keeping in concern the conceptual framework developed on previous researchers’ studies. The study aim was to examine the effect of different demographic variable and other determinants on sustainable consumption behaviour. The results of the research have shown that demographic variables and other determinants like social environmental forces, awareness and information, and market determinants all have positive and significant impact on sustainable consumption behaviour. It is very much necessary that sustainable environment can only be achieved when people will get more and more information of positive effects of sustainability. Social environmental forces also play a significant role to change the consumption behaviour and moulding them towards sustainable consumption. It is not only the task of consumers who must see the factors responsible for sustainable consumption but is also the duty of marketers to provide the sustainable products and adopt the practices of sustainability. Consumers require assistance in making sustainable choices, and as their awareness grows, they will begin to comprehend the consequences of their purchasing decisions, leading to long-term sustainable practices [24]. Consumer awareness has shifted as a result of globalization and technology improvements, and customers are now more aware of many social and environmental issues. Strategic problem framing by itself is unlikely to be helpful in changing entrenched attitudes and behaviours in the absence of a larger behavioural change programme [25]. Different methods of raising awareness, such as social media networking, correct advertising, product labelling, educational programmes, and so on, can be used to achieve the goal of sustainable development. Furthermore, consumers’ real purchasing decisions are influenced by the price and quality of
0.174
0.168
0.68
0.072
DEM → SCB
SEN → SCB
IE → SCB
MD → SCB
H1
H2
H3
H4
P
0.064
0.61
0.035
0.038
1.127
1.137
4.653
4.730
0.003
0.002
0.080
0.086
0.144
0.156
0.095
0.102
0.039
0.45
0.028
0.030
S.E
2.133
3.562
1.146
1.152
C.R. (t)
Standardized β
C.R (t)
SEM output: Modified proposed model
S.E
SEM output: proposed model
Standardized β
Paths
Hypotheses
Table 11 Significance test of individual hypothesis
0.001
0.002
0.002
0.003
P
Supported
Supported
Supported
Supported
Results
Sustainable Consumption: An Approach … 71
72
S. Dawar et al.
items and services. To help people cope with the situation, infotainment activities and programmes might be conducted. On the contrary, a legislative framework can be devised to exert control over corporations’ unsustainable behaviours, allowing for the timely correction of flaws. The current work has produced a fresh perspective that will be effective in raising consumer knowledge about sustainable consumption in order to attain long-term goals. Emotional intelligence increases the impact of involvement on pro-environmental and pro-social consumption behaviour, as well as having a direct impact on pro-environmental conduct [26]. This research has important managerial ramifications. It educates policymakers and marketing executives on the key predictors of consumers’ sustainable purchasing behaviour. Marketers would benefit from understanding the drivers to sustainable purchasing behaviour, as this knowledge will allow them to tailor their product offerings and develop marketing strategies to encourage sustainable purchasing behaviour. The current study has important implications for public policy. According to the findings, environmental awareness, demographics, market factors, and social environment are all important factors for consumers to investigate green products. Environmental education should be used by policymakers to further nurture and develop this tendency. Consumers are generally sceptical of manufacturers’ environmental claims and find it difficult to identify green products. As a result, environmental education should provide information on how a consumer can help the environment. Acknowledgements The authors would thank sincerely the management and administration of Manipal University Jaipur, Dr DY Patil Institute of Management Studies and Poornima Group of Colleges for providing necessary support for research.
References 1. J. Shadymanova, S. Wahlen, H. van der Horst, Nobody cares about the environment’: K yrgyz’perspectives on enhancing environmental sustainable consumption practices when facing limited sustainability awareness. Int. J. Consum. Stud. 38(6), 678–683 (2014) 2. H.M. Farley, Interpreting sustainability: an analysis of sustainable development narratives among developed nations. Ph.D. diss., Northern Arizona University, 2013 3. S. Koos, Varieties of environmental labelling, market structures, and sustainable consumption across Europe: a comparative analysis of organizational and market supply determinants of environmental-labelled goods. J. Consum. Policy 34(1), 127–151 (2011) 4. N. Mazar, C.B. Zhong, Do green products make us better people? Psychol. Sci. 21(4), 494–498 (2010) 5. S. Zanni, S.S. Cipolla, E. di Fusco, A. Lenci, M. Altobelli, A. Currado, M. Maglionico, A. Bonoli, Modeling for sustainability: life cycle assessment application to evaluate environmental performance of water recycling solutions at the dwelling level. Sustain. Prod. Consump. 17, 47–61 (2019) 6. L.A. Hale, At home with sustainability: from green default rules to sustainable consumption. Sustain. Sustain. 10(1), 249 (2018) 7. L. Lundblad, I.A. Davies, The values and motivations behind sustainable fashion consumption. J. Consum. Behav. 15(2), 149–162 (2016)
Sustainable Consumption: An Approach …
73
8. D.B. Holt, Constructing sustainable consumption: from ethical values to the cultural transformation of unsustainable markets. Ann. Am. Acad. Pol. Soc. Sci. 644(1), 236–255 (2012) 9. M. Bilharz, K. Schmitt, Going big with big matters. The key points approach to sustainable consumption. GAIA-Ecol. Perspect. Sci. Soc. 20(4), 232–235 (2011) 10. I. Vermeir, W. Verbeke, Sustainable food consumption: exploring the consumer attitude–behavioral intention gap. J. Agric. Environ. Ethics 19(2), 169–194 (2006) 11. V. Sharma, S. Srivastava, B. Valarmathi, N.S. Gupta, A comparative study on the performance of deep learning algorithms for detecting the sentiments expressed in modern slangs, in International Conference on Communication, Computing and Electronics Systems: Proceedings of ICCCES 2020, vol. 733 (Springer, 2021), p. 437 12. M.S. Ishi, J.B. Patil, A study on machine learning methods used for team formation and winner prediction in cricket, in Inventive Computation and Information Technologies (Springer, Singapore, 2021), pp. 143–156 13. M.J. Cohen, Consumer credit, household financial management, and sustainable consumption. Int. J. Consum. Stud. 31(1), 57–65 (2007) 14. P. Dolan, The sustainability of “sustainable consumption. J. Macromark. 22(2), 170–181 (2002) 15. M. Shanthini, G. Vidya, IoT-based smart door lock with sanitizing system, in Inventive Computation and Information Technologies (Springer, Singapore, 2021), 63–79 16. A.C. Nielsen, Global faces and networked places. Retrieved January, 29, 2010 (2009) 17. P.J. Lavrakas, Encyclopedia of Survey Research Methods (Sage Publications, 2008) 18. C. Fisher, S. Bashyal, B. Bachman, Demographic impacts on environmentally friendly purchase behaviors. J. Target. Meas. Anal. Mark. 20(3), 172–184 (2012) 19. E. Gory´nska-Goldmann, M. Gazdecki, Consumers’ awareness of the term sustainable consumption, in Conference Proceedings International scientific days 2018, Towards Productive, Sustainable and Resilient Global Agriculture and Food Systems (Volter Kulwer, Nitra, 2018), pp. 316–329 20. N.M. Ayala, Sustainable consumption, the social dimension. Revista Ecuatoriana de Medicina y Ciencias Biológicas. 39(1) (2018) 21. Y. Joshi, Z. Rahman, Factors affecting green purchase behaviour and future research directions. Int. Strateg. Manage. Rev. 3(1–2), 128–143 (2015) 22. J.H. Steiger, Point estimation, hypothesis testing, and interval estimation using the RMSEA: some comments and a reply to Hayduk and Glaser. Struct. Equ. Model. 7(2), 149–162 (2000) 23. J.F. Hair, R.E. Anderson, B.J. Babin, W.C. Black, Multivariate data analysis: a global perspective: Pearson Upper Saddle River 2010 24. M. Soni, S. Dawar, A. Soni, Probing consumer awareness and barriers towards consumer social responsibility: a novel sustainable development approach. Int. J. Sustain. Dev. Plan. 16(1), 89–96 (2021) 25. L.P. Fesenfeld, Y. Sun, M. Wicki, T. Bernauer, The role and limits of strategic framing for promoting sustainable consumption and policy. Glob. Environ. Chang. 68, 102266 (2021) 26. S. Kadic-Maglajlic, M. Arslanagic-Kalajdzic, M. Micevski, J. Dlacic, V. Zabkar, Being engaged is a good thing: understanding sustainable consumption behavior among young adults. J. Bus. Res. 104, 644–654 (2019)
The Concept of a Digital Marketing Communication Model for Higher Education Institutions Artur Kisiołek, Oleh Karyy, and Ihor Kulyniak
Abstract Digital marketing has become an essential element of higher education institution activities. Accordingly, higher education institutions need to adapt their marketing communications to modern realities. The authors’ intention was to highlight strategic and tactical aspects relevant to digital marketing communication processes based on literature review, the research of marketing activity of higher education institutions from Poland and Ukraine on the Internet, and own experience. The authors suggested the conceptual model for digital marketing communication of higher education institution that is universal, applicable to any type of higher education institution regardless of its profile, form of ownership and country. Keywords Digital marketing · Customer data platform · Higher education institutions · Marketing communication model · Omnichannel marketing · Web 2.0
1 Introduction The use of Internet technologies in the communication process is actively involved in the marketing activities of universities. These modern changes in marketing activities require scientific reflection. Internet marketing can also be described as a series of activities, procedures and practices that aim to achieve set goals, attract new visitors, retain them and turn visits into a specific customer response. To understand the different goals of digital marketing for each entity, the first thing you need to do is understand the core tools that are widely used today. Higher education institutions are A. Kisiołek ´ ´ Wlkp., Sroda Wielkopolska, Greate Poland University of Social Studies and Economics in Sroda Poland e-mail: [email protected] O. Karyy · I. Kulyniak (B) Lviv Polytechnic National University, Lviv, Ukraine e-mail: [email protected] O. Karyy e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_6
75
76
A. Kisiołek et al.
no exception in this matter. The multiplicity of online marketing instruments, tools and channels indicates both the remarkable potential of digital communication and the need to integrate various forms in many possible channels. The aim of the article is to present a conceptual model of digital marketing communication of a higher education institution, based on literature studies, own research on digital marketing of higher education institutions in Poland and Ukraine, and the authors’ professional experience in this matter. The model is an attempt to show the directions of integration in a multichannel and then omnichannel approach (multichannel, omnichannel) with the use of a Customer Data Platform (CDP) database and the use of cloud computing, artificial intelligence (AI) and machine learning.
2 Theoretical Outline Modern researchers distinguish three main directions of technological change affecting the operation of organisations, and these include transformational transitions: • from e-communication to m-communication; • from simple algorithmisation to artificial intelligence; • from simple analytics, via the Internet of Things (IoT) and Big Data, to predictive analytics [1–3]. The Internet, in the first, stationary phase of its existence (Web 1.0), became a database which was, for its current size, quite limited, and its further development in the interactive phase Web 2.0 brought an exponential growth of data. In such an overcrowded Web, in the chaos of publishing, finding valuable information is difficult, time-consuming and complex. Therefore, the next phase in the development of the Internet is the Web 3.0 concept, based on the semantic Web and artificial intelligence, the aim of which is to enable intelligent information search in a selective, targeted and user-driven manner. According to Rudman and Bruwer [4], in Web 3.0, a machine will be able to understand and catalogue data in a human-like way, making the Internet a base where any data format can be shared and understood by any device in any network. This evolution will enable the autonomous integration of data and services, as well as the creation of new functionalities. It also brings risks related to the protection of personal data, including unauthorised access or manipulation of data on an unprecedented scale. According to Baker [5, p. 14]: “The exploding world of data, as we will see, is a giant laboratory of human behaviour. It is a testing ground for social science, for economic behaviour and psychology”. A further development of the Web in the direction indicated above is already a fact, and its consequence will be, as Porter and Heppelmann [6] specified, the exit of data “trapped on two-dimensional document pages and computer screens” into three-dimensional Augmented Reality (AR) understood as “a set of technologies that put digital data and images on the physical world”, which gives hope for bridging
The Concept of a Digital Marketing Communication …
77
the gap between digital data collected on the Web and the possibility of using them in the physical world. The Internet, according to Kiełtyka and Zygo´n [7, p. 24], from a simple communication tool has become “the basis for the functioning of social life in many areas of the economy”, and technologies and adapted IT tools, both real and virtual, are still developing. The evolution of the Internet does not lose its momentum and the changes generated are reflected, among others, in the marketing activities of organisations worldwide. Digitalization of marketing is a process that has been progressing very dynamically for several years, determined by the speed of technological change, which is directly proportional to the changes that occur in the model of customer behaviour. It all started with the World Wide Web, which has now transcended computer screens and followed the user to include the mobile phone, the tablet, as well as wearable devices or the space around us, through objects hitherto offline, as exemplified by home appliances (Internet of Things). Artificial intelligence (AI) techniques are applied for customer data and that can be analysed to anticipate customer behaviour. The AI, the big data and advanced analytics techniques can handle both structured and unstructured data efficiently with great speed and precision than regular computer technology which elicits Digital Marketing (DM) [8]. According to Zeeshan and Saxena [9]: “The function of artificial intelligence and digital marketing have been started to be used in various fields, especially in marketing because it is the trendiest work in the market creating major changes”. The research conducted by the authors showed a certain picture of the use of the Internet in marketing activities of higher education institutions in Poland and Ukraine. This state cannot be described as “current” even at the time of survey interviews with respondents or the literature search. The findings point to the significant and growing role of the Internet in the marketing activities of higher education institutions in the analysed countries and areas that are gaining strategic importance for the future, such as social media and mobile marketing. The research results indicated the great potential of these areas, as well as opportunities for improvement, use of additional tools or integration towards multichannel and omnichannel communication. The results obtained during the research, as well as the speed and nature of changes in the area of digital marketing, made the authors develop a model of digital marketing communication of a higher education institution, which in its intention should be an aid in the process of digital transformation of a higher education institution’s marketing, as well as material for discussion on the future of marketing communication in an omnichannel environment, using the marketing automation technology in the cloud. Selected aspects of the marketing activity of a higher education institution are the subject of many research works by scientists representing different educational systems and nationalities [10, 11]. In the following, the selected models will be presented which provide the theoretical background to the concept prepared by the author. According to Williams [12], the last 20 years have seen profound changes as a result of the development of information and communication technologies, which are unique in their comprehensive reach and the stakeholders they affect. Higher education institutions are under increasing pressure to respond to national
78
A. Kisiołek et al.
and international economic, social and political changes. According to Chaurasia et al. [13], the answer to this evolving complex image of contemporary relationships between a higher education institution and its external stakeholders is “data”. Structured data streams from, for example transaction systems, web sites, social networks, radio frequency identification reader (RFID) scanners, as well as “physiological” data, “user footprint” data and other data from various types of sensors (such as popular beacons) are used to build collections of Big Data about users, which, after processing, become important for an organisation that has them at its disposal. The data-stimulated market for higher education services is referred to in the literature as “academic and educational analytics” [14]. According to Chaurasia et al. [13, p. 1100], Big Data can help an academic community better understand its students’ needs. Based on the concept of capability maturity models (CMM) developed at the Carnegie Mellon Software Engineering Institute, Chaffey and Ellis-Chadwick [15] proposed a CMM model for digital marketing consisting of processes considered in relation to five maturity levels, id est initial, managed, defined, quantitatively managed and optimised. In the earlier model, the researchers included six processes: digital channel strategy development; online customer acquisition; online customer conversion and customer experience; customer development and growth; crosschannel integration and brand development; and overall digital channel management, including change management. In their subsequent work, Chaffey and EllisChadwick [15] revised the model, detailing the following seven processes: strategic, performance improvement, management commitment; resources and structure; data and infrastructure; integrated customer communication and integrated customer experience. A summary of selected CMM models in relation to Web 2.0 technologies in digital marketing is included in Table 1. These models take a variety of perspectives, the broader the problem is framed, the more general the concept is and consequently it requires significant changes to adapt it to specific requirements of each organisation. Consequently, as reported by Al-Thagafi et al. [16, p. 1149], many sectors and organisations are developing their own, narrower and more detailed CMM models. The CMM model proposed by the aforesaid researchers relates to the use of social media (Web 2.0 technologies) in the recruitment of foreign students at higher education institutions in Saudi Arabia. To develop it, the authors used four business processes of the AIDA (Attention, Interest, Desire, Action) marketing communication model. The aim of the research was to assess the extent to which Web 2.0 technologies were implemented to support these processes. The analysis covered the period when a prospective student first learned about an educational service until he/she decided to apply to a particular higher education institution.
The Concept of a Digital Marketing Communication …
79
Table 1 Overview of capability maturity models (CMM) in relation to social media marketing activities Model
Description
Advantages
Restrictions
Dynamic capabilities model of organisations (Bolat et al. [17])
Four abilities: 1. Sensing the market 2. Managing relationships 3. Branding 4. Developing content
It distinguishes skills and knowledge for implementing mobile social media capabilities at every level
It focuses on the advertising industry in business-to-business (B2B) environments. It is based on a single operational level rather than maturity levels
Excellence in social media (Chaffey [18])
Five abilities: 1. Initial (unintentional use of social media) 2. Managed (specific objectives) 3. Defined (SMART objectives) 4. Quantitatively managed (statistical measurement of social media activity) 5. Optimised (return on investment after verification)
The model is constantly updated by the Smart Insights research group to provide information on the latest and best practices in social media
It focuses on plans and activities without paying attention to the capacity of staff to implement them
Strategic capabilities of the organisation (Nguyen et al. [19])
Three abilities: 1. Gaining knowledge from social media 2. Integrating knowledge 3. Applying knowledge in line with the strategic directions and choices of the organisation
The focus is on how organisational social behaviour can lead to effective integration of social media when implementing an organisation’s strategy
Switching between organisational change and marketing strategy; embedded in the cultural and economic context of China and its widely used Web 2.0 technologies, e.g. WeChat, Weibo
Social media opportunities in B2B communication [20]
Five abilities: 1. Broadcast speed—the speed at which a social media message reaches its target audience 2. Parallelism—the extent of mutual understanding between a sender and a receiver 3. Symbol sets—flexibility factor in message encoding 4. Possibility to practice—whether the message can be edited before sending 5. Re-processability—whether the message can be edited/deleted after sharing
A model was developed to improve marketing communication skills limited to B2B practices of small- and medium-sized enterprises (SMEs)
The focus is on B2B practices in SMEs (limited to Chinese SMEs and Web 2.0)
(continued)
80
A. Kisiołek et al.
Table 1 (continued) Model
Description
Advantages
Restrictions
Social media opportunities in B2B marketing [21]
Four abilities: 1. Technological—understanding and categorising different social networks according to the organisation’s strategic objective(s) 2. Operational—building online communities to increase user benefits 3. Managed—mechanisms to assess, control and measure social media performance and results 4. Strategic—ensuring that the organisation has the necessary cultural and individual capabilities to use social media in its business over the long term
A model has been developed that suggests having the capability of a dynamic environment that uses a social-cognitive approach to the differences in individuals’ skills, which is essential for the successful implementation of Web 2.0 in B2B marketing
A descriptive study based on literature analysis of 112 articles; no primary data were collected
Source Based on [16, pp. 1150–1151]
3 Omnichannel Versus Multichannel Approach According to Chaffey and Ellis-Chadwick [15], a strategic approach to digital marketing requires defining capabilities and initiatives to support the marketing and business objectives that an organisation should implement to leverage digital media, data and marketing technologies to increase multichannel audience engagement using digital devices and platforms. This definition will serve as a basis for further discussion, as the number of tools and channels and the new technological possibilities mean that modern marketing and, more specifically, marketing communications are moving towards a multichannel and omnichannel environment. In the multichannel approach, multiple tools are used independently of each other, and their interaction is non-existent or limited. The omnichannel model is based on the multichannel template, but the essence of its functioning is to integrate all tools and channels into one coherent message for the recipient. According to Nowak [22], omnichannel marketing is “a complex ecosystem based on a central data system with many interrelated components”. It is not just about giving the customer different ways to interact, but it is about “providing them with a consistent experience and maintaining a complete sequence of actions at every level of the customer’s relationship with the brand”. Omnichannel marketing, according to Berman and Thelen [23], is generally opposed to multichannel marketing. The researchers outline the main differences between each of these approaches based on two dimensions: organisational strategy
The Concept of a Digital Marketing Communication …
81
and consumer behaviour. In terms of strategy, the differences are based on the objectives of the organisation, the uniformity of messages across devices and channels, the distinction between offline and online services, the use of simple or multiple touchpoints, the format of the organisation and the extent to which the customer and databases are unified across all channels. Differences based on consumer behaviour include the following: the design of the consumer’s purchase path (uniform or different and linear or non-linear), the place of purchase versus the place of collection and return, and the degree of effort the consumer has to make when moving across channels and devices. Ultimately, the differences between multichannel and omnichannel marketing will result from the strategy, customer profile and process maturity of the organisation. In the case of higher education institutions, it is a matter of evolution: first the multichannel approach and then more personalization thanks to the implementation of the omnichannel approach. Gotwald-Feja [24, p. 263] believes that the changing marketing environment as well as the tools and scope of marketing communication force researchers to modify the known and previously applied models of marketing communication, which can be exemplified by a critical analysis of classical and contemporary models of communication by Szymoniuk [25, 26] and the concept of a spherical model of marketing communication proposed by this researcher. The models of marketing communication on the Internet, from the point of view of the considerations carried out, should be extended to the reality of the omnichannel environment. A proposal for an omnichannel marketing communication (OMC) model was presented by Gotwald (Fig. 1). The author introduced a mix of tools in the process of encoding and decoding a message, as well as turned the singularity of the process into its multiplicity and parallelism. Understood in this way, the sender’s (company’s) message is the sum of “messages delivered across multiple channels, using a variety of marketing communication tools” [27, p. 46]. The researcher points out the need to encode messages that complement each other synergistically, as they are decoded by the recipient
COMMUNICATION OMNICHANNEL
ENHANCEMENTS
SENDER (COMPANY)
AND EXPERIENCE CODING
FILTERS AND ENHANCEMENTS
COMMUNICATION (TOTAL MESSAGES)
TOOL MIX
DECODING TOOL MIX
DECODING
CODING COMMUNICATION
RECIPIENT (COMPANY)
SENDER (CUSTOMER)
(MESSAGE)
FILTERS AND ENHANCEMENTS
RECIPIENT (CUSTOMER)
CULTURE, EXPERIENCE AND INDIVIDUAL FEATURES
Fig. 1 Omnichannel marketing communication model. Source Gotwald, p. 46
82
A. Kisiołek et al.
(customer) applying different tools at different times. Synergy may occur at nodal points, where the recipient applies numerous tools at the same time, then, according to the author: “the message conveyed has a chance to resound in an exceptionally clear and unambiguous manner” [24, p. 268]. The discussed part of the model is shown in the upper part of Fig. 1, while its lower part illustrates the feedback process, in which the sender (customer) selects one communication channel and within it transmits a message which is then decoded by the receiver (company). The effective communication in an omnichannel environment requires, according to the author of the OMC model, planning communication in such a way as to “achieve synergy between messages, tools and objectives” [27, p. 47]. The aforesaid model fragmentarily illustrates the complexity of the digital marketing communications environment; and modern marketing is undergoing a profound transformation, part of which is the integration of multiple tools, channels, technologies and modes of operation with the requirements and needs of the customer operating in an off and online reality. It is therefore appropriate to conceptualise new models that take into account the impact of digitalisation on marketing activity, thus setting the scene for further research and practical applications.
4 Conceptual Model of Digital Marketing Communication of Higher Education Institutions Development The digital technology has made the so-called customer shopping path increasingly complex. Pasqua and Elkin [28, pp. 382–388] distinguish its four components: portability, preference, proximity and presence. Marketing has evolved from a monologue to a dialogue conversation with a single customer (with specific and unique needs) in real time, in which market players participate thanks to the development of social platforms and mobile technologies. This type of dependencies will also exist in the market for higher education services. With a smartphone in hand, consumers everywhere are online 24/7, which Kall [29, p. 30] calls “the most important consequence of the massive spread of mobile phones”, moreover, thanks to the development of location-based services (LBS), companies can tailor their communication strictly to the specific needs at a given moment. Consequently, there are changes in customers’ behaviour, their habits and expectations. This raises new challenges for today’s managers to keep up with adapting their marketing communications strategy to the new multichannel and then omnichannel reality. Based on the research results obtained from higher education institutions from Poland and Ukraine, the literature research and the author’s experience in online marketing, a conceptual model of digital marketing communication of higher education institutions has been constructed. The concept is universal, applicable to any type of higher education institution regardless of its profile, form of ownership and country. The digital marketing communication of a higher education institution, according to the model shown in Fig. 2, can be multichannel in the first stage or omnichannel in the
The Concept of a Digital Marketing Communication …
83
Stage I
Multichannel communication
Stage II
Omnichannel communication STUDENT DATA PLATFORM (SDP) DATA:
SENDER
HIGHER EDUCATION INSTITUTION
MESSAGE
CODING
CHANNEL
DATA BASE (CDP type)
RECIPIENT
DATA PROCESSING
CRM Digital advertising Social media E-mail Mobile Website Video
Digital advertising Social media E-mail Mobile Website Video Offline
SEGMENTATION
PERSONALISATION
INVOLVEMENT
COLLECTION
PROSPECTIVE STUDENT PRESENT STUDENT DECODING
GRADUATE
OPTIMISATION
SENDER
RECIPIENT
EXTERNAL PROVIDER OF IT SOLUTIONS
AUTHORITIES OF HEI
MARKETING DEPARTMENT OF HEI
IT DEPARTMENT OF HEI
INTERACTION (or INTERACTION CHANNELS) COLLECTION
DECODING
MESSAGE Digital advertising
Social media
E-mail
Mobile
Website
Video
CODING
Fig. 2 A conceptual model for digital marketing communication of higher education institution. Source Own work
second stage. Changes in consumer behaviour, as well as new technologies, according to Berman and Thelen, foster a shift from multichannel marketing to omnichannel marketing. Changes related to consumer behaviour include the increasing use of mobile devices, the widespread use of social media and the popularity of related software (e.g. applications). According to the cited authors, the large number of differences between multichannel marketing and omnichannel marketing indicate the complexity and multidimensional aspect of omnichannel marketing. It also suggests that an organisation may be in the initial, intermediate or final stages of adopting an omnichannel marketing strategy [23, p. 598]. The presented model does not assume the division of the multichannel and omnichannel stage into sub-periods related to the implementation phase, as this may be the subject of separate research. The sender transmits the communication (as total messages) via a cloud-based Student Data Platform (SDP). The SDP system is based on the Customer Data Platform (CDP), which, according to Kihn and O’Hara [30, p. 41], provides a place to store customer data and perform analysis, as well as a layer that takes abstract customer data and connects it to real-time systems to perform tasks such as managing interactions (in real time), making decisions and connecting content. The communication from the Student Data Platform is directed multichannel to the
84
A. Kisiołek et al.
recipient, who is a potential and current student and a graduate of a higher education institution (for simplicity, the author will use one term in the following description—recipient or student). The messages in each channel is the same for the same recipient—omnichannel approach, or other, different from each other—multichannel approach. The reception of a communication is multichannelled, as a student may receive emails, text messages, social media notifications, etc., from a higher education institution, as such opportunities are created by a multi- or omnichannel environment. The mix of tools used allows for additional synergies and streamlining of the communication process, thanks to the data management processes offered by the CDP and the technical ability to coordinate them (linking of channels). Once the communication (total messages) has been received and decoded, interaction from the receiver follows. This process, as Gotwald [27, p. 47], inter alia, points out, “has a linear character and depends on the effectiveness of the first tool (and channel) that the consumer chooses to convey the message” (in this case a student is treated as a customer). A student chooses a communication channel convenient for him or her at a given moment and sends a message to the university. His or her activity may include a response to a digital ad, social media post, email, web site or mobile app post. In the next step, the feedback message is decoded by the higher education institution. In a multi- or omnichannel environment is simplistic as it “primarily involves filters and amplifications, applied not at the level of need, but at the level of the perception of the relevance of the problem and the total value of the consumer” the student to the higher education institution. At this point, one cycle is closed and another university–student interaction is possible. It should be emphasised that all the communication described takes place in the cloud, and the platform itself can also be described as software for managing and sharing data for other systems. The components of the Student Data Platform are the data, the Customer Data Platform database and the key communication channels. The data feeding the CDP comes from the CRM (Customer Data Management) system or other databases which are at a higher education institution’s disposal and digital communication channels. The model includes the basic ones such as digital advertising, social media, email, mobile tools, web site or video. The data to be acquired can be categorised as follows: • 1st data party—data derived from user interaction with content that the higher education institution publishes online (e.g. through forms, subscriptions, likes and activity on its own web sites); • 2nd data party—data obtained through the cooperation of the higher education institution with its partners in the network (e.g. publishers of digital content aimed at the same target group); • 3rd data party—external data collected by third parties and made commercially available. The core element of the SDP is the Customer Data Platform. According to Kihn and O’Hara [30, p. 41], cited above, the customer data platform is not only a customer
The Concept of a Digital Marketing Communication …
85
information database, but it is also a real-time engagement system that makes splitsecond decisions with the support of five main processes. 1.
2. 3.
4.
5.
Data processing—the aim of which is to build a unified user profile with data from all the systems collected over the years, integrated and tailored to further applications. Segmentation—creating intelligent segments based on artificial intelligence and machine learning. Personalisation—a process in which the planning and response phases are intertwined, enabling campaigns to be managed and decisions to be taken in real time. As part of this process, the marketing specialist can create student experience as part of the academic customer journey, and develop rules (including machine learning-based decisions) to handle related events, such as an unfamiliar user appearing on a web site or mobile application. Engagement—the ability to contact the student directly (e.g. by sending an email or a text message) or interact with systems that reach the recipient through the channel of their choice. Optimisation—a process in which user conversion data and any other signals about user activity are captured, then actions are taken based on the reported results to improve them. The main goal of optimisation is Real-time Interaction Management (RTIM). According to M. Kihn and Ch. O’Hara “in a world where consumers move from channel to channel in near real time, RTIM is the way to connect and relate fast-moving experiences, driving relevance and delivering lift” [30, p. 113].
Customer data platforms are offered in the SaaS (software as a service) model, in the cloud, with the full possibility of integration with the IT environment functioning in a given organisation. These platforms mark the further path of the digital transformation of marketing, where what is offline intersects with what is online [31]. They are an evolution of CRM and Data Management Platforms (DMP), using Big Data resources and the latest AI and machine learning solutions to automate marketing activities and integrate them at an omnichannel level. Marketing automation based on AI and machine learning means a new approach to the problem of information overload and the need to constantly analyse it in order to improve communication and streamline decision-making processes. The main property of machine learning technology is the ability to continuously learn and improve results based on experience, and to continuously collect and analyse data [32]. CDP solutions offered by IT companies [33–36] of different sizes and scales, as well as in the banking sector [37], can be successfully implemented in higher education institutions [38] (e.g. in marketing communication, recruitment and enrolment processes, day-to-day student services, etc.).
86
A. Kisiołek et al.
5 Implementation of the Concept Model The implementation of the Student Data Platform as proposed requires the full support of the authorities of the higher education institution and strategic decisions at this level. The relationships between the different key entities in the implementation and management of the SDP are shown in the lower part of Fig. 2. The Marketing Department (or the higher education institution’s organisational unit responsible for marketing activities) is the initiator and main manager of the Student Data Platform. Modern marketing is real-time, and instant feedback means that higher education marketing professionals should regularly use technology platforms to measure, analyse and tailor their activities to the needs and experiences of students operating across multiple digital media channels [39]. User expectations move towards personalisation and highly individualised communication, and therefore, increasing demands will be placed on the technology applied and the skills of marketers. The challenge for a higher education institution using a digital marketing automation platform will be to define new roles for Marketing Department managers and specialists. The marketing manager will be responsible for leading the team and supervising processes and the implemented strategy (including ongoing projects and campaigns); coordinating cross-channel communication; defining strategic direction; approving content and analysing campaign metrics. The role of the specialist marketer will be to develop, plan and send marketing messages; analyse data for decision-making; project audit [40]; and monitor and optimise ongoing campaigns. At this point, it is important to emphasise the importance of a clear legal framework related to guaranteeing privacy rights for all users. Marketing departments of higher education institutions are responsible for operational contacts with the IT provider, while at the strategic level, including the scope and conditions of implementation of the CDP, the management role is performed by the authorities of the higher education institution in cooperation with the IT provider. The authorities of the higher education institution will be, among others, responsible for approving the content of major marketing communication campaigns in line with the adopted marketing strategy. Success in digital marketing depends on the ability of the marketing division to cooperate with the IT Department, which is responsible for assisting in the implementation of the platform, managing the infrastructure and carrying out day-to-day supervision of the functioning of the higher education institution’s IT resources. In a formal way, IT Department employees report to the authorities of the higher education institution.
6 Conclusions The discussed outline of the dependencies and roles of various organisational units of a higher education institution is of a general and postulative nature. The authors’ intention was to highlight strategic and tactical aspects relevant to digital marketing
The Concept of a Digital Marketing Communication …
87
communication processes. The intertwining of marketing and IT competences, the role of the authorities of the higher education institution and the extent of outsourcing in the digital transformation of marketing indicate areas for further research. The ability to coordinate different channels in a marketing message and the ability to personalise interactions and deliver the right content at the right time requires organisations to increase their technological competence almost exponentially. The model of digital marketing communication of the higher education institution outlined above is a proposal for higher education marketing professionals and managers, as well as those who supervise these areas. The impact of modern technologies on contemporary marketing broadens the scope of discussion on the marketisation of higher education, as well as the essence and role of marketing activities of the higher education institution. In this context, it may be highly probable to argue that today’s students expect to be active across all channels in which they are present, at the level of communication with their favourite brands.
References 1. K. Kania, Gamifikacja w procesie wprowadzania nowych technologii informatycznych do organizacji jako zadanie specjalistów HR. Zarz˛adzanie Zasobami Ludzkimi. 5, 27–44 (2018) 2. B. Filipczyk, Zarz˛adzanie wiedz˛a i komunikacj˛a cyfrow˛a w procesie onboardingu studentów, in Cyfrowa komunikacja organizacji, ed. by B. Filipczyk, J. Gołuchowski (Wydawnictwo Uniwersytetu Ekonomicznego w Katowicach, Katowice, 2020) 3. P. Karthigaikumar, Industrial quality prediction system through data mining algorithm. J. Electr. Infor. 3(2), 126–137 (2021) 4. R. Rudman, R. Bruwer, Defining Web 3.0: opportunities and challenges. Electron. Libr. 34(1), 132–154 (2016) 5. S. Baker, The Numerati. Mariner Books Houghton Mifflin Harcourt, Boston-New York (2009) 6. M.E. Porter, J. Heppelmann, Strategiczne podej´scie do rzeczywisto´sci rozszerzonej. HBR Polska 4, 42–56 (2018) 7. L. Kiełtyka, O. Zygo´n, Współczesne formy komunikacji – jak zarz˛adza´c z wykorzystaniem Internetu Rzeczy i Wszechrzeczy. Przegl˛ad Organizacji. 2, 24–33 (2018) 8. B.R. Arun Kumar, AI-based digital marketing strategies—a review, in Inventive Computation and Information Technologies, ed. by S. Smys, V.E. Balas, K.A. Kamel, P. Lafata. Lecture Notes in Networks and Systems, vol. 173 (Springer, Singapore, 2021), pp. 957–969 9. M. Zeeshan, K. Saxena, Explorative study of artificial intelligence in digital marketing, in Proceeding of the International Conference on Computer Networks, Big Data and IoT (ICCBI– 2019), ed. by A. Pandian, R. Palanisamy, K. Ntalianis. Lecture Notes on Data Engineering and Communications Technologies, vol. 49 (Springer, Cham, 2019), pp. 968–978 10. Z. Malara, Y. Ziaeian, Marketing model in global companies: designing and management. Organ. Rev. 6(953), 23–30 (2019) 11. I. Oplatka, J. Hemsley-Brown, A systematic and updated review of literature on higher education marketing: 2005–2019, in Handbook of Operations Research and Management Science in Higher Education. International Series in Operations Research and Management Science, ed. by Z. Sinuany-Stern. (Springer International, 2021), pp. 1–71. Preprint: https://www.researchgate.net/publication/351845192_A_systematic_and_updated_r eview_of_the_literature_on_higher_education_marketing_2005--2019 12. P. Williams, Assessing collaborative learning: big data, analytics and university futures. Assess. Eval. High. Educ. 42(6), 978–989 (2016)
88
A. Kisiołek et al.
13. S.S. Chaurasia, D. Kodwani, H. Lachhwani, M. Avadhut Ketkar, Big data academic and learning analytics. Connecting the dots for academic excellence in higher education. Int. J. Educ. Manage. 32(6), 1099–1117 (2018) 14. P.D. Long, G. Siemens, Penetrating the fog: analytics in learning and education. Ital. J. Educ. Technol. 22(3), 132–137 (2014) 15. D. Chaffey, F. Ellis-Chadwick, Digital Marketing (Pearson Education Limited, Harlow, 2019) 16. A. Al-Thagafi, M. Mannion, N. Siddiqui, Digital marketing for Saudi Arabian University student recruitment. J. Appl. Res. High. Educ. 12(5), 1147–1159 (2020) 17. E. Bolat, K. Kooli, L.T. Wright, Businesses and mobile social media capability. J. Bus. Indus. Mark. 31(8), 971–981 (2016) 18. D. Chaffey, Digital marketing benchmarking templates. https://www.smartinsights.com/gui des/digital-marketing-benchmarking-templates 19. B. Nguyen, X. Yu, T.C. Melewar, J. Chen, Brand innovation and social media: knowledge acquisition from social media, market orientation, and the moderating role of social media strategic capability. Ind. Mark. Manage. 51, 11–25 (2015) 20. W.Y.C. Wang, D.J. Pauleen, T. Zhang, How social media applications affect B2B communication and improve business performance in SMEs. Ind. Mark. Manage. 54, 4–14 (2016) 21. W.Y.C. Wang, M. Rod, S. Ji, Q. Deng, Social media capability in B2B marketing: toward a definition and a research model. J. Bus. Indus. Mark. 32(8), 1125–1135 (2017) 22. G. Nowak, Przyszło´sc´ marketingu to wszechkanałowo´sc´ . Cz˛es´c´ 1. Geneza, https://marketerp lus.pl/przyszlosc-marketingu-to-wszechkanalowosc-czesc-1-geneza 23. B. Berman, S. Thelen, Planning and implementing an effective omnichannel marketing program. Int. J. Retail Distrib. Manage. 46(7), 598–614 (2018) 24. B. Gotwald-Feja, Komunikacja marketingowa w realiach omnichannel – uj˛ecie modelowe. Marketing i Zarz˛adzanie. 1(47), 261–271 (2017) 25. B. Szymoniuk, Komunikacja marketingowa w klastrach i uwarunkowania jej skuteczno´sci. Wydawnictwo Politechniki Lubelskiej, Lublin (2019) 26. B. Szymoniuk, Sferyczny model zintegrowanej komunikacji marketingowej. Marketing i Zarz˛adzanie. 3(49), 193–208 (2017) 27. B. Gotwald, Komunikacja marketingowa w s´rodowisku omnikanałowym. Potrzeby i zachowania konsumentów na rynku centrów nauki. Wydawnictwo Uniwersytetu Łódzkiego, Łód´z (2020) 28. R. Pasqua, N. Elkin, Godzina dziennie z mobile marketingiem. Helion, Gliwice (2014) 29. J. Kall, Branding na smartfonie. Komunikacja mobilna marki. (Wolters Kluwer Business, Warszawa, 2015) 30. M. Kihn, C.B. O’Hara, Customer Data Platforms: Use People Data to Transform the Future of Marketing Engagement (Wiley, 2020) 31. O. Prokopenko, O. Kudrina, V. Omelyanenko, ICT support of higher education institutions participation in innovation networks, in 15th International Conference on ICT in Education, Research and Industrial Applications, vol. 2387. (Kherson, 2019), pp. 466–471 32. I.P. Rutkowski, Inteligentne technologie w marketingu i sprzeda˙zy – zastosowania, obszary i kierunki bada´n (Intelligent technologies in marketing and sales – applications, research areas and directions). Marketing i Rynek/Journal of Marketing and Market Studies. 6, 3–12 (2020) 33. Microsoft. https://dynamics.microsoft.com/pl-pl/ai/customer-insights/what-is-a-customerdata-platform-cdp 34. IBM. https://www.ibm.com/pl-pl/analytics/data-ai-platform 35. Salesforce. https://www.salesforce.com/products/marketing-cloud/overview 36. SAP. https://apollogic.com/pl/2020/12/platforma-danych-klientow 37. S. Subarna, S. Smys, Big data analytics for improved risk management and customer segregation in banking applications. J. ISMAC. 3(03), 235–249 (2021) 38. Higher Education Marketing Platform. https://www.salesforce.org/highered/marketing
The Concept of a Digital Marketing Communication …
89
39. O. Prokopenko, V. Omelyanenko, Marketing aspect of the innovation communications development. Innov. Mark. 14(2), 41–49 (2018) 40. A.K. Yadav, The substance of auditing in project system. J. Inf. Technol. Digital World 3(1), 1–11 (2021)
A Lightweight Image Colorization Model Based on U-Net Architecture Pham Van Thanh and Phan Duy Hung
Abstract For the problem of grayscale image colorization, many authors propose their methods to produce the most plausible, vivid images from a gray input. Almost all of them introduce a quite large neural network model with hundreds of megabytes of parameters. This paper proposes a relatively lightweight model to solve the problem which has equivalent performance to recent methods. Our model is based on famous U-net architecture which is frequently used for semantic segmentation problems. The model is trained to predict the chromatic ab channels given the lightness L channel in Lab color space to finally produce a colorful image. Our method applies self-supervised representation learning where input and labeled output are different channels of the same image. Experiments on commonly used PASCAL VOC 2012 and Places205 datasets show that our method has equivalent performance compared to other state-of-the-art algorithms while the model size is relatively smaller and consumes less computing resources. Keywords Image colorization · U-net architecture · Self-supervised learning
1 Introduction Image colorization is to assign a color to each pixel of a grayscale image. This is a difficult problem because two of the three channels of the ground truth image are lost. However, colorization is an interesting area with many applications. One of the most popular applications of image colorization is to colorize legacy images or videos. This helps people to gain more knowledge about the colors of old images. A historical movie is more vivid when it is colorized and gives watchers a better imagination from what they see.
P. Van Thanh · P. Duy Hung (B) FPT University, Hanoi, Vietnam e-mail: [email protected] P. Van Thanh e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_7
91
92
P. Van Thanh and P. Duy Hung
Before the era of deep learning, most of the colorization methods require human intervention. Previous works of [1] require color scribbles while [2–4] require reference images. During the emergence of deep learning, many authors like [5–11] introduce their uses of deep convolutional neural networks (CNN) to solve this problem in a fully automatic approach, and among them [9] proposes a deep learning method for user-guided image colorization. Their objective is to colorize the images with the most plausible colors instead of finding the lost original colors. In deep learning approaches, authors proposed deep CNNs to predict the chromatic ab channels in Lab color space given the lightness L channel. The CNNs usually have a contracting path that contains downsampling steps and an expanding path containing upsampling steps to grow the image back to its original resolutions. Their models perform well and generate images that even can fool humans in a “colorization Turing test” where people are asked to choose which is a generated image or the ground truth image [5]. Most of the authors highlight the importance of semantic colorization. This is the key concept to colorize images plausibly. Furthermore, some authors like [8, 10, 12–14] focus on the diversity of the generated colors. A car can be red, yellow, or white, and a bird can be green, blue, or black. Their methods are supposed to predict more vivid images with diverse colors instead of assigning a single color for a class of objects. To improve performance, other authors like [15] present a memory-augmented model that can produce high-quality coloration with limited data, or [16] use a pre-trained Inception-ResNetv2 to extract features of images in parallel with a normal encoder. Su et al. [17] propose a method of instance-aware colorization. They leverage an off-the-shelf object detector to obtain object images and use an instance colorization network to extract object-level features. The above studies show that the problem of image colorization can be solved by different approaches. Many state-of-the-art algorithms are proposed to colorize images with the care of semantics and diversity. However, the uses of deep CNN models prevent people from deploying a colorization application on limited computational devices. A common property of the CNNs proposed by the mentioned authors is that they are quite large. The model of [5] has 129 MB of parameters, while [7] introduces a model which consists of 588 MB of parameters. Heavy models with too many parameters take a lot of time to colorize an image and much more time for a video or long movies. This motivates us to find a relatively lightweight model which has good performance as most recent models and is light enough to be deployed and run smoothly on regular devices. The main contribution of this study is to propose a lightweight model for semantic colorization which has equivalent performance to state-of-the-art colorization methods. It is supposed to predict the ab channel given the lightness L channel which can colorize images of any resolution in a fully automatic manner. The model is inspired by a successful semantic segmentation architecture called U-net proposed by Ronneberger et al. [18]. This work applies self-supervised representation learning where raw images are used as the source of supervision.
A Lightweight Image Colorization Model …
93
2 Methodology This section of the paper presents the proposed architecture and the loss function. The model is supposed to predict the ab channels from a given L channel in the Lab color space. There is a pre-processing step to convert each RGB image in the training set to a pair of input (L channel) and labeled output (ab channels). After prediction, the calculated ab channels are combined with the original lightness channel to produce the predicted Lab image. Images of any resolution are resized to 256 × 256 before processing and restored to the original resolutions after prediction.
2.1 The Architecture The U-net architecture is proposed by Ronneberger et al. [18] which is based on the work of [19]. It is originally used for biomedical image segmentation. The CNN proposed by this study is a simplified version of the U-net (Fig. 1). Like some other colorization methods, this network has a contracting path and an expanding path. However, skip connections are added to retain information that might be lost during the downsampling. In Fig. 1, the number of channels is denoted on the top of the box while the resolution is on the left side. Inputs are 256 × 256 × 1 tensors where 1 represents the lightness channel L. Outputs are 256 × 256 × 2 tensors representing 2 chromatic channels a and b. In the contracting path, 3 × 3 kernels and ReLU activations are
Fig. 1 The simplified U-net architecture
94
P. Van Thanh and P. Duy Hung
Fig. 2 The proposed model
Table 1 Model size comparison of the proposed method and the others
Method
Parameters (MB)
Zhang et al. [5]
129
Larsson et al. [7]
588
Proposed model
55
used for all convolutional layers and 2 × 2 max pooling for downsampling. Unlike [18], in this method, “same” paddings are applied in all convolutional layers. In the expanding path, each upsampling step is done by an up-convolution that halves the number of channels and concatenation with a corresponding cropped feature map from the contracting path, followed by a 3 × 3 convolution with ReLU activation and “same” padding. Finally, a 1 × 1 convolutional layer with 2 filters is used to predict the ab channels before they are combined with the original lightness L channel to produce a colorful Lab image. Figure 2 shows the proposed model where each rectangular box represents a convolutional block. This model consists of 9 convolutional blocks and a 1 × 1 convolutional layer. It has 4.7 million parameters which are equivalent to 55 MB of memory storage. Compared to other models, this is relatively smaller and can be called “lightweight.” Table 1 shows the comparison where the model size of [5] is 2 times and [7] is 10 times larger than the proposed model approximately.
2.2 Loss Function Unlike [5] and some other authors who use custom loss functions, in this work, mean-squared-error (MSE) loss is used to evaluate the predicted mask. The loss is calculated over the ab channels as follow: MSE =
N 2 1 yi − yˆi N i=1
A Lightweight Image Colorization Model …
95
where N represents the number of entries in the predicted mask which is equal to 256 × 256 × 2.
3 Implementation and Evaluation 3.1 Experiment Preparation The proposed model is trained and evaluated on two different datasets. The first is a common semantic segmentation dataset named Pascal VOC 2012. This work merges the train/validation set and the test set into a bigger one and then split into 2 new subsets for training (31,792 images) and validation (1440 images). Evaluation on PASCAL VOC 2012 is used to compare the performance of the model to other methods where metrics are also reported on this commonly used dataset. The second dataset is Places205 where 227,880 images are for training and 12,101 images for evaluation. Training the model on Places205 is to report how well the model works on large datasets. The model is built using the TensorFlow framework and trained with the Nvidia Geforce GTX 1650 GPU. Adam optimization [20] is adopted.
3.2 Training This work uses mini-batch training where the mini-batch size is 16. When the model is trained on Places205, a single epoch contains 14,243 iterations and it averagely takes 7500 s to complete. The validation loss reaches the lower limit after 140,000 iterations. Approximately, the training process takes 20 h to get the best performance. This fast learning enables researchers to make various changes to the model and quickly see the effectiveness during the tuning process. On PASCAL VOC 2012, the model even learns faster. It reaches the lowest validation loss after only 18,000 iterations. However, after that point, the validation loss increases while the model still fits the training data better. This indicates model overfitting on such a small dataset as PASCAL VOC 2012. Validation-based early stopping [21] is adopted to terminate the training process when overfitting has begun.
3.3 Quantitative Comparison to Other Methods Model performance is compared to some state-of-the-art colorization models proposed by [5–7, 9, 10]. Two metrics used for evaluation are RMSE (Root Mean Squared Error) and PSNR (Peak Signal-to-Noise Ratio) measured on the predicted
96 Table 2 Quantitative evaluation
P. Van Thanh and P. Duy Hung Method
RMSE
PSNR
Runtime (ms)
Zhang et al. [5]
12.37
22.81
349.65b
Iizuka et al. [6]
10.66
24.20
–
Larsson et al. [7]
10.26a
24.56
–
Zhang et al. [9]
10.52
25.52
1,182.15
Zhao et al. [10]
11.43
23.15
–
Proposed model
11.33
24.88
112.98
a
Bold values indicate the best performance in comparisons b Italic values indicate the metrics reported by this work
ab channels compared to the ground truth. The reports of [5–7, 10] are provided by the work of [10] while the metrics of [9] are reported by this study. Like [10], this work selects 1440 images from Pascal VOC 2012 validation set to evaluate. This study also reports the average prediction time of [5, 9], and the proposed model for speed comparison. This experiment is done on an Intel Core i7-10750H CPU to show how fast each method runs on a regular device without powerful GPUs. Table 2 shows the experiment results. The results of Table 2 show that [7] outperforms the others on RMSE. Zhang et al. [9] achieves the best performance regarding PSNR. The proposed model performs better than most of the others according to the PSNR value and in the middle rank regarding RMSE. In terms of prediction time, this model is 3 times faster than [5] and 10 times faster than [9].
3.4 Qualitative Comparison To show how good the generated images are, the proposed model is trained on Places205 and the predictions are compared to the ground truth images. Figure 3 shows how they “look like” each other. Experiments on Places205 indicate that the model predicts better on blue areas like the sky or the ocean while it does not perform well on green objects like grass or trees. Therefore, in the images of nature, there is less green in the predicted masks than in the ground truth. This work also compares the predicted images to the results of Zhang’s methods [5, 9]. These are two state-of-the-art models of semantic colorization. Figure 4 shows the comparison. Most of the images generated by [5, 9] methods have more green color than the proposed method and even the ground truth. It makes their images more vivid and can fool people in a “colorization Turing test” [5]. However, this “green effect” sometimes makes the images look unreal. For example, in the images on the first row of Fig. 5, the railway in the prediction of [5, 9] is green while it is not in the ground truth. And in fact, a railway is rarely green!
A Lightweight Image Colorization Model …
97
Fig. 3 Predicted images of this work compared to the ground truths
In general, the images generated by this work have the same closure level to ground truth as Zhang’s methods. However, because of the “green effect,” Zhang’s methods do better on the images of nature where there is much green while the proposed model does better on the images of “less-green” things like cars, horses, lakes, and railways.
4 Conclusion and Future Works This paper proposed a neural network architecture for fully automatic image colorization. The model is a simplified version of U-net architecture which is relatively lightweight compared to other authors. The predicted images are not as vivid as some state-of-the-art methods like [5, 9]. However, the images reach the ground truth as close as those do. The model size is several times smaller and then it runs fast even on a device with no GPU available. Therefore, it can be used to build instant video colorization applications where so many image frames need to be colorized in an acceptable time. Predicted images by this work are quite grayish and sometimes desaturated. In the future, a custom loss function is applied to the model for training and predicting more
98
P. Van Thanh and P. Duy Hung
Fig. 4 The prediction of the proposed method compared to Zhang’s and the ground truth. From the left to right: Grayscale image; prediction of [5, 9]; Proposed model and the ground truth
colorful and saturated images. Furthermore, the proposed architecture can be used as the convolutional part of complicated colorization models where various additional techniques are adopted to achieve the best performance on particular aspects. The paper can be a good reference for many machine learning problems [22–24].
A Lightweight Image Colorization Model …
99
Fig. 5 The “green effect” in images generated by Zhang’s methods
References 1. A. Levin, D. Lischinski, Y. Weiss, Colorization using optimization. ACM Trans. Graph. 23(3), 689–694 (2004) 2. G. Charpiat, M. Hofmann, B. Schölkopf, Automatic image colorization via multimodal predictions, in Computer Vision-ECCV (Springer, 2008), pp. 126–139 3. X. Liu, L. Wan, Y. Qu, T.T. Wong, S. Lin, C.S. Leung et al., Intrinsic colorization. ACM Trans. Graph. 27(5), 152 (2008) 4. R.K. Gupta, A.Y.S. Chia, D. Rajan, E.S. Ng, H. Zhiyong, Image colorization using similar images, in Multimedia (2012) 5. R. Zhang, P. Isola, A.A. Efros, Colorful image colorization, in ECCV (2016) 6. S. Iizuka, E. Simo-Serra, H. Ishikawa, Let there be color!: Joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification. ACM Trans. Graph. 35(4), 110 (2016) 7. G. Larsson, M. Maire, G. Shakhnarovich, Learning representations for automatic colorization, in ECCV (2016) 8. A. Royer, A. Kolesnikov, C.H. Lampert, Probabilistic image colorization, in BMVC (2017) 9. R. Zhang, J.Y. Zhu, P. Isola, X. Geng, A.S. Lin, T. Yu, et al., Real-time user-guided image colorization with learned deep priors, in SIGGRAPH (2017) 10. J. Zhao, J. Han, L. Shao, C.G.M. Snoek, Pixelated semantic colorization. Int. J. Comput. Vision 128, 818–834 (2020) 11. P. Vitoria, L. Raad, C. Ballester, ChromaGAN: adversarial picture colorization with semantic class distribution, in WACV (2020)
100
P. Van Thanh and P. Duy Hung
12. A. Deshpande, J. Lu, M.C. Yeh, M.J. Chong, D. Forsyth, Learning diverse image colorization, in CVPR (2017) 13. S. Messaoud, D. Forsyth, A.G. Schwing, Structural consistency and controllability for diverse colorization, in ECCV (2018) 14. Y. Wu, X. Wang, Y. Li, H. Zhang, X. Zhao, Y. Shan, Towards vivid and diverse image colorization with generative color prior, in ICCV (2021) 15. S. Yoo, H. Bahng, S. Chung, J. Lee, J. Chang, J. Choo, Coloring with limited data: few-shot colorization via memory augmented networks, in CVPR (2019) 16. P.A. Kalyan, R. Puviarasi, M. Ramalingam, Image colorization using convolutional neural networks, in ICRTCCNT (2019) 17. J.W. Su, H.K. Chu, J.B. Huang, Instance-aware image colorization, in CVPR (2020) 18. O. Ronneberger, P. Fischer, T. Brox, U-net: convolutional networks for biomedical image segmentation, in MICCAI (2015) 19. J. Long, E. Shelhamer, T. Darrell, Fully convolutional networks for semantic segmentation, in CVPR (2015) 20. D.P. Kingma, J. Ba, Adam: a method for stochastic optimization, in ICLR (2015) 21. L. Prechelt, Early stopping—but when? in Neural Networks: Tricks of the Trade. Lecture Notes in Computer Science, vol. 7700, ed. by G. Montavon, G.B. Orr, K.R. Müller (Springer, Berlin, Heidelberg, 2012) 22. N.T. Su, P.D. Hung, B.T. Vinh, V.T. Diep, Rice leaf disease classification using deep learning and target for mobile devices, in Proceedings of International Conference on Emerging Technologies and Intelligent Systems, ICETIS (2021) 23. P.D. Hung, N.T. Su, Unsafe construction behavior classification using deep convolutional neural network. Pattern Recognit. Image Anal. 31, 271–284 (2021) 24. N.Q. Minh, P.D. Hung, The system for detecting Vietnamese mispronunciation, in Future Data and Security Engineering. Big Data, Security and Privacy, Smart City and Industry 4.0 Applications. FDSE 2021. Communications in Computer and Information Science, vol. 1500, ed. by T.K. Dang, J. Küng, T.M. Chung, M. Takizawa (Springer, Singapore, 2021)
Comparative Analysis of Obesity Level Estimation Based on Lifestyle Using Machine Learning R. Archana and B. Rajathilagam
Abstract Obesity is a global epidemic in which excessive body fat increases the risk of health problems. It is a major contributor to the burden of chronic disease and disability affecting people of all ages and gender. The study focuses on assessing the obesity level of individuals between 14 and 60 years based on their dietary behavior, lifestyle pattern, and physical condition. The supervised machine learning algorithms were used to develop a predictive model, and the factors that have a major impact in developing obesity were analyzed. When compared to other traditional methods using boosting algorithms, the models have achieved better performance in predicting various levels of obesity. Keywords Food security · Obesity · Machine learning · Random forest · Ensemble model
1 Introduction Food security is the ability of people from all sections to access and consume an adequate amount of nutritious and safe food in all period of time, to meet their dietary need food preferences for active life [1]. Food insecurity and obesity are strongly correlated. If there is an increase in rate of obesity, food insecurity is also increased. Obesity is a medical disorder in which the accumulation of a high volume of body fat raises the risk of multiple disease progression, severe impairment, and even premature death [2]. The deficiency of nutritious intake and prevalence of urban food desert is the major factors of obesity. People from low income suffer from the lack of nutritious food, budget-constrained market food basket, and increase in the release of stress R. Archana (B) · B. Rajathilagam Department of Computer Science and Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India e-mail: [email protected] B. Rajathilagam e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_8
101
102
R. Archana and B. Rajathilagam
hormone make them easily prone to obesity. On the contrary, people living in high socioeconomic regions have more access to low-caloric and junk foods that also end in obesity. It also decreases the quality of life. A person with obesity is likely to have inferiority complex that affects his mental health and decreases his cognitive skills. Thereby, being food insecure is a direct threat to the very development of the society. Obesity has reached epidemic proportions globally, with at least 4 million people dying each year as a result of being overweight or obese. The Food and Agriculture Organization (FAO) report shows that between 2000 and 2016 there has been a steady increase in obesity prevalence than that of overweight. It has been increased in all ages, especially among school children and adults [3]. When an individual’s Body Mass Index (BMI) is greater than 25, he/she is said to be overweight and above 30 is considered obese. BMI is determined by dividing the weight in kilograms by square of height in meters. Obesity is generally caused by the increase intake of high caloric or processed food, poor diet, lack of physical activities, and genetic or other medical disorders [4]. The lifestyle, mode of travel, and eating habits of individuals with sedentary work have a greater impact on the development of excessive body fat [5]. De-La-Hoz-Correa et al. [5] and Cervantes et al. [6] mention other factors that lead to obesity as “being only child, familiar conflicts such as divorce of parents, depression, or anxiety.” As Centres for Disease Control and Prevention (CDC) has stated, obesity is considered an immediate health challenge and also a winnable battle. As the obesity prevalence in individuals increases, the development for a computational approach becomes the need of the hour. Based on the prior study, the estimation of obesity level based on the BMI, omitting the other factors like family history, eating habit, and physical condition does not necessarily uncover the possibilities of being obese. The COVID pandemic has increased the sedentary nature of many people around the globe. It has changed the overall lifestyle of people, leaving them with limited choice for physical activities. There is a restriction to continue their habit of walking, exercise, and outdoor activities. Most of them have shifted to work from home pattern for their livelihood. These have huge impact on the increase of obesity to not just adults but also among children. Therefore, a study to estimate the obesity level in individuals of all ages is crucial. This study uses several Machine Learning algorithms to predict obesity level in individuals based on their eating habits and physical condition. The early prediction can help to avoid certain diseases by modifying the lifestyle and eating patterns. The structure of the paper is as follows: studies with similar background is discussed in Sect. 2, data samples and methodology used for the experimentation process is described in Sect. 3, experimental analysis in Sect. 4, results obtained from the techniques in Sect. 5 and the conclusion of the study is discussed in Sect. 6.
Comparative Analysis of Obesity Level Estimation …
103
2 Related Work De-La-Hoz-Correa [5] presented a study to estimate the obesity levels using Decision Trees. 712 data samples from students between 18 and 25 years from Colombia, Mexico, and Peru were used. Six levels of obesity were taken into consideration. The authors used SEMMA data mining technology, and three methods were selected: Decision trees, Bayesian networks, and Logistic Regression. Decision tree model was the best model that obtained a precision of 97%. A software using NetBeans were deployed to classify the patients with obesity. Cervantes [6] in his study to estimate obesity levels used data samples of 178 students of Colombia, Mexico, and Peru between 18 and 25 years old. Using WEKA tool, algorithms like Decision tree and Support Vector Machine (SVM) Obesity Level Estimation based on lifestyle using Machine Learning 3 were used to train the model. Simple K-Means was selected as the clustering method for validation. The study obtained a precision of 98.5%. The authors [7] used Naïve Bayes and Genetic Algorithm for predicting obesity in children. 19 parameters were used for the study. Genetic Algorithm was used to optimize these parameters. This hybrid approach achieved an accuracy of 75%. In order to reduce childhood obesity in Malaysia, the authors [8] used Naïve Bayes to identify children who are prone to be obese. They have developed a knowledge-based system to suggest suitable menu for improving health among school children. This system has a precision of 73.3%. Satvik Garg et al. [9] built a framework using Python Flask. They took leverage from various machine learning algorithms that would predict obesity level, body weight, and fat percentage level. Several hyperparameter optimization algorithms such as Genetic algorithm, Random Search, Grid Search, and Optuna were used to improve the performance of the model. They included many features like customizable diet plans, workout plans, and a dashboard to track the progress of a person. Based on the nutritional facts of the food intake and health status, Manoharan [10] has developed a Patient Recommendation System that automatically suggest the food diet to be followed based on their health condition. The study introduced a k-clique embedded deep learning classifier recommendation system. The data with thirteen features of patients having various disorders were collected over Internet and through hospitals. Machine learning techniques were used to compare the proficiency k-clique deep learning classifier.
3 Methodology To predict the obesity level, the machine learning model has been developed and that is depicted in Fig. 1. The data that satisfies the requirement of the study was identified. The output variable was defined. Data was preprocessed, and various
104
R. Archana and B. Rajathilagam
Fig. 1 Flowchart of the methodology
machine learning algorithms were applied. The hyperparameters were fine-tuned, and the model is validated. The study reaches its purpose when the model can rightly classify and predict the samples that are more prone to the high risk of obesity, for instance Obesity class III. Body Mass Index (BMI) cannot be considered as a sole indicator for predicting the obesity level of an individual. The BMI of an athlete or a body builder will be much higher than normal. Therefore, other factors that indicate the physical condition and the daily eatery habits should be considered as while estimating the obesity level.
Comparative Analysis of Obesity Level Estimation …
105
Table 1 WHO classification of obesity BMI (kg/m2 )
Nutritional status
Risk
Below 18.5
Insufficient weight
Low (but risk of other clinical problems increased)
18.5–24.9
Normal weight
Average
25.0–26.9
Overweight level I
Increased
26.9–29.9
Overweight level II
Increased
30.0–34.9
Obesity class I
Moderate
35.0–39.9
Obesity class II
Severe
3.1 Data The study uses data from individuals aged between 14 and 61 years, of Colombia, Peru, and Mexico collected through a survey [11]. Dataset contains 17 variables and 2111 observations. The data was collected through online survey using a Web platform. The dataset was balanced using Synthetic Minority Oversampling Technique (SMOTE) filter. It is a tool that Weka uses to generate synthetic data. 48% of the samples were students aged between 14 and 22. There was an equal distribution of male and female population in the observation. 46% of the samples were collected from people suffering from any one of obesity level. The drivers used for the estimation of obesity level are eating habit and physical. The attributes used for the study based on eating habits are frequent consumption of high caloric food, vegetable intake, number of main meals per day, consumption of water, food between meals, and drinking or smoking habit. Attributes based on physical conditions are data from calories consumption monitoring, physical activity frequency, time using technology devices, and transportation used. And the attributes based on individual’s characteristic are gender, age, height, weight, and family history with overweight. The obesity levels are categorized based on WHO classification on different degrees of BMI [12] as given in Table 1. The class distribution of the target variable is given in Fig. 2. Data cleaning and pre-processing were done upon the selected dataset. The dataset contained ordinal data that was converted, and label encoding method was used. Atypical and missing data were handled, and the correlation level between the attributes were checked.
3.2 Proposed Model The research hypothesized that supervised machine learning algorithms can be used to predict the obesity level of an individual provided the details about their
106
R. Archana and B. Rajathilagam
Insufficient weight Normal weight Overweight Level I Overweight Level II Obesity class I Obesity class II Obesity class III
Fig. 2 Target variable class distribution
lifestyle. The study also focuses on improving the accuracy of prediction. The term “prediction” implies estimating the output class of an unseen input data. Ensemble Modeling is a process of using multiple independent models to predict an output class. The result of this model is more accurate than individual models as it achieves “the wisdom of the crowd.” In our case, ensemble methods are leveraged to increase the predictive accuracy and to correctly identify the person having more tendency of being obese. A tree-based ensemble method can identify the most important features that have larger impact on the obesity level. The advanced ensemble modeling techniques including bagging and boosting are used for the study. Bagging: Bagging is an ensemble meta-estimator designed to improve the performance of machine learning algorithms used in both classification and regression. It creates multiple models and combines them to generate generalized result. Bagging or Bootstrapping is a sampling method used to subdivide the original dataset into bags or subsets with replacement. A weak base model c1 , c2 , c3 , … cn are built on these bootstrap samples. Final estimator is derived from the combination of all these base models c1 , c2 , c3 , … cn with majority votes or by considering average of predicted class. Bagging classifier can be applied on any classification algorithms especially on decision trees, and neural networks to improve the accuracy [13]. Random Forest: Random Forest is a tree-based ensemble learning method that uses bagging principal with randomization technique called Random Feature Selection. It consists multiple ensembles of independent decision trees each trained by bootstrap sampling of original dataset. With decision tree induction method, it randomly splits the dataset and selects the best split for building the model [14]. Although decision trees have many advantages, they are more likely to prone to over-fitting. Random Forest limits over-fitting without increasing the error. In classification, the prediction function f (x) is the most frequently predicted class [15]
Comparative Analysis of Obesity Level Estimation …
107
Table 2 Hyperparameter tuning values S. No Approach 1
Hyperparameters
Accuracy
Grid search n estimators: 200, max features: 2, max depth: 80, min samples 93 split: 9, min samples leaf: 2, and bootstrap: True
given in Eq. (1). To make a prediction of new input point x, the below equation is used where h j (x) is the prediction of the output variable using the jth tree. f (x) = arg max y∈Y
3.2.1
J I y = h j (x)
(1)
j=1
Hyperparameter Tuning
Hyperparameter tuning is a technique used to optimize the parameter values of learning algorithms that can reduce the overall cost function and improves the model behavior. It helps us to generalize the model with best performance to unseen data. In our study, the holdout validation approach is used. The part of the dataset is splitted and held aside as a holdout set that is used later to test the performance of the model. The whole dataset was split into training and holdout set. The holdout set is left untouched during the training phase. The model is learnt on the training data, and hyperparameter is tuned upon the same for better performance and then is validated using the holdout set. Grid Search is one of the approaches that is used to tune the hyperparameters. It takes all possible combinations of parameter values in the form of a grid, evaluates the values, and returns the best among them. A variety of key parameters are tuned to define an optimal model with a precision of 93%. The key parameter and their best value found after hyperparameter tuning are given in Table 2. The model is at its best performance when the gap between the training score and validation score is smaller. As seen in Fig. 3, the validation score is much lesser than training score which makes the model moderate in its performance. The classification report of the Random Forest model is given in Table 3. The model performs well while predicting the target class Obesity_class_III which is the most severe state of obesity. The overall performance of the model can be stated as good but not the best. Boosting: Boosting is an interactive process of creating a team of weak and less accurate classifiers forming a strong accurate prediction rule. The subsequent models tend to rectify the error of the previous model. Gradient boosting gradually trains different models in an additive and sequential manner. It identifies the shortcomings of a weak classifier by using gradients in the loss function. The loss function is a measure of the fit of model’s coefficient over
108
R. Archana and B. Rajathilagam
Fig. 3 Validation curve for Random Forest
Table 3 Classification report of Random Forest Output class
Precision
Recall
F1 score
Support
0 = Insufficient weight
0.97
0.97
0.97
61
1 = Normal weight
0.76
0.87
0.81
45
2 = Obesity class I
0.97
0.92
0.95
79
3 = Obesity class II
0.95
0.98
0.96
54
4 = Obesity class III
0.99
0.99
0.99
63
5 = Overweight level I
0.93
0.89
0.91
61
6 = Overweight level II
0.95
0.93
0.94
60
the data. In our study, the loss function is measured by correct classification of bad obesity (Obesity class III). Histogram-based gradient boosting: As gradient boosting is a sequential process of training and adding models, the models slow down during training. In order to increase the efficiency of gradient boosting, data or attributes are reduced by binning the values of continuous attributes in histogram-supported gradient boosting. “Instead of finding the split points on the sorted feature values, histogram-based algorithm buckets continuous feature values into discrete bins and uses these bins to construct feature histograms during training” [16]. Histogram-based gradient boosting fits faster on the training data. The model is evaluated by using repeated k-fold cross validation. As a result, a single model is used to fit the data and the mean accuracy is reported. The classification report is given in Table 4. As shown, the prediction accuracy is much higher when compared to other models. The model performs well in estimation of all obesity classes. We hypothesized that model performance is the best if it could correctly classify or predict the worst state of obesity (Obesity class III) which has the highest risk. The
Comparative Analysis of Obesity Level Estimation …
109
Table 4 Classification report of histogram-based gradient boosting Output class
Precision
Recall
F1 score
Support
0 = Insufficient weight
0.99
0.97
0.98
61
1 = Normal weight
0.96
0.99
0.98
45
2 = Obesity class I
0.99
0.97
0.98
79
3 = Obesity class II
0.96
0.99
0.98
54
4 = Obesity class III
0.99
0.99
0.99
63
5 = Overweight level I
0.98
0.98
0.98
61
6 = Overweight level II
0.98
0.97
0.97
60
proposed model performed well not only in predicting the extreme case of obesity but also estimating all the levels of obesity with almost equal predictive rate.
4 Experimental Analysis 4.1 Model Comparison In the previous studies, the researchers De-La-Hoz-Correa et al. [5] and Cervantes et al. [6] have limited the model to estimate the obesity level among students between the age 18 and 25 years. Also the levels of obesity being addressed is limited. It is always crucial to know the exact stage of nutritional status of our body. As the effort to be taken to maintain health level and the dietary or lifestyle pattern to be followed in each of the stage varies. Thereby, in our study we focused on addressing all the possible levels of nutritional status, which lead us to seven classes of target variable. The purpose of our study was to improve the prediction rate of previous research works with similar background. The hypothesis was to use advance machine learning algorithms like ensemble methods that can predict all stages of nutritional status for people of all ages with better performance. We used various other machine learning algorithms like Support Vector Machine (SVM), Naïve Bayes, Decision Trees, etc., that has been used in previous works compare the performance of our model. The widely used hyperparameters in SVM are kernel, regularization parameter (C), and kernel coefficient (gamma), in decision tree classifier are criterion, max depth, and n components, and in Random Forest Classifiers are n estimators, max features, max depth, min samples split, min samples leaf, and bootstrap. We fine-tuned the parameters to result in better predictive analysis. The values of the parameter after tuning along with the precision rate is given in Table 5. Receiver Operating Characteristic (ROC) Curve is used to visualize the performance of each classifier. Figure 4 depicts the ROC curve of various supervised machine learning algorithms.
110
R. Archana and B. Rajathilagam
Table 5 Hyperparameter tuning S. No
Classifier
Hyperparameters
Precision
1
SVM
C: 1000, gamma: 1, Kernel: Linear
95
2
Decision tree
Criterion: entropy, max depth: 12, number of components: 12
96
3
Random Forest
n estimators: 200, max features: 3, max depth: 80, min samples split: 8, min samples leaf: 3, and bootstrap: True
93
Fig. 4 ROC curve of supervised algorithms
From Fig. 4, it is evident that the precision of Random Forest and histogrambased gradient boosting are almost equal. But the latter is considered the best model as the variation of predictive rate between the holdout set and the training set is lesser when compared to that of Random Forest. The multiclass ROC curve is given in Fig. 5. The One vs Rest strategy is used in multilabel classification to fit one classifier per class. This improves the interpretability and gives knowledge about each label independently against the rest. As shown in Fig. 5, the class 4 has the leading rate which helped the model predict the class 4, i.e., Obesity class III with highest predictive rate.
Comparative Analysis of Obesity Level Estimation …
111
Fig. 5 Multiclass ROC curve
4.2 Evaluation Measures Precision is used to compare the proposed model with previous work accuracy. Other evaluation metrics like recall and F1 score are also taken into consideration. Precision gives the proportion of positive identification that was actually correctly classified. Thus, it is the proportion of true positive compared to all positive [17]. Recall gives the proportion of actual positives that were identified correctly. Thus, it is a proportion of true positive compared to the sample that should have been positive. To measure the correctness of a model both precision and recall should be considered. The F1 score is the harmonic mean between precision and recall [17]. To evaluate the obesity level estimation model, the following metrics are used based on the True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN). • • • •
True Positive (TP) = Number of obesity data predicted as obese True Negative (TN) = Number of non-obese data predicted as non-obese False Positive (FP) = Number of non-obese data predicted as obese False Negatives (FN) = Number of obesity data predicted as non-obese
5 Results and Discussion The supervised algorithms such as SVM, Decision Tree, Random Forest, and gradient boosting are used to develop the proposed work. The results of each models are
112
R. Archana and B. Rajathilagam
compared based on the evaluation metrics, and it is found that the best result is obtained by histogram-based gradient boosting. The result of the classifiers is given in Fig. 6. The result of Obesity Level-III classification which is marked severe has accuracy greater than 90% in all classifiers. The Area under the ROC Curve (AUC) which measures the ability of classifiers to identify various classes has a score of 1.00. Using Random Forest classifier, the six most important features that influence the rate of obesity or the nutritional status other than height and weight are found to be frequent consumption of high caloric food, age, consumption of food between meals, mode of transport, smoke, and physical activity frequency. The feature and its relative score are given in Fig. 7. Table 6 depicts the comparison summary between traditional models and the proposed work. Decision tree
Random forest
SVM
Histogram-Based GB
f1-score Recall Precision Accuracy 80
82
Fig. 6 Model evaluation
Fig. 7 Feature ımportance
84
86
88
90
92
94
96
98
100
Comparative Analysis of Obesity Level Estimation …
113
Table 6 Comparative summary Traditional approach
Proposed work
1. Study on prevalence of obesity among children and students aged between 18 and 25 years 2. Machine learning methods like • Decision trees, Bayesian networks, and Logistic Regression [5] • Decision Tree, Support Vector Machine (SVM), and Simple K-Means [6] • Random Forest, Decision Tree, XGBoost, Extra Trees, and KNN [9] were used to estimate the level of obesity 3. The two stages of pre-obese is fused into one label, limiting the nutritional status to six classes [5] • Four gender-specific clusters were formed that segregate people who are prone/ not prone to overweight and unsupervised technique is implemented [6] 4. Models with best performance: Random Forest (84%) [9], Decision Tree (97%) [5], Naïve Bayes (75%) [7], and Naïve Bayes (73%) [8]
1. Study on prevalence of obesity is extended to a wider range of age-groups 2. The model leveraged from supervised machine learning algorithms like logistic regression, decision tree, support vector machine, KNN, Random Forest, and gradient boosting 3. The obesity levels are categorized based on WHO classification on different degrees of BMI, resulting in seven classes of nutritional status. Addressing exact level of obesity is always crucial to take proportionate precautions especially at the pre-obese level 4. All classifiers used in the traditional methods are built with wider range of age-group. Hyperparameter optimization was used to improve the results of previous works. For instance, the precision of Random Forest was 86% in the traditional methods [9] and is improved to 93% in proposed work 5. The model with best predictive accuracy was histogram-based gradient boosting (98%) 6. There was an extension in the obesity classification labels and the range of population addressed. The model was able to obtain a good performance using boosting algorithms without stepping into deep learning models
6 Conclusion COVID pandemic has brought a severe change in most of our lives. The movement of the people around the world was restricted within four walls. The decrease in physical activity triggered a negative body reactions. The prevalence of obesity increased within no time. Thereby, a study on the nutritional status is more crucial as it can trigger early actions that can save lives and reduce sufferings. Machine Learning is a method of recognizing trends in information and utilizing them to make automatic predictions or decisions. The earlier prediction of obesity is crucial, since 39% of adults globally are overweight and 13% are obese. In our research, histogram-based gradient boosting achieved a precision of 98% in classifying obesity levels. The result of Obesity class-III classification which is marked severe has accuracy greater than 99% in histogram-based gradient boosting. This study has surpassed the results obtained in [7] that had 75% in precision, in [8] that obtained 73% in precision, and in [5] that has 97% precision. The proposed work can be used to identify individuals with tendency to suffer from obesity prior.
114
R. Archana and B. Rajathilagam
Acknowledgements We thank Innovation in Science Pursuit for Inspired Research (INSPIRE) managed by Department of Science and Technology for supporting the research.
References 1. 2. 3. 4. 5. 6. 7.
8.
9.
10. 11.
12. 13. 14. 15. 16. 17.
Food security, International Food Policy Research Institute Report (2021) F. Ofei, Obesity-a preventable disease. Ghana Med. J. 39(3), 98 (2005) WHO Report, The state of food security and nutrition in the world (2019) N. H. Service, “Obesity,” National Health Service (2019) E. De-La-Hoz-Correa, F. Mendoza Palechor, A. De-La-Hoz-Manotas, R. Morales Ortega, A.B.Sánchez Hernández, Obesity level estimation software based on decision trees (2019) R. Ca˜nas Cervantes, U. Martinez Palacio, Estimation of obesity levels based on computational intelligence (2020) M.H.B. Muhamad Adnan, W. Husain, N. Abdul Rashid, A hybrid approach using na¨ıve bayes and genetic algorithm for childhood obesity prediction, in 2012 International Conference on Computer Information Science (ICCIS), vol. 1 (2012), pp. 281–285 W. Husain, M.H.M. Adnan, L.K. Ping, J. Poh, L.K. Meng, Myhealthykids: ıntelligent obesity intervention system for primary school children, The 3rd International Conference on Digital Information Processing and Communications (ICDIPC2013) (2013) S. Garg, P. Pundir, MOFit: a framework to reduce obesity using machine learning and IoT, in 2021 44th International Convention on Information, Communication and Electronic Technology (MIPRO) (2021), pp. 1733–1740. https://doi.org/10.23919/MIPRO52101.2021.959 6673 S. Manoharan, Patient diet recommendation system using K clique and deep learning classifiers. J. Artif. Intell. 2(02), 121–130 (2020) E. De-La-Hoz-Correa, F. Mendoza Palechor, A. De-La-Hoz-Manotas, R. Morales Ortega, A.B. Sánchez Hernández, Obesity level estimation software based on decision trees. J. Comput. Sci. 15(1), 67–77 2019 P.T. James, R. Leach, E. Kalamara, M. Shayeghi, The worldwide obesity epidemic. Obes. Res. 9(S11), 228S-233S (2001) D. Lavanya, K.U. Rani, Ensemble decision tree classifier for breast cancer data. Int. J. Inf. Technol. Convergence Serv. 2(1), 17 (2012) S. Bernard, L. Heutte, S. Adam, Influence of hyperparameters on random forest accuracy, in International Workshop on Multiple Classifier Systems (Springer, 2009), pp. 171–180 A. Cutler, D.R. Cutler, J.R. Stevens, “Random forests,” in Ensemble machine learning (Springer, 2012), pp. 157–175 G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, T.-Y. Liu, Lightgbm: a highly efficient gradient boosting decision tree. Adv. Neural. Inf. Process. Syst. 30, 3146–3154 (2017) Google, Classification: Precision and Recall [WWWDocument] (2019). https://developers.goo gle.com/machine-learning/crash-course/classification/precision-and-recall
An Empirical Study on Millennials’ Adoption of Mobile Wallets M. Krithika and Jainab Zareena
Abstract Nowadays, Millennials are good opinion leaders on technology because they are comfortable with new technology and social media. As Millennials gain popularity in Chennai, it is important to explore how the Chennai Millennials’ adoption behaviour of mobile wallets differs from the conceptual model developed for the study is based on the TAM that was tested in the study. The important variables taken into account by the study are perceived ease of use, perceived usefulness, attitude, and intention towards adoption of Chennai Millennials towards mobile wallet adoption. The quantitative data was obtained by circulating online questionnaires (240) through various social media like Facebook and WhatsApp. SPSS was used to analyse the results. This research indicates that there is a positive influence of perceived ease on the intention of adoption. Significant determinants of intent to use mobile wallets were PEOU and PU. Implications and drawbacks of the study have been addressed. Keywords Stepwise regression · Technology Acceptance Model · Mobile wallet · Adoption · Chennai Millennials · Technology · Innovation
1 Introduction India is paving its way gradually towards a cashless society. From those hefty physical wallets to virtual wallets, we are evolving at a significant rate. Recall those days when we’d be carrying those bulky wallets full of cash and credit? However, it’s all thanks to mobile wallets and our freight while facilitating payments and transactions. Now, from the convenience of our home, we can pay for almost any product or service, transfer funds, make bill payments, book tickets, etc. Gone are the days when people M. Krithika (B) Department of Management Studies, Saveetha School of Engineering, SIMATS, Chennai, India e-mail: [email protected] J. Zareena Department of Management Studies, SCAD College of Engineering and Technology, Tirunelveli, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_9
115
116
M. Krithika and J. Zareena
had to wait hours just to get their hands on their favourite movie’s “first day, first series” card. With its one-tap functionality and fast processing all in one go, mobile wallets have simplified our lives. Mobile wallets were developed to allow for a smooth and seamless flow of trouble-free transactions. A mobile wallet uses bank account and credit or debit card information for seamless processing of payments while entirely securing all user details. Compared to other physical wallets, they help lower the payment processing time, reduce fraud, and are economical. The Indian government drove the demonetization of these wallets, and since that time, the user base has been continuously growing. According to CFO India, nearly 95% of transactions were cash transactions, 85% of those paid in cash, and nearly 70% of those voted “cash on arrival” as the preferred method of payment. However, the mobile wallet industry in India will grow by 150% next year, with $4.4 billion in transactions [1, 2]. With the increasing adoption and promotion of information technology and Internet marketing, the corporate environment has transformed at a fast rate [2]. Cell phones are popular in most emerging countries in industrialized economies due to the exponential development of the mobile industry. Tiny programs run on a mobile computer and carry out tasks such as banking, gaming, and web browsing to identify IO applications. A broad variety of applications have been created for mobile users, including sports apps, social networking apps, Internet shopping apps, and travel planning apps. Some statistical figures indicate that young consumers exposed to multiple mobile wallets [3, 4]. In developing countries like India, mobile wallets have become popular. The widespread use of mobile wallets are such as WhatsApp, Snapchat, Uber, OLX, Hotstar, Paytm, Google Pay, Zomato, Amazon, to name a few. Mobile wallets are a major boon for many young consumers in those developing markets. Government agencies have also released valuable public domain applications such as MyGov and Meri Sadak, to name a few. The youngest and major market group from 2017 to 2030 is Gen Z. This generation has a highly qualified, technically knowledgeable, inventive, and creative membership. They are actively engaged in the use of technology and digital devices through social media and the use of mobile wallets (apps) [5, 6].
2 Preparation of Your Paper 2.1 Background of Theoretical Framework The technological acceptance model for studying the behaviour of the adoption of technologies developed and extensively used by scientists was extensively studied. Extensive studies show that the TAM reliably describes improvements in adopted conduct and is well regarded as an effective model for predicting intentional use [7].
An Empirical Study on Millennials’ Adoption of Mobile Wallets.
117
The Technology Acceptance Model (TAM) is an evergreen term in the area of consumer development studies. Much of the TAM model tends to be a study of the driving powers of new technologies. He also found that perceived utility and perceived user-friendliness are important factors in determining how the technology is used [4]. Factors that influence the actions of technology adoption are multifaceted, selfmotivated, and diverse [8]. For instance, incorporating consumer creativity and perceived risk to the Technology Acceptance Model is a key to understanding the behaviour of Internet banking. The Technology Acceptance Model often combines social impact and personal creativity for analysis of the actions of adoption towards online services. The study carried by [9] found that there is an indirect impact of social and individual innovation on perceived utility and ease However, few researches have addressed the preferences of third parties for mobile payments [10]. As a result of the above findings, this study explores how young Indian Millennials will adopt the new mobile wallet adoption. This study will give more insight to TAM’s with regard to new technology adoption behaviour. The four main TAM variables have been set up for execution, perceived utility, and perceived ease of usage. Many who use new technology will be prepared to get them on board [11, 12]. According to a study carried out by [13], the term “attitude” describes people’s expectations and opinions of modern technology. Research [14] has shown that two determinants of attitudes are crucial elements in the decision-making cycle of perceived product interest and perceived ease of usage. The relationship between attitudes and acceptance expectations is clear and optimistic [6]. Therefore, the study uses basic TAM to test factors, namely perceived value (PU) and perceived ease of use (PEU).
2.2 Perceived Ease of Use A highly perilous factor in technology adoption and behavioural use is considered to be easy to use. PEOU has been described as the degree to which a person believes that the use of technology is easily understood or easily used [4]. Consider “to what degree people accept the rapid advancement of technology” and “to what degree people feel the technology is useful.” Specifically, the study indicates that variables such as ease of usage, perceived usefulness, action, and intentional adoption have significant consequences for the implementation of new technology in previous studies [15]. Studies on application adoption have defined the factors of ease of use on their own, and the results have had major implications in terms of user intent [16]. Researchers in the literature review have found that mobile wallets use variable usability, usability, perceived risk, attitudinal behaviour, and customer preferences, which have a significant effect on both acceptance and customer loyalty [9]. The following report has already been written.
118
M. Krithika and J. Zareena
H1 : PEOU significantly influence the adoption intention of young consumers towards using mobile wallets.
2.3 Perceived Usefulness The study states that PU is one of the main components of the initial paradigm of implementation of development [17]. If people feel that using a mobile app will boost their job efficiency, they may do so more frequently [18]. Numerous previous works have proven effective in the utilization of smartphone devices as a critical factor [19]. When considering the utility of technology [6,] the mobile wallet users anticipate the use of the program to boost the efficiency of a job in an organization. The technology is greatly useful to the consumer to perform a particular task specified in the online technology framework [20]. Perceived utility, along with perceived protection and scalability, affects the conduct of a mobile wallet programme on the part of the trader [19]. The purpose of confidence and traders is to use their perceived protection and use of the wallet programme [21]. Further to this study, the effect of the results on the decision to use innovations was seen as plain, considered to be beneficial and seen by consumers [22] [12]. Research was performed by [20, 23] on the usage of the network and mobile payments. It is influenced by the desire of traders to use higher technology prices. For identifying this result, the variable, namely perceived usefulness, was taken. The past study conducted by the researcher also used the above variable [24]. Hence, the researchers proposed the below-mentioned hypothesis. H2: Perceived usefulness has a significant effect on the behavioural purpose of young consumers to use mobile wallets.
2.4 Attitude and Intention Prior research has demonstrated a close correlation between technical development or creative creativity and the adoption or use of technology or innovation in the area of emerging technologies. Many longitudinal studies in the area of technology also confirm that there is a connection between attitude and intention. The customer attitude towards mobile shopping influenced consumers’ intent to participate in mobile shopping, and this was identified [25]. Further work has shown that users have a major effect on their choice to use e-readers [26]. Customers have a heavy influence on their choice to use the smart mobile wallet platform [27]. Consumers’ intent to purchase via online shopping has been identified [28]. Through involving university students in the USA, they studied the effect of congruity
An Empirical Study on Millennials’ Adoption of Mobile Wallets.
119
on confidence and Attitude, and their influence on the intention to buy. They clarified that attitude significantly affected purchasing intentions and congruity with a significantly impacted attitude towards self-image. H3: Attitude significantly affects the adoption intention of young consumers towards using mobile wallets.
2.5 Sampling and Methodology The people between the 18 and 40 years are considered as appropriate respondents for the study because widely accepted defining range for the generation for Millennial. The unbiased sampling was adopted by circulating questionnaires in the university clusters. The snowball sampling was adopted during the data collection where the web questionnaires were circulated in the WhatsApp social networking channel. Networking strategies was adopted by the researchers for sharing the questionnaires to many respondents. Data was collected from young consumers aged 18 to 40 in Chennai and surrounding areas who have been using smart phones and mobile wallets for at least the last two years. Almost 96% (240) of the 250 questionnaires distributed and received completed. Nearly equivalent gender representation achieved at a rate of 53% for males compared to 47% for females.
3 Data Analysis and Results The quantifiable data collected for this study was entered into SPSS 23, a specialized statistical platform for performing research analysis. Previously, all the data was carefully reviewed and marked. Respondents who gave the same answer to all of the questionnaire’s statements were eliminated [29]. Invalid responses in the questionnaires and the negative items are reviewed and recoded into positive ones [29].
3.1 Reliability of the Constructs The stability of buildings has been tested by Cronbach’s alpha. The Cronbach alpha value is over 0.7 in Table 1. This means that the quantities are strongly internally consistent and are ready for further study [30]. In the descriptive assessment, about 250 responses were collected from the respondents, and the number of valid responses was 240. Of the 10 rejected responses, four were younger than 18 years old and six were older than 40 years old. They were not
120 Table 1 Reliability of the variables
M. Krithika and J. Zareena Constructs
Total items in the construct
Alpha value
Perceived ease of use
4
0.921
Perceived usefulness
4
0.946
Attitude
4
0.916
Adoption ıntention
3
0.913
relevant to this study because the study population is Millennial (aged between 18 and 40). Hence, the final sample size for the study is 240. Demographic Profile The study respondents’ demographic characteristics are listed in Table 2. The number of respondents from Male is 128, which contributed 53% of the survey. The number of women is 112, representing 47% of the survey. For this study, a sample is a representative community since the study population is Millennial (age between 18 and 40). Respondent ages were divided between 18 and 40 years. The survey accounted for 59.5%of respondents aged between 18 and 24, which is 59.5%. The number of respondents is 86, followed by ages between 31 and 40, which accounted for 22.5% of the survey. The number of respondents between the ages of 25 and 30 is 64, representing 18.0% of the survey. Table 2 Demographic details of the respondents
Demographic characteristics
Items
Gender of respondents Male Female Age
Living region
Hours spend on mobile
Percentage (%) 53 47
18–24
59.5
25–30
18
31–40
22.5
Urban
46.7
Semi-urban
28.3
Rural
25.0
less than 1 h
5.0
1–2 h
27.5
2–5 h
39.2
5–8 h
17.1
More than 8 h
11.3
Experience on mobile Less than three years 52.1 payment 3–6 years 29.6 Over six years
18.3
An Empirical Study on Millennials’ Adoption of Mobile Wallets. Table 3 Mean and SD of the study variables
Constructs
121
N
Mean
Std. deviation
Perceived ease of use
240
18.90
3.749
Perceived usefulness
240
19.13
3.803
Attitude
240
13.90
3.232
Adoption ıntention
240
21.47
4.596
Valid N (listwise)
240
Normal, standard deviation, minimum and maximum values are displayed in Table 3 for eight structures. As for the mean value, four constructs are above 4, and two below 2. Adoption purpose has a mean value of 4.596, which is the highest. The expected utility ranks second, with the mean value being 3.803.
3.1.1
Stepwise Regression
Step-by-step regression is a variation to the forward set to check whether all model variables are less significant than the specified tolerance point at every stage where a variable has been introduced. If a non-significant characteristic is detected, the pattern will be removed. A description of the model is given in Table 4, in which multiple regressions are used step by step to obtain the standard coefficient; the independent variables are applied one at a time. There we see that the independent variables, perceived ease of use (Model 1), perceived usefulness (Model 2), and attitude (Model 3), were introduced one by one with the dependent variable intention adoption and increased the R and R square value. Table 4 shows the R square and the R square values modified for each move. The first step shows the R-value as .635, but when the steps were applied, the Rvalue started to increase, and in the final step, it reached 0.787, which increase of 0.152 (0.787–0.635). The model was statistically significant, including 3 independent variables and accounting for about 62% of the variation in mobile wallet adoption behaviour. Table 4 Model summary of independent and dependent variables Model
R
R square
R square change
F change
Sig. F change
1. PEOU
0.635a
0.403
0.403
160.464
0.000
2. PU
0.735b
0.541
0.138
71.263
0.000
3. ATTITUDE
0.787c
0.619
0.079
48.731
0.000
a. Independent Varable: Perceived Ease of Use b. Independent variable: Perceived Use c. Independent Variable: Attitude
122
M. Krithika and J. Zareena
Table 5 Standardized and unstandardized coefficient table Unstandardized coefficients Standardized coefficients
Model 1 2
3
t
B
Std. error
Beta
Constant
6.769
1.183
–
5.722
PEOU
0.778
0.061
0.635
12.667
Constant
3.751
1.099
–
3.412
PEOU
0.214
0.086
0.175
2.495
PU
0.715
0.085
0.591
8.442
Constant
2.777
1.013
–
2.742
PEOU
0.086
0.080
0.070
1.066
PU
0.469
0.085
0.388
5.522
ATT
0.467
0.067
0.405
6.981
The results are described in Table 5, which gives the coefficient values. Every model has a transition and modification that occurs as we descend, but here there is more interest in understanding the final model above (Model 3). The stable coefficients provide us with forecasters that greatly influence the dependent variable. The use of mobile wallets is significantly higher than the predicted benefit at p 0.01 (BETA = 0.635, p 0.01). But if Chennai Millennials were to believe it’s easy to use mobile wallets, they would find it helpful. At p 0.05 (BETA = 0.405, p = 0.00), mobile wallet adoption correlates positively with attitude. That means they will have more ability to follow it if Chennai Millennials consider mobile wallets to be easy to use. Among the three independent variables, perceived ease of use (BETA = 0.635, p = 0.001) is the most influential factor that influences consumers’ mobile wallet adoption behaviour.
4 Conclusion The theoretical model is stable, and the findings of the analysis are confirmed. In sum, the findings align with the five hypotheses suggested and contradict all the hypotheses suggested. The study considered that the main objective of the study was to identify the key variables that impact on mobile payments made by Millennials by third parties. Appropriation of the acceptability goal was closely associated, which is consistent with previous research in accepting development activities [6, 31]. The results of the study show that Chennai Millennials have an appreciation of the target of mobile payments by third parties. The objectives of the research are achieved by evaluating the TAM in mobile payment across different user groups. The usage of mobile payments has shown that travel has a major impact. As a result, it is concluded that the Chennai Millennials
An Empirical Study on Millennials’ Adoption of Mobile Wallets.
123
believe that third-party mobile payment is beneficial only when they have a positive attitude towards it. The important connection between the customer’s thoughts and the decision to use technology [23] has made it easy to use, use, user behaviour, and social influence and has confirmed that these variables are strongly linked to consumer behavioural intentions [20]. Assuming that the measure’s usability is ideal for actions, it is suggested that smartphone transactions by third parties are all right as it is easy to use Chennai Millennial. Such findings are consistent with previous research [31, 32] on development adoption. Regression analyses have a major impact on the actual utility of the desired efficiency of the study. The findings are similar to previous work on development acceptance [31]. This study reveals that third-party mobile payments are only useful if Chennai Millennials think that they are really user-friendly. Theoretical Implications The purpose of this study is to determine the functionality of TAM for different users as its efficiency and ease of use have been described as two key factors in the technical acceptance behaviour of Chennai Millennials in mobile wallet applications. This result suggests that the definition of development accession has been used as a helpful reference to the understanding of the approval of the product. This work can also be used in future studies. In the meantime, a concrete paradigm (R square = 67.9%) has developed, which consists of three main factors that define Chennai’s behaviour in accepting technology for mobile payments by third parties (perceived profit, perceived ease of use, and attitude). The study also investigates the actions of Millennials in the area of global online payments from an Indian point of view. Because most research on product adoption behaviour focuses on online shopping, these models should be important for future research if mobile payment adoptions are to be investigated further [12, 31]. Managerial Implications Mobile payments have shaped the local economic structure at both the local and global levels because they are booming faster. Since young people living in India are digital natives, People’s paramedic shift towards third-party mobile payments has been demonstrated [31, 32]. Because of this rapid change in payment mode, thirdparty mobile payments are growing faster in India. From the results, it is identified that the study variable taken for the study will affected Millennials’ intention. This work is, therefore, useful for businesses and companies engaging in third-party mobile payments to frame strategies for understanding the consumer requirements. Limitations and Directions for further Research Due to time and cost constraints, the sample size taken for the study was 240, and they also used snowball sampling. This may cause bias in the study. In the meantime, the research discusses the decision of Chennai Millennials to accept electronic payments from third parties. Millennials are a particular category of users with a radically different degree of exposure than traditional customers. A national review of the
124
M. Krithika and J. Zareena
trend for the implementation of mobile third-party payments should be conducted in all regions of India in the future. Besides, this work does not analyse the perceived danger from a security or privacy danger perspective. Failure to consider other types of risk can result in divergent outcomes. Perceived danger can be investigated further in further studies. In future studies, other gender-based stereotypes and moderator impacts on age could be explored in order to extend a third-party mobile payment acceptance activity inquiry.
References 1. CFO india, post demonitisation, Indians prefers mobile wallets to plastic money (2020) 2. T. S. Kumar, Construction of hybrid deep learning model for predicting children behavior based on their emotional reaction. J. Inf. Technol. 3(01), 29–43 (2021) 3. C. Liu, Y. Au, H. Choi, Effects of freemium strategy in the mobile app market: an empirical study of google play. J. Manag. Inf. Syst. 31(3), 326–354 (2014) 4. F. Davis, Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Q. 13(3), 319 (1989) 5. R.T. Localytics, 21% of Users Abandon an App After One Use. 2020. 6. G. Tan, K. Ooi, S. Chong, T. Hew, NFC mobile credit card: The next frontier of mobile payment? Telemat. Inform. 31(2), 292–307 (2014) 7. A. Angus, Top 10 Global Consumer Trends for 2018: Emerging Forces Shaping Consumer Behavior. Euromonitor International (2018) 8. S. Manoharan, Study on Hermitian graph wavelets in feature detection. J. Soft Comput. Paradigm (JSCP) 1(01), 24–32 (2019) 9. F. Davis, R. Bagozzi, P. Warshaw, User acceptance of computer technology: a comparison of two theoretical models. Manage. Sci. 35(8), 982–1003 (1989) 10. E. Slade, M. Williams, Y. Dwivedi, N. Piercy, Exploring consumer adoption of proximity mobile payments. J. Strateg. Mark. 23(3), 209–223 (2014) 11. C. Tam, T. Oliveira, Understanding the impact of m-banking on individual performance: DeLone & McLean and TTF perspective. Comput. Hum. Behav. 61, 233–244 (2016) 12. K. Madan, R. Yadav, Behavioural intention to adopt mobile wallet: a developing country perspective. J. Indian. Bus. Res. 8(3), 227–244 (2016) 13. Discovering Statistics Using Spss. 4rd ed. + Using IBM Spss Statistics for Research Methods and Social Science Statistics, 4th ed (Sage, Pubns, 2012) 14. X. Lu, H. Lu, Understanding chinese millennials’ adoption intention towards third-party mobile payment. Inf. Resour. Manage. J. 33(2), 40–63 (2020) 15. K. Kim, D. Shin, An acceptance model for smart watches. Internet Res. 25(4), 527–541 (2015) 16. E.E. Adam, Babikir, Survey on medical imaging of electrical impedance tomography (EIT) by variable current pattern methods. J. ISMAC 3(02), 82–95 (2021) 17. F. Liébana-Cabanillas, I. Ramos de Luna, F. Montoro-Ríos, User behaviour in QR mobile payment system: the QR payment acceptance model. Technol. Anal. Strateg. Manage. 27(9), 1031–1049 (2015). Available: https://doi.org/10.1080/09537325.2015.1047757 [Accessed 9 June 2020] 18. Y. Dwivedi, N. Rana, M. Janssen, B. Lal, M. Williams, M. Clement, An empirical validation of a unified model of electronic government adoption (UMEGA). Gov. Inf. Q. 34(2), 211–230 (2017) 19. T. Apanasevic, J. Markendahl, N. Arvidsson, Stakeholders’ expectations of mobile payment in retail: lessons from Sweden. Int. J. Bank. Mark. 34(1), 37–61 (2016). Available: https://doi. org/10.1108/ijbm-06-2014-0064
An Empirical Study on Millennials’ Adoption of Mobile Wallets.
125
20. A. Erdem, U. Pala, M. Özkan, U. Sevim, Factors affecting usage intention of mobile banking: empirical evidence from turkey. J. Bus. Res—Turk. 11(4), 2384–2395 (2019) 21. E. Pantano, C. Priporas, The effect of mobile retailing on consumers’ purchasing experiences: a dynamic perspective. Comput. Hum. Behav. 61, 548–555 (2016) 22. Y. Lee, J. Park, N. Chung, A. Blakeney, A unified perspective on the factors influencing usage intention toward mobile financial services. J. Bus. Res. 65(11), 1590–1599 (2012) 23. C. Kim, M. Mirusmonov, I. Lee, An empirical examination of factors influencing the intention to use mobile payment. Comput. Human Behav. 26(3), 310–322 (2010). Available: https://doi. org/10.1016/j.chb.2009.10.013 [Accessed 9 June 2020] 24. P. Schierz, O. Schilke, B. Wirtz, Understanding consumer acceptance of mobile payment services: an empirical analysis. Electron. Commer. Res. Appl. 9(3), 209–216 (2010) 25. S. R. Mugunthan, T. Vijayakumar, Design of improved version of sigmoidal function with biases for classification task in ELM domain. J. Soft Comput. Paradigm (JSCP), 3(02), pp. 70– 82 (2021) E, Bashar, Abul, Survey on evolving deep learning neural network architectures. J. Artif. Intell. 1(02), 73–82 (2019) 26. I. Ramos-de-Luna, F. Montoro-Ríos, F. Liébana-Cabanillas, J. Luna, NFC technology acceptance for mobile payments: a brazilian perspective. Rev. Bus. Manage. 19(63), 82–103 (2017) 27. I. de Luna, F. Liébana-Cabanillas, J. Sánchez-Fernández, F. Muñoz-Leiva, Mobile payment is not all the same: the adoption of mobile payment systems depending on the technology applied. Technol. Forecast. Soc. Chang. 146, 931–944 (2019) 28. N. Singh, S. Srivastava, N. Sinha, Consumer preference and satisfaction of M-wallets: a study on North Indian consumers. Int. J. Bank Mark. 35(6), 944–965 (2017) 29. S. Yang, Y. Lu, S. Gupta, Y. Cao, R. Zhang, Mobile payment services adoption across time: an empirical study of the effects of behavioral beliefs, social influences, and personal traits. Comput. Hum. Behav. 28(1), 129–142 (2012) 30. C. Antón, C. Camarero, J. Rodríguez, Usefulness, enjoyment, and self-image congruence: the adoption of e-book readers. Psychol. Mark. 30(4), 372–384 (2013) 31. V. Badrinarayanan, E. Becerra, S. Madhavaram, Influence of congruity in store-attribute dimensions and self-image on purchase intentions in online stores of multichannel retailers. J. Retail. Consum. Serv. 21(6), 1013–1020 (2014). Available: https://doi.org/10.1016/j.jretconser.2014. 01.002 32. T. Perry, J. Thiels, Moving as a family affair: applying the soc model to older adults and their kinship networks. J. Fam. Soc. Work. 19(2), 74–99 (2016) 33. W. Kunz et al., Customer engagement in a Big Data world. J. Serv. Mark. 31(2), 161–171 (2017) 34. J. Rowley, Designing and using research questionnaires. Manage. Res. Rev. 37(3), 308–330 (2014) 35. A. Bashar, Survey on evolving deep learning neural network architectures. J. Artif. Intell. 1(02), 73–82 (2019) 36. N. Koenig-Lewis, M. Marquet, A. Palmer, A. Zhao, Enjoyment and social influence: predicting mobile payment adoption. Serv. Ind. J. 35(10), 537–554 (2015) 37. R. Hill, M. Fishbein, I. Ajzen, Belief, attitude, intention and behavior: an introduction to theory and research. Contemp. Sociol. 6(2), 244 (1977) 38. R. Bagozzi, The legacy of the technology acceptance model and a proposal for a paradigm shift. J. Assoc. Inf. Syst. 8(4), 244–254 (2007) 39. Y. Lu, S. Yang, P. Chau, Y. Cao, Dynamics between the trust transfer process and intention to use mobile payment services: a cross-environment perspective. Inf. Manage. 48(8), 393–403 (2011)
An IoT-Based Smart Mirror K. N. Pallavi, Jagadevi N. Kalshetty, Maithri Suresh, Megha B. Kunder, and Kavya Shetty
Abstract In today’s world, many technologies have come up to make human life more comfortable. Nowadays through the Internet, it is possible for the people to get connected to the whole world and access any information easily. People can learn about current events happening around the world through television or Internet. People want everything smart. Smart houses are built nowadays to make their home connected to Internet. It can connect all the digital devices those communicate with each other through the Internet. World is changing, and life is changing too. Lot of improvement is also happening in the world of smart technologies. As a part of smart technology, a smart device is created called smart mirror. The smart mirror concept is inspired from the basic habit of people of looking into the mirror every day. A thought came to the mind why can’t make this mirror as a smart mirror. The result of this thought process is “smart mirror”. Keywords Alexa · Raspberry Pi · AIoT (Artificial Internet of Things) · Face recognition · Flask
K. N. Pallavi (B) NMAM Institute of Technology, Nitte, India e-mail: [email protected] J. N. Kalshetty Nitte Meenakshi Institute of Technology, Bangalore, India e-mail: [email protected] M. Suresh Oracle, Bangalore, India M. B. Kunder Atos Syntel, Bangalore, India K. Shetty Netanalytiks Technologies Pvt Ltd, Bangalore, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_10
127
128
K. N. Pallavi et al.
1 Introduction A smart mirror is a customised mirror to show the date, time, local weather forecasts, real-time news, upcoming calendar events, social media feeds [1] and so on. The big problem with any existing mirror is that it only shows the object in front of it or just the human face. There is no interaction. The voice instructions using Amazon Alexa Voice Service or Google Home Assistant are used to communicate with the user. With voice commands, one can interact with the screen by asking questions, setting reminders, playing songs and many other things. This project was developed with the inspiration drawn from the people [2] who spend some quality time in front of the mirror. A smart mirror acts as a single window system for a person through which all information can be obtained. To get latest news updates, weather forecast [3] and for any other information people will always have to turn on time-consuming television or mobile apps. To eliminate these problems and to make the search easy, a smart mirror [4] concept is introduced. All the necessary information such as weather forecast, latest news or any other information can be found in one place itself. Smart mirror will work with the help of Alexa and Jitsi Meet. Jitsi Meet is a set of open-source projects which empower users to use and deploy video conferencing platforms with state-of-the-art video quality and features. Alexa is a virtual assistant technology by Amazon, also known as Amazon Alexa. This smart mirror is programmed using python language. This can be connected to the camera near the door. Whenever the door is locked, the smart mirror will be active. If camera catches a person, smart mirror checks and find out whether that person [5] is an authorised person or not. If smart mirror concludes that the person identified is not an authorised person, then it will send an email alert to the user. Also whenever user wants to start a meeting [6] or conference call, the user has to just say “Alexa Ask Mirror to Start meeting”. By this voice command, the smart mirror will join the Jitsi Meet video conferencing call with friends and family [7, 8].
2 Proposed Methodology 2.1 Functional Requirements 2.1.1
Software Requirements
OpenCV is an open-source computer vision library based on the machine learning features. This software program is very often used for applications such as video analysis, image processing. With the assistance of this programming, the computer processes and understands the images and videos. Here OpenCV is used for face recognition of the user.
An IoT-Based Smart Mirror
129
Raspbian OS: Raspberry Pi hardware is optimised with a free operating system called Raspbian OS. Raspbian contains over 35,000 packages with many pre-built functions. Raspbian OS is very easy to install on a Raspberry Pi computer. Python: Python is a powerful programming language, and the main advantage is that it is easy to learn. Python contains high-quality data structures that work well. A simple and effective form of object-oriented program is Python. Python’s magnificent syntax makes it an excellent language for scripting and developing the application quickly in many places on many platforms.
2.1.2
Hardware Requirements
Raspberry Pi: It is actually a credit-card-sized computer [9]. The original intention for developing Raspberry Pi was for education. The inspiration for developing Raspberry Pi is a documentary produced by the BBC Micro in the year 1981. The creator of Raspberry Pi is Eben Upton. He wanted to create a computer that would help a person to understand hardware better and improve programming [10] skills cost effectively. It is small in size and has an affordable price. It is quickly adopted by electronics enthusiasts for projects. Speaker: The speaker is required to provide a voice output. Webcam: The webcam in this project is used to detect user’s faces. Any type of webcam is compatible with Raspberry Pi. Mirror: A special mirror named two-way mirror is used in this project. Instead of using a normal mirror, two-way mirror is used because two-way mirror is not painted in an opaque colour on the back. Microphone: Microphone is required to provide voice input. Mouse: Mouse is used to navigate. Keyboard: Keyboard is used to provide the input.
2.2 Software Approach 2.2.1
IoT
The Internet of Things or IoT is a network of physical objects. These objects for the purpose of connecting and exchanging data with other applications or devices are embedded in software, sensors and other technologies through the Internet. These devices can be common household items, or they can be complex industrial tools. More than 7 billion people connected to IoT devices today. Experts expect this number to grow to 10 billion by 2020 and 22 billion by 2025.
130
K. N. Pallavi et al.
2.2.2
Artificial Intelligence
Artificial intelligence (AI) is the intelligence manifested by the machines [11]. It is different from the natural intelligence. Natural intelligence is displayed by humans and animals which can be a knowledge, emotions, empathy, etc. AI and IoT cannot be separated [12]. AI plays a vital role in the world of technology. Combining AI and IoT results in improvement in both technologies. IoT deals with the connection of two or more objects, network or sensors which will enable data transfer for a few applications. AI enables the analysis of the most sensitive data and thus allows to provide important information and make more informed decisions.
2.2.3
Amazon Alexa
Using the script, the Alexa Voice service can be set in Raspberry Pi. Alexa voice service gives the ability to do many different things like playing the favourite songs, checking the cricket ratings using voice command and many more. The steps involved in setting up Amazon Alexa are as below: 1. 2. 3.
First go to Amazon Developer portal and register an AVS device. AVS Device SDK dependencies shall be installed and configured in Raspberry Pi. Create an AVS sample app and use it in Raspberry Pi.
2.2.4
Jitsi Meet Video Conferencing
Smart mirror can be used for video conferencing with friends and family. Jitsi Meet tool belongs to Jitsi software. This software contains open-source voice collection. It allows video conferencing and instant messaging services. Jitsi Meet allows us to capture group video calls, which means video conferencing without creating an account. This is the most useful advantage with the smart mirror. Even the people who do not have experience in using smart phones can also use smart mirror. Users can conduct or join a meeting from the web. There is no compulsion to download the Jitsi Meet application. This makes it easy to run multiple meetings per day and manage data at all times.
2.3 System Design 2.4 System Description Set-up of AI-based smart mirror is shown in Fig. 1 as follows:
An IoT-Based Smart Mirror
131
Fig. 1 High-level architecture
• Raspberry Pi is connected to power supply. A micro SD card which is Raspbian installed is placed in a slot. • Monitor (with HDMI—in) is the screen of the smart mirror. Any type of display monitor with available HDMI input can be used. Connect the monitor to Raspberry Pi using HDMI cable. • Two-way mirror or one-way reflective film is placed on the monitor that is acts as a mirror. • USB speaker, mouse, keyboard and microphone are connected in their respective slots. • Raspberry Pi ribbon camera is placed for facial recognition.
3 Implementation 3.1 Displaying Information A Python program was written to display all the details such as date, time, greeting message, news, weather, calendar events on the mirror. Weather information is obtained from Google Weather, news from Google News and events from Google Calendar API.
132
K. N. Pallavi et al.
3.2 Amazon Alexa Using the script, the Alexa voice service is set in the Raspberry Pi. Alexa voice service gives an ability to do various things such as playing favourite songs, checking cricket ratings using voice command. The steps involved in setting up of Amazon Alexa are as follows: 1. 2. 3.
First go to Amazon Developer portal and register an AVS device. AVS Device SDK dependencies should be installed and configured in the Raspberry Pi. Create an AVS sample app and use it in the Raspberry Pi.
3.3 Face Recognition OpenCV is an open-source computer vision machine learning Library. This library is frequently used for applications like video analysis, image processing, etc. With the assistance of this library, the computer processes the images and videos [13]. Also it helps in recognising the pictures and videos. Here, OpenCV is used for recognising the face of the user. ˙Initially, training is performed by the LBPH Face Recogniser model. Once the labelled images of the user are ready, the training will be performed. After successful training recogniser can be used for face recognition. Name of the face detected is also displayed in the web for future use. LBPH is an algorithm used for face detection [14]. It can detect side face and front face and is known for giving high performance. It compares the input face with that of face registered and is able to detect images. Images are stored in matrix of pixels format. LBPH makes use of this matrix for its facial detection capability.
3.4 Door Lock and Thief Detection Whenever the door is locked, the smart mirror will be active. If camera catches a person, smart mirror checks and finds out whether that person is an authorised person or not. If smart mirror concludes that the person identified is not an authorised person, then it will send an email alert to the user. A page is created using Flask in Raspberry Pi’s Localhost http://192.168.43.173:5000/door where a user can open or close the door. In real world, a switch and a camera can be placed on the door. This camera is connected to smart mirror. When door is locked, camera is turned on and if any unknown person is detected [15], an email will be sent using OpenCV library.
An IoT-Based Smart Mirror
133
3.5 Jitsi Meet Video Conferencing With smart mirror video conferencing is possible with friends and family. Jitsi Meet tool belongs to Jitsi software. This software contains open-source voice collection. It allows video conferencing and instant messaging services. Jitsi Meet allows us to capture group video calls, which means video conferencing without creating an account. This is the most useful advantage with the smart mirror. Even the people who do not have experience in using smart phones can also use smart mirror. Users can conduct or join a meeting from the web. There is no compulsion to download the Jitsi Meet application. This makes it easy to run multiple meetings per day and manage data at all times. User can just ask Alexa “Ask Mirror to Start meeting” and can join the Jitsi Meet video conferencing call with friends and family. Ngrok library is used for forwarding video conferencing address to Raspberry Pi.
4 Developed System Interactive futuristic smart mirror is designed using AI and IOT in Raspberry Pi. The artificial intelligence is used in face recognition and voice command services. Based on AIoT technology, many devices have been introduced. These devices designed based on the AIoT are supplying comfortable, stable, reliable, futuristic personal services everywhere. The face detection task is performed using OpenCV. The mirror will recognise user’s face and carry out further process with the use of Raspberry Pi. The Raspberry Pi is the most critical part of the smart mirror. It acts as a processing unit of the mirror. The programming of Pi is carried out using Python language. The Pi offers many in-built IDE due to this it is able to program in different languages like C, Java, C++, Python, etc. Installation of OS on Raspberry Pi is a simple process. Mirror display following information in smart mirror: • Weather report: Display climate forecasts. • Local news: Display information, announcements and headlines primarily based on favourite subjects. • Alexa: Add voice instructions and provides assistance to smart mirror. • Facial Recognition: Recognise the user. • Calendar: Display upcoming occasions using calendar. • Send mail on detecting unknown person. • Jitsi Meet video conferencing call with friends and family.
134
K. N. Pallavi et al.
5 Results Overall set-up of smart mirror: The monitor, speaker, microphone, keyboard and mouse are all connected to the Raspberry Pi (Fig. 2). Mirror displaying date, time, weather, news, events, etc. obtained by using Google provided APIs (Fig. 3). Amazon Alexa voice service: As shown in picture, Alexa can interact with users. The output shows listening, thinking and speaking state of Alexa (Fig. 4).
Fig. 2 Overall set-up of the smart mirror
Fig. 3 Display information
An IoT-Based Smart Mirror
135
Fig. 4 Alexa voice services
Face detection using OpenCV: The “hey beautiful” message is replaced with “Hi username” here (Fig. 5). Door close and open using website: Website created to open and close door is displayed here. If door is closed, security is ON (Fig. 6). Thief detection: When door is closed, if any other faces of human is captured then the thief is detected (Fig. 7).
Fig. 5 Detect face
136
K. N. Pallavi et al.
Fig. 6 Door open and close
Fig. 7 Detect thief
Mail sent: The thief detected above is sent to the users mail (Fig. 8). Jitsi Meet video conferencing: We can use Jitsi for interacting with friends and family through mirror (Fig. 9).
An IoT-Based Smart Mirror
137
Fig. 8 Mail received
6 Conclusion We have designed a futuristic smart mirror which is interactive with its user and additionally provides the exciting home services. The mirror display is provided through a display LCD monitor which shows news, weather, messages, search engines, conference calls, etc. in one screen. Smart mirror is a unique application of making a smart
138
K. N. Pallavi et al.
Fig. 9 Jitsi Meet
interacting device. The device is reliable and easy to use in the interactive world. Throughout the project, the main intention was to design an interactive device for home. The smart mirror will be more useful in smart home design. The serviceorientated architecture has been tailored for the improvement and deployment of the numerous applications like shopping malls, hospitals, offices, etc. in which the mirror may be used for the news feeds and internet service conversation mechanisms. The facial recognition technology may be used for providing further security. The future houses will be brilliantly designed using smart technology which will make human life comfortable, easier and enjoyable.
References 1. B. Cvetkoska, N. Marina, D.C. Bogatinoska, Z. Mitreski, Smart mirror E-health assistant— Posture analyze algorithm proposed model for upright posture, in IEEE EUROCON 2017—17th International Conference on Smart Technologies, Ohrid, 507–512 (2017) 2. M.M. Yusri et al., Smart mirror for smart life, in 2017 6th ICT International Student Project Conference (ICT-ISPC), Skudai, 1–5 (2017) 3. D. Gold, D. Sollinger and Indratmo, SmartReflect: a modular smart mirror application platform, in 2016 IEEE 7th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), Vancouver, BC, 1–7 (2016)
An IoT-Based Smart Mirror
139
4. O. Gomez-Carmona, D. Casado-Mansilla, SmiWork: an interactive smart mirror platform for workplace health promotion, in 2017 2nd International Multidisciplinary Conference on Computer and Energy Science (SpliTech), Split, 1–6 (2017) 5. S. Athira, F. Francis, R. Raphel, N.S. Sachin, S. Porinchu, S. Francis, Smart mirror: a novel framework for interactive display, in 2016 International Conference on Circuit, Power and Computing Technologies (ICCPCT), Nagercoil, 1–6 (2016) 6. M. Rodriguez-Martinez et al., Smart mirrors: peer-to-peer web services for publishing electronic documents, in 14th International Workshop Research Issues on Data Engineering: Web Services for eCommerce and e-Government Applications, 2004. Proceedings, pp. 121–128 (2004) 7. Yuan-Chih. Yu, Shing-chern. D, Dwen-Ren. Tsai, Magic mirror table with social-emotion awareness for the smart home, in 2012 IEEE International Conference on Consumer Electronics (ICCE), Las Vegas, NV, pp. 185–186 (2012) 8. M.A.Hossain, P.K.Atrey, A.E.Saddik, Smart mirror for ambient home environment. in 2007 3rd IET International Conference on Intelligent Environments, Ulm, pp. 589–596 (2007) 9. J. Markendahl, S. Lundberg, O. Kordas, S. Movin, On the role and potential of IoT in different industries: analysis of actor cooperation and challenges for introduction of new technology, in 2017 Internet of Things Business Models, Users, and Networks, Copenhagen, pp. 1–8 (2017) 10. S.S.I. Samuel, A review of connectivity challenges in IoT-smart home, in 2016 3rd MEC International Conference on Big Data and Smart City (ICBDSC), Muscat, 1–4 (2016) 11. P. Maheshwari, M.J. Kaur, S. Anand, Smart mirror:a reflective interface to maximize productivity. Int. J. Comput. Appl. 166(9), (0975–8887) (2017) 12. A. Sungheetha, R. Sharma, Real time monitoring and fire detection using internet of things and cloud based drones. J. Soft Comput. Paradigm (JSCP) 2(03), 168–174 (2020) 13. J.I.Z. Chen, L-T. Yeh, Graphene based web framework for energy efficient IoT Applications. J. Inf. Technol. 3(01), 18-28 (2021) 14. I.J. Jacob, P.E. Darney, Design of deep learning algorithm for ıot application by ımage based recognition. J. ISMAC 3(03), 276–290 (2021) 15. E. Kovatcheva, R. Nikolov, M. Madjarova, A. Chikalanov, in Internet of Things for Wellbeing— Pilot Case of a Smart Health Cardio Belt, IFMBE Proceedings, pp. 1221–1224 (2014)
AI-Assisted College Recommendation System Keshav Kumar, Vatsal Sinha, Aman Sharma, M. Monicashree, M. L. Vandana, and B. S. Vijay Krishna
Abstract For an aspiring undergraduate student, choosing which college and courses to apply to is a conundrum. Often, the students wonder which colleges are best suited for them. This issue has been addressed by building an artificial intelligence (AI)-driven recommendation engine. This system is developed using the latest techniques in machine intelligence. The system builds user profiles by explicitly asking questions from users and then maps it with college profiles gathered by web scraping and then generates recommendations based on novel hybrid recommendation techniques. The main aim of this system is to utilize artificial intelligencebased techniques to build an efficient recommendation engine particularly for college students to help them select the college which best suits their interests. Keywords Recommender · User profiling · Similarity metric · Collaborative filtering · Content based · Hybrid recommender system
1 Introduction Nowadays, students are provided services like career counseling or career guidance by many companies/websites. They also provide tips for aspiring undergraduate students through psychometric tests. But none of them provide college recommendations. The needs and requirements of every student are unique. Therefore, the recommender system should be able to provide recommendations that consider opinions from the student community as a whole as well as the user’s individual preferences. Demographic factors also play a crucial role in this. A knowledge-based K. Kumar (B) · V. Sinha · A. Sharma · M. Monicashree · M. L. Vandana Computer Science, PES University, Bengaluru, India e-mail: [email protected] M. L. Vandana e-mail: [email protected] B. S. Vijay Krishna CTO, nSmiles, Bengaluru, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_11
141
142
K. Kumar et al.
recommender system fulfills this purpose by attempting to suggest objects based on inferences about a user’s needs [1–16]. The proposed work gets the user profile by asking questions from users explicitly and then uses that profile to generate scores for each college w.r.t user. The college data has been scraped from the Internet. Finally, the students are able to get top college recommendations. This system can be extended to students in other fields or streams or grades. Recommendation engines are types of systems that provide suggestions of items to the users in which they might be interested. These recommendation systems are playing vital roles in almost every sector which involves selling or presenting items to customers [7–10]. These types of systems help users by enabling them to filter out the heaps of information and the product space. The various types of recommendation systems include: • • • • A.
Collaborative filtering. Content-based. Knowledge based. Hybrid based. Collaborative Filtering: Collaborative Filtering uses techniques that involve profiles of different users and generate recommendations based on similar user profiles. In simple words, similar users will like similar things in the future. For example, if a teenager likes Iron Man then it is highly likely that the other teenagers will also like Iron Man. This is done by user–user similarity. Advantages of collaborative filtering includes: a. b.
These systems work on ratings provided by users and hence don’t require any additional information from users. The recommendations generated by this method are completely new as it considers users with similar behavior.
Disadvantages are: a. b. B.
It suffers from a cold start problem as it cannot generate recommendations if there are no ratings available. If there is very little data then the accuracy is very poor [11, 12].
Content Based: In this system, the recommendations are generated with the help of items. To be precise, it can be said that the similarity between the features of items is used [5, 6]. Hence, item–item similarity. For example, if a person likes books on science fiction, then he/she can recommend other books which are in the science fiction category. Advantages of content-based systems are: a. b.
It is independent of users as it only uses ratings provided by the current users, unlike collaborative filtering. It is highly capable of recommending new items which are yet to be rated.
AI-Assisted College Recommendation System
c.
143
It is transparent as it shows what features the recommendations were generated.
Disadvantages are: a. b. C.
The system needs lots of features for generating distinguished recommendations. There can be a situation where it can generate only similar items and no new items.
Hybrid Based: To counter the disadvantages of collaborative filtering and content-based, hybrid approaches are used which mostly involves using useritem similarity. In this paper, weighted hybrid model is being used which uses a user-item similarity metric to generate accurate recommendations [13–15].
The system developed in this paper helps students get recommendations for college with the help of psychometric tests.
2 Methodology The proposed system adapts machine learning techniques for the specific task of assisting students in choosing a college. The architecture of the system is shown in Fig. 1. The novel approach here is that the scores for each college are generated with respect to the user. The scores are generated by calculating the cosine similarity metric between the user preferences vector and the college profile vector. The scores are not just directly calculated as there are few features that have higher weights than other features, for example, placements ratings have much more importance than campus infrastructure ratings. Hence, each feature used to build the vector has different weights assigned to it. For getting the weights of the features, a survey was performed on more than 50 students who were searching for colleges. From the results of the survey, all the features were divided into three classes of weight. For example, the placements ratings, safety ratings, and food ratings were given the most preference and hence these features were assigned to the weight class with the highest value. For getting the user profile, the questions are based on the very features which are used to calculate the score with respect to college [4]. The answers to these questions help to build the user profile. This also addresses the problem of cold start which collaborative filtering recommender systems suffer from as the data is gathered from the user explicitly. The user profile and college profile have identical structures as scores are calculated using similarity metrics between user and college. The equation to get the score is: Score = w1 ∗ (sim (U1, C1)) + w2 ∗ (sim(U2, C2)) + w3 ∗ (sim (U3, C3)) where w1, w2, w3 are the weights, Sim stands for similarity score between the user features and the college features.
144
K. Kumar et al.
Fig. 1 Architecture diagram
U1, U2, and U3 represent the subset of the user feature array. C1, C2, and C3 represent the subsets of the college feature array. How equation works: Suppose there is one user profile U and two college profiles C1 and C2. Also, let there be two types of weights w1 = 2 and w2 = 1 where w1 has more weight than w2. The values for user profile and college profiles are: U = [1, 2, 4, 2, 1, 3, 3, 4, 1, 2] C1 = [1, 2, 3, 3, 1, 1, 2, 4, 1, 3] C2 = [4, 4, 1, 2, 4, 3, 3, 1, 2, 3] Let the first five features of the vector have weight as w1 and other five features as w2. Now, the score of college C1 according to the equation will be: Score(C1) = w1 ∗ sim(U [1:5], C1[1:5]) + w2 ∗ sim(U [6:10], C1[6:10]) = 2 ∗ sim([1,2,4,2,1], [1,2,3,3,1]) + 1 ∗ sim([3,3,4,1,2], [1,2,4,1,3]) = 2.842
AI-Assisted College Recommendation System
145
Similarly, for the score of college C2, the value is 2.142. Now, from the value of scores, it can be easily determined that college C1 is a better match than college C2. The similarity metric used here is Cosine similarity. The scores can be converted to percentages easily as the maximum value is 3. Workflow of the System a. b.
c. d. e. f. g. h. i.
First, the user interacts with the interface and is asked to make an account by providing their credentials. After authentication, the user is provided with a list of questions with multiple options. There are no right or wrong answers as these questions are psychometric in nature. The user now interacts with the interface to answer all the questions and then submits their answer. The server receives the user inputs and builds a profile for that user. The recommender logic is applied to the user profile and all the college profiles present in the database. The score for each college w.r.t user is generated. After generating all the recommendations, they are sorted in descending order based on scores. The server now sends the top 50 recommendations to the front end of the interface. The recommendations are displayed to the user.
3 Implementation The system developed in this paper uses a combination of various technologies from the web intelligence domain. The front end of the system is developed using AngularJS [17], and it was chosen because this system was developed as per requirements by ‘nSmiles’ because this system has been developed in collaboration with ‘nSmiles’. The database layer is implemented using MongoDB [18] which is a NoSQL database. The server which receives requests and sends responses is developed using technologies or libraries in Python. To develop API, Flask was used which is written in Python. To calculate recommendations, pandas and ‘SciPy’ libraries in Python were used. Dataset The database of the colleges was developed using Web scraping [2]. Various websites which provide the information of colleges for free were used to scrape information. The tools used for web scraping were free software like Parse Hub and Python libraries. After gathering the dataset of almost 5400 colleges, the dataset was cleaned and pre-processed for use. The college features were converted to values in a specific range with the highest value implying that the feature having that value is one of the best.
146
K. Kumar et al.
Fig. 2 College data in JSON format
The dataset was then converted to MongoDB. The database was then hosted on the cloud with the help of MongoDB Atlas which is a free service (Fig. 2). Then, a webhook was developed on MongoDB Realm, which sends the data of all the colleges to the server. Server The experimental server was developed with the help of Flask and other data science libraries in Python. The server first receives the request from the front end which consists of an array that stores the user profile which is gathered by the questions from the front end. After receiving the request, the server now connects to the college database with the help of the webhook hosted on MongoDB Realm. The college data is received by the server in JSON format which is then converted to the ‘pandas’ data frame. Then, the user profile array and the data of college are used to generate scores of each college w.r.t. the user. The recommendations are then sorted based on the score, and the top 50 recommendations are sent as a response by the server to the front end in JSON format. The front end then interprets it and displays it to the user. Features used A total of 15 features are used for calculating the scores for each college. A few examples of the features are Placements, Food Quality, Hostel, etc. The features don’t
AI-Assisted College Recommendation System
147
have the same priority. Each feature has a weight associated with it. The weights were calculated by conducting surveys on more than 50 students who are in classes 11–12 and are looking for colleges. The most important features were assigned the most weight like placements, hostel, etc. The questions are based on the features of the colleges. For example, for the feature ‘research’, the question can be ‘How important are the research facilities provided by the college to you?’. Since these questions are psychometric in nature, there are no right or wrong answers (Fig. 3). The options for the question are: • • • •
Most important More important Less important Least important
Fig. 3 Questions
148
K. Kumar et al.
4 Experiments and Observations After building the system, the server was tested with the help of API testers like Insomniac or Postman. The server was sent dummy arrays that represented the user profiles. The server then sent the response in JSON format on average 4 s. The cloud-hosted database was also checked if it was sending the data in the correct format. The front end was implemented in AngularJS. The interface consisted of questions with answers. The system is able to flawlessly generate recommendations and also shows the score for each recommendation (Fig. 4). Also, the recommender logic was also checked by providing a biased input. The system works perfectly as per expectations.
5 Conclusions The students get a lot of career counseling services, but there is no service that provides personalized recommendations of the colleges. This paper describes the system which provides one solution by using the latest techniques from the field of machine learning and addresses the described issue. It addresses some issues which a normal recommendation system has like the cold start problem. The user only needs to provide answers to questions and the recommendations are generated with ease. This system uses the concept of knowledge-based recommenders properly. The system was built using a microservices architecture. More refinements for better UI and performance can be made in the future.
149
Fig. 4 Recommendations
AI-Assisted College Recommendation System
150
K. Kumar et al.
Acknowledgements This paper is based on ideas provided by ‘nSmiles’. ‘nSmiles’ provides various services for mental health in workplaces as well as career counseling services for students. The project is acknowledged by the CTO of ‘nSmiles’ Mr. B.S. Vijay Krishna and Prof. Vandana M.L., PES University.
References 1. S. Bouraga, I. Jureta, S. Faulkner, C. Herssens, Knowledge-based recommendation systems: A survey. Int. J. Intell. Inf. Technol. (IJIIT) 10(2), 1–19 (2014) 2. A.V. Saurkar, K.G. Pathare, S.A. Gode, An overview on web scraping techniques and tools. Int. J. Future Revolution Comput. Sci. Commun. Eng. 4(4), 363–367 (2018) 3. R. Burke, Knowledge-based recommender systems. Encycl. Libr. Inf. Syst. 69 4. S. Girase, V. Powar, D. Mukhopadhyay, A user-friendly college recommending system using user-profiling and matrix factorization techniques, in 2017 International Conference on Computing, Communication and Automation (ICCCA), pp. 1–5. IEEE, (2017) 5. Z. Cui, X. Xu, X.U.E. Fei, X. Cai, Y. Cao, W. Zhang, J. Chen, Personalized recommendation system based on collaborative filtering for IoT scenarios. IEEE Trans. Serv. Comput. 13(4), 685–695 (2020) 6. F. Xue, X. He, X. Wang, J. Xu, K. Liu, R. Hong, Deep item-based collaborative filtering for top-n recommendation. ACM Trans. Inf. Syst. (TOIS) 37(3), 1–25 (2019) 7. Y. Afoudi, M. Lazaar, M. Al Achhab, Hybrid recommendation system combined content-based filtering and collaborative prediction using artificial neural network. Simulat. Model. Practice Theory 113 (2021) 102375 8. W. Haoxiang, S. Smys, Big data analysis and perturbation using data mining algorithm. J. Soft. Comput. Paradigm (JSCP), 3(01), 19–28 (2021) 9. J. Samuel Manoharan, Patient diet recommendation system using K-clique and deep learning classifiers. J. Artif. Intell. 2(2), 121–130 (2020) 10. M.C.V. Joe, J.S. Raj, Location-based orientation context dependent recommender system for users. J. Trends Comput. Sci. Smart Technol. (TCSST), 3(01), 14–23 (2021) 11. S. Milano, M. Taddeo, L. Floridi, Recommender systems and their ethical challenges. AI. Soc. 35(4), 957–967 (2020) 12. S. Wang, L. Hu, Y. Wang, L. Cao, Q.Z. Sheng, M. Orgun, Sequential recommender systems: challenges, progress and prospects. (2019). arXiv preprint arXiv:2001.04830 13. M. Karimi, D. Jannach, M. Jugovac, News recommender systems–survey and roads ahead. Inf. Process. Manage. 54(6), 1203–1227 (2018) 14. P. Kouki, J. Schaffer, J. Pujara, J. O’Donovan, L. Getoor, Personalized explanations for hybrid recommender systems, in Proceedings of the 24th International Conference on Intelligent User Interfaces, pp. 379–390 (2019) 15. A.C. Harkare, N. Pali, N. Khivasara, I. Jain, R. Murumkar, Personalized college recommender: a system for graduate students based on different input parameters using hybrid model 16. Knowledge-Based Recommender Systems: An Overview by Jackson Wu available at medium.com 17. Angular available at https://angular.io/ 18. MongoDB available at: https://docs.mongodb.com
An Agent-Based Model to Predict Student Protest in Public Higher Education Institution T. S. Raphiri, M. Lall, and T. B. Chiyangwa
Abstract The purpose of the paper is to design and implement an agent-based model of student protests to predict the emergent of protest actions. Student protest actions have often resulted in damage to property and cancelation of academic programs. In this study, an agent-based model that integrates grievance as relative deprivation, perceived risk, and various network effects was used to simulate student protest. The results from a series of simulated experiments show that social influence, political influence, sympathy, net risk, and grievance were statistically significant factors that contribute to the probability of students engaging in a protest. Based on Logistic Regression, the model accounts for 93% of the variance (Nagelkerke R Square) in the dependent variables included in the equation. For universities management to effectively manage unruly student protest actions, policies on risk management strategies should place more emphasis on understanding network structure that integrate students’ interactions to monitor propagation of opinions. Keywords Agent-based model · Student protest · Social conflicts · Collective behavior · Network influence
1 Introduction Public universities in South Africa have been negatively affected by recent student protests which continue to be prevalent even after more than two decades of democracy. For instance, October 2015 marked the beginning of #FeesMustFall student movement ever experienced by South African universities post-apartheid [1]. This T. S. Raphiri (B) · M. Lall Department of Computer Science, Tshwane University of Technology, Gauteng, South Africa e-mail: [email protected] M. Lall e-mail: [email protected] T. B. Chiyangwa Computer Science Department, University of South Africa, Gauteng, South Africa e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_12
151
152
T. S. Raphiri et al.
#FeesMustFall protest was triggered by fee increment at Wits University and rapidly spreads to other universities across the country. Students continue to be frustrated with several issues including lack of transformation and high inequalities in South African universities. Therefore, these lack of transformation and unequal distribution of educational resources increase level of frustration and trigger protest behavior among students [2]. Dominguez-Whitehead [3] argues that students’ grievances are as a result of among others, financial and academic exclusion, lack of financial aid, inadequate student residences, and crime occurring on campuses. Students have identified protest actions as an effective strategy to express their frustration and to challenge their perceived injustices. Furthermore, the evolution and widespread accessibility of Internet technologies have enabled protests mobilization to become simpler than before and have changed social conflicts dynamics. To an extent, social media has been used to recruit people to join protests and to share opinions and news about ongoing protests events [4]. These Internet platforms have become an easy target for political campaign and protests mobilization, as witnessed during #FeesMustFall movements. The emergent of social conflict events (such as protests, civil violence, and revolutions) are classified as properties of complex adaptive system and can be modeled through agent-based model (ABM) [5–7]. Extensive research have been done to examine students’ protest in several disciplines, including social and political studies [3, 8], however, very little if any ABM has been developed to predict students’ protests at institutions of higher education. In this study, ABM that integrate the effect of relative deprivation (RD), net risk, social influence, political influence, and sympathy influence is developed to predict students’ protest at a public institution of higher education. The constructed model assists in identifying students’ microlevel behavioral patterns and combinations of factors which result into a protest action. The understanding of this emergent behavior-assisted university management in identifying behavioral patterns that result into a protest and subsequently helped in preventing damage to property, intimidation of staff and non-protesting students, and possible injuries [9]. This paper is organized as follows: Section 2 presents related work. Section 3 provide the methodology which consists of model design, implementation, calibration, and simulation experiments. Section 4 provide results and discussion followed by conclusion in Sect. 5.
2 Related Work Studies based on crowd simulation have shown how incorporating social conflict theories into ABM can help develop useful techniques to examine protests [10–13]. Epstein [13] developed a widely adopted classical agent-based computational model of civil violence, and since then, crowd simulation has evolved. In Epstein [13], civilian rebel if the different between their grievances (G) and perceived net risks (N) exceed a constant non-negative threshold (T), whereas law enforcement agents seeks
An Agent-Based Model to Predict Student …
153
to suppress rebelling civilians in their neighborhood. The grievance was denoted by the product of heterogeneous perceived hardship (H) and fixed government legitimacy (L). Net risk is represented as function of agent’s risk perception (R) as well as the estimated arrested probability (P). The simplified behavioral rule of the agent in the model proposed in the study of Epstein [13] is: “If G – N > T then ‘rebel’ else be ‘quiet’.” Simulation results qualitatively show that certain set of parameters were able to generate typical characteristics and dynamics which represent civil violence processes, such as endogenous and sporadic outbursts of violence. However, Epstein [13]’s model is simple without any defined social interaction, but serves as a baseline for future developments. The study of Kim and Hanneman [11] extended the Epstein [13]’s ABM to simulate crowd dynamics in workers protest to theoretically examine the interplay between race and class and further investigate patterns of protest waves. In [11], agent’s decision to protest is based on grievance, represented by relative deprivation as a result of wage inequalities, perceived risk of being arrested, and group identity, denoted by ethnic and cultural tags. The simulation experiment results in [11] indicate that frequency of the protest is heavily influenced by wage inequalities (or grievances). However, Kim and Hanneman [11] only includes neighborhood social interactions of agents without any network structure or influence exerted by activists or community leaders. Furthermore, the study in [14] extended Epstein’s model by incorporating three types of agents (that is protester, police, and media). Their proposed model qualitatively explores the interaction of agents in a street protest scenario, in order to understand the emerged crowd patterns, the influence of non-protesters, and the effects of news coverage. In [14], protesting citizens seeks to invade the attraction point, while law enforcement agents with sufficient backup arrest violent protesters to defend the attraction points. The media agents seek to be closer to the attraction points to capture the violent actions. The results in [14] show that the model simulated real-life features of a protest, such as clustering of violent and active protesters, formation of confrontation line, occasional fights and arrests, and the media agents lead local clustering and seeking hot spots to capture the action as close as possible. Although this model captures some realistic dynamics of crowd patterns of a protest event, the protesting civilians’ social interaction is minimal and does not incorporate the effect of social, political, and sympathy in the formation of protest. Other studies that extended Epstein [13]’s model include Ormazábal, Borotto and Astudillo [5], who proposed ABM to explore civil violence dynamics when hardship is represented as function of money distribution. The study of Ormazábal, Borotto and Astudillo [5] is aimed at evaluating the effect of inequalities in the distribution of money on social mobilizations. Again, the study of Fonoberova, Mezi´c, Mezi´c, Hogg and Gravel [15] presented an ABM of civil violence to explore the effect of non-neighborhood links on protest dynamics when varying cop agents’ density and network degree. Neither [5] nor [15] integrated social, political, and sympathy influence in their proposed model. This study was aimed at developing an ABM of student protests which extend from Epstein’s work. The model includes an integration of grievance which is
154
T. S. Raphiri et al.
defined by relative deprivation accumulated from discrepancies in resource distribution (inequality level). The model explores the effect of integrating social influence defined by undirected friendship ties, political influence denoted by directed activists’ links, and sympathy influence resulting from Moore’s neighborhood network graph.
3 Methodology 3.1 Model Description The student protest agent-based model (STUDPRO) consists of two types of turtle objects: student (S) and law enforcement officer (LEO) agents. STUDPRO includes two sorts of student agents: activist and regular students. Students and LEO interact in an abstract artificial environment defined by a forty-by-forty-two-dimensional grid space. Additional entities in the model represent linkages that operate as network graphs for the interactions of the agents. Figure 1 depicts a high-level class diagram
Fig. 1 High-level class diagram of the model (own source)
An Agent-Based Model to Predict Student …
155
Fig. 2 Model’s conceptual framework (own source)
of the model’s entities. Each object has its own set of internal characteristics as well as processes. Students are heterogeneous agents that executes submodels which allows them to interact with one another through several network architectures, assess the next action state, and migrate to vacant patches. The ACTIVE? variable indicated each student’s behavioral state, which was determined by using the threshold rule to assess whether the student was protesting or quiescent. Grievances, risks, and network effects such as social, political, and sympathy were all factors in students’ decision to engage in protest activity as depicted in Fig. 2. The green factors positively contribute toward participation, meanwhile red represent negative effect. The threshold rule to evaluate behavioral state is: If T ≤ RD + SInfl + PInfl + SymInfl − NR, Then be Active; else, be quite. That is, if constant threshold T ≡ 0.1 (taken form Epstein [13]) is less or equivalent to accumulated grievance (quantified as deprivation (RD)), plus network influences which are social effect (SInfl), political effect (PInfl), and sympathy effect (SymInfl) minus the net risk (NR), then student transform into active and T represents threshold. Student’s perceptual relative deprivation (RD) represents source of grievance. The possible range of RD felt by each student with regard to accessibility to resource x in relation to a certain reference group (social connections, political ties, and neighbors) is [0, x ∗ ]. The 0 represent the very minimum, while x ∗ denote maximum of resources a student may have. For student i, [0, x i ] denote variety of resources i can access, and [x i , x ∗ ] represent resources i is deprived of, which was used to quantify each unit of RD. Range of RD [x i , x ∗ ] felt by student i can be simplifies as (x, x + dx) or k 1− F(x), where whereby F(x) = 0 f (y)dy is the cumulative resource distribution, and 1 − F(x) is the frequency of students in the referenced group having resource accessibility above x. Relative deprivation experienced was calculated using the (1):
156
T. S. Raphiri et al.
x∗ RD = [1 − F(y)]dy
(1)
xi
Similar to the ABM proposed by Epstein [13], if grievances outweigh the consequences of participation, then students decide to protest. The perceived net risk (NR) of being laid off from academic activities was used to calculate each student’s participation cost. Students calculate participation cost or perceived net risk using (2):
NR = RA × [1 − e
Vlaw active
−k× S
]× J
(2)
As in Epstein [13], RA was a randomly assigned value ranging from 0 to 1 which represent a fixed and heterogeneous risk aversion of each student. Constant k = 2.3, while Vlaw represent number of visible LEO agents and Sactive denoted the number of students in active states in the vicinity. The visible radius was determined by number of sideway patches each student and officer agent can occupy, also referred as Moore (or indirect) neighborhood network [16], which was set by a slider when initializing the ABM. The maximum suspension term (J ) was set as homogenous value for student agents and assigned through a slider during model setup. Social effect was determined by interaction of students which resulted from symmetrical friendship network graphs integrated in the model. The structure of the students’ friendship network denoted by undirected graphG{S, L}, whereby S represent set of linked students and L are set of social ties or links between them. The constructed model only expresses fundamental attributes of network structure which include vertex degree (deg(S)), distance between nodes, and nodes clustering. deg(S) was defined by a fixed heterogeneous number of friendship ties ∈ {L 1 , . . . , L Random(1,Maxfriends) }, in which Maxfriends was obtained as input parameter through a slider in the model. In this study, students construct friendship ties by selecting students in the vision radius (V ) and by randomly choosing other students with equal STUDY-LEVEL which contains a value ranging from {0, 4} if deg(S) < L. This selection technique aided in generating a random network graph for realistic representation of friendship ties. Opinions which are represented as the different between the relative deprivation (RD) and net risk (NR) spread from aggrieved student to another via integrated social friendship network as summarized by (3). SInfl =
RD − NR.ω1
(3)
∈ASNt
∈ ASNt denotes the number of students protesting over time in the friendship network graph of student and ω1 represent global social effect weight which was constant. The directed network graph G{A ∈ [A1 , . . . , Anum_activist ], E ∈ [E 1 , . . . , E political_influence_size ]} was integrated into the model to represent political influence
An Agent-Based Model to Predict Student …
157
by activists. A and E define set of activist nodes, in the range [1, num_activists], and set of directed edges, in the range [1, political_influence_size], respectively. Edges were defined by an ordered pair node (a, n) in the order ofa → n, as a result, influence is initiated by activist (a) and directed to student node (n). Activists have a positive out-degree, which is determined by the POLITICAL-INFLUENCESIZE input parameter. Activists are sources with directed connections aimed toward a proportion of students picked at random from a population with the attribute of POLITICAL-PARTICIPATION? equal to TRUE. A student can be associated with more than one activists. Equation (4) was used to calculate political effect (P I n f l) in this study:
PInfl =
RD − NR.ω2
(4)
∈ NAPi
∈ N A P i denote number of opinions from activists toward regular students, computed as RD minus NR and ω2 represent a constant global political effect weight. Student agents were sympathetic toward active students situated in patches within vision radius. Moore neighborhood graph G = {[x, y] : |x − x0 | ≤ r, |y − y0 | ≤ r } used to incorporate the sympathy effect in this study. Whereby, [x0 , y0 ] denoted patch position occupied by student, and [x, y] represent patches adjacent to [x0 , y0 ], inclusive of [x0 , y0 ]. The maximum vision range denoted by r was defined by input parameter initialized through a slider. For each student in the model, sympathy effect (SymInfl) was calculated based on (5): SymInfl =
RD − NR.ω3
(5)
∈AVi,t
The variances in the RD and NR were used to define propagating opinions to construct sympathy effect. Meanwhile, ∈ AVi,t represented set of active students in the neighborhood network structure of a student, and ω3 represent global constant sympathy effect weight. Officers randomly catch one of the active students in their vision radius. Officers suspend active students by moving into the position of their patch. For each student, SUSPEND-TERM runs from (0 to J), where J was defined using MAXIMUMSUSPEND-TERM slider. Officers delays their next suspension rule by time ticks ranging in the range of (0, SUSPEND-DELAY). Students and cops follow a standard method in which they evaluate adjacent patches within their visual radius and move to any patch that is not occupied by other students or officers at random.
3.2 Model Implementatıon In this study, NetLogo version 6.1 was selected because it is mostly utilized by many researchers and also it is user friendly [17]. NetLogo 6.1 integrated development environment (IDE) was used to code the ABM of students’ protests, and multiple
158
T. S. Raphiri et al.
Fig. 3 NetLogo-integrated development environment (own source)
simulation experiments were conducted through BehaviorSpace [18]. The NetLogo IDE is a key tool that aids in the construction, simulation, and examination of models. In addition, the NetLogo IDE contains useful and easy-to-follow tutorials and documentations. BehaviorSpace aided in the off-screen execution of the model, as well as parameter sweeping and record simulated data in a comma-separated values (csv) file. The user interface for the implemented construct developed in this study is depicted in Fig. 3.
3.3 Experimental Design Simulation experiment was carried out in this study to investigate the effect of grievance, net risk, and network influences which social, political, and sympathy on the dynamics of students’ protests. Table 1 shows the configuration of parameters that were shared across all simulation experiments that were conducted. Note that the values flagged with ** are similar to those in [5, 11, 13], while values for other parameters were chosen based on the model’s sensitivity analysis. Table 2 depicts the experimental variables, which includes the decision variables utilized in the parameter sweeping procedure. When shared parameters were held constant, the emphasis was on the impact of inequality and suspension delay, friendship ties, the number of activists and their influential size, and sympathetic influence on the dynamics of student protests under first and second simulation conditions.
An Agent-Based Model to Predict Student …
159
Table 1 Shared parameters configurations (Own Source) Parameter
Description
Value
S
Student population density
**70%
LEO
Law enforcement officers density
**4%
Jmax
Maximum suspend (layoff) term
**30 ticks
Q1
Fixed arrest probability value
**2.3
V
Moore neighborhood vision radius
**7
Agent_Mobılıty?
Activate students and officers movement
**True
Actıvate_Socıal_Group?
Activate student friendships links
True
Actıvate_Polıtıcal_Group?
Activate activists mobilization links
True
Friendshıp_Weight
Weight of social friendship links
0.05
Political_Weight
Weight of activist links
0.05
Sympathy_Weıght
Weight of sympathy to neighbors
0.02
Lattice dimensions
Grid size (number of patches)
**40 × 40
Time step
Time ticks for each simulation experiments
**250
Table 2 Experimental variables (own source) Inequality level (IL)
Suspend delay (SD)
Maximum friendship ties (NF)
Number of activists% (NA)
Activists influential size (PI)
Sympathy
First condition
Low (0.02) Median (0.03) High (0.04)
Low (5 ticks) Median (7 ticks) High (9 ticks)
Low (deg(S) = 5) Median (deg(S) = 10) High (deg(S) = 15)
Low (0.05) Median (0.1) High (0.15)
Low (0.05) False Median (0.1) High (0.15)
Second condition
Low (0.02) Median (0.03) High (0.04)
Low (5 ticks) Median (7 ticks) High (9 ticks)
Low (deg(S) = 5) Median (deg(S) = 10) High (deg(S) = 15)
Low (0.05) Median (0.1) High (0.15)
Low (0.05) True Median (0.1) High (0.15)
160
T. S. Raphiri et al.
4 Results and Discussion 4.1 Data Screening Data screening was conducted to ensure that data captured from simulated experiments that was utilized in the statistical analysis to verify assumptions is of certain standards to produce accurate or acceptable results. In this study, simulated data was normally distributed for all cases, and no observation contained missing values. Preliminary analysis was conducted to ensure that multicollinearity issues are addressed before running predictive model. The regression coefficients results were used to calculate the collinearity tolerance threshold and variance inflation factors (VIF) to evaluate multicollinearity between decision variables. The result ascertained that there was no multicollinearity between predictor variables, tolerance was greater than 0.2, meanwhile VIF was less than 5. Further validation of no multicollinearities was observed through Pearson’s correlation matrix, whereby all the associativity between independent variables had correlation coefficients (R-values) which were under 0.9.
4.2 Results Dynamics of students’ action rules which are either “protesting,” “quiescent,” or “suspended” over time are illustrated in Figs. 4 and 5. When students show empathy toward protesting neighbors, the strength of the protest represented by the number of students participating in collective action tend to be higher. As shown in Fig. 4, when all decision variables (that is inequalities, suspend delays, activists, political influential size, and number of friends) were high (H), initial number of students protesting surpass the number of students who are quiescent at time step 0–5 and thereafter drops to an average of 38% until time step 250. The average number of bystanders over time was 53%, whereas number of suspended students was 9%. Simulating decision factors with median (M) values resulted in slightly decreased number of active students (36.5%), bystanders (55%), and suspended students (55%). Fig. 4 Time and student population first condition (own source)
An Agent-Based Model to Predict Student …
161
Fig. 5 Time and student population second condition (own source)
When decision variables were low (L), the number of students protesting was 29%, number of students suspended were 4%, and the average number of students who are bystanders were 67%. Figure 5 shows the number of active, suspended, and bystanders over time step when students are not sympathetic toward one another. When all decision factors were high (H), an average of 27.5% of the student population was engaged in protest action, 9.5% of students were suspended and 63% of students were bystanders. Meanwhile, when decision variables were medium (M), an average of 24% of students were active, 69% were quite, and 7% were suspended. When decision factors were low (L), the number of active students was the lowest at 22.5%, 75% of the population was quite, while 2.5% of students were suspended. Furthermore, logistic regression analysis was conducted to evaluate the influence of independent variables on the probability of student protest occurring. The dependent variable STRIKE, which is dichotomous with value 0 (no strike) and value 1 (strike) is used to represent the likelihood of strike occurring. Based on Cox and Snell, and Nagelkerke pseudo R2 values, the variation in the decision variables explained by the model in this study ranges from 18.1 to 93.0%, respectively. The goodness-offit test to evaluate overall performance of the model with predictors was statistically significant, χ2(5) = 3103.400, p < 0.05. Table 3 depicts the classification table that shows the construct that was predicted by the logistic regression model with the intercept and predictor variables. The accuracy in classification with added independent variables correctly predicted by the model was 99.7% as shown by Overall Percentage. Sensitivity illustrates that 92.0% of participants who led to the occurrence of the strike were also predicted by the model. Specificity shows 99.9% of participants who did not led to the occurrence of the strike were correctly predicted by the model. The positive predictive value was 94.19%, while the negative predictive value was 99.82%. Furthermore, the Wald test was used to evaluate the statistical significance for each of the latent variables. The results in Table 4 clearly illustrate that social influence
162
T. S. Raphiri et al.
Table 3 Classification tablea (own source) Observed
Predicted Strike
Step 1
Strike
Yes (1)
No (0)
15,180
20
99.9
Yes (1)
28
324
92.0
Overall percentage a
Percentage correct
No (0)
99.7
The cut value is 0.500
(Wald = 358.222; p value < 0.05), political influence (Wald = 32.537; p value < 0.05), sympathy influence (Wald = 27.554; p value < 0.05), grievance (Wald = 261.2; p value = 0.045), and net risk (Wald = 325.411; p value < 0.05) are positively significant to the model or prediction. In this study, as conclusion the proposed model of this study was different from the traditional model after statistical analysis was carried out as given in Table 5.
4.3 Discussion of Results The results in this research ascertain that grievance is a statistically significant influence in the probability of student participating in protest action (β = 120.361; p value = 0.045). This finding is inline with [19] who argued that certain level of grievance tends to increase the probability of an individual to participation in political activities such as protest action. Furthermore, the perceived net risk is a statistical significant contributor in the probability of student participating in protest behavior (β = − 293.46; p < 0.05). This is consistent with the study of [19] who argued that when risk reaches a particular level: (1) It reduces the probability of any form of participation in political activities in general, (2) reduces interest of participating in protest, and (3) reduce participation in conflicts activities. In addition, simulation results posit that social influence is a statistical significant factor in the probability of student participating in protest behavioral action (β = 16,159.44; p < 0.05). This outcome is further ascertained by study of [20] that suggested that individuals with friendship links or acquaintances that are actively involved in a protest action are more likely to participate in social movement actions than others. In addition, the results reveal that political influence is a statistical significant determinant in the probability of student participating in protest behavioral action (β = 2325.886; p < 0.05). These findings are in line with previous literature. According to [21], political discourses such as protest are primarily described through different levels of issues including propagation of political influence as well as political mobilization. The study of [22] suggests that widespread use of social media platform has enabled activists to target large number of people for protest mobilization and that subsequently increase protest participation. Lastly, the result of this study suggests that sympathy influence
5.249
7.394e + 272
119.695
628.304
Sympathy influence
Note Strike level “1” coded as class 1
5.704
∞
407.756
2325.886
Political influence
−18.039 18.927
3.565e −128 ∞
16.268
853.788
−293.460
16,159.440
Net risk
Social influence
0.447
233,617.056
27.640
12.361
Z 16.122
Odds ratio 4.398e + 60
Grievance
Standard error
8.661
139.636
(Intercept)
Estimate
Table 4 Variables in the equation (own source)
27.554
32.537
358.222
325.411
0.200
259.910
Wald statistic
Wald test Df
1
1
1
1
1
1
< 0.001
< 0.001
< 0.001
< 0.001
0.655
< 0 0.001
p
393.705
1526.700
14,486.047
−325.344
862.903
3125.072
17,832.834
−261.575
66.534
−41.811
Upper bound 156.612
122.660
Lower bound
95% confidence interval
An Agent-Based Model to Predict Student … 163
164 Table 5 Comparison of traditional and proposed model (own source)
T. S. Raphiri et al. Proposed model
Traditional model [11], [13]
Decision
Threshold
Threshold
Similar
Grievance
Grievance
Similar
Net risk
Net risk
Similar
Political effect
Different
Social effect
Different
Sympathy
Different
is a statistical significant contributor in the probability of student participating in protest behavioral action (β = 628.304; p < 0.05). This is in agreement with the study of [20] that argue that individual’s first step in protest participation is guided by consensus mobilization, whereby the general society is distinguished into people who sympathize with the cause and others who do not. The more effective consensus mobilization has been, the bigger the number of sympathizers a protest can attract.
5 Conclusion The purpose of the study was to design and implement an agent-based prediction model of student protests in the context of South African public universities to predict the emergent of protest actions. Several fields, such as social and political studies, have conducted substantial study on student protest, but no agent-based model has been used to predict student demonstrations at higher education institutions. An ABM was employed to mimic student protest in this study. This study shows that social influence contributes the highest beta value (16,159.44) influence toward the emergent of strike. Social influence, political influence, sympathy, net risk, and grievance contribute 93% as total variance explained in predicting the probability of protest occurrence based on Nagelkerke pseudo R Square using logistic regression. Future studies can utilize student demographical and empirical data to initialize internal properties of student agents for more realistic representation of model entities. Furthermore, based on the results obtained, it can be recommended that the application of ABM of student protest can be broadened to explore community-based protest initiated by activist.
6 Limitations In terms of the context and scope limitation, this study only focused on student protests without categorizing it into peaceful nor violent protests. This study acknowledges that incorporating protests containment strategies for law enforcement officers can have a significant change the dynamics of the system; however, that was
An Agent-Based Model to Predict Student …
165
out of scope in this research due to time and lack of available data on protest policing strategies in South African institution of higher learning. The interactions of model entities were limited, since there was no readily available spatial data to integrate geographical information system to guide agents’ movement rules on their hypothetical environment.
References 1. T. Luescher, L. Loader, T. Mugume, # FeesMustFall: an Internet-age student movement in South Africa and the case of the University of the Free State. Politikon 44(2), 231–245 (2017) 2. T.S. Kumar, T. Senthil, Construction of hybrid deep learning model for predicting children behavior based on their emotional reaction. J. Inf. Technol. 3(01), 29–43 (2021) 3. Y. Dominguez-Whitehead, Executive university managers’ experiences of strike and protest activity: a qualitative case study of a South African university. South Afr. J. High. Educ. 25(7), 1310–1328 (2011) 4. T. Ramluckan, S.E.S. Ally, B. van Niekerk, Twitter use in student protests: the case of South Africa’s #FeesMustFall Campaign, in Threat Mitigation and Detection of Cyber Warfare and Terrorism Activities (IGI Global, 2017), pp. 220–253 5. I. Ormazábal, F. Borotto, H. Astudillo, Influence of money distribution on civil violence model. Complexity 2017, 1–15 (2017) 6. A. Bashar, Survey on evolving deep learning neural network architectures. J. Artif. Intell. 1(02), 73–82 (2019) 7. S. Mugunthan, T. Vijayakumar, Design of improved version of sigmoidal function with biases for classification task in ELM domain. J. Soft Comput. Paradigm (JSCP) 3(02), 70–82 (2021) 8. B. Oxlund, Responding to university reform in South Africa: student activism at the University of Limpopo. Soc. Anthropol. 18(1), 30–42 (2010) 9. S. Peté, Socrates and student protest in post-apartheid South Africa-Part Two. J. Juridical Sci. 40(1–2), 1–23 (2015) 10. S.S. Bhat, A.A. Maciejewski, Agent-based simulation of the LA 1992 riots, An (Colorado State University, Libraries, 2006) 11. J.-W. Kim, R. Hanneman, A computational model of worker protest. J. Artif. Soc. Soc. Simul. 14(3), 1 (2011) 12. Lacko, P., Ort, M., Kyžˇnanský, M., Kollar, A., Pakan, F., Ošvát, M., Branišová, J.: Riot simulation in urban areas, in Book Riot Simulation in Urban Areas (IEEE, 2013), pp. 489–492 13. J.M. Epstein, Modeling civil violence: an agent-based computational approach. Proc. Natl. Acad. Sci. 99(suppl 3), 7243–7250 (2002) 14. C. Lemos, H. Coelho, R.J. Lopes, Agent-based modeling of protests and violent confrontation: a micro-situational, multi-player, contextual rule-based approach, in Book Agent-Based Modeling of Protests and Violent Confrontation: A Micro-situational, Multi-player, Contextual Rule-Based Approach (2014), pp. 136–160 15. M. Fonoberova, I. Mezi´c, J. Mezi´c, J. Hogg, J. Gravel, Small-world networks and synchronisation in an agent-based model of civil violence. Global Crime 20(3–4), 161–195 (2019) 16. S. Klancnik, M. Ficko, J. Balic, I. Pahole, Computer vision-based approach to end mill tool monitoring. Int. J. Simul. Model. 14(4), 571–583 (2015) 17. M. Lall, An agent-based simulation of an alternative parking bay choice strategy. S. Afr. J. Ind. Eng. 31(2), 107–115 (2020) 18. U. Wilensky, Netlogo. 1999. http://ccl.northwestern.edu/netlogo/ (Center for Connected Learning and Computer-Based Modeling, Northwestern University, Evanston, IL, USA, 1999) 19. A.F. Lemieux, E.M. Kearns, V. Asal, J.I. Walsh, Support for political mobilization and protest in Egypt and Morocco: an online experimental study. Dyn. Asymmetric Conflict 10(2–3), 124–142 (2017)
166
T. S. Raphiri et al.
20. J. Van Stekelenburg, B. Klandermans, The social psychology of protest. Curr. Sociol. 61(5–6), 886–905 (2013) 21. G. Singh, S. Kahlon, V.B.S. Chandel, Political discourse and the planned city: Nehru’s projection and appropriation of Chandigarh, the capital of Punjab. Ann. Am. Assoc. Geogr. 109(4), 1226–1239 (2019) 22. T. Poell, J. Van Dijck, Social media and activist communication, in Social Media and Activist Communication. In The Routledge Companion to Alternative and Community Media, ed. by P. Thomas, J. van Dijck (2015), pp. 527–537
RETRACTED CHAPTER: High Accuracy for Hyperspectral Image Classification Using Hybrid Spectral 3D-2D CNN
PT ER
Mohini Shivhare and Sweta Tripathi
A C
TE D
C
H A
Abstract HSI characterization is broadly utilized for the investigation of remotely detected image. HIS incorporates changing groups of images. CNN is perhaps the most habitually utilized profound learning-occupying techniques for visual information handling. The utilization of CNN for HSI characterization is additionally apparent in ongoing works. These accesses are generally founded on CNN_2_D. Then again, the HSI characterization execution is profoundly reliant upon both contiguous and spectral data. Not many strategies have utilized the CNN_3_D as a result of expanded compurgation intricacy. This paper introduce a hybrid spectral CNN (HybridSN) for HSI arrangement. As a rule, the HybridSN is a spectral contiguous CNN_3_D pursue by contiguous CNN_2_D. The CNN_3_D works with the collective contiguous–otherworldly component portrayal from a pile of phantom groups. The CNN_2_D on dominant of the CNN_3_D further learns more dynamic level contiguous depiction. In addition, the utilization of cross breed CNNs diminishes the intricacy of the model contrasted with the utilization of CNN_3_D unattended. An exceptionally acceptable exhibition is gotten utilizing the proposed HybridSN for HSI characterization.
R
ET R
Keywords Hyperspectral image (HSI) · 2D · 3D · Convolutional neural network (CNN) · HybridSN
The original version of this chapter was retracted. The retraction note to this chapter can be found at https://doi.org/10.1007/978-981-19-2894-9_63 M. Shivhare (B) · S. Tripathi Department of Elecronics and Communication, Kanpur Institute of Technology, Kanpur, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022, corrected publication 2024 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_13
167
168
M. Shivhare and S. Tripathi
1 Introduction
R
ET R
A C
TE D
C
H A
PT ER
The overall engineering of HSI preparing is introduced in Fig. 1. It contains four phases to be specific picture procurement, pre-preparing, dimensionality decrease, and grouping. In the main stage, HSI is obtained with hyperspectral sensors. Hyperspectral distant detecting gadget catches a scene utilizing different imaging spectrometer sensors over a wide frequencies going from apparent range to approach infrared range, which offers definite phantom data about the ground objects in a few nonstop ghastly groups (from tens to a few hundreds) [1]. The hyperspectral information caught utilizing an imaging spectrometer has various mistakes which are produced because of review math, climatic conditions, stage developments, and different sources. These mistakes are considered as environmental, radiometric, and mathematical blunders. They get decreased in the prehandling stage utilizing different adjustment draws near. In environmental remedy, the surface reflectance from distantly detected symbolism is recovered through evacuation of climatic impacts [2, 3]. In radiometric rectifications, dark upsides of pixel are changed over into brilliance esteems relating to radiation reflected or produced from the surface. Mathematical remedy includes change of a distantly detected picture, empowering having the scale and projection properties of a given guide projection. These revisions are now directed on standard benchmark hyperspectral datasets. Nonetheless, while working
Fig. 1 General architecture of hyperspectral image processing
RETRACTED CHAPTER: High Accuracy for Hyperspectral Image Classification …
169
D
C H
A PT ER
with continuous hyperspectral dataset, this amendment should be acted in the prepreparing stage. Also, the assortment of spectra is mutilated by sensor commotion just as shifting brightening or air conditions. Thus, the HSI for the most part has a couple loud and water assimilation groups [4]. These groups are should have been taken out before additional preparing. In this work, we utilized standard pre-prepared benchmark datasets. Given a bunch of preparing tests, arrangement of hyperspectral expects to dole out an interesting mark for each test pixel in the picture. A vector addresses every pixel of a HIS [5]. Every pixel compares to the reflectance of the article, and the length of the vector is equivalent to the quantity of discrete phantom groups. Characterization of HSI is a difficult assignment because of the presence of an enormous number of groups and predetermined number of named tests [6]. The exhibition of the classifier relies upon the relationship between the quantity of preparing tests and the quantity of highlights. The colossal ghostly groups in hyperspectral information bring about an immense volume of information. Lacking number of named preparing tests, the characterization execution can cause a critical corruption due to the “scourge of dimensionality” or “Hughes wonder”. Since the quantity of highlights in a hyperspectral picture is huge, the quantity of preparing tests must be enormous with the end goal of precise order of the hyperspectral picture, which is unrealistic. In addition, the neighboring groups in the hyperspectral are for the most part emphatically associated [7].
C TE
2 HSI
R
ET
R
A
HSI obtain many, exceptionally thin, coterminous ghostly groups all through the apparent, close—infrared, mid-infrared also, warm infrared places of the electromagnetic range. HSI normally gather at least 200 groups empowering the development of an practically nonstop reflectance range for each pixel in the scene. Adjacent limited transmission capacities normal for hyperspectral information considers top to bottom assessment of earth surface highlights which would some way or another be “lost” inside the somewhat coarse transmission capacities procured with multispectral scanners. Over the previous decade, broad innovative work has been completed in the field of hyperspectral far off detecting. With business, airborne HSI like the dispatch of satellite-based sensors like Hyperion HSI is quick into the standard of distant detecting and applied far off detecting research contemplates. HSI has discovered numerous applications in water asset the board, horticulture, and ecological observing. It is critical to recollect that there is not really a distinction in spatial goal among hyperspectral and multispectral information but instead in their phantom goals. HSI is ordinarily alluded to as ghostly imaging or on the other hand ghostly investigation. The differentiation among hyper- and multi-otherworldly is here and there subject to an optional “number of gatherings” or on the sort of assessment, dependent upon what is fitting to the explanation. Multispectral imaging deals with a couple of pictures at discrete and somewhat close gatherings. Being “discrete and
M. Shivhare and S. Tripathi
PT ER
170
H A
Fig. 2 Concept of HSI
TE D
C
genuinely confined” is what perceives multispectral in the obvious from concealing photography. A multispectral sensor may have various gatherings covering the reach from the clear to the long wave infrared. Multispectral pictures do not make the “range” of a thing. Landsat is an astonishing representation of multispectral imaging (Fig. 2).
A C
3 CNN
R
ET R
CNN is one of the most progressive man-made consciousness ways to deal with handle PC vision challenges [8]. The deep CNN established a class of profound learning as a feed-forward counterfeit neural organization and applied in a few of the agrarian picture orders works [9]. The convolutional layer (CL) is fundamental in CNN, and it extricates the elements from input pictures utilizing channels. The huge volume of preparing information is important to improve the exhibition of CNN [10]. One of the vital advantages of utilizing deep CNN in picture characterization is decreasing the need of the element designing interaction. Various convolutions are acted in a few layers of CNN [11]. They create different portrayals of the preparation information, beginning from additional normal ones in the primary more broad layers and turning out to be more itemized in the more profound layers. At first, the CL operate like element extracting from the preparation information whose greatness is then limited utilizing the pooling layers. The CL extricates different lower level elements into extra discriminative elements [12]. Moreover, the CL is the essential structure squares of deep CNN. Include designing is an exceptionally unmistakable piece of profound learning and a critical stride ahead for conventional AI [13].
RETRACTED CHAPTER: High Accuracy for Hyperspectral Image Classification …
171
PT ER
Fig. 3 Convolutional neural network
R
ET R
A C
TE D
C
H A
The amalgamate layer directs the down-testing activity along the spatial aspects. It upholds diminishing the quantity of boundaries. The max pooling system was utilized in the amalgamate layer of the prospective model [14]. Max amalgamate accomplishes preferred execution over normal amalgamate in the prospective deep CNN model. Another significant layer is a dropout, which alludes to eliminating elements from the organization. It is a regularization strategy for diminishing overriding. The prospective model was prepared and looked at utilizing distinctive dropout esteems changing from 0.2 to 0.8. At last, the thick layer plays out the arrangement utilizing the result of the convolutional and amalgamate layers. Profound CNN is a very iterative cycle, and it should prepare various models to find the best one [15]. Slope plummet is a basic improvement strategy that conducts the slope steps utilizing all preparation information on each progression, and it is moreover known as group inclination plunge. The execution of inclination plummet with a broad preparing set is troublesome. Figure 3 shows the example layered engineering of the deep CNN method. These channels are the basic highlights that are looked in the info picture in the CL. It takes w × h × d (w-width, h-height, d-significance of an image) display of pixels as data over which a k × k size window, known as channel or portion is slid across its width and stature with the end goal that it covers all the space of the information picture. While sliding through the picture, every pixel esteem in the information picture is duplicated with the qualities in the channel component by component and summarized to give one pixel worth of the yield picture [10]. Each layer yields a bunch of actuation guides or highlight maps, and one for each channel and this is taken care of as contribution to the following convolution layer. The convolution activity is obviously clarified in Fig. 4. For a given 5 × 5 picture, a 3 × 3 channel is slid across the picture one pixel at a time until it arrives at the last segment which would bring about three convolution activities for the principal column. Then, at that point, the channel is slid one pixel down from the left most corners and again slid across the picture till the end. This even and vertical sliding occurs till it arrives at the right side base most 3 × 3 square. This would deliver a 3 × 3 initiation map. Also, for n number of channels, n actuation guides will be created.
M. Shivhare and S. Tripathi
C TE
D
C H
A PT ER
172
Fig. 4 Example of transfer learning technique
R
ET
R
A
Move learning is a method to move the information from one AI model to another [16]. It decreases the underlying model advancement cycle of the new model by gathering the loads and predisposition esteems from existing models [17]. For instance, a AI model created for task An is reused as the establishment for a AI model on task B. The exchange learning methods are utilized pre-prepared condition of the art models for gathering the information and moves to the new models [18]. The most well-known pre-prepared profound learning models are AlexNet, visual geometry group network (VGGNet), residual network (ResNet), and inception network [14]. The AlexNet is a broadly utilized pre-prepared profound convolutional neural organization model. The AlexNet comprises of five convolutional layers with a ReLU enactment capacity and three completely associated layers. There are 62 million teachable factors in the AlexNet. Figure 4 outlines the course of move learning strategies. VGGNet expands execution and decreases the preparation time than the AlexNet. A huge contrast between the VGGNet and AlexNet was VGGNet utilize lesser portion size of the convolutional and pooling layers than the AlexNet. The size of the piece is fixed in the whole preparing process. Various variations of VGGNets
RETRACTED CHAPTER: High Accuracy for Hyperspectral Image Classification …
173
A PT ER
are VGG16 and VGG19. The number addresses the quantity of layers in the organization. The VGG16 has an aggregate of 138 million teachable boundaries. The ResNet manages the evaporating angle issue in the preparing cycle of the profound convolutional neural organizations. The ResNet utilizes a alternate route association with improve the exhibition of the organization. It utilizes as it were two pooling layers in the whole organization. The most as often as possible utilized ResNet models are ResNet18, ResNet50, and ResNet101. The ResNet18 has eleven million teachable boundaries. Moreover, the beginning net is presented the equal piece strategies to deal with variable bit esteems [15]. A most basic variant of the InceptionNet is GoogleNet. There are 6.4 million of teachable boundaries in GoogleNet.
4 Proposed Methodology
R
ET
R
A
C TE
D
C H
Certify the phantom contiguous HSI information block be signified by A ∈RW ×H ×B , where A is the first information, W is width, H is height, and B is quantity of otherworldly groups/profundity. Each HSI pixel in A contains D ghastly measures and structures a one-hot mark vector Y = (y1 , y2 , … yC ) ∈ R1×1×C , where C addresses the land-umbrella classifications. In any case, the HSI pixels display the blended land-cover classes, presenting the high eventually class fluctuation and interclass likeness into I. It is of extraordinary test for any model to handle this issue. To eliminate the unearthly repetition first, the customary head segment investigation is practiced over the first HSI information (I) along ghastly groups. The PCA lessens the quantity of phantom groups from D to B while keeping up with similar contiguous measurements (i.e., width W and height H). We have decreased just ghostly groups with the end goal that it saves the contiguous data which is vital for perceiving any article. We address the PCA diminished information 3D shape by X ∈ RM × N × B, where X is the adjusted contribution after PCA, W is width, H is height, and B is the quantity of unearthly groups subsequently PCA. The boundaries of CNN, for example, the predisposition b and the piece substance W, are typically prepared utilizing directed methodologies with the assistance of a slope plummet enhancement procedure. In traditional CNN_2_D, the convolutions are applied over the contiguous measurements just, integument all the element guides of the past layer, to register the 2D discriminate component maps. Then again, for the HSI characterization issue, it is alluring to catch the unearthly data encoded in various groups alongside the contiguous data. The CNN_2_D cannot deal with the ghastly data. Then again, the CNN_3_D portion can separate the unearthly and contiguous component portrayal at the same time from HSI information, however, at the expense of expanded computational intricacy. To yield the benefits of the programmed include acquirements ability of both 2D and 3D CNNs, we prospective a half breed highlight acquirements system called hybridSN for HSI characterization. The stream outline of the proposed hybridSN network is displayed in Fig. 5. It contains three 3D convolutions (2), one 2D convolution (1), and three completely associated layers.
M. Shivhare and S. Tripathi
A C
TE D
C
H A
PT ER
174
Fig. 5 Proposed methodology
R
ET R
To build the quantity of ghostly contiguous element maps at the same time, 3D convolutions are practiced threefold and can protect the ghastly data of info HSI information in the yield volume. The 2D convolution is practiced once sooner the f latten layer by remembering that it emphatically separates the contiguous data inside various ghastly groups without generous loss of phantom data, which is vital for HSI information.
5 Simulation Result The computational proficiency of HybridSN model shows up in term of preparing and testing times in Table 1. The prospective model is more productive than 3D CNN (Figs. 6 and 7).
RETRACTED CHAPTER: High Accuracy for Hyperspectral Image Classification …
175
Table 1 Result of test and training Data
2D CNN
3D CNN
HybridSN
Test (s)
Train (m)
Test (s)
Train (m)
Test (s)
IP
1.871
1.056
14.226
UP
1.793
1.214
57.912
4.128
13.901
4.681
10.291
20.038
SA
2.103
2.078
73.291
15.101
6.492
25.128
8.902
A
C TE
D
C H
A PT ER
Train (m)
R
Fig. 6 HSI image-I
R
ET
The effect of contiguous measurement over the presentation of HybridSN model is accounted for in Table 2. It has been tracked down that the pre-owned 25 × 25 contiguous measurement is generally reasonable for the prospective method. We have additionally figured the outcomes with even less preparing information, i.e., just 10% of complete examples and have summed up the outcomes in Table 3. It is seen from this trial that the presentation of each model reductions marginally, though the prospective strategy is as yet ready to beat different strategies in practically all cases (Table 3).
M. Shivhare and S. Tripathi
Fig. 7 HSI image-II
TE D
C
H A
PT ER
176
Table 2 Contiguous window size IP (%)
UP (%)
SA (%)
Window
IP (%)
UP (%)
SA (%)
19 × 19
99.74
99.98
99.99
23 × 23
99.31
99.96
99.71
21 × 21
99.73
99.90
99.69
25 × 25
99.75
99.98
100
A C
Window
ET R
Table 3 Classification accuracies Method
Indian Pines
OA
Kappa
AA
OA
Kappa
AA
OA
Kappa
AA
2D CNN
79.25
77.98
67.77
95.22
94.67
93.59
95.22
94.87
93.33
3D CNN
Univ. of Pavia
Salinas Scene
78.82
75.32
95.68
94.37
96.22
84.58
82.78
88.36
80.24
80.13
74.11
94.67
92.90
96.49
93.16
92.89
95.21
SSRN
97.04
97.46
85.04
98.38
98.93
98.33
98.22
98.48
98.38
HybridSN
97.79
97.99
97.75
98.39
98.98
98.16
98.34
98.78
98.07
R
81.03
M3D CNN
RETRACTED CHAPTER: High Accuracy for Hyperspectral Image Classification …
177
6 Conclusion
C H
A PT ER
The weight assigned to each of the band was determined based on the models of diminishing the distance inside each group and expanding the distance among the various bunches which features the significance of the specific band in the combination interaction. The meaning of this approach lies in its exceptionally discriminative capacity which prompts a superior grouping execution. Exploratory outcomes and examination with the current component extraction approaches demonstrated the proficiency of the imminent methodology for HSI grouping. Contrasted and that of the other contending approaches on three standard informational indexes, the proposed approach has shown the accomplishment of higher arrangement exactness and better visual outcomes. The prospective HybridSN model fundamentally consolidates the corresponding data of spatio-ghastly and unearthly as 3D and 2D convolutions, individually. The examinations more than three criterion informational collections contrasted and the new cutting edge techniques affirm the prevalence of the prospective strategy. The prospective model is compurgation productive than the 3D-CNN model.
References
R
ET
R
A
C TE
D
1. D. Kumar, D. Kumar, Hyperspectral Image Classification Using Deep Learning Models: A Review (ICMAI, IEEE 2021) 2. U. Kulkarni, S.M. Meena Sunil, V. Gurlahosur, U. Mudengudi, Classification of cultural heritage sites using transfer learning, in 2019 IEEE Fifth International Conference on Multimedia Big Data (BigMM), pp. 391–397 (2019) 3. L. Zhang, L. Zhang, B. Du, Deep learning for remote sensing data: a technical tutorial on the state of the art. IEEE Geosci. Remote Sens. Mag. 4(2), 22–40 (2016) 4. B. Rasti, D. Hong, R. Hang, P. Ghamisi, X. Kang, J. Chanussot, J.A. Benediktsson, Feature extraction for hyperspectral imagery: the evolution from shallow to deep: overview and toolbox. IEEE Geosci. Remote Sens. Mag. 8(4), 60–88 (2020) 5. J.M. Haut, M.E. Paoletti, J. Plaza, A. Plaza, J. Li, Hyperspectral image classification using random occlusion data augmentation. IEEE Geosci. Remote Sens. Lett. 16(11), 1751–1755 (2019) 6. Y. Li, H. Zhang, Q. Shen, Spectral–spatial classification of hyperspectral imagery with 3D convolutional neural network. Remote Sens. 9(1), 67 (2017) 7. S.K. Roy, G. Krishna, S.R. Dubey, B.B. Chaudhuri, HybridSN: exploring 3-D–2-D CNN feature hierarchy for hyperspectral image classification. IEEE Geosci. Remote Sens. Lett. 17(2), 277– 281 (2019) 8. X. Zhang, Y. Sun, K. Jiang, C. Li, L. Jiao, H. Zhou, Spatial sequential recurrent neural network for hyperspectral image classification. IEEE J. Select. Top. Appl. Earth Observ. Remote Sens. 11(11), 4141–4155 (2018) 9. Q. Liu, F. Zhou, R. Hang, X. Yuan, Bidirectional-convolutional LSTM based spectral-spatial feature learning for hyperspectral image classification. Remote Sens. 9(12), 1330 (2017) 10. C. Shi, C.M. Pun, Multi-scale hierarchical recurrent neural networks for hyperspectral image classification. Neuro Comput. 294, 82–93 (2018) 11. Y. Liu, L. Gao, C. Xiao, Q. Ying, K. Zheng, A. Marinoni, Hyperspectral image classification based on a shuffled group convolutional neural network with transfer learning. Remot. Sensor. 12, 01–18 (2020)
178
M. Shivhare and S. Tripathi
R
ET
R
A
C TE
D
C H
A PT ER
12. H. Yu, L. Gao, W. Liao, B. Zhang, L. Zhuang, M. Song, J. Chanussot, Global contiguous and local spectral similarity-based manifold learning group sparse representation for hyperspectral imagery classification. IEEE Trans. Geosci. Remote Sens. 58, 3043–3056 (2020) 13. X. Zhao, Y. Liang, A.J. Guo, F. Zhu, Classification of small-scale hyperspectral images with multi-source deep transfer learning. Remote Sens. Lett. 11, 303–312 (2020) 14. X. He, Y. Chen, P. Ghamisi, Heterogeneous transfer learning for hyperspectral image classification based on convolutional neural network. IEEE Trans. Geosci. Remote Sens. 58, 3246–3263 (2019) 15. H. Zhang, Y. Li, Y. Jiang, P. Wang, Q. Shen, C. Shen, Hyperspectral classification based on lightweight 3-D-CNN with transfer learning. IEEE Trans. Geosci. Remote Sens. 57, 5813–5828 (2019) 16. X. Zhang, X. Zhou, M. Lin, J. Sun, ShuffleNet: an extremely efficient convolutional neural network for mobile devices, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018, pp. 6848–6856 17. N. Ma, X. Zhang, H.T. Zheng, J. Sun, ShuffleNet v2: practical guidelines for efficient CNN architecture design, in Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018, pp. 116–131 18. M.E. Paoletti, J.M. Haut, J. Plaza, A. Plaza, A new deep convolutional neural network for fast hyperspectral image classification. ISPRS J. Photogramm. Remote Sens. 145, 120–147 (2018)
Design Smart Curtain Using Light-Dependent Resistor Feras N. Hasoon, Mustafa Khalaf Aal Thani, Hilal A. Fadhil, Geetha Achuthan, and Suresh Manic Kesavan
Abstract With the growth and development of the intelligent home industry, the smart curtain has become part of the advanced technology of this era. A smart curtain is an advanced technology that closes and opens automatically without interference. The system was built based on the external light sensor. Throughout analyzing and detecting the external light, light-dependent resistor (LDR) automatically closes and opens the curtain according to the light intensity. This paper reveals the tools used to build the smart curtain with experiments before the entire hardware project is implemented. Keywords Smart curtain · Intelligent curtain · Automatic curtain · Sensible curtain
1 Introduction Curtain is the window cover and has an irreplaceable role in daily life. Curtain has both practical and aesthetic advantages that include regulation of sunlight, insulation from heat and cold, privacy at night, and protection from outside dust. It is a mandatory accessory for homes and offices as it plays an essential role in decoration [1–4]. Privacy is one of the many top reasons to utilize curtains. It is necessary to protect the privacy of the home. Curtains prevent people from peering at the privacy of the house, and additionally, it gives feeling of comfort and reassurance in the home especially at night [5]. Installing curtains gives safety and the feeling of being uninterrupted by the additional glare of lighting and outside vehicle movement. This is a great method to improve privacy in the home. With the development of technology, the home curtain F. N. Hasoon · M. Khalaf Aal Thani · G. Achuthan · S. M. Kesavan (B) Electrical and Computer Engineering Department, National University of Science and Technology, Muscat, Sultanate of Oman e-mail: [email protected] F. N. Hasoon e-mail: [email protected] H. A. Fadhil Department of Electrical and Computer Engineering, Sohar University, Sohar, Sultanate of Oman © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_14
179
180
F. N. Hasoon et al.
can be created as a smart curtain. Therefore, the curtain can be made to close and open automatically at times specified by the user. For example, the curtain closes during the night to provide privacy, and it opens throughout the day to illuminate the house and save the electrical energy consumed by the lamps [6, 7]. This paper reveals the smart curtain using light-dependent resistor (LDR) that contributes to operating the curtain according to the sunlight. Curtains are widely installed in the home and office environment [8]. Additionally, the paper introduced the major testing and simulation of the components used to design the hardware of the smart curtain. The operation of the smart curtain has been designed to be the manual mode or auto mode. The manual mode plays as a switch that can open and close the curtain manually in case the user wants to open the curtain at night for example. The auto mode plays as a smart operation depending on the external light. Where this curtain has been designed to opens at the daytime and closes at the night. Moreover, the electric circuit of the smart curtain has an intelligent design by how the relays work inside the electric circuit and stop depending on the mode of the curtain.
2 System Design 2.1 Block Diagram The block diagram of the smart curtain system is shown in Fig. 1, which includes the major components in the electrical circuit. These components include rectifier, double pole double throw (DPDT) switch, double pole triple throw (SP3T) switch,
Fig. 1 Block diagram of smart curtain system
Design Smart Curtain Using Light-Dependent Resistor
181
single pole double throw (SPDT) relay, double pole double throw (DPDT) relay, LDR, and DC motor. The DPDT switch is used to change the mode type of the operation of the curtain between auto and manual. Also, the SP3T switch is used for closing and opening the curtain in manual mode. The AC power supply will be converted to a 16 V DC power supply by a stepdown transformer with a rectifier. In return, an AC power supply is used for the AC lamp and DPDT relay. LDR will be fed by a DC supply to sense the external lights. At a certain resistance, depending on the light intensity, the LDR will feed the DPDT relay with a DC supply to proceed with the operation of the curtain. DPDT relay has three options to send an open or close signal to the next SPDT relay as well as to turn the AC lamp on during the process of opening or closing the curtain. The DC motor will receive power or reversed power signals from SPDT relays to proceed with opening or closing the curtain.
2.2 Light Sensor (LDR) LDR stands for a light-dependent resistor. It is a resistance that its value changes according to the intensity of the light exposed to it. There is an inverse relationship between the intensive light and the resistance of the LDR. The value of resistance decreases when the intensity of the incident light increases. Conversely, the value of resistance increases when the LDR is shielded from light. In the dark, the resistance of the LDR is very high, reaching more than a Mega Ohm. In the event the LDR is exposed to the light, the resistance drops to a few hundred ohms [9]. LDR is the main component of the smart curtain project. Hence, the operation of automatic operation of the curtain depends mainly on the LDR component as shown in Fig. 2.
Fig. 2 LDR component and resistance curve [10]
182
F. N. Hasoon et al.
Fig. 3 Symbol of SPDT and DPDT relay [13]
2.3 SPDT and DPDT Relay Relay is the control device that is utilized to create or stop the flow of electrical power in a circuit [11]. Additionally, this device is employed as a switch for opening or closing the circuit. Hence, it is considered an electromechanical device. The relay circuit often is made up of switches, wires, and coils. Two models of the relay are implemented for the smart curtain system, including SPDT and DPDT relay. SPDT relay has three terminals: a common port and two other ports that exchange connections to the shared port. The SPDT is very suitable for choosing from two power supplies, commutative inputs, or any other application requiring one circuit from between two circuits to be connected [12]. DPDT represents two types of SPDT relay. The DPDT can control two circuits, but these circuits must be switched on or off together. DPDT switches consist of six ports. The operation of the relay can be done through applying a voltage to the circuit, and the coil of the relay will magnetize and cause to move the pole latch to touch the other terminal (from normally closed to open circuit or vice versa). Figure 3 illustrates the symbol of difference between SPDT and DPDT relay.
2.4 LDR and Voltage Divider Displayed equations or formulas are centered and set on a separate line (with an extra line or half line space above and below). Displayed expressions should be numbered for reference. The numbers should be consecutive within the contribution, with numbers enclosed in parentheses and set on the right margin. Please do not include section counters in the numbering. The LDR in the circuit is used to sense external light and decides whether the resistivity is high or low depending on the intensity of the light. Because there is an inverse relationship between the intensity of the light and the resistance of the device. The resistance of the LDR in the dark is very high, reaching more than a Mega Ohm.
Design Smart Curtain Using Light-Dependent Resistor
183
Fig. 4 LDR and voltage divider circuit diagram
In the event the LDR is exposed to the light, the resistance drops to a few hundred ohms [9]. Figure 4 shows the circuit diagram of the LDR and voltage divider. The output voltage depends on the LDR resistance and the R2. The value of R2 is set to be 100 K to match the required value of the voltage if the curtain is put into the automatic mode. At night (the absence of light), the values of resistance of the DLR as well as the output voltage are higher to operate the relay responsible for closing the curtain.
2.5 Transistor and Voltage Divider The superscript numeral used to refer to a footnote appears in the text either directly after the word to be discussed or—in relation to a phrase or a sentence—following the punctuation mark (comma, semicolon, or period).1 NPN transistor type is used to the smart curtain circuit. The major objective of utilizing a transistor in an electrical circuit is to energize or stop the SPDT relay accountable for closing the curtain in automatic mode. Figure 5 shows the circuit diagram of the transistor and voltage divider. At night, the light intensity is very low, and the resistance value of the LDR will become high and reaches Mega Ohms or may reach infinite. In this case, the voltage value across LDR will become in high value. In contrast, the voltage value across R2 will become relatively low (below 0.7 V). This value is not enough to activate the transistor (Q1). The collector of the transistor (Q1) is normally connected to the base–emitter of the transistor (Q2). So, the voltage across the collector of the transistor (Q1) is high (above 0.7 V), and it is enough to trigger the transistor (Q2).
1
The footnote numeral is set flush left, and the text follows with the usual word spacing.
184
F. N. Hasoon et al.
Fig. 5 Transistor and voltage divider circuit diagram
The voltage will supply the SPDT relay, and the relay will be triggered. The operation of the SPDT relay will control the other component of the circuit to open the curtain.
2.6 Overall Circuit Figure 6 shows the overall design of the smart curtain circuit. The DPDT switch-1 is designed to switch the curtain circuit in manual auto mode. For operating the curtain manually, the DPDT switch-1 should be switched to the manual position. The DP3T switch is responsible to switch on and switch off the
Fig. 6 Overall circuit diagram of the smart curtain
Design Smart Curtain Using Light-Dependent Resistor
185
DPDT relay-2 and SPDT relay-3. When we switch the DP3T switch to position no.1, the SPDT relay-3 triggers and turns on the motor in the opening position. When the DP3T switched to position no.2, DPDT relay-2 triggers and turns on the motor and rotates in the reverse direction to close the curtain and turns on the sleeping lamp. On the side of the auto mode, the DPDT switch-1 should be switched to the auto position. In this case, the operation of the curtain depends on the LDR resistivity of the light. If the intensity of the external light is high, the resistance value of LDR is very low. Hence, the SPDT relay-1 remains off and in normally closed. The power triggers SPDT relay-3 and turns on the motor to open the curtain. In case the intensity of the external light is low, the resistance value of LDR is high. Hence, SPDT relay-1 triggers. Also, the power triggers DPDT relay-2 and turns on the motor and rotates in the reverse direction to close the curtain and turns on the lamp. The limit switch used to the curtain circuit is manually close. Therefore, it is set to open the circuit in a certain position of the curtain to open or close the curtain in the desired position.
3 Experiment and Results 3.1 Software Simulation Proteus simulation software is used for circuit simulation of the smart curtain. This simulation enables to test the operation of the curtain project and discovers the defects and finds the appropriate components value to run the circuit in an effective method. There are four simulations for the curtain operation including closing and opening the curtain by manual mode and closing and opening the curtain by auto mode. Figure 7 shows the circuit diagram of the smart curtain using Proteus software. Simulation completed successfully, and the curtain operation worked in the expected way. Table 1 demonstrates the results obtained during curtain operation in the manual mode. On the side of the auto mode, two values of the LDR resistance have been chosen. For the daytime, the LDR value set to be 100 K, and for the nighttime, the LDR value set to be 500 K. Table 2 demonstrates the results obtained during curtain operation in the auto mode.
3.2 Hardware Testing Figure 8 shows the hardware testing for the smart curtain using manual mode. When switched the SP3T SWT-2 to the position 1 (closing process), the voltage value across the DPDT relay-2 is 0 V, and the voltage value across SPDT relay-3 is 12.16 V. The output voltage value is 12.28 V. The SPDT relay-1 remains off because it only triggers in auto mode with the closing process. When switched the SP3T SWT-2 to the position 2 (opening process), the DPDT relay-2 triggers and closes the DC motor
186
F. N. Hasoon et al.
Fig. 7 Circuit diagram of the smart curtain using Proteus software
Table 1 Results obtained in manual mode for daytime Manual mode Position
Relay-1 Vdc
Relay-2 Vdc
Relay-3 Vdc
Lamp Vac
Motor direction
Open
0
0
11.9
Off
Clockwise
+11.8
Close
0
11.9
0
On 169
Anticlockwise
−11.8
V out Vdc
circuit. SPDT relay-3 remains off and is normally closed. Finally, the circuit operates the DC motor in a closing direction, and the lamp turns on. The voltage value across relay-2 is 12.10 V, and the voltage value across relay-3 is 0 V. The manual mode testing has been successfully carried out, and the curtain closing and opening operations have been carried out effectively. Table 3 summarizes the results obtained for the manual mode operation of the curtain. During exposing the LDR into the light, the circuit operates the DC motor in the opening direction, and the lamp remains off. The voltage value across relay1 and relay-2 is 0 V, and the voltage value across SPDT relay-3 is 12.09 V. The output voltage value is 12.28 V. When the external light is cut off from the LDR, the resistance of the LDR increases according to the intensity of the light falling on the LDR. The circuit operates the DC motor in a closing direction, and the lamp turns on. The voltage value across SPDT relay-3 is 0 V, and the voltage value across SPDT
6.17
100 K
500 K
Open
Close
11.2
LDR Vdc
LDR Resistance Value
Position
5.7 0.7
– –
– –
0.4 1.1
–
–
B-C
B-E
E-C
B-E
B-C
Transistor Q-2
Transistor Q-1
Table 2 Results obtained in auto mode for daytime
–
–
E-C 11.1
0
Relay-1 Vdc
11.9
0
Relay-2 Vdc
0
11.9
Relay-3 Vdc
On 169
Off
Lamp Vac
Anticlockwise
Clockwise
Motor Direction
−11.8
+11.8
V out Vdc
Design Smart Curtain Using Light-Dependent Resistor 187
188
F. N. Hasoon et al.
Fig. 8 Testing curtain operation in manual mode in day time
Table 3 Results obtained in manual mode for night time SP3T SWT-2
Relay-1 VDC
Relay -2 VDC
Relay -3 VDC
Lamp VAC
1
0
0
12.16
0
2
0
12.10
0
236
V out Vdc 12.28 −12.25
relay-1 and DPDT relay-2 is 11.96 and 12.03 V, respectively. The output voltage value is −12.22 V. Figure 9 shows the hardware testing using auto mode. The results show that the auto mode testing has been successfully carried out, and the curtain closing and opening processes have been carried out effectively. Table 4 summarizes the results obtained for the manual mode operation of the curtain.
Fig. 9 Testing curtain operation in manual mode for night time
LDR Vdc
11.46
11.41
Light condition
Light
Dark
0.68
1.33
E-C 1.94
0.68 0.76
0.14 0.69
12.09
B-C
B-E
0.67
B-C
B-E
0.74
Transistor Q-2
Transistor Q-1
Table 4 Results obtained in auto mode for night time
E-C 0.09
12.23 11.96
0
Relay-1 Vdc
12.03
0
Relay-2 Vdc
0
12.09
Relay-3 Vdc
233 (On)
0 (Off)
Lamp Vac
−12.22
12.28
V out Vdc
Design Smart Curtain Using Light-Dependent Resistor 189
190
F. N. Hasoon et al.
4 Conclusion Smart curtain is one of the significant home needs of the modern era. This technology is recognized as easy for everyone to use and safe since it works with a low DC supply. Additionally, it depends on the external light when the mode is set automatically, and it could be operated manually depending on the user’s need. The paper presented the block diagram, and the main components used for the smart curtain. Also, the smart curtain has been simulated and testing, and the results obtained are evidence of the project’s success.
References 1. A. Trio, F.A. Sinantya, F. Syifaul, Y.F. Maulana, Curtain control systems development on mesh wireless network of the smart home. Bull. Electr. Eng. Inform. 7, 615–625 (2018) 2. V. Balasubramaniam, IoT based biotelemetry for smart health care monitoring system. J. Inform. Technol. Digit. World 2, 183–190 (2020) 3. P.J. Patil, V.Z. Ritika, R.T. Kalyani, A.S. Bhavana, R.S. Shivani, S. Shailesh, IoT protocol for accident spotting with medical facility. J. Artif. Intell. 3, 140–150 (2021) 4. I.J. Jacob, P.E. Darney, Design of deep learning algorithm for IoT application by image based recognition. J. ISMAC 3(03), 276–290 (2021) 5. Importance of Curtains in Daily Life. https://mydecorative.com/importance-of-curtains-indaily-life/ 6. Y. Wang, Y. Zhang, L. Hong, Design for intelligent control system of curtain based on Arduino, in 2nd International Conference on Electrical, Computer Engineering and Electronics, pp.1344–1348 (2015) 7. A. Sungheetha, S. Rajesh, Real time monitoring and fire detection using internet of things and cloud based drones. J. Soft Comput. Parad. (JSCP) 2, 168–174 (2020) 8. T. Sun, Y. Wang, M. Wu, A new intelligent curtain control system based on 51 single chip microcomputer. Navy Submarine Acad. 3, 16–28 (2017) 9. M. Malin, B. Palmer, L. DeWerd, Absolute measurement of LDR brachytherapy source emitted power: instrument design and initial measurements. Med. Phys. 43, 796–806 (2016) 10. Open circuit. https://opencircuit.shop/search/LDR 11. G. Unni, D. Solu, SPST to DPDT switching conversion module for solid state relays (SSR). Int. J. Eng. 5, 96–104 (2017) 12. J. Kim, S. Kim, H. Shin, A 4-channel SPST and 8-channel SPDT CMOS switch IC for electronic paper display smart card. J. Inst. Electron. Inform. Eng. 55, 47–53 (2018) 13. Electrical Relay. https://www.electronics-tutorials.ws/io/io_5.html
Machine Learning Assisted Binary and Multiclass Parkinson’s Disease Detection Satyankar Bhardwaj, Dhruv Arora, Bali Devi, Venkatesh Gauri Shankar, and Sumit Srivastava
Abstract Neurodegenerative changes in Parkinson’s disease primarily affect the patient’s movement and speech, but they may also cause tremors. This is a central nervous system disease, and neurodegenerative changes mainly affect movement and speech. There are more than 10 million+ people affected worldwide. Its research so far has not yielded any concrete remedies or cures. Researchers believe that through studying past medical data of Parkinson’s patients, we can form algorithms that can diagnose the patient with the disease years before symptoms appear. In this research paper, we have analyzed the signs and symptoms facilitating the early diagnosis of Parkinson’s disease by applying classification algorithms on the classical features. The subjects in this study were placed into four classes according to the Unified Parkinson’s disease Rating Scale (UPDRS) score after analyzing the various features considered in the above classes. We have attained 87.83 and 98.63% accuracy using KNN (Binary Data) and Decision Tree Classifier (Multiclass Data). Keywords Parkinson’s disease · Neurodegenerative disease · Machine learning · Classifier · Regression
1 Introduction When a person has Parkinson’s disease, their movement and speech can be affected, but it can also cause tremors. Parkinson’s disease is a disease of the brain. In the brain, nerve cell damage is caused by dopamine-producing neurons called the substantia. It is one of the most common brain diseases that worsen over time. Because of the high cost of treating the disease, the patients’ quality of life is harmed [1, 2]. They have a hard time socializing, and their finances worsen because of the costs. Because the cause of this disease is known, it can be a lot easier to treat this disease if it is S. Bhardwaj · D. Arora · B. Devi (B) Computer Science and Engineering, Manipal University Jaipur, Jaipur, Rajasthan, India e-mail: [email protected] V. G. Shankar · S. Srivastava Information Technology, Manipal University Jaipur, Jaipur, Rajasthan, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_15
191
192
S. Bhardwaj et al.
found early on when the symptoms are not as bad. This word is called TRAP, and it refers to four main things: People shake when they are not moving: rigidity, a lack of movement, and a jerky posture. Also, Parkinson’s disease slows you down and makes your body move in different ways, so that’s another thing to know about the disease. It is not just anxiety that a neurodegenerative patients have. It also have fatigue, depression, sleep problems, and cognitive problems that make it hard for you to do things [1, 3, 4]. Another thing to note is that phonation and speech disorders are prevalent. At first, they may not show up for up to five years before a doctor can confirm that the person has Parkinson’s disease. The early stages of Parkinson’s disease can be hard to figure out if patients do not show any critical signs. A lot of the symptoms of other disorders in the same group as the one you have could be similar to the one you have. If you have Parkinson’s disease, it could take up to five years or more to get a correct clinical diagnosis from your doctor. Keep an eye on your health during this time. Often, there are not many signs or symptoms that can be seen at the beginning of the disease. Statistical illustration of data related to the problem objective is described in dataset Sect. 4.
2 Identify Research and Literature Survey Gunjan Pahuja [5] used a model based on the Genetic Algorithm Extreme Machine Learning (GA-ELM) trained on MRI images of Parkinson’s disease patients. Classifiers look at images of brain scans for voxel-based features that show how well the brain is working. GA-ELM was right 95% of the time. A better model than SVM was found to be this one. It was found to be more accurate than SVM and more stable than the other. Use Principal Component Analysis to cut down on the number of dimensions in your data. Fisher Discriminant Ratio (FDR) helps you rank the features you choose for a framework for diagnosing neurodegenerative diseases through multiclass classification. This framework is called a “multiple class framework,” and Gurpreet Singh [6] and his team called it that. The average classification accuracy of this method was more than 95% for binary data and more than 85% for multiclass data, as shown in the figure. In their opinion, this is the best accuracy they have seen for a multiclass Parkinson’s disease diagnosis that they have ever seen. The study of Wang [7] attributed to the desirable characteristics of the model in learning linear and nonlinear features from Parkinson’s disease data without making any handcrafted data extractions showed an accuracy of 96.45%. Their study incorporated the analysis based on premotor features using a deep learning model to automatically discriminate patients affected by Parkinson’s disease and normal individuals. This model outperformed the previous twelve considered machine learning models in discriminating Parkinson’s disease patients and normal people. Voice analysis using a dataset was used to diagnose Parkinson’s disease by Benmalek [8, 9]. The dataset of 375 people was divided based on severity into four classes. People used the Local Learning-Based Feature Selection algorithm to group things into different groups.
Machine Learning Assisted Binary and Multiclass …
193
We used the random subspace algorithm to do discriminant analysis for multiclass systems, which helped us get the best results. He has shown how to use voice features at different stages of Parkinson’s disease to tell the difference between Parkinson’s disease patients and healthy people. It used Mel-frequency Cepstral Coefficients (MFCC) chosen by the LLBFS algorithm and discriminant analysis to get 86.7% of the time, which is good. Using more acoustic features that are not used in the model building is even better. They might be more useful in medicine when figuring out if someone has Parkinson’s disease. In his study, Lei [10] used a sparse feature selection framework to look for signs of Parkinson’s disease early on. He also used a multiclass classification model to look for signs of the disease in people [11]. It was also done to use it in the clinic to figure out what type of disease it was. Show that their method can classify three groups at once. They used multimodality data from the PPMI neuroimaging dataset to show that their method can classify NC, PD, and SWEED simultaneously. For PD diagnosis, they also made 10 important ROIs. These ROIs show where the important parts of the brain are for PD diagnosis. In the hybrid intelligent framework for Parkinson’s disease diagnosis built by Ali [12], many different types of persistent phonation data were used to make it. There are two ways to think about this framework: The LDA model is used to cut down on the number of dimensions, and the neural network model is used to classify the things in the dataset. There was a 95% accuracy rate on a training database and a 100% accuracy rate on a test database when the system used all the dataset’s features. That’s why gender-related features were removed from this study’s dataset because there were too many women in the dataset. When the proposed framework was changed, it was able to get 80% correct on the training database and 82% correct on the testing database. The author, Jefferson S. Almeida [13] came up with a new way to tell if someone has Parkinson’s. When he did this, he used a mix of feature extraction and machine learning. Eighteen feature sets and 14 classifiers were used to look at the phonic data set used in this study to measure ERR and Accuracy. This data set had a lot of different things to look at. K1 classifier was best when used with the audio cardioid (AC) microphone’s phonation mode. Yaffe’s (YA) and KTU’s (KT) individual feature sets are best for this classifier. K1 classified Yaffe (YA) and KT best when the smartphone microphone was set to phonation mode. They had an accuracy rate of 94.55 and 92.94%, respectively. Senturk [14] came up with a way to use the features of voice signals from both healthy people and people with Parkinson’s disease to help people figure out if someone has Parkinson’s disease or not. In the experiments, there were a lot of different ways to use FS and classifiers. It was a good idea to use both FS methods and classification methods simultaneously [15, 16]. It was important to use them with speech signals with many different phonetic characteristics, so they were useful. Use this model when Parkinson’s disease is in its early stages. It could also be used to stop the disease’s worst effects, so it could be very accurate.
194
S. Bhardwaj et al.
Table 1 Limitations of existing works S. No.
Existing methods
Limitations
1
Enhanced fuzzy KNN approach [17]
High error rates in elder population samples
2
Linear discriminant analysis and The biased dataset may have low genetically optimized neural network [12] accuracy
3
Cepstral analysis [9]
Required heavy preprocessing and feature extraction of voice samples
4
Sparse feature learning [10]
Dataset requires MRI images and DTI images which require extensive preprocessing
5
Machine Learning framework based on PCA, FDR, and SVM for Multiclass Diagnosis [6]
The framework was used on a limited dataset and not implemented on a large dataset
Problem Statement This research problem is about getting an early diagnosis of Parkinson’s disease (PD) by classifying certain traits. The Unified Parkinson’s disease Rating Scale (UPDRS) score is used to classify the subjects into four groups based on their features. Research Gap in Table 1 Research Objectives and Novelty of the Work 1. 2.
To create a model that can tell if a patient has Parkinson’s disease or not based on clinical observations. To create a multiclass model to predict the severity of Parkinson’s disease.
3 Methodology In our study, primarily the models were applied for binary and multiclass classification. The data was collected from the Department of Neurology in Cerrahpasa Faculty of Medicine, Istanbul University, Turkey. Since the binary dataset consisted of many highly dimensional data, we implemented Principal Component Analysis (PCA) to reduce execution time and increase accuracy. They made the data for multiclass classification with help from 10 medical centers in the United States and Intel Corporation. Athanasios Tsanas and Max Little worked on the project with help from Intel. The target variable, the Unified Parkinson’s disease Rating Scale (UPDRS) score, was between 7 and 54. The higher the rating, the more severe the disease. To conduct multiclass classification on the target variable, we scaled down these continuous values into 4 broad categories, ranging from 0 to 3 through unsupervised learning. The dataset consisted of no missing or repeated values. The data was highly dimensional, i.e., it had a massive number of features. This caused problems with
Machine Learning Assisted Binary and Multiclass …
195
Fig. 1 Detailed design methodology
processing time and negatively affected the accuracy of the data. To solve this problem, Principal Component Analysis (PCA) was used. PCA is an algorithm that groups together similar-looking features, thereby reducing the data’s number of features or dimensionality. As stated in detail ahead, PCA also yielded higher accuracy while significantly optimizing the process in Fig. 1. The basic skeletal of working of all used algorithms are as shown in Fig. 2:
4 Dataset We loaded labeled pickle files in an array and used different algorithms on these models. We applied Principal Component Analysis (PCA) to help to reduce the dimensionality of data, with 100 components giving the highest accuracy. To further increase accuracy, we implemented data standardization [18–20]. For Binary Dataset: There is a school of medicine at Istanbul University called Cerrahpa¸sa. That’s where the data was taken, Do you have Parkinson’s disease? This study looked at 188 people who had the disease, aged 33–87. It looked at 107 men and 81 women with the disease. People in the control group, 23 men and 41 women, are all healthy, and there are 64 of them between 41 and 82 years old. As part of the data-gathering process, the mics are set to 44.1 kHz. After a doctor looked at each person, three repetitions of the vowel “a” were removed from each one [18–20]. For Multiclass Dataset: The data has been sourced from the UCI machine learning repository: Parkinson’s telemonitoring dataset. The dataset had UPDRS value (target variable) in continuous form and thus was transformed into 4 broad categories
196
S. Bhardwaj et al.
Fig. 2 Detailed design flow diagram
through unsupervised scaling [18–20]. The target variable now had 4 discrete values ranging from 0 to 3, with the increasing value indicating a more severe form of Parkinson’s disease. Analysis of correlation of UPDRS with features showed that age was the most impactful feature followed by HNR, DFA, sex, PPE, and RPDE had the most impact on the severity of the disease. HNR—the ratio between periodic and non-periodic components in a voice sample. DFA—signal fractal scaling exponent, PPE—a nonlinear measure of fundamental frequency variation, RPDE— a nonlinear dynamical complexity measure. Comparison of age, sex, and UPDRS showed that females were more likely to show symptoms earlier than males while males tended to have a higher UPDRS (For reference, 0 = “Male” 1 = “Female”) in Fig. 3.
5 Results 5.1 Binary Data We tested the data on the following models—SVM, KNN, Decision Tree, and Logistic Regression. We observed that KNN provided the highest accuracy among the tested models. SVM followed closely and was the second-best performing. The
Machine Learning Assisted Binary and Multiclass …
197
Fig. 3 UPDRS data with age
decision tree gave varying accuracy and was not a reliable option for binary classification. PCA increased the accuracy for SVM and Logistic Regression, reducing the same decision tree and KNN.
5.1.1
SVM Algorithm
Support vector machine (SVM) is a popular classification algorithm that uses support vectors to classify data points on an n-dimensional plot. It plots the data points and tries to find a boundary or hyperplane separating different data points. This hyperplane can be linear or nonlinear. Unlike other classification algorithms, SVM performs well with highly dimensional data and is not susceptible to overfitting [2, 3]. This accuracy was achieved by modifying the parameters like Kernel—poly, Param C—1, Param Gamma—auto (Figs. 4, 5 and Table 2).
5.1.2
Logistic Regression Algorithm
Logistic Regression is a classification algorithm that works on the logistic function to give the result for binary-valued output. However, with more complex solvers, multiclass outputs can also be used. We used this algorithm because of its popularity as a classifier due to its simplicity and efficiency as a binary classifier. This also allowed us to use Logistic Regression in a multiclass classification that is less commonly used [2, 16, 21].
198
S. Bhardwaj et al.
Fig. 4 Model comparison for binary data
Fig. 5 Confusion matrix SVM (binary data)
Table 2 Accuracy of SVM on binary data
With PCA (110)
Without PCA
87.30%
79.36%
This accuracy is achieved by tuning the parameters like Solver–newton-cg, Param C-0.1 in Fig. 7 and Table 3. Table 3 Accuracy of logistic regression (binary data)
With PCA (110)
Without PCA
85.71%
83.59%
Machine Learning Assisted Binary and Multiclass …
199
Fig. 6 Confusion matrix KNN (binary data)
Fig. 7 Confusion matrix logistic regression (binary data)
5.1.3
KNN Algorithm
K-nearest neighbor (KNN) is one of the simplest classification algorithms that classify data by plotting the given data points and classifies the unknown data points depending on their distance to the “k” closest known data points. The user can manually set the value of “k”; hence, we could compare the accuracy of the model for different values of “k” and choose the most optimal one. We favored this algorithm as it is simple, robust, and works very well with large datasets [3, 15, 22]. The model showed an accuracy of 87.83%. The model’s accuracy did not change from the application of PCA in Fig. 6 (Fig. 7).
5.1.4
Decision Tree
A Decision Tree Classifier is a classifier that implements a set of rules to decide, i.e., to get the final classification. It constructs a tree based on training data. In this tree, the leaf nodes are the final answers. A decision tree is an excellent classifier because
200 Table 4 Accuracy of decision tree
S. Bhardwaj et al. With PCA (150)
Without PCA
75.13%
85.18%
it can work with data that has not been pre-processed and supports categorical data along with numerical data [11, 21–23]. We have used the decision tree classifier model to achieve the following result in Table 4 and Confusion Matrix Decision Tree in Fig. 8. This accuracy is achieved by tuning the parameters as given as Random State—1. Fig. 8 Confusion matrix decision tree (binary data)
Fig. 9 Model comparison for multiclass data
Machine Learning Assisted Binary and Multiclass …
5.1.5
201
Comparison Table for All the Classifiers Algorithms-Binary Class is Given in Table 5 and Model Comparison for Binary Data in Fig. 4
See Table 5.
5.2 Multiclass Data The data has been sourced from the UCI machine learning repository: Parkinson’s telemonitoring dataset. The dataset had target variable in continuous form and thus was transformed into 4 broad categories. For multiclass classification (0: Normal, 1: Slight, 2: Mild, 3: Moderate), we use SVM, KNN, and Decision Tree Classifier. We dropped Logistic Regression for multiclass classification because it works on sigmoid functions. It is excellent for binary classification but performs poorly otherwise. Decision Tree Classifier gives the best accuracy of 98.63%. The distribution of Continuous UPDRS is shown in Figs. 10 and 11. Table 5 Comparison table for all the classifiers algorithms (binary class)
Model name
Accuracy (%)
SVM
87.30
Logistic regression
85.71
KNN
87.83
Decision tree
85.18
Fig. 10 Distribution of continuous UPDRS
202
S. Bhardwaj et al.
Fig. 11 Distribution of discrete UPDRS
Fig. 12 Confusion matrix SVM (multiclass data)
5.2.1
SVM Algorithm
The maximum accuracy of 94.86% of the model was achieved by scaling the data and tuning the following parameters: Kernel—poly, Degree param—3, C param—100 in Fig. 12.
5.2.2
Decision Tree Classifier
The maximum accuracy of 98.63% of the model was achieved by scaling the data and tuning the random state parameter to 3, shown in Fig. 13.
5.2.3
KNN Algorithm
The maximum accuracy of the model was 92.85% achieved by scaling the data and with the value of K as 5, shown in Fig. 14.
Machine Learning Assisted Binary and Multiclass …
203
Fig. 13 Confusion matrix decision tree classifier (multiclass data)
Fig. 14 Confusion matrix KNN (multiclass data)
5.2.4
Logistic Regression Algorithm
The maximum accuracy achieved with Logistic Regression was 90.53% with newtoncg Solver, l1 penalty, and C = 100.
5.2.5
Comparison Table for All the Classifiers Algorithms for Multiclass Data is Given in Table 6, and Model Comparison for Multiclass Data for Accuracy and Classifiers is Shown in Fig. 9
See Table 6.
204
S. Bhardwaj et al.
Table 6 Comparison table for all the classifiers algorithms (multiclass data)
Model name
Accuracy (%)
SVM
94.89
Decision Tree Classifier
98.63
KNN
92.85
Logistic Regression
90.53
6 Conclusion We have used machine learning models like SVM, logistic Regression, Decision Tree, and KNN classifiers for making models on given datasets to distinguish and classify patient data according to the UPDRS. We have attained 87.83 and 98.63% accuracy using KNN (Binary Data) and Decision Tree Classifier (Multiclass Data). We concluded that SVM always provided consistently high accuracy in any clinical data scenario. A comparison summary table clarifies the differences between the proposed model and the traditional model shown in Table 7. Acknowledgements We would like to thank Manipal University Jaipur and CIDCR Lab-1032AB, School of Computing and IT, Manipal University Jaipur, for supporting us in conducting this research. We are also very grateful for UPDRS (Unified Parkinson’s disease Rating Scale) and UCI for sharing the dataset.
Table 7 A comparison summary table make clear the differences between the proposed model and the traditional model Paper
Binary classification accuracy (%)
Multiclass classification accuracy (%)
Proposed model
87.83
98.63
Pahuja et al. [5]
95
–
Singh et al. [6]
95
85
Wang et al. [7]
96.45
–
Benmalek et al. [8]
–
86.7
Benmalek et al. [9]
–
86.7
Lei et al. [10]
–
78.4
Ali et al. [12]
100 (without gender features)/82 (with gender features)
–
Almeida et al. [13]
94.55
–
Senturk [14]
93.84
–
Cai et al. [17]
97.89
–
Machine Learning Assisted Binary and Multiclass …
205
References 1. V.G. Shankar, D.S. Sisodia, P. Chandrakar, DataAutism: an early detection framework of autism in infants using data science, in Data Management, Analytics and Innovation. Advances in Intelligent Systems and Computing, ed. by N. Sharma, A. Chakrabarti, V. Balas, vol. 1016 (Springer, Singapore, 2020). https://doi.org/10.1007/978-981-13-9364-8_13 2. B. Devi, V.G. Shankar, S. Srivastava, D.K. Srivastava, AnaBus: a proposed sampling retrieval model for business and historical data analytics, in Data Management, Analytics and Innovation. Advances in Intelligent Systems and Computing, ed. by N. Sharma, A. Chakrabarti, V. Balas, vol. 1016 (Springer, Singapore, 2020). https://doi.org/10.1007/978-981-13-9364-8_14 3. V.G. Shankar, D.S. Sisodia, P. Chandrakar, A novel discriminant feature selection–based mutual information extraction from MR brain images for Alzheimer’s stages detection and prediction. Int. J. Imag. Syst. Technol. 1–20 (2021). https://doi.org/10.1002/ima.22685 4. B. Devi, S. Srivastava, V.K. Verma, Predictive analysis of Alzheimer’s disease based on wrapper approach using SVM and KNN, in Information and Communication Technology for Intelligent Systems. ICTIS 2020. Smart Innovation, Systems and Technologies, vol. 196, ed. by T. Senjyu, P.N. Mahalle, T. Perumal, A. Joshi (Springer, Singapore, 2021). https://doi.org/10.1007/978981-15-7062-9_71 5. G. Pahuja, T.N. Nagabhushan, A novel GA-ELM approach for Parkinson’s disease detection using Brain Structural T1-weighted MRI Data, in 2016 Second International Conference on Cognitive Computing and Information Processing (2016) 6. G. Singh, M. Vadera, L. Samavedham, E.C.-H. Lim, Machine learning-based framework for multiclass diagnosis of neurodegenerative diseases: a study on Parkinson’s disease. 2016 IFACPapersOnLine 49(7), 990–995 (2016) 7. W. Wang, J. Lee, F. Harrou, Y. Sun, Early detection of Parkinson’s disease using deep learning and machine learning. IEEE Access 8, 147635–147646 (2020). https://doi.org/10.1109/ACC ESS.2020.3016062 8. E. Benmalek, J. Elmhamdi, A. Jilbab, Multiclass classification of Parkinson’s disease using different classifiers and LLBFS feature selection algorithm. Int. J. Speech Technol. 20(1), 179–184 (2017) 9. E. Benmalek, J. Elmhamdi, A. Jilbab, Multiclass classification of Parkinson’s disease using cepstral analysis. Int. J. Speech Technol. 21(1), 39–49 (2017) 10. H. Lei, Y. Zhao, Y. Wen, Q. Luo, Y. Cai, G. Liu, B. Lei, Sparse Feature Learning for Multiclass Parkinson’s Disease Classification (IOS Press, 2018) 11. J.I.Z. Chen, P. Hengjinda, Early prediction of coronary artery disease (CAD) by machine learning method—a comparative study. J. Artif. Intell. 3(1), 17–33 (2021) 12. L. Ali, C. Zhu, Z. Zhang, Y. Liu, Automated detection of Parkinson’s disease based on multiple types of sustained phonations using linear discriminant analysis and genetically optimized neural network. IEEE J. Transl. Eng. Health Med. (2019) 13. J.S. Almeida, P.P. Rebouças Filho, T. Carneiro, W. Wei, R. Damaševicius, R. Maskeliunas, V.H.C. de Albuquerque, Detecting Parkinson’s Disease with Sustained Phonation and Speech Signals Using Machine Learning Techniques. elsevier.com (2019) 14. Z.K. Senturk, Early diagnosis of Parkinson’s disease using machine learning algorithms. Med. Hypotheses 138, 109603 (2020) 15. A. Sungheetha, R. Sharma, Design an early detection and classification for diabetic retinopathy by deep feature extraction based convolution neural network. J. Trends Comput. Sci. Smart Technol. (TCSST) 3(02), 81–94 (2021) 16. J.S. Manoharan, Study of variants of extreme learning machine (ELM) brands and its performance measure on classification algorithm. J. Soft Comput. Paradigm (JSCP) 3(2), 83–95 (2021) 17. Z. Cai, J. Gu, C. Wen, D. Zhao, C. Huang, H. Huang, C. Tong, J. Li, H. Chen, An intelligent Parkinson’s disease diagnostic system based on a Chaotic bacterial foraging optimization enhanced fuzzy KNN approach. Comput. Math. Methods Med. (2018)
206
S. Bhardwaj et al.
18. Multiclass dataset. https://archive.ics.uci.edu/ml/datasets/Parkinsons+Telemonitoring 19. Image Dataset Used, Parkinson’s Progression Markers Initiative (PPMI) Databasewww.ppmiinfo.org/data 20. UPDRS Dataset. https://www.movementdisorders.org/MDS/MDS-Rating-Scales/MDS-Uni fied-Parkinsons-Disease-Rating-Scale-MDS-UPDRS.htm. Online accessed 11 Sept 2021 21. V. Goel, V. Jangir, V.G. Shankar, DataCan: Robust approach for genome cancer data analysis, in Data Management, Analytics and Innovation. Advances in Intelligent Systems and Computing, vol. 1016, ed. by N. Sharma, A. Chakrabarti, V. Balas (Springer, Singapore, 2020). https://doi. org/10.1007/978-981-13-9364-8_12 22. V.G. Shankar, B. Devi, A. Bhatnagar, A.K. Sharma, D.K. Srivastava, Indian air quality health index analysis using exploratory data analysis, in Micro-Electronics and Telecommunication Engineering. Lecture Notes in Networks and Systems, vol. 179, ed. by D.K. Sharma, L.H. Son, R. Sharma, K. Cengiz (Springer, Singapore, 2021). https://doi.org/10.1007/978-981-33-46871_51 23. V.G. Shankar, B. Devi, U. Sachdeva, H. Harsola, Real-time human body tracking system for posture and movement using skeleton-based segmentation, in Micro-Electronics and Telecommunication Engineering. Lecture Notes in Networks and Systems, ed. by D.K. Sharma, L.H. Son, R. Sharma, K. Cengiz, vol. 179 (Springer, Singapore, 2021). https://doi.org/10.1007/978981-33-4687-1_48
Category Based Location Aware Tourist Place Popularity Prediction and Recommendation System Using Machine Learning Algorithms Apeksha Arun Wadhe and Shraddha Suratkar
Abstract Tourism contributes majorly in the economic growth of a country. Country like India which is a large market for tourism. Tourism is one of the prime sectors contributing highly to the GDP. Hence, it is necessary to build a smart tourism system which helps in increasing revenue from the tourism industry. In this paper, we have proposed a novel framework for place popularity prediction and recommendation system using machine learning algorithms. A novel recommendation system can overcome cold start problem. Dataset used in a research study has been gathered from popular tourism web sites. In experiment, category of places has been determined using the LDA algorithm. Location of places has been identified using K-means clustering algorithm. Sentiment analysis has been used for popularity rating prediction. Sentiment classification has been done using famous supervised machine learning algorithms, i.e., naive Bayes (NB), decision tree (DT), support vector machine (SVM), random Forest (RF), and performance analysis using several evaluation factors. From research, we could conclude that random forest has given the highest performance in comparison with decision tree, naive Bayes, and SVM classifiers. As a result, popularity-based tourist spot classification using RF had been implemented which has given accuracy of 88.02% for testing data used. On bases of category + location top-N popular places have been recommended to an end user using a combination of LDA, and RF and K-means algorithms used in a layered approach. This hybrid recommendation system is a combination of content-based + popularity-based recommendation systems. Such a recommendation system will give more precise recommendations as compared to only content-based and only popularity-based systems. Keywords Latent Dirichlet allocation · Linear support vector machine · Multinomial naive Bayes · Bag of words · TF-IDF · K-means · Random forest · Decision tree A. Arun Wadhe (B) · S. Suratkar Department of Computer Engineering and Information Technology, VJTI, Mumbai 400019, India e-mail: [email protected] S. Suratkar e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_16
207
208
A. Arun Wadhe and S. Suratkar
1 Introduction Tourism contributes majorly in economic growth of the country. Country like India which is a large market for tourism. Tourism is one of the prime sectors contributing highly to the GDP. Hence, it is necessary to build a smart tourism system which helps in increasing revenue from the tourism industry. Nowadays, social media is growing rapidly millions of users post reviews, rate place over tourism web sites, and forums. This can lead to a platform for popularity prediction and recommendation. There are various models proposed yet for popularity prediction and recommendation. But there are some problems with prediction and recommendation system like data sparsity, cold start problem. As tourism data is rare so it suffers from data sparsity problem. Also, the cold start problem is one of the major problems in the recommendation system. So, there is a need for a system which will overcome this problem. In this research, study tourist place categorization has been performed using topic modeling algorithm for category-based recommendation. Also, sentiment analysis has been performed for popularity prediction. Then, tourist place region detection has been performed for location aware recommendation. Topic modeling has been performed using latent Dirichlet allocation (LDA). Sentiment classification has been performed using popular machine learning algorithms, i.e., multinomial naive Bayes, linear support vector machine, and random forest. K-means algorithm is used for place location clustering. Comparative analysis of various supervised algorithms used in research study has been performed with the help of several evaluation parameters such as accuracy score, recall, precision and F1-score. Contents of paper are outlined as below. Section 2 states related work to research study. Section 3 explains terminologies of machine learning algorithms used to conduct research. Section 4 elaborates methodology for tourist place popularity prediction and recommendation. Section 5 describes experiments performed in research study. Section 6 shows results of research study. Section 7 states comparative analysis. Section 8 describes advantages of the recommendation system in research study. Section 9 depicted disadvantages. Section 10 infers research work. Future work of study has been delineated in Sect. 11.
2 Literature Review Author [1] has proposed a novel hierarchical framework for point of interest popularity prediction. He has used heterogeneous tourist place dataset in research. He has estimated TCHP using hierarchical multi-clue fusion technique, and he has compared various early fusion and late fusion techniques like EF, SVM2K (SVM + KCCA) for popularity prediction. In hierarchical multi-clue modeling multi-modal data types taken into consideration with topic modeling layer implemented using latent Dirichlet allocation (LDA). Future scope of research is to design recommendation system. In this paper [2], online news popularity has been predicted using
Category Based Location Aware Tourist Place Popularity …
209
machine learning techniques. Author has used the UCI machine learning repository dataset. They have used SVM, neural network, random forest, KNN, naive Bayes, multilayer perceptron, bagging, adaboost, logistic regression, etc. From research study, it has found that multilayer perceptron and random forest gives best performance by setting a best value for parameters. In this paper [3], author has proposed a new method for classification of sentiment. He has used latent Dirichlet allocation (LDA) for topic detection. They have concluded that their proposed method with naive Bayes has outperformed other machine learning techniques like naive Bayes, SVM, adaboost along with naive Bayes and decision tree. In this paper [4], author has proposed a demographic-based recommendation system using machine learning algorithms. They have used naive Bayes, Bayesian networks, and support vector machine. It is found that machine learning algorithm specially SVM outperforms over baseline methods. Future scope of study is to include review textual information to improve accuracy of prediction. In this paper [5], users have implemented a hybrid recommendation system. It is a combination of content-based filtering as well as collaborative filtering. Author has used Bing data. Content-based recommendation technique is based on user’s content, whereas collaborative is based on similarity in choices of users. They had found that fusion of content-based filtering and collaborative filtering gives better recommendation quality. In this paper [6], author has proposed a novel system for personalized tourist destination recommendation on the basis of social media check-in data. Also, friend check-in data has been taken into consideration to overcome the cold start problem. They have used Facebook as a social media platform for research study. Paper concludes that by overcoming cold start problem recommendation system quality can be improved. In paper [7], author has proposed novel GuideMe model for tourist recommendation. In this research, recommendation has been given on bases of user preferences, user present location, and user past activity. They have used integration with social services like Facebook and Twitter. Future scope of research paper is to integrate user rating, friends’ similarity, and comments into consideration. As elaborated in paper [8], author has proposed user-based collaborative filtering method for tourist spot recommendation. For calculating the similarities between each user, cosine similarity method has been used. And later the recommendation of tourist spot has been given as per the visiting history of the user’s neighbors as per similarity of choices. As presented in [9] paper, author has performed in detailed survey on various types of recommendation systems like content-based filtering, collaborative filtering, hybrid recommender, and demographic recommendation system. Also, the author has discussed various challenges in the recommendation system. Advantages and disadvantages of different RS used in research study have been discussed. In this paper [10], author has proposed a novel approach for recommendation of a research paper based on topic modeling technique. Author has used latent Dirichlet allocation (LDA) for topic analysis In this way, he has calculated thematic similarity measurement for topic clustering. By adopting proposed method problems in traditional recommendation systems which are cold start problems could get overcome. This paper is extension of base paper [11].
210
A. Arun Wadhe and S. Suratkar
3 Terminologies of Machine Learning Algorithms 3.1 Feature Extraction In machine learning, pattern recognition feature extraction is a crucial step. It helps in dimensionality reduction as it removes redundant and irrelevant attributes. As a result, performance of machine learning algorithms improves drastically. In NLP, word is used as a feature. Natural language processing contains many algorithms for feature extraction. In this research study, we have used BoW and TF-IDF. Bow is a bag of words is an approach in which a feature vector is formed based on count of words. Number of times a token appears in a document is calculated for finding feature weight. TF-IDF is an extension of BoW. In addition to word count, here inverse document frequency taken into consideration for calculation of weightage of feature in document. [12, 13] will provide more details. Below equations indicate TF-IDF mathematics. TF = No of times particular word w occurs in a document/Total no of words in a document IDF = log(Total no of documents/No of documents in which particular word w is present TF − IDF = TF ∗ IDF
3.2 Topic Modeling Topic modeling is one of the kinds of statistical modeling for finding the hidden or latent topics that appear in a set of documents. Latent Dirichlet allocation (LDA) is one of topic modeling algorithm which is useful for categorization of words in a document to a specific topic. LDA based on Gibb’s sampling is used which performs better in case of data imbalance as well. LDA Gibbs sampling uses Bayes algorithm and Monte Carlo Markov chains (MCMC) principle. Main aim of MCMC principle is to develop a Markov chain. This Markov chain finds equilibrium posterior probability distribution. But it takes a long run time to converge. For more insights refer [14].
3.3 Classification Algorithms Classification is one of the methods of supervised machine learning. Supervised algorithms are useful to train the model if data is labeled which is also called as class. Apart from classification, there is another method of supervised technique, i.e.,
Category Based Location Aware Tourist Place Popularity …
211
regression. Classification is a technique to determine a discrete class of data records, whereas regression is used for prediction of continuous value. For this research, we have used classification algorithms for popularity prediction. Below are a few highly popular classification algorithms.
3.3.1
Decision Tree
Decision tree is one of the most popular supervised machine learning algorithms. It is based on a rule-based approach. It forms a tree which consists of various nodes. Root node are decision terminals, and leaf nodes are labels of classification. Depending on tree, rules are formulated [15]. For more details refer [15]
3.3.2
Multinomial Naive Bayes
Naive Bayes is one of the most famous supervised machine learning algorithms. It is based on a probabilistic approach. The most common model of naive Bayes is Bernoulli and multinomial. Multinomial NB is denoted by MNB. MNB is more appropriate for text mining. MNB algorithm executes naive Bayes on multinomial distributed dataset. MNB assumes features are distributed multinomial for calculating probability of document to each class [16, 17]. Refer [17] for more details.
3.3.3
Support Vector Machine
Support vector machine is one of the most powerful supervised algorithms. Support vector machine is to find the separating hyperplane with the largest margin for generalized classification purpose. Support vector machine has several kernel methods like linear, radial basis function, and polynomial. Linear kernel is used when dataset is linearly separable by hyperplane [18]. Reference [18] will provide more details of SVM.
3.3.4
Random Forest
Random forest ensemble predictor with a set of decision trees that grow in randomly selected subspaces of data. It is a collection of multiple decision trees which predict class based on maximum vote. It estimates the final prediction. References [19, 20] will provide more details.
212
A. Arun Wadhe and S. Suratkar
3.4 Cluster Analysis Cluster analysis is the task of grouping similar kinds of entities together. It is a type of unsupervised machine learning techniques. For better clustering, intra-cluster distance must be least and inter-cluster distance must be most.
3.4.1
K-Means
K-means is one of the most famous and powerful partitioning methods which is used to cluster data. K-means is an unsupervised, numerical, non-deterministic, and iterative process of clustering. In k-means, every cluster is a representation of the mean value of elements present in the cluster. Where n elements get partitioned into K number of groups or clusters for keeping similarity among inter-cluster elements low and intra-cluster elements high. Similarity is calculated by mean value of elements present within a cluster. This process gets iterated n number of times so that cluster elements remain constant. Li and Wu [21] gives more insights.
4 Methodology In this paper, category-based location aware tourist place popularity prediction and recommendation has been performed using the following methodology. Refer Fig. 1 for proposed framework.
4.1 Dataset Data are gathered from several heterogeneous tourism review web sites. Data were in heterogeneous format which needed to be converted into a common format. Data has features: place description, place name, latitude, longitude, review, and rating. sentiment has been derived from rating. If rating > 3 then + 1, if rating < 3 then sentiment value is −1, if rating = 3 then sentiment is 0.
4.2 Language Detection and Data Cleansing Data present on tourism review web sites is multilingual and hence, language detection has been performed. Non-English data has been deleted. Data preprocessing is a crucial process which is highly essential as social media data is noisy and raw. Data cleansing has been carried out using steps like tokenization, stop word elimination,
Category Based Location Aware Tourist Place Popularity …
213
Fig. 1 Proposed framework for tourist place popularity prediction and recommendation system
lower casing, stemming, and lemmatization. Customized stop word list has been used. Stop words are words which occur frequently and irrelevant that need to be eliminated to improve performance to machine learning algorithms. Stop word list includes words like a, this, we, you, that, and etc. In addition to these steps, words whose length was less than four characters, punctuation characters, numbers, and special symbols have been removed.
4.3 Feature Extraction Text mining contains numerous feature extraction. Feature extraction for topic modeling has been performed using bag of word (BoW) algorithm. Whereas sentiment analysis has been performed using TF-IDF.
214
A. Arun Wadhe and S. Suratkar
4.4 Topic Modeling Topic modeling has been performed to find the category of place based on place description. Topic modeling performed using latent Dirichlet allocation (LDA) Gibbs sampling. 3900 records used for training purpose and 152 records for testing model.
4.5 Sentiment Analysis Sentiment analysis has been performed for popularity prediction. Sentiment analysis has been performed using decision tree, naive Bayes, support vector machine, and random forest. Reviews associated with 152 tourist places have been collected. There are 3209 reviews which divided train and test data into 80:20 ratios.
4.6 Spatial Data Mining Spatial data mining has been performed for place region detection. It is helpful in location aware recommendations. Place regions are clustered using the K-means algorithm. Places are categorized into four classes north, south, east, and west.
4.7 Popularity Prediction Popularity of 143 testing places has been predicted. On bases of total number of positive sentiment review and total number of review popularity rating of tourist place has been determined. Formula used for popularity rating prediction described in Sect. 5.
4.8 Recommendation System Top-N recommendation has been given to end users based on search query by user. Place in search query has been analyzed and similar categories, similar regions, and similar category + region places shown to the end user. For cold start user, popularity-based recommendation is given.
Category Based Location Aware Tourist Place Popularity …
215
4.9 Visualization Results have been visualized using pie chart, bar chart, gmaps. From gmap, users can view results over Google Map. Comparative analysis of supervised algorithm performed and visualized matplotlib bar chart.
4.10 Performance Evaluation Machine learning models have to be evaluated for calculating efficiency of models. Performance of machine learning model used in research study has been evaluated based on parameters such as accuracy score, recall, precision, and f1-score.
5 Experiment In the research experiment, 3900 records place description data have been collected for topic modeling training, whereas in testing, there are 152 records data. Reviews associated with 152 tourist places have been collected, and there are 3209 reviews in the dataset. Data has been assembled from numerous tourism forums and sites. Data has assembled in the comma separated values file format that comprises fields place description, place name, latitude, longitude, review, and rating. On bases of rating, sentiment has been calculated into three classes (+1, −1, 0). +1 denotes positive class, −1 denotes negative class, and 0 denotes neutral class. In this way, a labeled dataset has been gathered. Also, latitude and longitude data for tourist places have been collected for spatial data mining. Dataset is split into 80:20 proportions for training data and testing data, respectively. Next step was language detection and data preprocessing. Feature extraction has been performed using BoW for topic modeling. Topic modeling has been performed using LDA Gibbs sampling. Topic modeling has been performed for finding the category of place among the total 12 categories. Categories are hill station, historical place, beach, pilgrimage place, art gallery, museum, educational place, desert, botanical garden, national park, aquarium, and amusement park. For sentiment analysis, feature extraction has been performed using TF-IDF algorithm. Sentiment analysis has been performed using machine learning algorithms such as naive Bayes, support vector machine, and random forest. Sentiment analysis has categorized reviews into three categories positive (+1), negative (−1), and neutral (0). Based on sentiment analysis overall popularity rating of tourist places has been predicted. Spatial data mining has been performed using K-means algorithm for categorizing place into region, i.e., north, south, east, and west. Ultimately top ten recommendations given based on category, location, and popularity rating.
216
A. Arun Wadhe and S. Suratkar
5.1 Experimental Framework The experiment has been executed on the system with Windows 8.1 Home 64-bit OS, ×64-based processor with below hardware and software configurations. Hardware Configuration • Intel (R) Core (TM) i5-2450 M Processor @ 2.50 GHz • 8 GB Memory • 512 GB ROM. Software Configuration • • • • •
Python Programming Language Version 3.6 Pycharm Studio 2018 Mallet 2.8 Java Development Kit 1.8 MySQL Workbench.
5.2 Experimental Scenarios The motive of the experimental scenario is to estimate the perfect combination of parameters and hyperparameters for topic modeling and classification algorithms. Performance evaluation has been performed for every combination of parameters. Parameters and hyperparameters have been selected after several iterations so that the model will not overfit or underfit. In research study, for topic modeling algorithms, LDA Gibbs sampling has been used. The most important parameter in topic modeling is the number of topics which is denoted by T. Its value should be equal to number of classes. T has been set to 12 because documents needed to be classified into 12 categories. Second hyperparameters, alpha and beta whose value is set to 50/T and 0.001, respectively, where T is the number of topics. Number of iterations is set to 1000. Alpha and beta value influence topic sparsity in document and word sparsity in topic, respectively. In sentiment analysis, classification algorithms such as decision tree, multinomial naive Bayes, linear support vector machine, and random forest are utilized. In this research for naive Bayes, a multinomial event model was used. For MNB, alpha has assigned value 0.25 Similarly for SVM parameter, gamma has assigned value 0.01 and C has assigned value 1 with kernel function linear. For RF, n_estimator has assigned value 1000 also gini criterion selected. For DT has assigned criteria set as entropy, max_features as auto. For K-means, K-value has been set to four.
Category Based Location Aware Tourist Place Popularity …
217
5.3 Implementation We have gathered a dataset from heterogeneous tourism review web sites which got converted into one common format which contains place description, place name, latitude, longitude, comment as feature attributes. As social media data is very raw and noisy. It needs to preprocess properly. Non-English data records have been deleted for making data monolingual. Data preprocessing consists of tokenization, punctuation mark removal, stop word and short word removal, lemmatization, and stemming. Customized stop word list has been used instead of NLTK stop word list. Next step was feature extraction which has been performed using bag of words (BoW) for topic modeling, whereas for sentiment analysis TF-IDF used. After feature extraction, topic modeling has been performed using LDA Gibbs sampling for finding the category of place among 12 categories. Sentiment analysis has been performed over labeled data of reviews using decision tree (DT), multinomial naive Bayes (MNB), linear support vector machine (SVM), and random forest (RF) algorithm. Performance of the model has been estimated with the help of several factors such as accuracy, precision, recall, and f1-Score. Popularity of place has been determined using the following equations. P = ((No of Positive comments/Total No of Comments) ∗ 10) Popularity Rating = P/2 Depending overall popularity rating predicted if rating greater than or equal to three, then destination is classified into popular class else unpopular. During the experiment, we have found that RF gave better results of sentiment classification, so popularity prediction and recommendation engine have been developed using RF. Results of popularity prediction have been visualized using matplotlib and gmap Python library. Geographic representation on Google Maps helps end users to visualize place details. Next step was to categorize places into regions north, south, east, and west. It has been performed using the K-means algorithm. K-value for algorithm passed as four. Last step was recommendation. For recommendation, web site has been developed where Top-N recommendation has been given to end users based on popularity to cold start users. Also, similar region, similar category, similar category + region recommendation given to the end user on the basis of search query place. Experimental results are shown in the RESULT section.
6 Results Results of the research study are depicted in Figs. 2, 3, 4, and 5. Figure 2 denotes results popularity prediction on gmap. Figure 3 indicates search form for tourists and on bases of search the most popular place similar to the searched place has been
218 Fig. 2 Geographic result of category detection and popularity prediction of places
Fig. 3 Search form
Fig. 4 Similar places to searched place
A. Arun Wadhe and S. Suratkar
Category Based Location Aware Tourist Place Popularity …
219
Fig. 5 Popularity-based recommendation for cold start use
given in Fig. 4. In Fig. 5, popularity-based recommendation as given for cold start user (Fig. 5).
Fig. 6 Popularity statistic report using random forest
220
A. Arun Wadhe and S. Suratkar
Table 1 Comparison of machine learning algorithms using several factors Parameters
Methods DT (%)
MNB (%)
LSVM (%)
RF (%)
Accuracy
80.37
84.11
85.35
86.29
Precision
81
83
85
86
Recall
80
84
85
86
F1-measure
81
82
84
85
7 Comparative Analysis Comparative analysis of machine learning algorithms for sentiment-based popularity prediction using several parameters are delineated in Table 1.
8 Advantages • This hybrid recommendation system is a combination of content-based and popularity-based systems so it will give more precise recommendations to travelers in comparison with only content-based and only popularity-based systems. • Also As popularity of place is determined for cold start users who have not searched much more content, could be given only popularity-based place recommendations.
9 Disadvantages • In this system, social influence has not taken into consideration. • Also for giving location aware recommendations, distance has not taken into consideration
10 Conclusion Research presented in this paper reveals the main problem in the recommendation system and possible solution proposed to overcome it. We can conclude that categorybased location aware tourist place popularity prediction and recommendation could be beneficial for giving recommendations to cold start users which has improved the quality of recommendation by giving more precise recommendation to an end user as it has used three factors for recommendation, i.e., category, location, and
Category Based Location Aware Tourist Place Popularity …
221
popularity. In sentiment analysis, random forest (RF) has outperformed decision tree (DT), support vector machine (SVM), and naive Bayes (NB) for research dataset used and on bases of various evaluation parameters used in research. So, popularitybased tourist spot classification using RF has implemented which classify spots into popular and unpopular classes and gave accuracy of 88.02%.
11 Future Scope Major problem in the tourism domain has data sparsity problem. So, in the future, we will try to extend the dataset which will overcome data sparsity problem. Also, recommendation quality can be improved using a distance-based location aware system. So, in the future work, we will try to incorporate geographic distances in the design of recommendation system. Acknowledgements We wish to express our gratitude toward Veermata Jijabai Technological Institute, Mumbai for providing laboratory and resources for conducting research work.
References 1. Y. Yang, Y. Duan, X. Wang, Z. Huang, N. Xie, H.T. Shen, Hierarchical MultiClue Modelling for POI Popularity Prediction with Heterogeneous Tourist Information. IEEE2018OI10.1109/TKDE.2018.2842190 2. R. Joshi, R. TekcFeras Namous, A. Rodan, Y. Javed, Online News Popularity Prediction. 2018 Fifth HCT Information Technology Trends (ITT). https://doi.org/10.1109/CTIT.2018.8649529 3. S. Nakamura, M. Okada, K. Hashimoto, An investigation of effectiveness using topic information order to classify tourists reviews, in 2015 International Conference on Computer Application Technologies. https://doi.org/10.1109/CCATS.2015.32 4. Y. Wang, S.C.-f. Chan, G. Ngai, Applicability of demographic recommender system to tourist attractions: a case study on trip advisor, in 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology. https://doi.org/10.1109/WI-IAT.2012.133 5. Z. Lu, Z. Dou, J. Lian, X. Xie, Q. Yang, Content-based collaborative filtering for news topic recommendation, in Proceedings of the TwentyNinth AAAI Conference on Artificial Intelligence. ISBN:0-262-51129-0 6. K. Kesorn, W. Juraphanthong, A. Salaiwarakul, Personalized attraction recommendation system for tourists through check-in data. IEEE Access 5, 26703–26721 (2017) 7. A. Umanetsa, A Ferreiraa, N. Leitea, GuideMe—A Tourist Guide with a recommender system and social interaction, in ScienceDirect Conference on Electronics, Telecommunications and Computers—CETC (2013). https://doi.org/10.1016/j.protcy.2014.10.248 8. Z. Jia, W. Gao, Y. Yang, X. Chen, User-based collaborative filtering for tourist attraction recommendations, in 2015 IEEE International Conference on Computational Intelligence & Communication Technology. https://doi.org/10.1109/CICT.2015.20 9. P.B. Thorat, R.M. Goudar, S. Barve, Survey on collaborative filtering, content-based filtering and hybrid recommendation system. Int. J. Comput. Appl. 110(4), 0975–8887 (2015) 10. C. Pan, W. Li, Research paper recommendation with topic analysis, in International Conference on Computer Design and Applications (ICCDA 2010).https://doi.org/10.1109/ICCDA.2010. 5541170
222
A. Arun Wadhe and S. Suratkar
11. A. A. Wadhe, S. S. Suratkar, Tourist place reviews sentiment classification using machine learning techniques. In 2020 international conference on Industry 4.0 Technology (I4Tech) (2020, February) 12. J. Ramos, Using TF-IDF to determine word relevance in document queries, in Proceedings of the Twenty-First International Conference on Machine Learning (2003) 13. S.-W. Kim, J.-M. Gil, Research paper classification systems based on TF-IDF and LDA schemes. Hum.-Centric Comput. Inform. Sci. 9, Article number: 30 (2019) 14. D.M. Blei, Probabilistic topic models. Mag. Commun. ACM 55(4), 77–84 (2012) 15. J.R. Quinlan, Induction of decision trees. Mach. Learn. 1, 1 (1986) 16. K. Sarkar, Using character N-gram features and multinomial naive bayes for sentiment polarity detection in Bengali Tweets, in 2018 Fifth International Conference on Emerging Applications of Information Technology (EAIT) 17. Text Classification and Naïve Bayes. https://web.stanford.edu/~jurafsky/slp3/slides/7_NB.pdf 18. C. Burges, A tutorial on support vector machines for pattern recognition. Data Min. Knowl. Disc. 2, 121–167 (1998) 19. L. Breiman, Random Forests. Statistics Department University of California Berkeley. https:// ieeexplore.ieee.org/document/8167934 20. C. Sheppard, Tree-Based Machine Learning Algorithms: Decision Trees, Random Forests, and Boosting (CreateSpace Independent Publishing Platform, 2017) 21. Y. Li, H. Wu, A clustering method based on K-means algorithm, in 2012 International Conference on Solid State Devices and Materials Science 22. S. Bird, E. Klein, E. Loper, Natural Language Processing with Python. O’Reilly Publication 23. T. Haslwanter, An Introduction to Statistics with Python With Applications in the Life Sciences. Springer 24. A.C. Müller, S. Guido, Introduction to Machine Learning with Python. O’Reilly Publication 25. C.-Y. Yang, J.-S. Yang, F.-L. Lian, Safe and smooth: mobile agent trajectory smoothing by SVM. Int. J. Innov. Comput. Inform. Control 8(7B) (2012) 26. A.A. Wadhe, S.S. Suratkar, Tourist place reviews sentiment classification using machine learning techniques, in 2020 International Conference on Industry 4.0 Technology (I4Tech)
Maximization of Disjoint K-cover Using Computation Intelligence to Improve WSN Lifetime D. L. Shanthi
Abstract WSNs have been used in different sectors of applications such as industrial, environmental, and social due to the progress of technology and the necessity. Because the network’s sensors are restricted by battery power, network operations are important. The life extension of a wireless sensor network has been explored in this study by locating a large number of disjoint set coverings. All of the targets were covered by each separate group of sensors. Instead of maintaining all sensor nodes in operation, the only way to prolong the service life by about K times is to use the sensors of one cover while the sensors of the other covers are in sleep mode. This approach saves both energy and time by processing useful data and reducing duplicate data coming from different sensors in a region. Different configurations of sensor networks have been tested using an evolutionary computation-based computer intelligence technique, as well as a genetic algorithm and differential evolution. To make the solution possible, a local operator has been incorporated. With integer encoding of solutions in genetic algorithm performs better than differential evolution in finding a good number of disjoint set covers. Over a continuous search space, DE is highly efficient, but its efficiency has been hampered by integer transformation. Keywords Computational Intelligence (CI) · Coverage · Disjoint sets · Differential evolution (DE) · Genetic algorithm (GA) · Lifetime
1 Introduction In many applications, wireless sensor networks (WSNs) have demonstrated tremendous usefulness, including battlefield surveillance, environmental monitoring, traffic control, animal tracking, and residential applications. A large number of WSN applications need to be monitored for long-term and in cost-effective manner, the network needs to be operational for a longer time, enhancing the network lifetime is of major concern. Exhaustible sensor batteries have prompted the development of techniques D. L. Shanthi (B) BMS Institute of Technology and Management, Bengaluru, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_17
223
224
D. L. Shanthi
to extend the network’s lifetime and have become one of the most critical and difficult challenges. As a basis, extending the life of WSN networks is an essential study topic. Data processing, routing, device location, topology management, and device control are all challenges that can be solved to extend the life of WSNs. In a highly dispersed WSN, a fraction of the devices can already address coverage and connection problems [1]. Network coverage problem is one of the issue that has to be solved; from literature it has been contemplated that connectivity across the network is by the nodes coverage and is established that each node communication range is at least twice its sensing range. The device control technique that schedules the devices’ sleep/wakeup activities has proved to be promising in a WSN, where the coverage problem affects how well an area of interest is monitored by sensors. Random placement is used to deploy the sensors to the target location (by dropping from planes). Active and sleep modes are the most common modes of operation for sensors. A sensor needs a lot of energy to perform all of its functions, such as detecting, processing, and communicating. The asleep mode sensor, on the other hand, uses very little energy and can be activated for complete activities within a predetermined amount of time. In an area where a subset of sensors can entirely cover the target region, the remainder of the sensors may be programmed to enter sleep mode to conserve energy; hence, if there are more subsets, the lifetime of the WSN may be significantly extended. Another approach to improving network life would be to increase the number of fully covered subsets. The maximum percentage of full cover subsets is difficult to find since each subset needs to offer full coverage of the target region yet only one subset of sensors is available at any moment to the WSN to conduct the monitoring task. The disjoint set cover is an issue commonly called as K-COVER problem and is an NP problem in WSN.
2 Related Work Wireless sensor networks (WSNs) are known to be energy-constrained, and the battery capacity of the nodes is heavily influenced by the endurance of each network sensor. As a result, WSN research has focused on the network lifespan. Although various energy-efficient solutions for extending network lifespan have been learned, different definitions have been given for lifetime based on different topology settings and procedures. The most common definition of a sensor network’s lifespan is the time until the first sensor node fails, which seems to be decidedly unhelpful in many possible deployment scenarios [2]. WSN protocols were compared using different network lifespan criteria and explored the implications of these metrics as well as their use in objectively measuring the performance of WSN data delivery techniques. In [3], an overview of current WSN advancements, including its applicability, design restrictions, and predictive lifetime techniques, was provided. Yıldız et al. [4] used two linear programming (LP) techniques to evaluate the impact of capturing numerous key nodes on WSN life length. The findings indicate that capturing a large number of key nodes in a WSN considerably reduces network life. The challenge
Maximization of Disjoint K-cover Using Computation Intelligence …
225
of prolonging the life of dynamically varied WSNs with EH sensors was investigated in [5]. This issue was defined as having the highest number of covers that are each part of all the sensors for all of the targets to be tracked. Authors designed a mathematical model of the problem, and proposed a search algorithm called a harmonic search algorithm with multiple populations and local search (HSAML). Routing is a vital activity in wireless sensor networks, and it is dependent on the deployment and administration of a flexible and efficient network. Both sensor nodes and the complete network must be optimized for energy consumption and resource management to provide effective routing [6]. To extend the WSN’s lifespan, a multicriteria routing system with adaptability was proposed. The clustering method has been shown to extend the lifespan of WSNs. Alghamdi [7] developed a clustering approach to choose an optimum CH based on energy, distance, latency, and safety. The CH is chosen via the hybridization of fireflies and dragonflies [8, 9]. The key contribution of the work here is the transformation of the single-object disjoint set cover problem into a multi-objective problem (MOP). A maximization of (DSC) disjoint set cover is addressed by increasing the coverage and optimizing the number of DSC, and scheduling the nodes into a maximum DSC is an NP-hard optimization issue [10]. Authors have experimented on a more realistic WSN model to maximize the set cover depending on the target, and a multi-layer genetic algorithm is proposed to determine utmost set covers having minimum sensors. Depending on the required configurations for different applications, the design of WSN is very complex and the dynamic nature mandates finding optimum network quick and adaptive [11]. Authors have addressed this issue by designing a genetic algorithm-based self-organizing network clustering (GASONeC) framework, an initiative of providing dynamically optimized paths and clusters formation. In any WSN design, the idea to extend the network lifetime is to avoid the energy depletion of a node before other nodes in a homogeneous network [12]. To reduce network energy usage, a dynamic cluster model is suggested for selecting a good CH. One of the most essential ways to reduce energy exhaustion is to use the right routing technique, which may be improved by using the right CH based on the threshold specifications. A Low Energy Adaptive Cluster Hierarchy (LEACH), selects CH on rotation basis and there exists equal probability for high energy nodes to become CH. Ayati et al. [13] showed a threelayer routing strategy called super cluster head election using fuzzy logic in three layers, in which fuzzy logic is utilized to pick a supercluster head from a group of CHs (SCHFTL). Practically nodes are deployed randomly in the region of interest, this makes the sensor density variations in any application [14]. The problem is optimized by finding more set covers to track the targets at active mode. A greedybased heuristic technique for scheduling sensors to act on events generated in an environment is suggested to increase network lifetime [15]. To minimize the delay in data transmission, authors have proposed Optimal Mobility-based Data Gathering (OMDG) approach with Time-Varying Maximum Capacity Protocol (TMCP) and CHEF is applied for cluster construction, also a variety of sinks were developed to collect data from various regions of the network.
226
D. L. Shanthi
3 Representation of Disjoint Set Cover Problem Consider a given finite set of targets T to be covered by disjoint subsets S from a collection of nodes in a given region and finding the maximum number of such subsets that every cover C i (C i ⊆ S), each of the target T in the region belongs at least to one cover C i or maybe two covers C i and C j , but C i ∩ C j = ϕ. This has been an NP-complete problem as given in Fig. 1, and a network with a set of five nodes and a set of four targets in a region is deployed as a sample. The the targets T 1 … T 4 may be denoted as a relation between the sensors S 1 … S 5 and bipartite graph G = (V, E), where V = S T and eij E if S i covers T j . Figure 2 depicts the bipartite graph of the WSN seen in Fig. 1, with S 1 = {T 1 }, S 2 = {T 1 , T 2 }, S 3 = {T 2 , T 3 , T 4 }, S 4 = {T 3 }, and S 5 = {T 4 }. In this case, the maximum number K of disjoint covers is two C 1 = {S 1 , S 3 }; C 2 = {S 2 , S 4 , S 5 }. Lifetime enhancement of WSN is achieved by solving the network coverage problem, and this can be done by determining more set of covers especially by dividing the sensors into disjoint sets which covers the targets better. Fig. 1 Example deployment of WSN
Fig. 2 WSN representation using bipartite graph
Maximization of Disjoint K-cover Using Computation Intelligence …
7 T3
5 T2
1 9 T1 T3 1st Cover
10 T4
4 T2
2 8 T1 T4 T3 2nd Cover
227
6 T3
3 T1
Fig. 3 A sample representation of solution with its fitness for sensors and the targets covered
4 Methodology 4.1 Fitness Value and Solution Representation Sensors have been given a separate number from 1 to a maximum number of sensors (NS) in the network area during solutions determined. Figure 3 illustrates a sample solution representation with ten sensors in the region and with 4 targets to be covered. Fitness estimation is given in the table for the set covers in the region as specified in Fig. 2. In the Fig. 3, it can be noticed that 2 sets of disjoint set covers have been discovered; this gives the fitness of a solution, that is the total number of disjoint covers developed specifies the fitness of the solution, and in the given sample, we have a solution {7, 5, 1, 9, 10, 4, 2, 8, 6, 3} which is having fitness of 2 as there is 2 disjoint cover in the region.
4.2 Locally Corrected Genetic Algorithm The work is carried out using a genetic algorithm (GA), and offspring were generated by applying a more natural way by provisioning equal chances for each member. This approach differs from the conventional type of GA, in which parents have a fitness focus. Equal possibilities to become a parent for all members are more natural and provide them a greater chance to explore. The solution space was explored with a two-point crossing, while the tournament selection is followed by the exploitation for the following generation illustrated in Fig. 4. The initial population is chosen by the random generator, which produces random numbers by permuting integer values from 1 to NS (NS: total number of sensors). Two random parents were chosen from
228
D. L. Shanthi
Fig. 4 Suggested genetic algorithm with a locally corrected operator
the population, and a two-point crossover was used to generate offspring, based on the criteria defined as equal opportunities. The integer mutation strategy was used to further alter the children, with the mutation location ranging from 1 to NS. Following a mutation, progeny with numerous locus points may have the same sensors at various locations at the same time, resulting in an infeasible solution. This condition of theoretical view is rectified by randomly selecting unavailable sensors from the current solution to make a multi-locus capability for sensors and to make the solution feasible. When the parents and the offspring are of the same sizes, the two populations are merged into a pool, and then, a tournament selection is applied over it to determine the sample population for the following generation. Depending on the termination
Maximization of Disjoint K-cover Using Computation Intelligence …
229
condition, the previous population will be replaced by the next generation population, and the process is repeated, or a final solution from the next generation is obtained.
4.3 Differential Evolution (DE) DE is currently one of the powerful entities under evolutionary computations to address global optimization problems. The DE algorithm population includes NP individuals, and each of them has a D-dimensional vector as accessible in a task, corresponding to D dimensions. During every generation different mutation strategies under DE are applied to generate a D-dimension donor-vector. Two methods, namely DE/rand/1 as in Eq. (1) and DE/current to best/rand/1 as in Eq. (2), have been used in this research. The crossover operator in a probabilistic environment was applied to generate the trial vector, as shown in Eq. (3). CR, a crossover control parameter or factor with a range of [0, 1], represents the probability of creating parameters for a trial vector from a mutant vector. In the range [1, NP], index jrand is a randomly selected integer. The target and accompanying trial vectors are then chosen for the following generation using a greedy selection procedure, as described in Eq. (4). Rounding was used to arrive at the integer values. (G) (G) Vi(G) = X r(G) 1 + F ∗ Xr2 − Xr3 (G) Vi(G) = X i(G) + F ∗ X best − X i(G)
(G) + F ∗ X r(G) − X 1 r2
(1) (2)
u i(G) j
vi(G) j if rand(0, 1) ≤ CR or j = jrand (G) xi j otherwise (G) (G) (G) ≤ f x u if f u i i i xi(G) j = xi(G) otherwise
=
(3)
(4)
5 Experimental Set-Up and Results The suggested form of GA and DE was applied to various configurations of a simulated WSN network. Sensors have been randomly deployed into 2D space, and objectives have been set. These conditions are reproduced by the assignment for the sensors and targets of the random values of (x, y). In the first step, the Cover Matrix determined the convergence of the target by the individual sensor with the help of each sensor’s Euclidean distance from the target. The coverage matrix construction has
230
D. L. Shanthi
been demonstrated in Fig. 5. The entire simulation procedure was built up in a MATLAB setting. The population size was 100, and the number of generations permitted was 100 for the GA and DE. The chance of mutations of GA was assumed to be 0.1, while the size of the tournament was 10%. The F value in the DE was 0.5 and CR 0.5. Case-1 Fig. 6 depicted a network set-up with defined parameters as in Table 1 as above for case-1. Looking into the network, it can be known that some sensors are redundant that do not cover any target in the region, so they can be eliminated, and the final Fig. 5 Formation of coverage matrix
100
SNS TAR
90
Distance in y co-ordinate
80 70 60 50 40 30 20 10 0
0
10
20
30
40
50
60
70
Distance in x co-ordinate Fig. 6 Deployment of network for case-1 (100 nodes and 10 targets)
80
90
100
Maximization of Disjoint K-cover Using Computation Intelligence … Table 1 Network set-up parameters
231
Number of sensors deployed
100
Number of useful sensors
86
Number of targets
10
Sensing range
20
coverage matrix with useful nodes is shown in Fig. 7, and the corresponding coverage matrix is shown in Table 2. From the table, it can be seen that sensors considered are going to cover at least a single target in the network. Figure 8 compares the performance of GA and DE in the 100th generation (the most recent generation) to the first generation. At the start, fitness variations varied from four to six covers, but DE created a maximum of eight covers and GA found the 10 covers. Convergence in fitness is also seen in Fig. 9. The finished cover by DE and GA is presented in Table 3 and can be seen that all targets may be covered with relatively low numbers of sensors. In the case of DE, certain useless sensors are left which cannot be reused if necessary.
100
SNS TAR
90
Distance in y co-ordinate
80 70 60 50 40 30 20 10 0
0
10
20
30
40
50
60
70
Distance in x co-ordinate Fig. 7 Network after applying coverage matrix under case-1
80
90
100
232
D. L. Shanthi
Table 2 Coverage matrix determined for the network in case-1 Sensor
Target 1
2
3
4
5
6
7
8
9
10
1
0
0
0
0
1
1
0
0
0
0
2
0
0
0
0
1
1
0
0
0
0
3
0
0
1
1
0
1
0
0
0
0
4
0
0
1
1
1
0
0
0
0
0
5
0
0
0
0
0
0
0
0
1
0
6
0
0
0
0
0
0
0
0
0
1
7
0
0
0
0
0
0
1
0
0
0
8
0
0
0
0
0
0
0
1
0
0
9
1
0
1
0
0
0
0
0
0
0
10
0
0
1
0
0
0
0
0
0
0
11
0
0
1
0
1
1
0
1
0
1
12
1
0
0
0
0
0
0
0
0
0
13
1
0
0
0
0
0
0
0
0
0
14
0
0
0
0
0
0
1
0
0
0
15
0
1
0
0
0
0
0
0
0
0
16
0
0
0
0
0
0
1
0
1
0
17
1
0
0
0
0
0
0
0
0
0
18
0
0
0
1
0
0
0
0
0
0
19
0
0
0
0
0
0
0
0
1
0
20
0
0
0
0
0
0
1
0
1
0
21
1
0
0
0
0
0
0
0
0
0
22
0
0
0
0
0
0
1
0
0
0
23
0
0
1
0
0
0
0
0
0
0
24
0
1
0
0
0
0
0
0
0
0
25
1
0
0
0
0
0
0
0
0
0
26
0
0
1
0
0
0
0
0
0
0
27
0
0
0
0
0
0
1
0
1
0
28
0
1
0
0
0
0
0
0
0
1
29
0
0
0
1
0
0
0
0
0
0
30
0
0
1
1
1
1
0
0
0
0
31
0
0
0
1
0
0
0
0
0
0
32
0
0
0
0
0
0
0
1
0
1
33
1
0
0
0
0
0
0
0
0
0
34
0
0
0
1
0
0
0
0
0
0
35
0
0
0
0
0
0
1
0
0
0 (continued)
Maximization of Disjoint K-cover Using Computation Intelligence …
233
Table 2 (continued) Sensor
Target 1
2
3
4
5
6
7
8
9
10
36
0
0
0
1
1
0
0
0
0
0
37
0
0
0
0
0
0
1
0
0
0
38
0
1
0
0
0
0
0
0
0
0
39
0
0
0
0
0
0
1
0
1
0
40
0
1
0
0
0
0
0
0
0
0
41
0
0
0
0
0
0
0
1
0
1
42
0
0
1
0
0
0
0
0
0
0
43
0
0
0
0
0
0
1
0
1
0
44
0
0
0
0
0
0
0
0
1
0
45
1
0
0
0
0
0
0
0
0
0
46
0
0
0
0
0
0
1
0
1
0
47
1
0
0
0
0
0
0
0
0
0
48
0
1
0
0
0
0
0
0
0
0
49
1
0
0
0
0
0
0
0
0
0
50
0
0
1
0
0
1
0
1
0
1
51
0
0
0
0
0
1
0
1
0
1
52
1
0
0
0
0
0
0
1
0
1
53
1
0
0
0
0
0
0
0
0
0
54
0
0
1
1
1
1
0
0
0
0
55
0
0
0
0
0
1
0
0
0
0
56
0
0
0
1
0
0
0
0
0
0
57
0
0
0
0
0
0
0
0
1
0
58
0
0
1
0
0
0
0
0
0
0
59
0
0
0
0
0
0
0
1
0
1
60
0
1
0
0
0
0
0
0
0
0
61
0
0
0
1
1
1
0
0
0
0
62
0
0
0
0
0
0
1
0
1
0
63
0
0
0
0
0
0
1
0
1
0
64
0
0
0
1
0
0
0
0
0
0
65
0
0
0
0
0
0
1
0
1
0
66
0
1
0
0
0
0
0
0
0
0
67
0
0
0
0
0
0
0
1
0
1
68
0
0
0
0
0
0
0
0
1
0
69
0
0
0
1
0
0
0
0
0
0 (continued)
234
D. L. Shanthi
Table 2 (continued) Sensor
Target 1
2
3
4
5
6
7
8
9
10
70
0
0
1
0
0
0
0
0
0
0
71
1
0
0
0
0
0
0
0
0
0
72
1
0
0
0
0
0
0
0
0
0
73
0
0
0
0
0
0
0
0
1
0
74
0
1
0
0
0
0
0
0
0
0
75
1
0
0
0
0
0
0
0
0
0 0
76
1
0
0
0
0
0
0
0
0
77
0
0
0
0
0
0
0
0
1
0
78
0
0
1
0
1
0
0
0
0
0
79
0
0
0
0
0
0
0
1
0
1
80
0
0
1
1
1
1
0
0
0
0
81
0
0
0
1
1
0
0
0
0
0
82
0
0
0
0
0
0
0
1
0
1
83
0
0
0
0
0
0
0
1
0
1
84
0
1
0
0
0
0
0
0
0
0
85
0
0
1
1
1
1
0
0
0
1
86
0
0
1
0
1
1
0
0
0
0
10 1st Gen 100th Gen(DE) 100th Gen(GA)
9
K-Cover value
8
7
6
5
4
3
0
10
20
30
40
50
60
70
Population Members Fig. 8 Solution under various generations for population and fitness case-1
80
90
100
Maximization of Disjoint K-cover Using Computation Intelligence …
235
11 KDE KGA
10.5 10
K-Cover value
9.5 9 8.5 8 7.5 7 6.5 6
0
10
20
30
40
50
60
70
80
90
100
Generation No. Fig. 9 Fitness convergence for best solutions in case-1
The proposed algorithm is tested for different experimental set-ups under different network configurations with many sensors that varied from 100, 200, and 500 and targets that varied from 10, 10, and 15, respectively, and keeping the same sensing rage 20 for all cases. In all the cases, the proposed GA has performed better by determining the maximum number of disjoint set covers as shown in Table 4.
6 Conclusion The adoption of a disjoint cover-based approach can save a lot of battery energy in most WSN applications where sensor placement is random and a high number of sensors are dropped. The disjoint coverage is quite difficult to find, although heuristic algorithms such as GA have been highly successful. The second advantage of GA direct integer coding is not just that the transmission error from a continuous variable to an integer value is straightforward to accomplish. Over a continuous search space, DE is highly efficient, but its efficiency has been hampered by integer transformation.
20
63
11
44
33
69
4
5
6
7
8
Unused sensors
78
81
83 84
85
82
49
46
31
80
64
73
25
75
50 14
71 74
65 68
30
53
77 27
51
55
38
46
49
78
33
75
14
29
5 1
47
38
37
77
72 68
42 66
35 31
39 45
26 19
69
70 2
81 9
18
6 30 41
8 71
27
50
25
60
62
48
24
28
72
59
54
61
39
41
18
6 8
19
32
60
63
79 58
36 70
23 45
15 16
3 40
9 54
4
59
57
35
22
13
86
2 67
7
34 21
26
10
52
56
12
17
66
42
Sensor no
GA
9
76
10
43
3
29
37
2
1
47
1
Sensor no
K-Disjoint cover DE
Table 3 Disjoint set covers under DE and GA
28
85
55
56
79
53
82
57
52
61
20
17
65
76
44
48
51
10
16 40
34
12 67 43
7 24
83
58
84
11
80
86
64
15
13
5
62
21 22
73
3 4 32
74
23
36
236 D. L. Shanthi
Maximization of Disjoint K-cover Using Computation Intelligence …
237
Table 4 Comparison of algorithm with different network set-ups Case
Number of sensors deployed
Number of useful sensors
Number of targets
Sensing range
Number of disjoint set covers using DE
Number of disjoint set covers using GA
1
100
86
10
20
8
10
2
200
128
10
20
14
15
3
500
442
15
20
26
28
References 1. Y. Xu, J. Fang, W. Zhu, Differential evolution for lifetime maximization of heterogeneous wireless sensor networks. Math. Probl. Eng. (2013). https://doi.org/10.1155/2013/172783 2. N.H. Mak, W.K.G. Seah, How long is the lifetime of a wireless sensor network? Int. Conf. Adv. Inf. Netw. Appl. 2009, 763–770 (2009). https://doi.org/10.1109/AINA.2009.138.2016 3. H. Yetgin, K.T.K. Cheung, M. El-Hajjar, L.H. Hanzo, A survey of network lifetime maximization techniques in wireless sensor networks. IEEE Commun. Surv. Tutorials 19(2), 828–854, Second quarter 2017, https://doi.org/10.1109/COMST.2017.2650979 4. H.U. Yıldız, B. Tavlı, B.O. Kahjogh, Assessment of wireless sensor network lifetime reduction due to elimination of critical node sets, in 2017 25th Signal Processing and Communications Applications Conference (SIU), 2017, pp. 1–4. https://doi.org/10.1109/SIU.2017.7960228 5. C.C. Lin, Y.C. Chen, J.L. Chen et al., Lifetime enhancement of dynamic heterogeneous wireless sensor networks with energy-harvesting sensors. Mob. Netw. Appl. 22, 931–942 (2017) 6. F. El Hajji, C. Leghris, K. Douzi, Adaptive routing protocol for lifetime maximization in multiconstraint wireless sensor networks. J. Commun. Inf. Netw. 3, 67–83 (2018). https://doi.org/ 10.1007/s41650-018-0008-3 7. T.A. Alghamdi, Energy efficient protocol in wireless sensor network: Optimized cluster head selection model. Telecommun. Syst. 74, 331–345 (2020). https://doi.org/10.1007/s11235-02000659-9 8. B.A. Attea, E.A. Khalil, S. Özdemir et al., A multi-objective disjoint set covers for reliable lifetime maximization of wireless sensor networks. Wirel. Pers. Commun. 81, 819–838 (2015). https://doi.org/10.1007/s11277-014-2159-3 9. M.K. Singh, Discovery of redundant free maximum disjoint Set-k-covers for WSN life enhancement with evolutionary ensemble architecture. Evol. Intel. 13, 611–630 (2020). https://doi.org/ 10.1007/s12065-020-00374-z 10. M.F. Abdulhalim, B.A. Attea, Multi-layer genetic algorithm for maximum disjoint reliable set covers problem in wireless sensor networks. Wirel. Pers. Commun. 80, 203–227 (2015). https://doi.org/10.1007/s11277-014-2004-8 11. X. Yuan, M. Elhoseny, H.K. El-Minir et al., A genetic algorithm-based, dynamic clustering method towards improved WSN longevity. J. Netw. Syst. Manage. 25, 21–46 (2017). https:// doi.org/10.1007/s10922-016-9379-7 12. M. Elhoseny, A.E. Hassanien, Extending homogeneous WSN lifetime in dynamic environments using the clustering model, in Dynamic Wireless Sensor Networks. Studies in Systems, Decision, and Control, vol. 165 (Springer, Cham, 2019). https://doi.org/10.1007/978-3-319-92807-4_4 13. M. Ayati, M.H. Ghayyoumi, A. Keshavarz-Mohammadiyan, A fuzzy three-level clustering method for lifetime improvement of wireless sensor networks. Ann. Telecommun. 73, 535–546 (2018). https://doi.org/10.1007/s12243-018-0631-x
238
D. L. Shanthi
14. J. Sahoo, B. Sahoo, Solving target coverage problem in wireless sensor networks using greedy approach, in 2020 International Conference on Computer Science, Engineering and Applications (ICCSEA) (2020), pp. 1–4. https://doi.org/10.1109/ICCSEA49143.2020.9132907 15. A.R. Aravind, R. Chakravarthi, N.A. Natraj, Optimal mobility based data gathering scheme for life time enhancement in wireless sensor networks, in 2020 4th International Conference on Computer, Communication and Signal Processing (ICCCSP) (2020), pp. 1–5, https://doi. org/10.1109/ICCCSP49186.2020.9315275
Car-Like Robot Tracking Using Particle Filter Cheedella Akhil, Sayam Rahul, Kottam Akshay Reddy, and P. Sudheesh
Abstract This paper proposes the method that tracks a remote-controlled car-like robot in outdoor environment by using particle filter algorithm. The proposed model uses a sampling-based recursive Bayesian algorithm which is implemented in the state estimator particle filter object. A geometrical model tracker distinguishes particles by resampling from invalid particles to make more precise prediction. Commands are sent to the robot and the robot pose measurement is provided by an on-board global positioning system (GPS). The proposed algorithm predicts the estimated path, and the error has been calculated to evaluate the performance of the tracking model. Keywords Particle filter · State-estimator · Bayesian algorithm · State transition function
1 Introduction Particle filter techniques are an established technique for producing samples of the desired distribution without requiring assumptions approximately the nation area fashions or the nation distribution [1]. Particle filter makes use of set of samples (particles) to symbolize the posterior distribution of a few stochastic methods given through noisy and partial observations. The stochastic process is a deterministic process where there is only one possible reality. It is also a random process where it has several possible evolutions and is characterized by probability distributions where the starting point might be known [2]. It is a time series modeling where it has series of random states or variables. These modeling measurements are taken at discrete times [3]. In state space, the state vector consists of all facts to explain the investigated device that is normally multi-dimensional. The size vector represents observations associated with state vector that is commonly of decreased measurement than the state vector [3]. There are known motion commands sent to robot, but robot cannot C. Akhil · S. Rahul · K. A. Reddy · P. Sudheesh (B) Department of Electronics and Communication Engineering Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_18
239
240
C. Akhil et al.
execute the exact commanded motion due to mechanical stack or model inaccuracy. This paper shows how to use state estimator particle filter to reduce the effect of noise in measurement data and get more accurate estimation of pose of robot [4]. This paper is structured as following. An overview of literature works is presented in Sect. 2. The probabilistic inference of sampling primarily based on particle filter monitoring framework for car-like robot and detailed set of rules put in force supplied over the framework has been elaborated in Sect. 3. The experimental effects have been illustrated in Sect. 4. Finally, the conclusion and future works are addressed in Sect. 5.
2 Related Work The main goal of the paper is to have a low cost and more informational model that has some advantages over other tracking algorithms that had been studied widely. There are many researchers operating on monitoring gadgets with a few data given to it, especially tracking of pedestrian. With the scenario of real vehicle tracking, a number of them employ assumptions which include background subtraction, situation with constant entrance and exit, etc. Car-like robot tracking and its behavior analysis using many samples is summarized in [1]. The diversion of pose caused by roofed area or some other obstacles are quite common. A lot of set of rules had been proposed to resolve and track the failure brought about by any interface in the moving path. Some of them are recognition on statistics association, seeking to supplement the positions during interface area via way of means of optimizing the trajectory, etc. [1]. They can nonetheless work while the goal is to acquire greater accuracy, however, except the strategies that usually depend on a set line of sight with no obstacles present in the path of target’s poses. The line graph of path of robot and its visible illustration is at extra risk of noisy and clustered background [3]. By introducing online gaining knowledge into monitoring framework, it is possible to replace the arrival version, while the arrival of target is changed. While their fashions are lack of semantic statistics, which helps in distinguishing aim besides the factor objects. The sampling-based recursive Bayesian version may be very famous in detecting and monitoring objects. Without any constraint of the global appearance, while a few paths of a goal are misplaced because of obstacles, visible paths nonetheless offer legitimate observation for tracking. Procedures with sampling-based model are more effective with estimating object. The compact version works with a totally restricted scene, however, this proves the use of sampling-based model [1]. In plenty of research, sampling-based model is used in object tracking for a given path, and they mainly focus on predicting accurate path of the goal or studying the geometry shape of samples they extract, however, this work is inquisitive about car-like robotic monitoring [3]. Bayesian filtering strategies have been broadly used with the car monitoring problem. Methods using Kalman filter have a low computation complexity, while
Car-Like Robot Tracking Using Particle Filter
241
particle filter can constitute a broader area of distribution and version nonlinear transformation. Car-like robot was tracked with the use of a particle filter with the aid of using modeling trajectories at the intersection of particles that had been anticipated in determining movement pattern. Debris of feature elements in 2D were tracked on a grid map, and weights can be measured with the resource of the use of occupied gird which distinguishes static from shifting elements. Both seek to tune car-like robot in two dimensional space [4]. To find the mean square error of tracking, distance between actual pose and estimated pose have been calculated for hypothetical samples and RMS of those samples were calculated [3].
3 Procedure Sampling-based car-like robot framework has been illustrated in this section. For estimating state of car-like robot system, particle filter algorithm is ideally ˙ y˙ , θ˙ , and suited, since the particle filter can deal with inherent nonlinearities x, ´ are velocities, given in Eqs. (1–4). Ø The noisy measurements of robot pose (x, y, θ ) are, x = v ∗ cos θ,
(1)
y = v ∗ sin θ,
(2)
θ = (v/L) ∗ tan ∅,
(3)
∅=ω
(4)
´ is The linear and angular velocities which are sent to robot are v and ω, and Ø the wheel orientation which is not included in estimation [5]. Car-like robotic drives and modifies its speed and steerage attitude continuously as shown in Eq. (5). The full state of car from observer’s point of view will be, [x, y, θ, x, ˙ y˙ , θ˙ ]
(5)
The pose of the robot is measured by some noisy external system, e.g., a Vicon or a GPS system. Along the path, the robot drives through a roofed area where no measurement can be made. The noisy measurement on robot’s partial pose (x, y, θ ) and no measurement is available on the front wheel orientation (φ) as well as on all ˙ y˙ , θ˙ , φ). The linear and angular velocity command sent to the robot the velocities (x, are vc and ωc, respectively. There will be some difference between the commanded motion and the actual motion of the robot. To estimate the partial pose (x, y, θ ) of the car-like robot from the observer’s perspective, the full state of the car, given
242
C. Akhil et al.
in Eq. (5), uses state estimator PF to process the two noisy inputs and make best estimation of the current pose. At the predict stage, the states of the particles are updated with a simplified, unicycle-like robot model, as shown in velocity equations below. The system model used for state estimation is not an exact representation of the actual system. This is acceptable, as long as the model difference is well-captured in the system noise as given in Eqs (1), (2), and (4). At the correct stage, the importance weight (likelihood) √ of a particle is determined by its error norm from current measurement, (()2 + ()2 + (θ )2 ), as there are measurements only on these three components. This model configures the particle filter using 5000 particles. Initially, all particles are randomly picked from a normal distribution with mean at initial state and unit covariance. Each particle contains six state variables as shown in Eq. (5). The third variable is marked as circular since it is the car orientation. It is also very important to specify two callback functions, state transition function, and measurement likelihood function. These two functions directly determine the performance of the particle filter. Here, the commanded linear and angular velocities to the robot are arbitrarily picked time-dependent functions. Also, the fixed-rate timing of the loop is realized through rate control run loop at 20 Hz for 20 seconds using fixed-rate support. The fixedrate object is reset to restart the timer. The timer is reset right before running the time-dependent code. 1.
State Transition Function
2.
The sampling-based state transition function evolves the particles based on a prescribed motion model so that the particles form a representation of the proposal distribution. Below is an example of a state transition function based on the velocity motion model of a unicycle-like robot. The sd1, sd2 and sd3 are decreased to see how the tracking performance deteriorates. Here, sd1 represents the uncertainty in the linear velocity, sd2 represents the uncertainty in the angular velocity, and sd3 is an additional perturbation on the orientation. Measurement Likelihood Function The measurement likelihood function computes the likelihood for each predicted particle based on the error norm between particle and the measurement. Importance weight for each particle is assigned based on the computed likelihood.
Here, predict particles are a N × 6 matrix (N is the number of particles), and measurement is a 1 × 3 vector.
Car-Like Robot Tracking Using Particle Filter
243
4 Results The tracking performance is evaluated for a car-like robot by using particle filter. The particle filter tracks the car as it drives away from the initial pose which is depicted in Fig. 1. The robot moves across the roofed area as shown in Fig. 2. Measurement cannot be made, and the particles only evolve based on prediction model (marked with orange color). There are particles gradually forming a horseshoelike front, and the estimated pose gradually deviates from the actual pose after the robot has moved out of the roofed area. With the new measurements, the estimated pose gradually merges with the actual path.
Fig.1 Environment where the car-like model travels
Fig. 2 Path of robot on different surfaces
244
C. Akhil et al.
Table 1 Calculation of RMS from values of actual path and estimated path
X-position (m)
Actual Y-position (m)
Estimated Y-position (m)
Error (m)
0
0
0
0
2
0.68
0.68
0
4
1
1
0
6
1.86
1.89
0.03
6.25
1.89
1.93
0.04
6.5
1.95
2.05
0.1
6.75
1.98
2.22
0.24
7
2
2.05
0.05
8
2.42
2.42
0
10
3.05
3.05
0
Actual path and estimated path are calculated from the resultant output graphs and the error is calculated at ten instants and is tabulated in Table 1. Rootmean square (RMS) error = 0.0871779 m
5 Conclusion To cope with the partial observation and ranging viewpoints in car-like robot tracking, this research proposes a sampling-based particle filter to find the error between actual path and estimated path. This version takes sampling-based Bayesian version into particle filter framework. The path followed by the robot is extracted from the preestimated path and geometric model, ensuring that it is no longer required to draft on inappropriate objects. During prediction of path, a roofed area is placed on the way of the robot’s path, where the robot slightly deviates from its actual path. The advantage of the roofed area environment was considered to calculate the error. The estimated path compared to the actual path, i.e., error, is of extraordinary assist in getting a higher remark and a green prediction. Qualitative and quantitative analyzes display that the proposed set of regulations, take care of barriers and alternate the route in car-like robotic monitoring.
References 1. J. Hui., Tracking a Self-Driving Car with Particle Filter. A Survey. Published Online in jonathanhui.medium.com, Apr 2018
Car-Like Robot Tracking Using Particle Filter
245
2. G.M. Rao, C. Satyanarayana, Visual object target tracking using particle filter: A survey. Published Online in MECS. 5(6), 57–71 May (2013) 3. F. Gustafsson, F. Gunnarsson, N. Bergman, U. Forssell, J. Jansson, R. Karlsson, P.-J. Nordlund. Particle filters for positioning, navigation, and tracking, in Proceeding of the IEEE Transactions on Signal Processing, vol. 50(2) (2002) 4. P.L.M. Bouttefroy, A. Bouzerdoum, S.L. Phung, A. Beghdadi, Vehicle tracking using projective particle filter, in Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance (2009), p. 9 5. S.-K. Weng, C.-M. Kuo, S.-K. Tu, Video object tracking using adaptive Kalman filter, J. Vis. Commun. Image Represent. 1190–1208 (2006) 6. K.R. Li, G.T. Lin, L.Y. Lee, J.C. Juang, Application of particle filter tracking algorithm in autonomous vehicle navigation, in CACS International Automatic Control Conference (CACS) Dec 2013, pp. 250–255 7. Y. Fang, C. Wang, H. Zhao, H. Zha, On-road vehicle tracking using part-based particle filter, in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Sept 2017, pp. 694–711 8. P.D. Moral, Nonlinear filtering using random particles. Theory Probab Appl. 40(4), 690–701 (1995) 9. M.S. Arulampalam, M. Simon, G. Neil, C. Tim, A tutorial onparticle filters for online nonlinear/non-gaussian bayesiantracking. IEEE Trans. Signal Process. 50(2) (2002) 10. C.V. Reddy, K.V. Padmaja, Estimation of SNR, MSE and BER with incremental variation of power for wireless channel. Int. J. Eng. Res. Technol. 3(7), Jul (2014) 11. K. Roy, B. Levy, C.J. Tomlin, Target tracking and estimated time of arrival (ETA) prediction for arrival aircraft, in AIAA Guidance, Navigation and Control Conference and Exhibit (2006), pp. 112–183 12. M.G. Muthukrishnan, P. Sudheesh, M. Jayakumar, Channel estimation for high mobility MIMO system using particle filter. In: International Conference on Recent Trends in Information Technology (ICRTIT) (2016), pp. 1–6 13. S.J. Sreeraj, R. Ramanathan, Improved geometric filter algorithm for device free localization, in International Conference on Wireless Communications, Signal Processing and Networking (2017), pp. 914–918 14. S.K. Megha, R. Ramanathan, Impact of anchor position errors on WSN localization using mobile anchor positioning algorithm, in International Conference on Wireless Communications, Signal Processing and Networking (2017), pp. 1924–1928 15. A.S. Chhetri, D. Morrell, A. Papandreou-Suppappola, The use of particle filtering with the unscented transform to schedule sensors multiple steps ahead, in IEEE International Conference on Acoustics, Speech, and Signal Processing (2004), pp. 301–304 16. G. Ignatius, U. Murali Krishna Varma, N.S. Krishna, P.V. Sachin, P. Sudheesh, Extended Kalman filter based estimation for fast fading MIMO channels, in IEEE International Conference on Devices, Circuits and Systems (ICDCS) (2012), pp. 466–469
Secured E-voting System Through Blockchain Technology Nisarg Dave, Neev Shah, Paritosh Joshi, and Kaushal Shah
Abstract Election is the heart and soul of any democratic country. Hence, it is mandatory that elections should be conducted in a safe and secure environment. Also it is the duty of voters to present themselves on the Election Day. However, there are various security issues with present-day election processes like fake voting and vote tampering. Also, there is an issue of low voter turnout especially during a pandemic. Such issues can be solved by blockchain technology. The architecture of blockchain technology makes it difficult to alter data, or alter votes, and attack on the blockchain itself which makes it secure and reliable. Furthermore, voters could cast their vote from the comfort of their home which is helpful in a pandemic which leads to a healthy voter turnout. Blockchain could very well be the future of how a country conducts its elections. Keywords E-voting · Blockchain technology · Android · Smart contract
1 Introduction 1.1 What is Blockchain? Blockchain is a technology designed to work in a peer-to-peer network. In this network, each user is called a node. These are interconnected; hence, these nodes can make transactions with each other. For this, nodes’ public addresses and the amount of assets to be transferred is specified in the transaction. To confirm a transaction, the sender node presents its private signature. These transactions are stored as a whole group. These whole groups are called blocks. It also verifies this transaction by consensus achieved by [1] proof of work. These blocks have a unique key that is used for identifying it and storing its predecessor’s key, forming a chain of sequential blocks. Hence, blocks are connected N. Dave (B) · N. Shah · P. Joshi · K. Shah Pandit Deendyal Energy University, Gandhinagar, Gujrat, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_19
247
248
N. Dave et al.
in the form of chains, hence the name. Any change to any single block in the chain will lead to change in its key which will break the chain from the block onwards. Blockchain provides transparency as well as anonymity. As blockchain is public, each node can be checked if everything is as it is supposed to be. Multiple copies of the chain are present among [2] various nodes of the network. So for someone to manipulate the chain, they would have to attack 51% of the network or gain access to 51% of the network which will lead to control over blockchain, which is very unlikely.
1.2 Election Process Election process is a critical part of democracy. Each democratic process requires transparency and individual voters’ anonymity. Even if the election process is of a University representative and if the fair means are not practiced, then it may bring trouble for individual voters. According to [3], only 67% of the voters turned up to vote on the day of election. In a country of 1 billion population, if 33% of voters are not practicing their basic right is not beneficial for that country.
1.3 Why the Need for E-Voting? According to the reports, 33% of voters who did not vote were having doubts about casting their votes or were not able to turn up at the voting booth on time. If the voting was held using electronic media and was given assurance that their votes aren’t going to get manipulated, then the voter turnout could have been more. If a system was available, more votes would have been casted, which might lead to the change in result of elections as well.
1.4 Why to Use Blockchain for E-Voting? Election requires transparency, anonymity, and immunization. All these are the features of blockchain, which is the reason why blockchain is the most appropriate technology for conducting elections. Blockchain is immutable; hence, it will be impossible to change results of held elections.
Secured E-voting System Through Blockchain Technology
249
2 Literature Review Before starting with the development of the project, we referred to the research papers in this domain. We decided to review the work, to get some more ideas in this field. Table 1 shows some observations that we made from the research papers.
3 System Model Figure 1 depicts the system model of the Android application.
4 Proposed Method with Implementation As a solution to our problem statement, we have developed an Android application. It will allow the user to vote for their candidate during the election phase and view the result after the election phase is over. We have developed a smart contract using a programming language called Solidity on Remix Ethereum IDE. We have deployed this smart contract on the Ethereum blockchain. We are using MetaMask as our crypto wallet, to store our test ethers. We are working on the Ropsten Test Network which is also provided by MetaMask. We have used Firebase for the database and back end operations. We have used Infura and Web3j for connecting our smart contract with our Android application.
4.1 Tools and Technologies Used 4.1.1
Remix
Remix IDE is an open source desktop and Web application. Solidity contracts can be written right from the browser, using this open source tool. The Remix IDE contains modules for testing smart contracts, debugging them, and deploying them.
4.1.2
Ropsten
Ropsten is a proof-of-work testnet. This means it is the best like-for-like representation of Ethereum.
250
N. Dave et al.
Table 1 Literature review References Observations [4]
• Suggested SHA256 and ECDSA algorithm for security • Implementation of private key + public key infrastructure for privacy • Implementation of digital signature for non-repudiation
[5]
• • • •
[6]
• Implemented digital signature for privacy and non-repudiation • A system where one cannot get the data regarding votes, i.e., which voter voted for which candidate, until the counting phase begins • There was inclusion of a central authority for overlooking the operations
[7]
• • • • •
Implemented SHA256 and ECDSA algorithm for security Implementation of public key Infrastructure for privacy Implementation of digital signature for non-repudiation Inclusion of a central authority for overlooking the operations Proposed the structure of block for blockchain where 1 block will contain transaction detail, or vote transfer details, of 1 voter
[8]
• • • •
Implemented SHA256 for security Proposed candidate confidentiality for privacy of candidates Inclusion of a central authority for overlooking the operations Proposed a system where each candidate will have their own blockchain and length of blockchain will determine the number of votes they acquired
[9]
• Implemented blind signature for security • Proposed vote validation and voter confidentiality for voters’ privacy • Inclusion of a trusted third party for keeping a check on operations
[10]
• • • •
[11]
• Implementation of SHA256 for security • Implementation of public key infrastructure for privacy • Inclusion of a trusted third party, like an Election Commission, for voters’ confidentiality
[12]
• Proposed voters’ anonymity and votes’ validation • Inclusion of an authority to look over election process and declare results
[13]
• A blockchain-based on deniable authentication encryption storage of image data is proposed • Cloud’s use has been suggested for returning cipher text and trapdoor • Partial private keys, system master keys, and system parameters are used for deniable authentication encryption
Secret-ballot election scheme for security Implementation of Anonymous Veto Protocol (also known as AV-net) for privacy Added district node and boot node inside POA network for verification Suggested 3 blockchain frameworks, namely Exonum, Quorum, and Geth
Implementation of SHA256 for security Proposed voters’ anonymity and also votes’ validation Inclusion of a central authority for overlooking operations Used a two link blockchain
Secured E-voting System Through Blockchain Technology
251
Fig. 1 System model flowchart
4.1.3
MetaMask
MetaMask is a software cryptocurrency wallet used to interact with the Ethereum blockchain. It allows users to access their Ethereum wallet through a browser extension or mobile app, which can then be used to interact with decentralized applications.
252
4.1.4
N. Dave et al.
Android Studio
Android Studio is the official Integrated Development Environment (IDE) for Android app development, based on IntelliJ IDEA. On top of IntelliJ’s powerful code editor and developer tools, Android Studio offers even more features that enhance your productivity when building Android apps.
4.1.5
Kotlin
Kotlin is a cross-platform, statically typed, general-purpose programming language with type inference. Kotlin is designed to interoperate fully with Java, and the JVM version of Kotlin’s standard library depends on the Java Class Library, but type inference allows its syntax to be more concise.
4.1.6
Git
Git is a free and open source distributed version control system designed to track changes in any set of files. It is designed for coordinating work among the programmers who are collaboratively developing source code during software development. It is fast and efficient. It supports distributed and non-linear workflow and also handles data integrity.
4.1.7
Firebase
Firebase is a platform developed by Google that provides detailed documentation and cross-platform SDKs to build and ship apps on Android, iOS, Web, etc.
4.1.8
Infura
Infura is a set of tools for creating an application that connects to Ethereum blockchain. It interacts with Ethereum blockchain and runs nodes on behalf of its users. It is also used by MetaMask which we have used as a wallet.
4.1.9
Solidity
Solidity is a high-level, object-oriented programming language for developing smart contracts. It is used for implementing smart contracts on blockchain platforms, mainly Ethereum.
Secured E-voting System Through Blockchain Technology
4.1.10
253
Smart Contract
Smart contract is a program, also referred to as transaction protocol for blockchain, which automatically executes and controls relevant events and actions as per its terms.
4.2 Methodology 4.2.1
Smart Contract Development
We used Remix IDE for smart contract development. Smart contract was developed in Solidity. We created two structures, one for voters and another for candidates. Voter structure consists of following parameters: • • • •
voterNum—stores the voter number voterId—stores the voter ID given to them voterAddress—stores the wallet account address of the voter isvoted—stores a Boolean value of either true or false based on whether the voter has voted in the election or not Candidate structure consists of following parameters:
• • • •
candidateNum—stores the candidate number candidateId—stores the candidate ID provided to them candidateAddress—stores the wallet account address of the candidate counter—stores the total number of votes the candidate received.
We also specified some modifiers for keeping some functions in check that is to restrict execution of some functions during a specific state or phase of the election. The modifiers that we specified, which can also be seen in Fig. 2, are as follows: • onlyOwner—This modifier will allow only the owner/admin to execute the particular function. No other person could call that function. • onlyPreElection—This modifier will allow the function call only when election is not yet started, that is the pre-election phase. It cannot be called once the election is started or finished, that is, entered a phase other than pre-election phase. • onlyActiveElection—This modifier will allow the function call only when the election is live, that is the active/live election phase. It cannot be called in any other phase. • onlyPostElection—This modifier will allow the function call only when the election is finished, that is the post-election phase. We have implemented two functions to change the phase of the election. Those functions are as follows:
254
N. Dave et al.
Fig. 2 A screenshot of Remix Ethereum IDE showing a snippet of the Solidity code
• Activate()—This is an owner/admin only function. It allows the admin to change the election phase to active, that is make the election live. • Completed()—This is also an owner/admin only function. It allows the admin to change the election to complete phase, which is finish the election. We have implemented two functions to check the phase of the election. Both of them are Boolean functions. Those functions are as follows: • IsActive()—This function allows us to check whether the election is active or not. • IsCompleted()—This function allows us to check whether the election is completed or not. We have also implemented two functions, one for adding voters and another for adding candidates. Following are those two functions: • addVoter()—This function can be called only before election as specified by the onlyPreElection() modifier. It allows us to add a voter by taking in three parameters, namely voterId, voterAddress, and isvoted. • addCandidate()—This is an admin only function. It allows admin to add a candidate by taking in two parameters, namely candidateAddress and candidateId. We have implemented another function named Vote(), that will allow a voter to vote for a candidate. This function can be called only when the election is live, as specified by the onlyActiveElection modifier. It first checks whether the voter has already voted or not, so as to meet the criterion of allowing one vote per voter. After it is checked, and if the voter has not voted yet, then the vote is transferred to the candidate of the voter’s choice. The voter must confirm this transfer of vote
Secured E-voting System Through Blockchain Technology
255
through his/her account (account on MetaMask) and pay the required fee to perform the transaction. Lastly, we have implemented a result() function to declare the result of the election. This function can be called only when the election is over, as specified by the onlyPostElection modifier. This function will search for the candidate with the highest number of votes and will return the winner candidate’s ID and the number of votes that candidate acquired.
4.2.2
Smart Contract Deployment
After the smart contract is compiled, the next step is to deploy it. For that, the first step is to select the environment as “Injected Web3” as we will be using web3j for connecting our smart contract with our Android application. Next, it will ask us to connect this smart contract to our MetaMask wallet so that we could deploy our smart contract on the Ethereum blockchain, and for that we will need to pay fees. After the connection is established, we will now see our account ID (which is also the public key of our account) and the amount of ethers present in our account. Note that these are test (or fake) ethers on the Ropsten Test Network. The Gas Limit will be established, as shown in Fig. 3, and we could now proceed to click on the Deploy button and deploy our smart contract.
4.2.3
Android Development
Android development is the core part of this entire project. Android application is the face of this project. When deployed publicly, software or applications are supposed to be easy to use; hence, expecting voters to do the complex task will only reduce the use of E-voting. We developed an Android application which is user friendly and which will be connected to blockchain. For development, Kotlin is used. And for connecting smart contracts, web3j and Infura dependencies are included. For storing the information of users or candidates, we have connected an Android application with Firebase real-time database. For authenticating users via OTP or authenticating admin via Email and Password, we have used FirebaseAuth. Firebase real-time database is a real-time database which is a NoSQL database. FirebaseAuth is used for authentication. Figure 4 is a screenshot of a code snippet we wrote for development of Android application in Android Studio:
256
N. Dave et al.
Fig. 3 A screenshot of Remix Ethereum IDE showing deployment details
4.2.4
Need of the MetaMask and Ethereum
MetaMask: MetaMask is a crypto wallet which allows us to store ethers and conduct transactions. It also supports various test networks, namely Ropsten, Kovan, Rinkeby, and Goerli alongside its Mainnet network. It also allows us to create our own custom network.
Secured E-voting System Through Blockchain Technology
257
Fig. 4 A screenshot of Android Studio showing a snippet of code used for Android application development
Here, for our project, we have used Ropsten Test Network and have some test ethers for carrying out transactions. For deploying our contract on Ethereum blockchain, we need to pay a certain amount as fee. Hence, we need to connect our MetaMask account to the Remix Ethereum IDE and confirm this transaction so as to deploy our smart contract. Furthermore, for a voter to vote, he/she will be required to pay a certain amount as fee to confirm the vote and will need a MetaMask account to carry out this transaction. Ethereum: Ethereum provides a great development environment. One of the most famous and frequently used programming languages for smart contracts, Solidity, is supported by Ethereum. It also allows the use of test ethers for deploying smart contracts on Ethereum blockchain. Since these are test ethers, there is no market value of it and hence could be obtained for free, which is quite helpful for projects.
4.2.5
Setting up Backend
We need to manage the voters and candidates’ data, and we can store it in blockchain too. But storing in blockchain will increase the traffic in blockchain. And as we all know, it will require a gas amount that means we would be damaging the environment too. So we implemented the backend in Firebase. Firebase allows awesome tools for user authentication, storage, and database.
258
4.2.6
N. Dave et al.
Integrating Application with Backend and Blockchain
Blockchain is the ledger where all the important data and records of transactions are stored. And all the transactions are recorded and handled by smart contracts which are developed in Solidity. And this smart contract contains different methods. This method will be called from an Android application. Once the Solidity file is deployed on Remix, then it is saved locally inside the Android project. As Kotlin allows extension of Java, we will convert .sol file to .java file. After saving the .sol file inside the project directory, solc—a node packaging module is used for extracting data from Solidity file. Command: solcjs < filename > .sol—bin—abi—optimize—o < path/to/directory > After executing the above command, .bin and .abi files are autogenerated. Using these two files, .java will be generated which will have all the methods inside classes and constructors. Command: web3j generate solidity -b < path/to/bin/file > .bin—a < path/to/abi/file > .abi—o < path/to/dir > -p < name of package > After executing the above command, the .java file is autogenerated and will have all the methods of smart contract. Then, this Java class will be loaded inside Kotlin files and the methods will be called accordingly.
4.3 Features 4.3.1
User Authentication
The user is supposed to verify themselves, for logging into the application, with two-step authentication. Those two steps are as follows: • Password—The user will have to enter the correct password, known only to the user, to login to our application. The password will be stored in a Firebase real-time database as a hash digest. • OTP—The user will also have to enter a one-time password which will be sent to them on their registered mobile number in order to log in. After entering these two parameters, the user will then be allowed to log in to the application and will be able to vote provided the election is live.
4.3.2
User Confidentiality
User confidentiality and privacy are important factors not just in blockchain network but [14, 15] any network. Just as in real-life scenarios where data like which voter
Secured E-voting System Through Blockchain Technology
259
voted for which candidate is kept confidential, here also we provide the similar feature. In blockchain technology, transactions are kept anonymous, that is, one cannot trace back the actual person who initiated the transaction. Here also, when a voter transfers a vote to a candidate, there is no way to trace back that transaction and find out who the voter is in real life. Only details that are public are the public key of the voter and the same of the candidate. The public key does not contain any information regarding actual identity of the voter, like their real name, address, etc., and such information cannot be derived from the public key. Hence, by knowing the public key, one cannot deduce who voted for whom, which should be the case for maintaining the voter confidentiality.
4.3.3
Security Features
Every block of blockchain is cryptographically linked with its previous block as it stores the hash of the previous block. The hashing function used is SHA256. SHA256 is a standard hashing algorithm, which means it is a one-way function. Once a hash digest, of original data, is created using hashing function, it is impossible to reverse the process and obtain the original data. Changing even a single bit of data in the original data would result in a completely different hash digest. Hence, it is not possible to change data of the block without breaking the current chain. Therefore, it is not possible for an attacker to change the data of the block and change the voter’s vote to some other candidate. Hence, it prevents stealing of votes and changing of votes to a particular candidate’s favor.
5 Summary and Future Works There are various research papers that have addressed the similar problem statement as ours and are mentioned in the Literature Review section of this report. One of the main notable differences between those proposed work and ours is that we have actually developed an application for this problem statement unlike the rest. We have developed an Android application unlike them. There is one more notable work done in this direction. It is a project by Chrystopher Borba titled “Smart Contract example for elections in blockchain platform.” They have successfully created and deployed a smart contract to address this problem statement. However, there is no user end application, like Android, iOS, or Web application. While on the other hand, we have developed an Android application. Following are the future plans regarding this project: • Conduct a small scale election through this application before deploying it to the Google Play Store.
260
N. Dave et al.
• Increase the reach of project by developing a web-based solution and also an iOS compatible application. We have developed a working model for conducting a safe election. Voters could cast their vote from the safety and comfort of their home which is beneficial in pandemic era, which otherwise would be difficult for traditional election process. Also, the major issue of fake/false voting in traditional election process is also solved to a greater extent through this solution. There are still some problems that need to be addressed, like expanding the reach of our project, and conducting a big scale election is a good challenge in itself. Hence, we are working on both the challenges and will develop applications for Web and iOS devices too.
References 1. C. Lepore, M. Ceria, A. Visconti, U.P. Rao, K.A. Shah, L. Zanolini, A survey on blockchain consensus with a performance comparison of PoW, PoS and pure PoS. Mathematics 8(10), 1782 (2020) 2. V. Patel, F. Khatiwala, K. Shah, Y. Choksi, A review on blockchain technology: Components, issues and challenges, in ICDSMLA 2019 (Springer, Singapore, 2020), pp. 1257–1262 3. A. Gunasekar, C. Srinivasan, Voter turnout, in Lok Sabha Polls 2019 Highest Ever: Election Commission, 2019. https://www.ndtv.com/india-news/general-elections-2019-record-voter-tur nout-of-67-11-per-cent-in-lok-sabha-polls-2041481 4. R. Hanifatunnisa, B. Rahardjo, Blockchain based E-voting recording system design (2017) 5. F.Þ. Hjálmarsson, G.K. Hreiðarsson, Blockchain-based E-voting system 6. F.S. Hardwick, A. Gioulis, R.N. Akram, K. Markantonakis, E-Voting with Blockchain: An E-Voting protocol with decentralization and voter privacy (2018) 7. H. Yi, Securing e-voting based on blockchain in P2P network. J. Wirel. Com. Netw. 2019, 137 (2019). https://doi.org/10.1186/s13638-019-1473-6 8. A.B. Ayed, A conceptual secure blockchain-based electronic voting system. Int. J. Net. Secur. Appl. (IJNSA) 9, (2017) 9. Y. Liu, Q. Wang, An E-Voting protocol based on blockchain (2017) 10. F. Fusco, M.I. Lunesu, F. Pani, A. Pinna, Crypto-voting, a blockchain based e-voting system 223–227 (2018). https://doi.org/10.5220/0006962102230227 11. R. Ganji, B.N. Yatish, Electronic voting system using blockchain, in Dell EMC Proven Professional Knowledge Sharing (2018) 12. R. Ta¸s, Ö.Ö. Tanrıöver, A systematic review of challenges and opportunities of blockchain for E-Voting. Symmetry 12(8), 1328 (2020). https://doi.org/10.3390/sym12081328 13. C.V. Joe, J.S. Raj, Deniable authentication encryption for privacy protection using blockchain. J. Artif. Intell. Capsule Netw. 3(3), 259–271 (2021) 14. K.A. Shah, D.C. Jinwala, Privacy preserving, verifiable and resilient data aggregation in gridbased networks. Comput. J. 61(4), 614–628 (2018) 15. K. Shah, D. Jinwala, Privacy preserving secure expansive aggregation with malicious node identification in linear wireless sensor networks. Front. Comp. Sci. 15(6), 1–9 (2021)
A Novel Framework for Malpractice Detection in Online Proctoring Environment Korrapati Pravallika, M. Kameswara Rao, and Syamala Tejaswini
Abstract As a result of both pandemics and the advantage of a remote examination, job interviews have become popular and necessary. These systems are being used by the majority of companies and academic institutions for both recruitment and online exams. In this study, a new virtual assessment system uses deep learning to continuously monitor. Physical locations without the use of a live proctor, however, conducting exams in a safe environment are one of the main disadvantages of remote examination systems. In this work, there is a pipeline for an online interview and exam fraud analysis. Only a video of the candidate is required by the system, which is recorded during the exam. Face detection, recognition, object detection, and tracking algorithms are part of this work. The proposed work focus on techniques for identifying malpractices during online exam. Keywords Online learning · Online proctor · Student authentication · Face detection · Face recognition · Multi-person detection
1 Introduction In society, today, online exams and interviews have become popular. A pandemic is one justification for this kind of workforce, but another important reason is that both people and enterprises have the opportunity to properly engage their time and attention. Instead of appearing for interviews in physical offices, candidates may do it at any time and from anywhere in the world by simply browsing online. In this manner, online interviews allow the hiring task to go even more easily and effectively. Exams are a common method of checking people’s knowledge of specific subjects. Therefore, proctoring formative assessments in a safe atmosphere has now become popular and is essential for the exams’ legality. According to [1], approximately 74% of participants believed they can easily cheat in online exams. It simply entails taking a look at a single person or a set of individuals during every exam to K. Pravallika (B) · M. K. Rao · S. Tejaswini Department of Electronics and Computer Engineering, KLEF, Vaddeswaram, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_20
261
262
K. Pravallika et al.
prevent post-cheating. Even if the pandemic seems to be no longer present, among the biometric approaches employed by the system are facial recognition using the HOG face detector and the OpenCV face recognition algorithm. Recognition of multiple persons using a OpenCV based libraries is one of the key aspects of malpractice detection. The system is tested using SSD approaches and the COCO dataset, which is put into practice as a software system.
2 Related Works The field of education has changed dramatically in the last two years. These most backward of institutions have been pushed to shift to the online-based learning experience of the COVID-19 outbreak. As a result, the demand for online proctoring has grown. Even though, due to their hurried development, most software lacks critical functionalities and settings or is not as user pleasant as it is. Considering that the vast majority of people are familiar with devices, the use of an easily available basic interface is critical. Several retailers buy paid versions of the proctoring system. The user’s face, voice, touch, mouse, and laptop are individually acknowledged as part of a continuous verification system. Builds and deploys a network firewall system that uses network address translation, demilitarized zone, virtual protocol network, and other firewall techniques for threat detection. Deep learning-based face identification is also a solution for detecting the face in online test. An introduction of another a virtual exam system that uses fingerprint and eye gaze technology. It also contains cheating and anti-cheating status of participants and is used to assess their recommended methods. For researchers and engineers, the online proctoring system has recently become a difficulty because of COVID-19 the demand for online proctoring system increases rapidly. The paid version of the proctoring system was promoted by several industrial enterprises. From the start of the test until the end of the exam, this is the entire process of online proctoring. Artificial intelligence (AI) has altered our environment and lifestyle by introducing intelligence into many aspects of everyday life [2, 3] but with a few limitations [4]. Machine learning (ML) has improved innovation in a various industries [5]. However, it has a lot of drawbacks. Education, transportation [6], communication networks, disaster management, smart cities, and many more fields have benefited from machine and deep learning innovations. The virtual exam encounters multiple challenges throughout exam, and [7] discussed the online exam’s various difficulties and proposed an alternative that included grouping client hostnames or IPs for a specific area and time, as well as biometric authentication technologies such as face recognition and fingerprints [8], and AI has the ability to completely transform proctoring and online learning.
A Novel Framework for Malpractice Detection in Online …
263
3 Proposed Model There are two parts to the proposed web-based online proctoring system. First, there is the online proctoring. The suggested design for the online proctoring system is shown in Fig. 1. During the exam, the online proctoring program verifies the examinees identify and protects against unethical behavior. Even before the test begins, the application ensures that the participant has access to a screen that combines video and audio recording. The examination is not valid. Begin until the proctors have verified their identities. The major topic begins with online proctoring, in which a student takes an exam from a monitor using the camera of the student. The webcam uses the HTTPS protocol to capture frames and detects three types of techniques: face detection, which detects the face using facial landmarks, face recognition, which determines whether the face matches the student or not using loops, and head pose estimation, which estimates the student’s head movement and angle using facial landmarks and some functions and algorithms. When the data is discovered, it will be saved in a CSV file. This method will be followed throughout the exam. The test will be proctored continuously during this procedure. The data will be saved in a database as a CSV file. This procedure will conclude once the entire inspection has been completed, and the collected data will be saved in a CSV file database.
Fig. 1 Proposed architecture
264
K. Pravallika et al.
Algorithm 1 Master algorithm 1: Start process online Proctor 2: while True do 3: screen ← detect the face from cam 4: face Detect← by face Detection (method “HOG”) 5: find face← find Face Form Current Frame 6: if person == 0 then 7: cancel Exam: person not found 8: else if get face == 1 then 9: continue Exam 10: else if 11: face Match← face Recognition () 12: if face Match = = False then 13: cancel Exam: Unjustified face 14: else if 15: if person range > = 1 then 16: Stop the exam (another person detected from the frame) 17: else if 18: Person range=l then 19: Continue the exam 20: end if 21: end while 22: end process
Face Recognition The most extensively used biometric for Internet authentication is face recognition. Intel created the OpenCV computer vision library in 1999. Image representation, operations, and binary patterns histograms (BPH) are some of the face recognition techniques supported by OpenCV. From the suggested methodology, photograph of the participant as the input and use HOG methods to recognize face landmarks in the image. Estimate the face with landmarks for the identify field picture. Measure the distance between eyes and size of lips and the photos are combine to faces already identified and saved in our database. Algorithm 2 shows the facial recognition pseudocode. Face recognition algorithm uses biometrics to map facial features from a photograph or video. The geometry of face is read by facial recognition software. The length between your eyes and the length from your forehead to chin are important considerations. The software recognizes facial landmarks; each system recognizes 68 points on a face which plays a crucial role in differentiating your face.
A Novel Framework for Malpractice Detection in Online …
265
Algorithm 2 Recognition of face 1: procedure Face Identification (Id, Name, password) 2: while True do 3:Casel ← present Case 4: Locate face ← get all faces 5: Encoding face ← get all face Encoding 6: face Match ← compare all faces with student face 7: 8: if same face == True then 10: 9: Face same 11: 12: else 13: 14: Face not same 15: end if 16: 17: end while I8: 19: end procedure
Object Detection Object tracking method that deals with a fixed depth of conceptual items of a certain class (such as individuals, houses, or automobiles) in digital photographs and videos. It is linked to computer vision applications. Human detection and recognition are two well-studied convolutional neural networks areas. Detection may be used in a wide range of computer tasks, such as picture extraction and video surveillance. For object recognition, the well-prior components of the Mobile Net SSD model trained with three classes (body, mobile phone, and laptop) from the COCO dataset to recognize a person. It was also travel to every frame Ft as well. In a similar way of face detection, object detection, and find the person count with the help of Mobile Net SSD provided a threshold value. With the help of COCO data set to find the objects present in the video. Algorithm 3 shows the object detection pseudocode. It works with the help of Mobile Net SSD with the help of mobile net ssd uses a matching phase while training to identify the appropriate anchor box to the clusters of each ground truth object in the frame. Generally, the anchor box having the greatest overlap with an object is in charge of deciding the object’s class and address.
266
K. Pravallika et al.
Algorithm 3 Object detection algorithm 1: procedure (login id. Name: Password) 2: while True do 3: capture the frame from participant’s webcam 4: frame ← present frame 5: face ← detect face 6: : frame ← detect the frame of participant 7: 8: 9: object ← object recognition 10: if object found 11: cancel exam 12: if object not found 13: continue the exam 14: end if 21: end for 22: end while 23: end procedure
Multi-person Detection As previously mentioned, the COCO dataset and Mobile Net SSD is a deep neural network algorithm to detect multiple objects present in the image. Mobile Net SSD (Single-shot object detection) the object recognition system depends on mobile net as a base, and it can identify rapidly. It takes input image stores in the database during the examination. It will detect every participant’s display when there is another person in the screen making it malpractice. With help of a deep convolutional neural network, it classifies 80 objects and is super-fast. It has 53 convolutional layers with each of them followed by a batch normalization layer. It just need a single photo using SSD to identify multiple objects in the frame. Algorithm 3 shows the facial recognition pseudocode.
A Novel Framework for Malpractice Detection in Online …
267
Algorithm 4 Multi-person detection 1: procedure gets (person count) 2: while True do 3: Capcv2.vedio capture (0) 4: detector ← dlib. get portal faces 5: while true 6: frame← cap. Read() 7: frame ← cv2.flip (frame) 8: gray ← cv2 lvt color (frame cv2 color BGR2GRAY) 9: face ← detector (gray) 10: for face in face: x, y ← face.Left () .face.Top () X1,y1← frame. Right (), face, Bootom () 11: cv2 ← rectangle (frame (x,y),(x1,y1),(0,255,0,2)) 12: print (face i) 13: cv2 test show (“frame”, frame) 14: if 15:cv2. Wait kgy(l) 16: break 17: cap release () 18: cv2. destroy all 19: end if 20: end while 21: end procedure
4 Results Object Detection (Mobile Phone) and Multiple Person Detection are shown in Figs. 2 and 3.
5 Conclusion In this project, they used both methodologies to create an effective pipeline for online interviews and assessments. The system’s operands have to be limited. Since a fast pipeline was offered. Visual data is being used to restrict inputs, whereas audio data is avoided and the cheating analysis system outputs are limited to. There are only three ways to make mistakes: the presence of an individual, a tool, and is unavailable. Moreover, the model is implemented on the OpenCV. With the help of dlib and HOG methods, face detection and identification algorithms were developed. Fraud identification mechanism is presented in this paper. The system’s primary goal is to serve secure and reliable testing. Discussions and assessments can be done online in this situation. It simply requires a small person’s video, which can be recorded using an integral webcam, as data. As a consequence, it is a fast and easy method to
268 Fig. 2 Object Detection (mobile phone)
Fig. 3 Multiple Person Detection
K. Pravallika et al.
A Novel Framework for Malpractice Detection in Online …
269
implement face identification, recognition, monitoring, and identification of objects are included within the pipeline. Another person, device, and absence were identified as the three key cheating acts. Additional work which can enhance the effectiveness of our system would include using a speech processing module, as most cheating instances involve merely talking or are followed by suspect voice behavior. The addition of eyeball detection, comparable to analyze the participant eyeball motion is yet another enhancement. These two modules have the potential to significantly improve the efficiency of our system. The trials were carried out a unique data of three videos exposing real-life cheating dataset consisting of three videos depicting reallife cheating acts. As a further step, features like audio analysis and eye prediction add to the work.
References 1. F. Alam, A. Almaghthawi, I. Katib, A. Albeshri, R. Mehmood. Response: An AI and IoTenabled framework for autonomousCOVID-19 pandemic management. Sustainability 13(7), 3797 (2021). [Online]. Available: https://www.mdpi.com/2071-1050/13/7/3797 2. E. Bilen, A. Matros, Online cheating amid COVID-19. J. Econ. Behav. Organ. 182, 196–211 (2021) 3. Y. Atoum, L. Chen, A.X. Liu, S.D. Hsu, X. Liu, Automated online exam proctoring. IEEE Trans. Multimedia 19(7), 1609–1624 (2017) 4. S. Prathish, K. Bijlani, An intelligent system for online exam monitoring. In 2016 International Conference on Information Science (ICIS) (IEEE, 2016), pp 138–143 5. H.S. Asep, Y. Bandung, A design of continuous user verification for online exam proctoring on M-learning, in 2019 International Conference on Electrical Engineering and Informatics (ICEEI) (IEEE, 2019), pp. 284–289 6. A.K. Pandey, S. Kumar, B. Rajendran, B.S. Bindhumadhava, e-Parakh: Unsupervised online examination system, in 2020 IEEE Region 10 Conference (TENCON) (IEEE, 2020), pp. 667–671 7. T. Yigitcanlar, R. Mehmood, J.M. Corchado, Green Artificial intelligence: Towards an efficient, sustainable and equitable technology for smart cities and futures. Sustainability 13(16), 8952 (2021). [Online] Available: https://www.mdpi.com/2071-1050/13/16/8952 8. Y. Atom, L. Chen, A.X. Liu, S.D. Hsu, X. Liu, Automated online exam proctoring. IEEE Trans. Multimedia 19(7), 1609–1624 (2017)
Frequency Reconfigurable of Quad-Band MIMO Slot Antenna for Wireless Communication Applications in LTE, GSM, WLAN, and WiMAX Frequency Bands B. Suresh, Satyanarayana Murthy, and B. Alekya Abstract This study proposes frequency reconfigurable, Quad-band MIMO slot antenna in various frequency bands of L and S to be used in wireless communication applications. The proposed Quad-band antenna comprises of 4 PIN diodes, and it is electronically switchable within LTE, GSM, WLAN, and WiMAX frequency bands, for example 1.9, 2.6, 2.4, and 3.8 GHz. This Quad-band antenna is designed for the operating bands and wireless communication bands for different performance parameters, i.e., S-parameters, far-field radiation characteristics, voltage standing wave ratio, and surface current distribution. In the operating band, it had a highest gain of 4.6 dBi. A meander-line resonator is placed in between antennas to achieve high isolation, which can improve the working of the Quad-band antenna. The proposed antenna was fabricated with FR4 substrate and 4 PIN diodes. The gain of proposed antenna is improved from 2.46 to 4.62 dBi gain, number of frequency bands is increased from three to four, and radiation efficiency obtained is 50%. Keywords Frequency reconfigurable · PIN diode · Slot antenna · Wireless communication applications · Quad-band frequency operation · Multiple-input multiple-output (MIMO) · Diversity gain (DG) · Envelope correlation coefficient (ECC) · Total active reflection coefficient (TARC) · Mean effective gain (MEG) · Channel capacity loss (CCL)
1 Introduction The developments in current wireless communication systems has enhanced the use of antenna-based sensor technology in satellite systems, radars, robotic, healthcare, wireless communication, [1] etc. As temperature sensor [2] and sensor for soil moisture [3], patch antennas have been considered to be utilized. MIMO antennas have been recently utilized as microwave sensor in Worldwide Interoperability for Microwave Access (WiMAX), mobile network frequency bands, Bluetooth, and B. Suresh · S. Murthy (B) · B. Alekya Department of ECE, V R Siddhartha Engineering College, Vijayawada, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_21
271
272
B. Suresh et al.
Wireless Local Area Network (WLAN). To get high data rate, this MIMO technology is unified into Long-Term Evolution (LTE), Global System for Mobile Communications (GSM), WLAN, and WiMAX networks. The operating frequencies of LTE, WLAN, GSM, and WiMAX are 1.9 GHz, 2.4 GHz, 2.6 GHz, and 3.8 GHz, respectively. A compact frequency reconfigurable antenna may be utilized in space of utilizing multiple antennas, to work on various frequency bands considering its applications on various time places. The size of the compact frequency reconfigurable MIMO antennas will be compact, and the mutual coupling effect between antennas should be low, for integrating MIMO antennas into various electronic systems and devices. Different methods can be utilized to lower the mutual coupling effect among antennas, including the resonators, metamaterials, parasitic elements, employing neutralization lines, etc. Reduction of mutual coupling by utilizing the slot of meander-line resonator [4] and some defected structures of ground [5] have been recently revealed by researchers. Frequency reconfigurable antenna may be proved utilizing different techniques, like Radio Frequency Microelectromechanical Systems, PIN, varactor diodes [6], etc. Some reconfigurable antennas are recently reported in [7, 8]. In [7], for single and double band operation, to get reconfigurability of frequency, the stepped feed line was drawn to feed a slot of antenna, which connects the 3 PIN diodes. In [8], 4 MEMS switches were placed to get a reconfigurable of frequency MIMO antenna along with an isolation over the 22 dB. In any case, it is big challenge to have reconfigurable of frequency in a compact slot MIMO antenna along with greater isolation ratio. In [9], frequency reconfigurable MIMO slot antenna using 4 PIN diodes achieved two frequency bands and gain of 2.46 dBi only. This study presents an advanced frequency reconfigurable MIMO compact antenna having wireless communication applications in various frequency bands of LTE, WLAN, GSM, and WiMAX. The antenna-based sensor covers the 4 communicating frequency bands (1.9, 2.4, 2.6, and 3.8 GHz). The isolation between the two antennas in all 4 four frequency bands is over the 28 dB, and the ECC is lower than 0.08, so it is not more than desired range for ECC.
2 Antenna Design 2.1 Reconfigurable of Frequency Slot Antenna In this section of the study, we have presented analysis and design for a W-shaped reconfigurable of frequency slot antenna (RFSA). The Fig. 1 presents a RFSA antenna which can be switched between four communication bands of frequency adjusted at f 1 = 1.9, f 2 = 2.6, f 3 = 2.4, and f 4 = 3.8 GHz. By utilizing a W-shaped slot, and 2 PIN diodes, having dielectric constant (εr ) on 4.4 and loss tangent (tan δ) on
Frequency Reconfigurable of Quad-Band MIMO Slot Antenna …
273
L W3
L7
L6
L3
w
PIN diode
L5
L1
D1
L4
D2
L2 W1
Y
W2
X Port
Fig. 1 Reconfigurable of frequency slot antenna, (L1 = 6 mm, L = 32.5 mm, L2 = 26 mm, L3 = 14 mm, L4 = 2 mm, L5 = 5 mm, L6 = 9 mm, L7 = 10 mm, W 1 = 2 mm, W = 20 mm, W 2 = 8 mm, W 3 = 1 mm)
(a)
(c)
(b)
(d)
Fig. 2 Various structures of RFSA at various states of PIN diodes. a Structure 1 for frequency = 1.9 GHz, b Structure 2 for frequency = 2.6 GHz, c Structure 3 for frequency = 2.4 GHz, and d Structure 4 for frequency = 3.8 GHz
0.02, a RFSA is designed having 0.8-mm thickness FR4 substrate. This antenna was provided feedline of 50 microstrip.
274
B. Suresh et al.
Table 1 Different equivalent slot shapes for the RFSA antenna Pin diodes
Structure I
Structure II
Structure III
Structure IV
D1
Off
Off
On
On
D2
Off
On
Off
On
Resonant frequency (GHz)
1.9
2.6
2.4
3.8
Slot length (mm)
58.5
36.25
49.15
27.4
In Fig. 2a–d, different structures (Structures I–IV) for the slot RFSA are presented. The distances of the slots are equivalent to λg/2, while the wavelength for the resonance frequency was λg. Various structures could be illustrated in one structure (Fig. 1) by utilizing D1 and D2 (2 PIN diodes) on their suitable positions. For instance, by switching D2 and D1 to ON and OFF states, respectively, the Structure II could be achieved. All structures are described in Table 1, with various switching positions of the diodes. This antenna is fabricated with a PIN diode— BAP65-0, 115 PIN diodes. This diode shows lower capacitance in off position [10]. The diode ON condition has the values of R = 1 , L = 0.6 nH, and the diode OFF condition has the values C = 0.35 pF, R = 20 K and L = 0.6 nH where R and C are parallel. The reconfigurable of frequency W shape slot antenna can be structured by using 2 (two) BAP65-0, 115PIN diodes. Turn on the PIN diode, taking into account 1 resistance and 0.6 pF inductance. Turn off the PIN diode, taking into account 20 Kohm resistance, a 0.35 pF capacitor, and 0.6 nH [10], the simulation of RFSA antenna in software of Ansys HFSS. As shown in Table 1, we can see that RFSA antenna may be switched among 4 single bands of frequency. Fig. 3 shows return loss points of different structures (Structure I to Structure IV). This is evident and observed in the study that desired resistance matching is gained at f 1 = 1.9, f 2 = 2.6, f 3 = 2.4, and f 4 = 3.8 GHz.
2.2 Reconfigurable of Frequency MIMO Antenna The reconfigurable of frequency MIMO antenna for application in various frequency bands like LTE, GSM, WLAN, and WiMAX has been developed on substrate FR4 utilizing 2 RFSAs, which is presented in Fig. 4. This Frequency Reconfigurable MIMO antenna has 70 mm length, 20 mm width, and 0.8 mm height, with 5 mm separation distance between its elements. For this RF multi-input multi-output antenna, progress is presented in Fig. 5 and S-parameters are shown in Fig. 7. Firstly, there are two W-shaped RF slot antennas kept at head-to-head places at FR4 substrate having size Lm × Wm and thickness 0.8 mm, as visible in the Fig. 5. A mazeformed meander-line resonator slot is fixed in the center of two RF slot antennas with the same separation distance, to improve the isolation. When the external current propagation between the components of antenna is suppressed by the meander-line
Frequency Reconfigurable of Quad-Band MIMO Slot Antenna …
275
Fig. 3 Return loss of reconfigurable of frequency antenna as indicated by Structures I–IV
Lm
substrate
Wf L4
L3
L2 Wm
D1
D2
D3
D4
metal
Y Z X L3 L1
(a)
Port
Port
(b)
Fig. 4 Designed reconfigurable of frequency MIMO antenna. (Lm = 70 mm, L1 = 4 mm, L2 = 5 mm, L3 = 33 mm, L4 = 1.75 mm, Wm = 20 mm, Wf = 2 mm) PIN diodes are at D1to D4. a Top view b bottom view
resonator slot, it results in low mutual coupling. Thus, setting at the f 1, f 2, f 3, and f 4 resonance frequencies, the isolation among the components was improved to 20, 18, 26, and 28 dB, as shown in Figs. 6 and 7. Furthermore, a novel maze shape meander-line resonator is applied to the MIMO, to improve the isolation as shown in Fig. 5. In Fig. 4, on the two sides of MIMO antenna at reasonable places, the MLR is placed. The MLR decreases the nearby field coupling among the components of the antenna. It also suppresses the surface current distribution and more. The MLR decreases the nearby field coupling among the components of the antenna. It also suppresses the surface current distribution and more. Further, Fig. 9 shows surface distribution of current flows. From the Sparameter plot presented in Fig. 7, it is also evident that the values of isolation are
276
B. Suresh et al.
D1
D2
D3
P1
P2
(a)
D1
D2
D3
P1
D4
P2
(b)
D1
D4
D3
D2
D4
(c)
Fig. 5 Various arrangements of reconfigurable of frequency proposed antenna a stage1, b stage 2, c stage3
D1
w7 D2
w6 w5 Lm w4
D3
w2
w3
w1 D4
Wm
Fig. 6 Geometry of meander-line resonator (Lm = 20 mm, Wm = 5 mm, w1 = 5 mm, w2 = 2 mm, w3 = 2 mm, w5 = 0.45 mm, w6 = 0.5 mm, w7 = 0.2 mm)
more than 20, 18, 26, and 28 dB in the corresponding frequencies of 1.9, 2.6, 2.4, and 3.8 GHz, respectively.
Frequency Reconfigurable of Quad-Band MIMO Slot Antenna …
277
Fig. 7 S-parameters simulated and measured a for 1.9 GHz, b for 2.6 GHz, c for 2.4 GHz, d for 3.8 GHz
2.3 S-parameters S-parameters are utilized to characterize electrical networks using matched impedances. In the Fig. 7 presented simulation of S-parameters for the plots of MIMO antenna (a) resonating ON f 1 where d 1 to d 4 are in OFF state, (b) resonating ON f 2 when d 1 , d 3 are in OFF state and d 2 , d 4 are in ON state, (c) resonating ON f 3 when d 1 , d 3 are in ON state and d 2 , d 4 are in OFF state, and (d) resonating ON f 4 where d 1 to d 4 are in ON state.
2.4 VSWR characteristics Voltage standing wave proportion (VSWR) is characterized as the proportion among transmitted and reflected voltage standing waves in a radio RF electrical transmission network. It is a proportion of how proficiently RF power is communicated from the power source, through a transmission line, and into the load. Simulated VSWR characteristics of this proposed antenna plots are presented in Fig. 8. They are resulted from these characteristics LTE band 1.9 GHz, GSM band 2.6 GHz, WiMAX frequency band 3.8 GHz, and WLAN frequency band 2.4 GHz.
278
B. Suresh et al.
Fig. 8 VSWR plot simulated and measured
(a)
(b)
(c)
(d)
Fig. 9 Surface current distribution of proposed MIMO antenna (a) at D1 off D2 off condition, (b) at D1 off D2 on condition, (c) at D1 on D2 off condition, (d) at D1 on D2 on condition
2.5 Surface Current Distribution Fig. 9a–d, respectively, has shown the distribution of surface current for this frequency reconfigurable slot antenna at frequencies 1.9, 2.6, 2.4, and 3.8 GHz. The current distribution at slot frequencies is very less, the current distribution for
Frequency Reconfigurable of Quad-Band MIMO Slot Antenna …
279
operating frequencies is high, and it is more concentrated in radiating element feed line and the corners of rectangular patch.
2.6 Radiation Pattern Fig. 10a–d represents the 3D-polar plots, and Fig. 11a–d simulated and measured radiation patterns (co-polarization and cross-polarization) of proposed reconfigurable antenna for four operating frequencies 1.9, 2.6, 2.4, and 3.8 GHz. At 1.9, 2.6, 2.4, and 3.8 GHz, the proposed antenna exhibits directional patterns with a maximum of 2.65, 2.00, 3.64, and 4.6 dB gain, respectively. The antenna gain and efficiency were additionally estimated in the anechoic chamber, and the outcomes are plotted in Figs. 10 and 12. It should be noted that the integration of the pin diodes reduces the antenna acquisition and productivity due to the additional disadvantage caused by the characteristic impedance of the diodes.
(a)
(c)
(b)
(d)
Fig. 10 3D-polar plot of proposed antenna 3D gain plot of proposed antenna. a Frequency at 1.9 GHz, b frequency at 2.6 GHz, c frequency at 2.4 GHz, d frequency at 3.8 GHz
280
(a)
(c)
B. Suresh et al.
(b)
(d)
Fig. 11 Radiation patterns of proposed antenna (Co-polarization phi = 0 and cross-polarization phi =90) a frequency at 1.9 GHz, b frequency at 2.6 GHz, c frequency at 2.4 Ghz, d frequency at 3.8 GHz
Fig. 12 Radiation efficiency of proposed antenna
3 Photograph of Fabricated Antenna This proposed reconfigurable of frequency slot antenna with Quad-bands was fabricated on FR4 substrate having dimensions 70 × 20 mm2 , thickness 0.8 mm and dielectric constant of 4.4.
Frequency Reconfigurable of Quad-Band MIMO Slot Antenna …
(a) Top view
(b) Bottom view
281
(c) Experimental set-up with VNA formeasuring antenna
Fig. 13 Antenna testing
The fabricated and testing of proposed MIMO Quad-band antenna is presented in Fig. 13, and its experimental set-up with VNA for measuring antenna performance parameters is represented in Fig. 7.
4 Experimental Results To validate the current method, a model of proposed MIMO antenna is presented; after placing 4 PIN diodes at desired areas with proper DC biasing, the characteristics are noted. S-parameters for all these states (F1-F4) are simulated and measured, as shown in Fig. 7. This is evident by the figure that the MIMO antenna [11–18] can be switched between four bands, i.e., LTE, GSM, WLAN, and WiMAX applications, and the isolation obtained among the antenna components is more than 28 dB. At the resonance frequencies, Fig. 11 shows the radiation patterns of the propped antenna in bi-directional in the E plane and omnidirectional in the H plane.
4.1 MIMO antenna Diversity Performance Analysis Instead of one antenna, two or more antennas can be used to enhance the quality and reliability of the wireless network. This presents model with fathoming of the proposed antenna. The accompanying boundaries [19, 20] like MEG (mean effective gain), ECC (Envelope correlation coefficient), DG (diversity gain), TARC (total active reflection coefficient), and CCL (channel capacity loss) are determined (reproduced furthermore, estimated) and introduced in this part. Here, the figure of measurements for the multiple-input multiple-output antenna framework is presented.
282
B. Suresh et al.
Fig. 14 Simulated and measured of envelope correlation coefficient of proposed antenna
Values of the radiation for antenna components 1 and 2 are η1 and η2, respectively. In Fig. 14, the ECC is plotted which displays that in all the three frequency bands the ECC values are less than 0.08, and it is lower than the desired range of ECC for better presentation, i.e., ρe < 0.5 [8]. 2 ∗ S S12 + S ∗ S22 11 21 ρe = 1 1 − |S11 |2 − |S21 |2 1 − |S22 |2 − |S12 |2 2
(1)
The increased diversity gain is utilized to communicate using multi-input multioutput receiving antenna. Diversity gain can be determined utilizing Eq. (2) and accomplish more noteworthy than 9.7 in a working band, which can be seen from Fig. 15. DG =
(1 − 0.99Pe)
(2)
In multipath environment, mean effective gain is to measure a ratio of signal strength of the main test antenna to the reference antenna. MEG is calculated according to the following Eq. (3) and plot shown in Fig. 16. ⎡
⎤ N 2 Si j ⎦ MEG = 0.5⎣1 − Z j=1
Also MEGi − MEGj < 3 dB. So, MEG can be written as
(3)
Frequency Reconfigurable of Quad-Band MIMO Slot Antenna …
283
Fig. 15 Diversity gain of proposed antenna simulated and measured
Fig. 16 Mean effective gain of proposed antenna simulated and measured
M E G1 = 1/2 1 − |S11 |2 − |S12 |2 − |S13 |2 − |S14 |2 MEG2 = 1/2 1 − |S21 |2 − |S22 |2 − |S23 |2 − |S24 |2 MEG3 = 1/2 1 − |S31 |2 − |S32 |2 − |S33 |2 − |S34 |2 MEG4 = 1/2 1 − |S41 |2 − |S42 |2 − |S43 |2 − |S44 |2 TARC can be figured straightforwardly from the scattering matrix. Essentially to the dynamic reflection coefficient, the TARC is a component of recurrence, and it likewise relies upon scan angles and tapering. TARC relates the total incident power to the total outgoing power in an N-port microwave components and determined by utilizing Eq. (4), and it is shown in Fig. 17.
284
B. Suresh et al.
Fig. 17 Total active reflection coefficient of proposed antenna
2 N N jθm−1 i=1 Si1 + m=2 Sim e TARC = √ N
(4)
CCL is computed, which helps in defining the loss of transmission bits/s/Hz in a high data rate transmission. The minimum acceptable limit of CCL over which the high data rate transmission is defined by 0.4 bits/s/Hz and it displayed in Fig. 18, The highest possible CCL achieved in the frequency band shown below (Table 2). Closs = − log2 det α R
Fig. 18. Channel capacity loss of proposed antenna simulated and measured
(5)
Frequency Reconfigurable of Quad-Band MIMO Slot Antenna …
285
Table 2 Comparison of proposed antennas with other research papers Ref. No
Structure of antenna
Material
Antenna size (mm3 )
[21]
Microstrip slot antenna
Taconic RF35 εr = 3.5 and tanδ = 0.0018
50 × 46 × 1.52 2.2, 4.7
3.7
[22]
U slot antenna
Taconic TLT εr = 2.55 and tanδ = 0.0025
40 × 40 × 0.8
2.3, 3.6
4.5
[7]
L slot antenna
Rogers RO4350B εr = 3.48 and tanδ = 0.0037
27 × 25 × 0.8
2.8, 4.5, 5.8 3.6
[9]
C-shaped slot
FR4 εr = 4.4 and tanδ = 0.02
60 × 20 × 1.6
2.4. 3.6, 5.8 2.4
[23]
Rhombus
RT-duroid εr = 2.2 and tanδ = 0.0009
55 × 52 × 1.6
4–7.3
4
[24]
H-shaped slot
FR4
r = 4.4, tanσ = 0.02
50 × 50 × 1.6
3.5, 5.5
2.5, 2.6
[25]
Ring shaped
FR4
r = 5.4, tanσ = 0.02
22 × 13 × 1.5
3.1–10.6
4.2
[26]
Square-shaped cell
Rogers 80 × 80 × 1.6 RT/Duroid 5880 εr = 2.2 and tanδ = 0.0009
1.76, 5.71
4.2
Proposed antenna
W-shaped slot
FR4 εr = 4.4 and tanδ = 0.02
1.9, 2.6, 2.4, 3.8
2.6 2 3.6 4.6
70 × 20 × 0.8
⎞ N 2 αii = 1 − ⎝ Si j ⎠
Operating frequency (GHz)
Gain (dBi)
⎛
(6)
j=1
⎡
α11 ⎢ α21 R α =⎢ ⎣ α31 α41
α12 α22 α32 α42
α13 α23 α33 α43
⎤ α14 α24 ⎥ ⎥ α34 ⎦ α44
(7)
286
B. Suresh et al.
5 Conclusion The work presents reconfigurability of frequency MIMO antenna utilizing 4 PIN diodes for application of wireless sensor network in WLAN, LTE, GSM, and WiMAX and wireless frequency bands, for example, one LTE band (1.9), GSM band (2.6), WLAN band (2.48), and one WiMAX band (3.8 GHz). Simulated and measured values of antenna performance parameters like VSWR characteristics, surface current distribution, and far-field radiation characteristics are investigated. VSWR values are between 2 and 1 in the whole operating frequency range. This MIMO antenna achieved a highest gain of 4.6 dB and maximum radiation efficiency of 50%. In all 4 frequency bands, by utilizing an meander-line resonator slot and after, a metallic strip at the ground level, more over 20, 18, 26 and 28 dB isolation in the corresponding frequency band within the antenna components and get ECC lower than 0.08, DG greater than 9.7, MEG less than 3 dB and CCL under 0.4 pieces/s/Hz. This proposed MIMO antenna has 2.65, 2.0, 3.64, and 4.6 dBi gains, respectively, at the different frequencies at f 1, f 2, f 3, and f 4. This MIMO antenna has the radiation which is bidirectional in E plane and omnidirectional in H plane. The proposed MIMO antenna is good for wireless communication applications with the provision of LTE, GSM, WLAN, and WiMAX applications.
References 1. R. Bhattacharyya, C. Floerkemeier, S. Sarma, Low-cost, ubiquitous RFID- tag-antenna-based sensing. Proc. IEEE 98(9), 1593–1600 (2010) 2. J.W. Sanders, J. Yao, H. Huang, Microstrip patch antenna temperature sensor. IEEE Sens. J. 15(9), 5312–5319 (2015) 3. H. Zemmour, G. Baudoin, A. Diet, Effect of depth and soil moisture on buried ultra-wideband antenna. Electron. Lett. 52(10), 792–794 (2016) 4. S. Hwangbo, H.Y. Yang, Y.K. Yoon, Mutual coupling reduction using micromachined complementary meander line slots for a patch array antenna. IEEE Antennas Wirel. Propag. Lett. 16, 1667–1670 (2017) 5. S. Pandit, A. Mohan, P. Ray, A compact four-element MIMO antenna for WLAN applications. Microw. Opt. Technol. Lett. 60, 289–295 (2018) 6. L. Ge, K.M. Luk, Frequency-reconfigurable low-profile circular monopolar patch antenna. IEEE Trans. Antennas Propag. 62(7), 3443–3449 (2014) 7. L. Han, C. Wang, X. Chen, W. Zhang, Compact frequency-reconfigurable slot antenna for wireless applications. IEEE Antennas Wirel. Propag. Lett. 15, 1795–1798 (2016) 8. S. Soltani, P. Lotfi, R.D. Murch, A port and frequency reconfigurable MIMO slot antenna for WLAN applications. IEEE Trans. Antennas Propag. 64(4), 1209–1217 (2016) 9. S. Pandit, A. Mohan, P. Ray, Compact frequency-reconfigurable MIMO antenna for microwave sensing applications in WLAN and WiMAX frequency bands. IEEE Sens. Lett. (2018) 10. NXP Semiconductors, “BAP65–0, 115 PIN diode,” data Sheet 2010 [Online]. https://www.far nell.com/datasheets/1697008.pdf 11. S.M. Nimmagadda, A new HBS model in millimeter-wave beamspace MIMO-NOMA systems using alternative grey wolf with beetle swarm optimization. Wirel. Pers. Commun. 120, 2135– 2159 (2021). https://doi.org/10.1007/s11277-021-08696-6
Frequency Reconfigurable of Quad-Band MIMO Slot Antenna …
287
12. S.M. Nimmagadda, Enhancement of efficiency and performance gain of massive MIMO system using trial-based rider optimization algorithm. Wirel. Pers. Commun. 117, 1259–1277 (2021) 13. S.M. Nimmagadda, Optimal spectral and energy efficiency trade-off for massive MIMO technology: analysis on modified lion and grey wolf optimization. Soft. Comput. 24, 12523–12539 (2020). https://doi.org/10.1007/s00500-020-04690-5 14. N.S. Murthy, S.S. Gowri, B.P. Rao, Non orthogonal quasi-orthogonal space-Time block codes based on circulant matrix for eight transmit antennas. Int. J. Appl. Eng. Res. 9(21), 9341–9351 (2014). ISSN 0973–4562 15. N.S. Murthy, S.S. Gowri, Full rate general complex orthogonal space-time block code for 8transmit antenna. in ELSEVIER, SCIVERSE Science Direct Procedia Engineering in IWIEE 2012 (China, 2012), Jan 9–10, 2012. H Index: 11, IF: 0.61 (Scopus index), ISSN: 1877-7058 16. N.S. Murthy, S.S. Gowri, B.P. Rao, Quasi-orthogonal space-time block codes based on circulant matrix for eight transmit antennas, in IEEE International Conference on Communication and Signal Processing—ICCSP 14 at Melmaruvathur, on 3rd-5th Apr 2014 17. N.S. Murthy, S.S. Gowri, P. Satyanarayana, Complex Orthogonal space-time block codes rates 5/13 and 6/14 for 5 and 6 transmit antennas, in Wireless Communications, Networking and Mobile Computing (WiCOM), 2011 7th International Conference Sept 23-25 at Wuhan (IEEE Explore Digital Object Identifier, China, 2011). https://doi.org/10.1109/wicom.2011.6040107 18. N.S. Murthy, Improved isolation metamaterial inspired MM-Wave MIMO dielectric resonator antenna for 5G application. Prog. Electromagnet. Res. C 100, 247–261 (2020). https://doi.org/ 10.2528/PIERC19112603 19. A. Kayabasi, A. Toktas, E. Yigit, K. Sabanci, Triangular quad-port multi-polarized UWB MIMO antenna with enhanced isolation using neutralization ring. AEU-Int. J. Electron. Commun. 85, 47–53 (2018) 20. M. Naser-Moghadasi, R. Ahmadian, Z. Mansouri, F.B. Zarrabi, M. Rahimi, Compact EBG structures for reduction of mutual coupling in patch antenna MIMO arrays. Prog. Electromagnet. Res. C 53, 145–154 (2014) 21. H.D. Majid, M.K.A. Rahim, M.R. Hamid, M.F. Ismail, A compact frequency-reconfigurable narrowband microstrip slot antenna. IEEE Antennas Wirel. Propag. Lett. 11, 616–619 (2012) 22. Z. Ren, W.T. Li, L. Xu, X.W. Shi, A compact frequency reconfigurable unequal U-slot antenna with a wide tunability range. Prog. Electromagn. Res. Lett. 39, 9–16 (2013) 23. K. Murthy, K. Umakantham, K. Satyanarayana, Reconfigurable notch band monopole slot antenna for WLAN/IEEE-802.1 1 n applications. Aug 2017 http://www.inass.org/2017/201 7123118 24. G. Jin, C. Deng, Differential frequency reconfigurable antenna based on dipoles for Sub6GHz 5G and WLAN applications. IEEE Antennas Wirel. Propag. Lett. https://doi.org/10. 1109/LAWP.2020.2966861 25. M.U. Rahman, CPW fed miniaturized UWB tri-notch antenna with bandwidth enhancement. Adv. Electric. Eng. 2016, Hindawi Publishing Corporation, Article ID 7279056, https://doi. org/10.1155/2016/7279056 26. M. Shirazi, J. Huang, A switchable-frequency slot-ring antenna element for designing a reconfigurable array. IEEE Antennas Wirel. Propag. Lett. https://doi.org/10.1109/LAWP.2017.278 1463
Intelligent Control Strategies Implemented in Trajectory Tracking of Underwater Vehicles Mage Reena Varghese and X. Anitha Mary
Abstract Underwater vehicle (UV) is a considerable technology for the renewable use of ocean resources. As we know, the ocean environment is harsh and complex and is affected by various force and moments. Hence, it is a tough task to attain an UV stable in the pre-planned static position and heading angle. Trajectory tracking control is a main research area for the above said problem. In this paper, various control strategies used in UVs for tracking control such as classical, robust, and intelligent controllers are reviewed. Intelligent controllers with novel controlling techniques give extreme good results under complex environmental uncertainties such as waves, ocean currents, and propulsion. Keywords PID control · Sliding mode control · Adaptive control · Fuzzy control · Artificial neural network · Linear quadratic gaussian · H infinity · intelligent controllers
1 Introduction Nowadays, underwater vehicles have been extensively used in many vital applications such as extraction rigs, military, gas, and oil factories. These are mainly suitable for risky conditions where human beings face difficulties to carry on. The two main classifications of underwater vehicles are remotely operated vehicle (ROV) and autonomous underwater vehicle (AUV). ROV is observed as a hitched robot linked through an umbilical cord for communications and power transferring. But AUV is an untethered robot which have its own power and achieve a pre-planned task. ROV is simpler and faster to set up because it can be operated on on-board by a person and is good for pre-prepared missions for vast areas like the deliver for MH370. In M. R. Varghese Department of Robotics Engineering, Karunya Institute of Technology and Sciences, Coimbatore, India X. A. Mary (B) Karunya Institute of Technology and Sciences, Coimbatore, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_22
289
290
M. R. Varghese and X. A. Mary
the twenty-first century, the AUV is becoming more popular under the name of two categories, hovering vehicles and flight vehicles. Former is used for physical work around fixed objects, detailed inspection, etc. and later for the purpose of object location, object delivery, searches, and surveys. But the design of AUVs is tough since it includes lots of manipulators and sensors. Currently, researches all over the world are tediously working in the motion control and development of control algorithms of UV. The research is so vital that it has to deal with the multiplex oceanic habitat and disturbance agents such as sea waves and ocean currents that alter with seas and depths. However, UV has to track the specific paths accurately and efficiently in a most stable manner [1, 2]. The basic performance of an underwater vehicles can be inflated by enhancing the trajectory tracking control. This is much crucial in UV implementation. It mainly includes trajectory tracking and path following. The former is dependent on time, and later is independent of time. UVs have to follow the known reference trajectory to reach the destination for completing a pre-planned task in an underwater environment such as subsea jacket inspection on offshore oil platforms as well as maintenance and pipeline inspection. Many research works are undergoing in this path tracking areas of the AUV. The various control strategies used in former studies can be categorized into classical controllers like PID, sliding mode, and self-adaptive; robust controllers like Linear Quadratic Gaussian and H infinity controllers; intelligent controllers like fuzzy and neural network controllers; combination of intelligent controllers along with other controllers; and various other control techniques. The paper first discusses the classical and robust control tracking algorithms used in UV by the researchers. Secondly, the fast growing intelligent control techniques and the recent developments in this area are also reviewed.
2 Conventional Trajectory Tracking Controls Used in UV Some of the most common conventional or basic control strategies used in underwater vehicles are classical controllers like PID, sliding mode, and self-adaptive and robust controllers like Linear Quadratic Gaussian and H infinity controllers. This is been discussed in the following section.
2.1 Proportional Integral Derivative (PID) Control PID controller (Fig. 1) is a classical control which adopts a feedback control technique to rectify the position error ‘e (t)’between an actual trajectory ‘y (t)’ and reference trajectory ‘r (t)’ with corrective action [3]. Kp, Kd, and Ki are the parameters to be selected appropriately for a fast, accurate, stable, and smooth dynamic process. The PID controller’s output [4] is given by u (t) = Kp e (t) + Ki Ѐ e(t) dt + Kd d/dt e(t).
Intelligent Control Strategies Implemented in Trajectory …
291
Fig. 1 PID tracking control for underwater vehicle. Source [3, 4]
PID control is a nonlinear controller. Guerrerro et al. 2019 recommended to replace a PID controller’s constant feedback gains with a set of nonlinear functions for trajectory tracking control of UV. Its stability can then be checked by using Lyapunov’s design [5]. The work is done in real time, and the nonlinear PID shows good tracking performance and robustness [5]. The main task of the PID controller is to tune its parameters. PID controller is widely used in all control applications as it is simple and reliable. One of its drawback is that it is not adaptable to the environmental and working condition, because manual set up is done for the parameters. Hence, many of the PID controllers are clubbed together with other intelligent and adaptive controllers to obtain more optimal results. And also, it is chosen commonly to compare the simulations of the advanced controllers with PID controllers. Borase et al. [4] reviewed various applications of the PID in different fields such as process control, robotic manipulator, electric drives, and mechanical systems [4].
2.2 Sliding Mode Control Sliding mode control or SMC has been frequently using controller in underwater vehicles since three decades. It is one of the nonlinear robust controllers to sustain with marine environment. It has got many sliding mode control strategies for nonlinear system such as conventional, integral, dynamic, twisting, super-twisting, terminal, and fast terminal with different sliding manifold and control inputs. However, SMC results in chattering phenomenon due to discontinuous control action which leads to the reduction of tracking accuracy in AUV. The input to the SMC is the position error e (t), and the input to the UV is the force τ which comes from the controller (Fig. 2) [6–8].
292
M. R. Varghese and X. A. Mary
Fig. 2 SMC tracking control for underwater vehicle. Source [3, 7, 8]
Vu et al. [9] proposed a dynamic sliding mode control (DSMC) to improve the system robustness for the motion control of an over actuated AUV. Lyapunov criteria are used to analyze the stability of the system. In this, two different methods are used for the optimal allocation control in distributing a proper thrust to each seven thrusters of the over actuated AUV. They are the quadratic programming method and the least square methods [9]. For the depth control of AUV vertical plane, a dual loop control methodology is adopted by Wang et al. [10]. The inner pitch controller is a SMC, and the parameter tuning is done with extreme learning machine (ELM). The outer loop for depth control is built by a proportional controller. This design guaranteed the stability and robustness of the system [10]. Guerrerro et al. [11] presented high order sliding mode control with auto-adjustable gain under oceanic disturbances and uncertainty model in the issue of underwater trajectory tracking [11].
2.3 Adaptive Control Adaptive control is a complicated nonlinear feedback control method with little dependence on mathematical models. Craven et al. quoted ‘the majority of successful control studies include some kind of adaptive control strategy’ [12]. It does not need a known dynamic model. That is the reason most of the researchers combine their proposed controllers with adaptive especially in underwater vehicles. The four types of adaptive control are model reference adaptive control (MRAC), self-tuning, feed-forward, and feedback. In this, the most popular control is MRAC. The feedback loop permits an error measure to be calculated between the reference model and the output of the system (Fig. 3). Thus, depending on the error measurement, the controller parameters are tuned to minimize the error [7, 8, 12]. For trajectory tracking of AUV in [13], a novel adaptive control law is implemented for the calculation of uncertain hydrodynamic damping parameters. The stability of this law is checked using direct method of Lyapunov. Bandara et al [14] addressed a vehicle fixed frame adaptive controller for attitude stabilization in a low speed autonomous underwater vehicle along with external disturbances [14]. However, adaptive control finds difficult to adjust the control parameters and robustness when dealing with the actual motion.
Intelligent Control Strategies Implemented in Trajectory …
293
Fig. 3 Model reference adaptive control tracking system for UV. Source [3, 7, 8, 12]
2.4 Linear Quadratic Gaussian (LQG) LQG is an optimal controller used for linear time varying and linear time invarying system. It is a combination of a Linear Quadratic Regulator (LQR) and a Kalman filter. This controller is used to obtain the optimal parameter values which minimizes the quadratic cost function [15]. Hassan et al. (2021) suggested two controllers for an AUV to overcome the sensor noise which is a disturbance for the motion control. One is the LQG controller, and the other is an improved version of FOPID controller. The latter provides more stability.
2.5 H Infinity H infinity controller is a robust controller. It is based on H infinity norm, that is, under all frequencies and in all directions, maximum gain is obtained. Linearization and planning of the control law are two main steps of the H infinity design [16]. Gavrilina et al. [17] recommends an H infinity proposal for an underwater vehicle attitude control system to attain less response to noises from other passages. The outcome of this suggestion increased the standard of the current attitude control system of UVs [17].
3 Intelligent Control Techniques Used in UV Other than the basic controllers, artificial intelligence control is currently a trend in the tracking control of underwater vehicles as it gives excellent control. Some of the intelligent control techniques involve fuzzy control, artificial neural network control strategies, genetic algorithm techniques, machine learning techniques, etc.
294
M. R. Varghese and X. A. Mary
3.1 Fuzzy Intelligent Control Fuzzy control is a control method using the concept of fuzzy logic and comes under the category of intelligent controller. This is commonly used to control the uncertainties or strongly nonlinear systems when the precise system model is unknown. A fuzzy controller consists of empirical rules which is beneficial for operator controlled plants. A simple if–then structure is used as rules and given to a knowledge-based controller. The inference engine simulates the rules and calculates a control signal based on the inputs measured. The fuzzy controller is built of four parts: knowledge base, fuzzy interface, inference engine, and defuzzification interface (Fig. 4) [18]. Recent studies have shown that fuzzy logic controllers are implemented with huge success in underwater vehicles. Chen et al. [20] compared a fuzzy controller design with the PID controller for motion control system of an autonomous underwater vehicle and under uncertainties, and the proposed fuzzy controller has shown an excellent quality control [19]. To build an efficient obstacle avoidance approach for underwater vehicles in marine environments, Chen et al. [20] used an experimental platform with ROV equipped scanning sonar and applied fuzzy logic control to calculate linear and nonlinear problems. With the help of an optimum navigation strategy, fuzzy logic adopted ROV was able to avoid the obstacle and had a magnificent control stability [20]. A self-tuned nonlinear fuzzy PID (FPID) controller [21] is suggested for speed and position control of multiple-input multiple-output (MIMO) fully actuated UV with eight thrusters to follow desired trajectories. Here, Mamdani fuzzy rules are taken to tune the PID parameters. The result is compared with classical controller PID. The fuzzy PID controller has got a quicker response and reliable behavior.
Fig. 4 Fuzzy tracking control for UV. Source [3, 19]
Intelligent Control Strategies Implemented in Trajectory …
295
3.2 Artificial Neural Network or Artificial Intelligent Control Artificial neural network or simply neural network (NN) is also an intelligent controller adopted from a mammal neuron structure to match the biological neuron system’s learning ability. The general mathematic model of neural network comprises of input layer, hidden layer, and output layer [22]. NN has got a vital importance in nonlinear underwater vehicle control applications such as tracking control, collision avoidance control, motion control, and target searching. The recent trends are to incorporate the neural networks with adaptive, robust, dynamic, reinforcement learning, evolutionary, convolutional and bioinspired algorithm to get optimal control (Fig. 5) illustrates the block diagram of neural network (NN) tracking control for UV. Muñoz et al. [23] implemented a dynamic neural network (DNN)-based nonparametric identifier construction for the collected disturbances blended along with constant input gain’s parametric identification given by a parameter adaptive law known as Dynamic Neural Control System (DNCS) for an AUV with 4° of freedom. This resolves the trajectory tracking issue in an AUV under the harsh marine environment [24]. For efficacious real-time search of multiple targets with the help of multiple AUVs, a bioinspired neural network (BNN) is recommended. Here, a steepest gradient descent rule is used to establish a search path autonomously by AUVs and fuzzy control is provided to avoid the obstacles in the path of movement by optimizing the search path. The speciality of this algorithm is that the parameters do not require training and learning [23]. An evolutionary neural network (ENN) control known as assembler encoding in recurrent and feed-forward is carried out on a biomimetic AUV to bypass collision in underwater. The tests done inferred that ENN in recurrent control had a good performance over feed-forward [25].
Fig. 5 Neural network (NN) tracking control for UV. Source [3, 16]
296
M. R. Varghese and X. A. Mary
3.3 Genetic Algorithm Control ˙It is a computational tool which undergoes Darwin’s genetic criterion and natural evolutionary biological evolution process [26]. In [27], the parameter tuning is done with the help of genetic algorithm (GA) and harmonic search algorithm (HSA). This is suggested to obtain a robust steering control of an autonomous under water vehicle. Another advance method of tuning the parameter using GA is cloud modelbased quantum GA (CQGA). It is utilized in the tuning of fractional order PID control parameters to increase the performance of motion control for an AUV. Thus, a conventional integer order PID controller can be generalized to this newly suggested method [28]. A cloud model-based quantum genetic algorithm (CQGA)based FOPID controller for motion control is presented in [29]. In this, CQGA is used to tune the optimal parameters of the FOPID controller and the simulations show better result than using GA alone.
3.4 Machine Learning Control This is a controlling technique which is used mainly in complex nonlinear system. It will teach the machine to control the problem by previous experiences or using examples by different learning control methods [26]. Deep learning and reinforcement learning are the common methods used in underwater vehicles for trajectory tracking control Liu et al. [30] recommended a Deep Deterministic Policy Gradient (DDPG), an intelligent control algorithm for controlling the lower layer motion control of the vectored thruster AUV. The main advantage of this design is that a system model is not required, but some input coefficients of the AUV is obtained from the sensors [30]. An adaptive controller based on the deep reinforcement learning structure is been suggested by Carlucho et al. [31] for low level control of an AUV [31]. They were able to control all the six degrees of freedom by giving low-level commands to the thrusters for a real-time autonomous underwater vehicle by this machine learning technique. Different NNs such as Faster Region-based Conventional NN and Single Shot Multibox Detector structures were trained and confirmed using deep learning technologies for automatic recognition of target in an AUV on optical and acoustic subsea imagery [32].
4 Comparıson of Conventional and Intelligent Control Algorıthms Used in Underwater Vehicles Table 1 shows the pros and cons of the main trajectory tracking control algorithms which are used commonly.
Intelligent Control Strategies Implemented in Trajectory …
297
Table 1 Comparison of various control algorithms used in UV Advantages
Control method
Disadvantages
Conventional controllers PID [3]
1 2 3
Simple and reliable Most widely used Parameters can be automatically adjusted by self-tuning or intelligent algorithms
1 2
No adaptability to the changes because of the manual set up of the parameters Optimal control will not be achieved
Sliding mode [3]
1 2
An accurate dynamic model is not necessary More reliable and robust
1 2
High frequency of chattering Hence intensive heat losses and proleptical wear in thruster
Self-adaptive [3]
1 2
Highly adaptive Automatically adjust the control parameters
1 2
Based on accurate mathematical model The adjustment of control parameters and robustness when dealing with the actual motion is very difficult
LQG [2]
1 2
Optimal state feedback Controller Accurate control design
1 2
Sensitive to model accuracy Inefficient to handle nonlinearity
H infinity [16]
1 2 3
Robust Sharp tracking performance Fast response speed
1 2
Complexity of design Experience designer is required
Fuzzy logic [2, 19]
1 2 3 4
Accurate knowledge of system model is not needed It is popularly used for nonlinear and uncertain systems Easy to design, good stability, Faster response User friendly since it is natural language
1 2 3 4
Absence of a learning function Difficult to tune the fuzzy rules It has to smoothen the overshoot prediction Time consuming
Neural network [2, 22]
1 2 3 4
Exact model is not required 1 Commonly used in nonlinear 2 systems Self-learning ability is the good strength of NN Ability to tolerate faults
Real-time application of the control system is difficult as sample learning process lags Complex for real-time application
Genetic algorithm [26]
1 2
Commonly used as an optimization tool with other controllers Good amount of data types can be processed
Cost is high Software packages are less available
Intelligent controllers
1 2
(continued)
298
M. R. Varghese and X. A. Mary
Table 1 (continued) Control method Machine learning technique [26, 31]
Advantages 1 2 3 4
Model based or model free can be used Used in complicated systems Different frameworks can be used Good portability
Disadvantages 1 2 3 4
Efficiency depends upon data Much time consuming Computation is complex High cost
5 Conclusıon The ultimate aim of the trajectory tracking controller in underwater vehicles is to make the system stable under the ocean disturbances, unpredictable disturbances, and model uncertainties. In Sect. 2, the basic tracking controls used in the current researches have been discussed. These ones are commonly used in linear systems, and many of the traditional controllers require a known model. PID, which is the most common used controller, is much difficult to tune its parameters. This can be done with the help of intelligent controllers. The result obtained will be highly optimal. Intelligent controllers are best in trajectory tracking of UV. It can be applied to nonlinear models of complex nature, and there is no necessary of a known model. This is the main advantage of an intelligent controller other than the traditional controllers which is especially good in the case of underwater vehicles with unpredictable disturbances. The comparison Table 1 clearly pointed out the limitations of traditional controllers. So in real time, the basic controllers are commonly combined together with intelligent controllers to get a better simulation so as to compensate its own drawbacks and to obtain a better performance. Many researchers have done the above matter and trying to get the best results. Machine learning control, an intelligent controller using is a very good control technique in autonomous underwater vehicle’s trajectory tracking. In Sect. 3.4, we have seen some research papers based on this. They are much time consuming, but can be used as a control in the worst complicated cases. Research has been progressing to discover more advanced and innovative intelligent control techniques in the trajectory tracking of underwater vehicles.
References 1. F.U. Rehman, G. Thomas, E. Anderlini, Centralized control system design for underwater transportation using two hovering autonomous underwater vehicles (HAUVs). IFAC-Papers OnLine 52(11), 13–18, (2019). ISSN 2405-8963 2. M. Aras, M. Shahrieel, S. Abdullah, F. Abdul Azis, Review on auto-depth control system for an unmanned underwater remotely operated vehicle (ROV) using intelligent controller. J. Telecommun. Electron. Comput. Eng. 7, 47–55 (2015) 3. W.-Y. Gan, D.-Q. Zhu, W.-L. Xu, B. Sun, Survey of trajectory tracking control of autonomous
Intelligent Control Strategies Implemented in Trajectory …
4. 5.
6. 7. 8. 9.
10. 11.
12. 13. 14.
15.
16.
17.
18. 19. 20.
21.
22. 23.
299
underwater vehicles. J. Mar. Sci. Technol. (Taiwan). 25, 722–731 (2017). https://doi.org/10. 6119/JMST-017-1226-13 R.P. Borase, D.K. Maghade, S.Y. Sondkar et al., A review of PID control, tuning methods and applications. Int. J. Dynam. Control 9, 818–827 (2021) J. Guerrero, J. Torres, V. Creuze, A. Chemori, E. Campos, Saturation based nonlinear PID control for underwater vehicles:Design, stability analysis and experiments. Mechatronics 61, 96–105 (2019). ISSN 0957-4158 M. Mat-Noh, M.R. Arshad, Z.M. Zain, Q. Khan, Review of sliding mode control applications in autonomous underwater vehicles. Indian J. Geo-Mar. Sci. (2019) J.E. Slottine, W.Li., Applied nonlinear control, in KeyInformation: Non-linear control Techniques (Prentice Hall, 1991) H.K. Khalil. Nonlinear systems, in Key information: Nonlinear control Techniques, Third Edition (Prentice Hall, 2002) M.T. Vu, T.-H. Le, H.L.N.N. Thanh, T.-T. Huynh, M. Van, Q.-D. Hoang, T.D. Do, Robust position control of an over-actuated underwater vehicle under model uncertainties and ocean current effects using dynamic sliding mode surface and optimal allocation control. Sensors 21(3), 747 (2021) D. Wang et al., Controller design of an autonomous underwater vehicle using ELM-based sliding mode control, in OCEANS 2017 (Anchorage, 2017), pp. 1–5 J. Guerrero, E. Antonio, A. Manzanilla, J. Torres, R. Lozano, Autonomous underwater vehicle robust path tracking: Auto-adjustable gain high order sliding mode controller. IFAC-Papers OnLine 51(13), 161–166 (2018). ISSN 24058963 P.J. Craven, R. Sutton, R.S. Burns, Control strategies for unmanned underwater vehicles. J. Navig. 51(1), 79–105 (1998) B.K. Sahu, B. Subudhi, Adaptive tracking control of an autonomous underwater vehicle. Int. J. Autom. Comput. 11, 299–307 (2014) C.T. Bandara, L.N. Kumari, S. Maithripala, A. Ratnaweera, Vehicle-fixed-frame adaptive controller and intrinsic nonlinear PID controller for attitude stabilization of a complex-shaped underwater vehicle. J. Mechatron. Rob. 4(1), 254–264 (2020) M.W. Hasan, N.H. Abbas, Controller design for underwater robotic vehicle based on improved whale optimization algorithm. Bull. Electr. Eng. Inf. [S.l.], 10(2), 609–618, Apr 2021. ISSN 2302-9285 K. Vinida, M. Chacko, An optimized speed controller for electrical thrusters in an autonomous underwater vehicle. Int. J. Power Electron. Drive Syst. (IJPEDS) 9(3), 1166–1177, Sept 2018. ISSN: 2088-8694 E.A. Gavrilina, V.N. Chestnov, Synthesis of an attitude control system for unmanned underwater vehicle using H-infinity approach. IFAC-Papers OnLine 53(2), 14642–14649 (2020). ISSN 2405-8963 R.S. Burns, R. Sutton, P.J. Craven, Computational intelligence in ocean engineering: A multivariable online intelligent autopilot design study (2000) A. Zhilenkov, S. Chernyi, A. Firsov, Autonomous underwater robot fuzzy motion control system with parametric uncertainties. Designs 5(1), 24 (2021) S. Chen, T. Lin, K. Jheng, C. Wu, Application of fuzzy theory and optimum computing to the obstacle avoidance control of unmanned underwater vehicles. Appl. Sci. 10, 6105 (2020). https://doi.org/10.3390/app10176105 M.M. Hammad, A.K. Elshenawy, M.I. El Singaby, Trajectory following and stabilization control of fully actuated AUV using inverse kinematics and self-tuning fuzzy PID. PLoS One 12(7), e0179611, 6 Jul 2017 Y. Jiang, C. Yang, J. Na, G. Li, Y. Li, J. Zhong, A brief review of neural networks based learning and control and their applications for robots. Complexity 14, (2017). ArticleID 1895897 A. Sun, X. Cao, X. Xiao, L. Xu, A fuzzy-based bio-inspired neural network approach for target search by multiple autonomous underwater vehicles in underwater environments. Intell. Autom. Soft Comput. 27(2), 551–564 (2021)
300
M. R. Varghese and X. A. Mary
24. F. Muñoz, J.S. Cervantes-Rojas, J.M. Valdovinos, O. Sandre-Hernández, S. Salazar, H. Romero, Dynamic neural network-based adaptive tracking control for an autonomous underwater vehicle subject to modeling and parametric uncertainties. Appl. Sci. 11(6), 2797 (2021) 25. T. Praczyk, Neural collision avoidance system for biomimetic autonomous underwater vehicle. Soft Comput. 24, 1315–1333 (2020) 26. K.-C. Chang, K.-C. Chu, Y.C. Lin, J.-S. Pan, Overview of some ıntelligent control structures and dedicated algorithms 8th Apr 2020 27. M. Kumar, Robust PID tuning of autonomous underwater vehicle using harmonic search algorithm based on model order reduction. Int. J. Swarm Intell. Evol. Comput. 4 (2015) 28. J. Wan, B. He, D. Wang, T. Yan, Y. Shen, Fractional-order PID motion control for AUV using cloud-model-based quantum genetic algorithm. IEEE Access 7, 124828–124843 (2019) 29. M. Wang, B. Zeng, Q. Wang, Study of motion control and a virtual reality system for autonomous underwater vehicles. Algorithms 14(3), 93 (2021) 30. T. Liu, Y. Hu, H. Xu, Deep reinforcement learning for vectored thruster autonomous underwater vehicle control. Complexity 2021, 25 (2021). Article ID 6649625 31. I. Carlucho, M. De Paula, S. Wang, Y. Petillot, G.G. Acosta, Adaptive low-level control of autonomous underwater vehicles using deep reinforcement learning. Robot. Auton. Syst. 107, 71–86 (2018). ISSN 0921-8890 32. L. Zacchini, A. Ridolfi, A. Topini, N. Secciani, A. Bucci, E. Topini, B. Allotta, Deep learning for on-board AUV automatic target recognition for optical and acoustic imagery. IFAC-Papers On Line 53(2), 14589–14594 (2020). ISSN 24058963 33. H. Tariq et al. A Hybrid Linear Quadratic Regulator Controller for Unmanned Free-Swimming Submersible. Appl. Sci. 11(19) :9131 (2021)
Fused Feature-Driven ANN Model for Estimating Code-Mixing Level in Audio Samples K. Priya, S. Mohamed Mansoor Roomi, R. A. Alaguraja, and P. Vasuki
Abstract Code mixing spoken language identification (CM-SLI) from speech signal plays a vital role in many computer-aided voice analysis applications. To identify the level of code-mixing language from a speech signal, a powerful feature is needed for training the model. In the proposed work, such a feature is extracted from the speech by mel frequency cepstral coefficients (MFCC), Delta Delta MFCC (D2 MFCC), and pitch. These features are fused and trained by multilayer perceptron (MLP) neural network (NN) with a Bayesian regularization (BR) function. This classifies the given audio sample into Tamil or English and achieves an accuracy of 97.6%. Then, the level of a mix of languages is estimated by classifying fragments of audio that find the acquaintance of the speaker on the chosen language. Keywords Code mixing · Language identification · Mel frequency cepstral coefficients · Pitch · Multilayer perceptron · Neural network
1 Introduction Speech is the most important communication modality that conveyed 80% of the information by face and 100% of the information by telephonic conversation. The process of determining the language of an anonymous speaker’s utterance, regardless of gender, accent, or pronunciation, is known as spoken language identification (SLI). The SLI plays a significant role in speech processing applications [1, 2] like K. Priya (B) · S. Mohamed Mansoor Roomi · R. A. Alaguraja ECE, Thiagarajar College of Engineering, Madurai, Tamil Nadu, India e-mail: [email protected] S. Mohamed Mansoor Roomi e-mail: [email protected] R. A. Alaguraja e-mail: [email protected] P. Vasuki ECE, Sethu Institute of Technology, Madurai, Tamil Nadu, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_23
301
302
K. Priya et al.
automatic speech recognition (ASR), speech coding, speech synthesizes, speech to text communication, speaker and language identification, etc. Persons can distinguish one language from another without having prior knowledge of the language terms, but it is a tedious process for machines. In service centers, SLI systems can be used to route foreign inquiries to an operator who speaks the recognized language proficiently [3]. It is also used voice-controlled information retrieval systems like Amazon Alexa, Google Assistant, and Apple Siri. There are several research methods available to carry out the problems of SLR with acoustic or phonotactic features. The acoustic features include a wide range of vocal tract characteristics such as rhythm, pitch, and stress. Lin and Wang [4] proposed a language identification system using Gaussian mixture model (GMM) based on pitch contour. The other acoustic features such as linear prediction (LP), linear prediction cepstral coefficients (LPCC), and mel frequency cepstral coefficients (MFCC) are used in speech processing applications. In SLI, some extra features fused in these basic features and classified with artificial neural networks (ANN) [5]. A fresh universal acoustic description method for language recognition was developed by Siniscalchi et al. [6]. A universal set of fundamental units that can be defined in any language has been investigated. Gundeep Singh et al. [7] presented work for language identification by deep neural networks that has been used spectrogram of the speech signal as input. In speech processing analysis, the MFCC is the dominant acoustic feature. It reproduces the shape of human voice perception and it provided speaker-independent features [8, 9]. Hariraj et al. [10] proposed a framework for language identification by extracting MFCC and classifying these features by support vector machine (SVM). Zissman [11] presented a work to identify language using pitch and syllable time as a feature and classified by GMM. The SLI is proposed by Sadanandam [12] by fusing the acoustic features MFCC and fundamental frequency. The fused features were recognized by the hidden Markov model (HMM). Sarthak et al. [13] proposed work for language identification using new attention module-based convolution networks. It uses log mel spectrogram images of the speech signal in Voxforge dataset. Multilingual speakers frequently flip between languages in a social media conversation and automatic language recognition becomes both an essential and difficult task in such an environment. Code-mixing is a common term in multilingual societies, and it refers to the intra sentential switching of two different languages. So in any given audio sample containing a mix of two languages, the proposed algorithm finds the level of a mix of languages by classifying fragments of audio into Tamil or English. This would provide information on how the speaker mixes two languages voluntarily or involuntarily leading to the gauging versatility and acquaintance of the speaker on the chosen language. Major contributions of the works are • Proposal of a multilayer perceptron (MLP) neural network (NN) with a Bayesian regularization (BR)-based code-mixing spoken language identification (CM-SLI). • Selection and fusion of acoustic features such as MFCC, D2 MFCC, and pitch. • Classification of fused features by ANN.
Fused Feature-Driven ANN Model for Estimating Code-Mixing Level …
303
• Estimating the Level of code-mixing spoken language. • Performance comparison on speech dataset against various ANN classifiers. This proposed work is structured as follows. Section 2 describes speech database collection, and Sect. 3 describes a proposed methodology using feature extraction and ANN classification algorithms. Section 4 explains the experimental results and their comparison with other ANN models, and Sect. 5 concludes the paper.
2 Database Collection The objective of the proposed work is to identify code-mixing language with a collected speech signal from YouTube. The collected speech signal consists of speech samples from two languages that are Tamil and English. The total number of files in the language database is 2500 with a sampling frequency 44.1 kHz. Each class contains 50% of the total files which is 1250. The length of the speech files is varying from 5 to 15 s. In this work, speech files with a length of 5 s of both classes are used to identify the language.
3 Proposed Methodology The flow of the proposed methodology is shown in Fig. 1. It comprises four stages such as speech signal detection and segmentation, framing and windowing, feature
Fig. 1 Block diagram of code-mixing spoken language ıdentification system
304
K. Priya et al.
extraction, and ANN model to classify the given code-mixing language speech signal as Tamil or English.
3.1 Speech Signal Detection and Segmentation The speech signal may have a silent and noise-free signal. Suppose the speech signal has a silent part, it is not carrying any information. So that the speech signal analysis requires the removal of the silence part. Equation 1 represents the input speech signal with silence and speech parts. In this work, voice activity detection is used to remove the silence from the input speech signal (I). The silence-removed speech signal (y) is shown in Eq. 2. I = Silence(Ss ) + Speech S p
(1)
y = [t1 , t2 . . . tT ]
(2)
3.2 Framing and Windowing The speech signal is time-varying and non-stationary. When the speech signal has a long period, the characteristics of the signal is changed and it will be reflected in different sound spoken by the same speaker. If the length of a speech signal is short (10–30 ms), it will be considered stationary. So that used framing and windowing to use speech signal characteristics properly. Framing [14] converts the non-stationary signal into stationary by changing the speech signal into a short period of small signals.
w(n) =
F(n) = {y1 |y2 |y3 . . . .yn }
(3)
Fw (n) = F(n) ∗ w(n)
(4)
0.54 − 0.46 cos 0
2πr , 0≤r ≤ N −1 N −1 otherwise
(5)
The framing Fw (n) of the signal is obtained from Eq. 3. Then, the hamming window function w(n) (Eq. 4) is applied on the framed signal shown in Eq. 4 that eradicates the spectral distortions in the signal.
Fused Feature-Driven ANN Model for Estimating Code-Mixing Level …
305
3.3 Feature Extraction Mel Frequency Cepstral Coefficients (MFCC) Feature extraction is the process of extracting significant information and removing unrelated information. In this, the MFCC, Delta Delta MFCC (D2 MFCC), and pitch are extracted to identify the language from the speech signal. MFCC is the most important acoustic feature which represents the short-term power and shape of the speech signal. It provides compressed information about the vocal tract in the form of a small number of coefficients. It depends on the linear cosine transform of the log power spectrum on a nonlinear mel scale of frequency. The flow of the MFCC is shown in Fig. 2. First, the windowed signal is converted into the frequency domain by discrete Fourier transform (DFT) using Eq. 6. X (k) =
N −1
Fw (n)e
−j2π nk N
0≤k ≤ N −1
(6)
n=0
where X(K) is the DFT of the signal and N is the number of DFT points that have been computed. The DFT of the signal is applied to mel filter banks to compute the mel spectrum. Mel is used to measuring the frequency that imitates like the human ear perceived. The physical frequency of the mel scale f mel was calculated using Eq. 7. f f mel = 2595 log10 1 + 700
(7)
where f is the physical frequency and f mel is the perceived frequency. The triangular filters are used to compute the magnitude spectrum envelope s(m) in Eq. 8. N −1
|X (k)|2 Hm (k) 0 ≤ m ≤ M − 1 s(m) =
(8)
k=0
where M is the total number of weighting filters. The weight of the kth energy spectrum is denoted as Hm (k). The energy levels of the adjacent bands are correlated because of the smoothness of the vocal tract. This can be achieved by discrete cosine transform, and it produced the mel cepstral coefficients. The MFCC is computed using Eq. 9.
Fig. 2 Block diagram of MFCC
306
K. Priya et al.
c(n) =
N −1
log10 (s(m)) cos
k=0
π n(m − 0.5) M
n = 0, 1, 2 . . . .C − 1
(9)
where c(n) is the mel cepstral coefficients and C is the number of coefficients. It provides more discriminative spectral information of the signal. Delta Delta (D2 ) MFCC The second-order derivative of the MFCC is called D2 MFCC coefficients, and it has temporal dynamics information and acceleration of the speech. Delta is the difference between the current and previous coefficients and is represented in Eq. 10. g di ( j ) =
j=1 (ci+ j − ci− j ) g 2 j=1 j 2
(10)
where di ( j ) is the coefficient of delta and i is the frame. The D2 MFCC is returned when N = 2. Pitch The periodic excitation of the vocal folds produces a voiced signal estimated by pitch as time or the fundamental frequency ( f 0 ) in the frequency domain. In this proposed methodology, the pitch is calculated as fundamental frequency ( f 0 ). It is the vibration of the vocal card per second during the voiced speech. In this work, the spectral-based harmonic-to-noise ratio (HNR) [15] method was used to predict the pitch. First, the Hamming window function is multiplied with the framed signal using Eq. 4. Then, apply fast Fourier transform (FFT) on the windowed signal in Eq. 11. The resulting array E(mw) has complex values, and it is doubled except first values, and the first N/2 array values are considered as energy spectrum, and this can be calculated using Eq. 12. The pitch ( f c ) is estimated using HNR in Eq. 13. N −1 −j2πnk 1 N Fw (nT )e s(kw) = N T n=0
(11)
E(mw) =
|s(kw)|k = 0 |2s(kw)|k = 1, 2 . . . N /2
HNR( f c ) =
E( f c ) + E(2 f c ) + E(3 f c ) E( f < 3 f c , f = f c , f = 3 f c )
(12) (13)
Fused Feature-Driven ANN Model for Estimating Code-Mixing Level … Table 1 Fused acoustic feature descriptions
Acoustic features
Indices
Number of features
MFCC
1–13
13
D2 MFCC
14–26
13
Pitch
27
1
Total
27
307
3.4 Fusion of Acoustic Features The extracted acoustic features are fused, and it is denoted as U shown in Eq. 14, and the length of the fused feature vector is 27. The first 13 features are extracted from MFCC coefficients and are represented as U 1 to U 13 . The next features U 14 to U 26 represent the features extracted from D2 MFCC. The final U 27 th feature is pitch. Table 1 shows the acoustic feature descriptions. U = U1 , U2 , U3 . . . U27
(14)
3.5 Artificial Neural Network ANN is the common technique that has been used in various applications used in image and signal processing [16–18]. It learns the complex mapping between input and output layers as shown in Fig. 3. In this work, multilayer perceptron neural network (MLP-NN) is used for language identification. The input layer depends on the features, while the output layer matches the number of language detection. The trial and error method is used to predict the number of the hidden layer. The following equations for the backpropagation algorithm are presented in the order in which they might be employed during training for a single training vector V (Eq. 15), where ‘w’ stands for weights and ‘f ’ stands for activation functions, ‘b’ stands for bias, and ‘y’ stands for the target. nethj =
N
W jih Vi + bhj
(15)
i=1
i j = f jh nethj
(16)
Equation 16 represents the hidden layer i j that gives output to another hidden layer using input vectors. The input vectors are applied to the hidden layers (Eq. 17).
308
K. Priya et al.
Fig. 3 Block diagram of artificial neural network
netok =
L
Wkoj i j + bkO
(17)
j=1
Ok = f kO net KO
(18)
Then, the hidden layer parameter outputs are applied to the output layer which is represented by Eq. 18. The results from the output layer provide the classification result (Eq. 19). δkO = (yk − Ok ) f kO net KO
(19)
N δ hj = f jh nethj δkO Wkoj
(20)
k=1
This MLP is using Eqs. 19 and 20 to calculate the error rate for the output δkO and hidden layers δ hj . wkoj (t + 1) = wkoj (t) + ηδkO i j
(21)
Fused Feature-Driven ANN Model for Estimating Code-Mixing Level …
w hji (t + 1) = w hji (t) + ηδ hj X i
309
(22)
The NN backpropagates this weightage to minimize the error using Eqs. 21 and 22. 1 2 δ 2 k=1 k M
Ep =
(23)
The network is trained up to the error E p (Eq. 23) becomes minimal for each of the input vectors. Finally, the classified speech signal is categorized as Tamil or English. Then, the level of spoken language is calculated using the following steps. Let the length of the input speech signal (L) comprises the length of silence signal (S) and length of the speech signal (SP) shown in Eq. 24. L = S + SP
(24)
S = S1 , S2 . . . S N
(25)
SPT = L −
N
Sn
(26)
n=1
The length of the total speech signal is calculated by subtracting the total silence signal from the input speech signal. The total silence (S) signal and the total speech SPT signal is represented in Eqs. 25 and 26, respectively. SPT = SP1 + SP2 + . . . SPm SPET =
k
(27)
SPT i
(28)
SPT j
(29)
i=1
SPTT =
l j=1
The total speech SPT time is consisting of segmented speech time between the silence part shown in Eq. 27. The total English SPET and Tamil SPTT language speech time is obtained by Eqs. 28 and 29. LSPET =
SPET SPT
(30)
LSPTT =
SPTT SPT
(31)
310
K. Priya et al.
The level of English LSPET and Tamil LSPTT language speech time is calculated from the ratio of individual language time to total speech time shown in Eqs. 30 and 31.
4 Experimental Results The CM-SLI from the collected speech database is developed using MATLAB 2020a with 8 GB RAM, GETFORCE GTX, CORE i7 laptop. 70% of the total speech files are used for training the network, and 30% of the files are used for validation and testing. The input speech signal containing a mix of two languages shown in Fig. 4, is silence removed and segmented, then it is framed to 20 ms length with 10 ms overlapping. The speech detected signal is shown in Fig. 5. The Hamming windowing is applied to the framed signal for feature extraction. Extract MFCC, D2 MFCC, and pitch from the windowing signal. The number of MFCC coefficients for each frame is 13. Then, calculate mean of each coefficient of the widowed signal is used as features. Likewise, the mean of D2 MFCC features is extracted for the next 13 features. Finally, the pitch is calculated as the last feature. Then, these acoustic features are fused to make the input vector and the feature vector length is 27. The extracted feature vector is the input of the MLP-NN, and it is trained with Bayesian regularization (BR) function to achieve language identification. In this work, the input layer contains 27 features, the number hidden layer is 20 and the number of the output layer is 2. So that the network architecture is input–hidden– output, that is 27–20–2. The number of training speech sample is 1750, so the number of training factors are 1750 * 27. The network modeling parameters are shown in Table 2. Fig. 4 Sample ınput speech signal
Fused Feature-Driven ANN Model for Estimating Code-Mixing Level …
311
Fig. 5 Sample detected speech signal
Table 2 Network modeling parameters Training algorithm
Transfer function
Network structure
Training data
Verifying data
Testing data
MLP
Bayesian regularization
27–20–2
1750 * 27
375 * 27
375 * 27
The MLP-NN was trained with the training function of Bayesian regularization (BR) with 100 epochs. The number of hidden layers is 20, and the training time is 26 s with 0.000862 performance. After training the MLP, the performance function reaches the target for all samples. This model achieves 97.9% classification accuracy to the testing data as shown in Table 3. The sample code-mixing input speech signal was segmented, and it has three segmentations as shown in Fig. 6. The identified language of the first segment speech signal is Tamil with a length of 0.69 S, and the other two segments are English with a length of 1.82 S and 1.89 S, respectively. It shows the information on the level of code-mixing language that is 84.31% shown in Table 4. Table 3 Training parameters Training algorithm
Number of hidden layers
Epoch
Performance
Training time (s)
Accuracy (%)
MLP-BR
20
100
0.000862
26
97.9
312
K. Priya et al.
Fig. 6 Detected language (Tamil and English)
Table 4 Level of code-mixing spoken language identification Speech Language
(SP1 )
(SP2 )
(SP3 )
Tamil (SPT )
English (SPE )
English (SPE )
Total speech Time = SP1 + SP2 + SP3
Time
0.69
1.82
1.89
4.4
% of language detection
15.69
41.36
42.95
–
Total % of language detection
– 84.31
4.1 Evaluation Metrics for Language Identification In this section, the evaluation metrics for language identification have been discussed. The evaluation of the performance metrics provides the MLP-NN classifier classification state by using the measures of precision, recall, and F1-score. The confusion matrix of the MLP-NN-BR is shown in Fig. 7. These performance metrics define whether the classifier performs well or not. In this work, the proposed classifier achieves better accuracy of 97.9% and the other performance metrics are shown in Table 5. Figure 8 explains the comparison results of the language identification using various MLP algorithms such as Levenberg Marquardt (LM), scaled conjugate gradient (SCG), Polak-Ribiere conjugate (CGP), conjugate gradient with Powell/ Beale restarts (CGB), BFGS quasi-Newton (BFG), and Fletcher Powell conjugate gradient (CGF). Among these NN training functions, the BR achieves better classification accuracy.
Fused Feature-Driven ANN Model for Estimating Code-Mixing Level …
313
Fig. 7 Confusion matrix of the MLP-NN-BR Table 5 Experimental results of the speech fluency recognition Database
Class
Accuracy (%)
Collected from YouTube
80.1
SCG
Recall %
F1-score
Accuracy (%) 97.9
Tamil
98.4
97.4
97.89
English
97.3
98.4
97.84
86.9
CGB
Precision %
78.5
CGF
85.6
87.9
CGP
LM
MLP-NN
Fig. 8 Comparison results of various MLP algorithm
97.9 78.5
BFG
BR
314
K. Priya et al.
Table 6 Comparison of the proposed method with state-of-the-art Database
Features
Classifier
Accuracy (%)
Three Indian language [3]
Revised perceptual linear prediction
GMM
88.75
Constructed database [5]
MFCC
SVM
74
Four Indian language [10]
MFCC
SVM
76
Six Indian language [12]
MFCC
HMM
90.63
Voxforge dataset [13]
Log mel spectrogram
ConvNet
96.3
Collected from YouTube (proposed)
MFCC, D2 MFCC, Pitch
ANN
97.6
Table 6 shows the performance of the proposed method with state-of-the-art methods. Most of the methods used MFCC alone to identify language with Indian databases. The proposed method uses fused features to identify language, and it also estimates the level of language used. This method achieves good classification accuracy among these state-of-the-art methods.
5 Conclusions A powerful acoustic feature based on an MLP-BR NN for code-mixing spoken language identification is proposed. The proposed method performs the level of language identification on speech signals by signal detection and segmentation, fused acoustic feature extraction, and detection. The pre-processing step normalizes the speech signal and the acoustic features such as MFCC, D2 MFCC, and pitch are extracted from the pre-processed signal. These features are classified with different NN classifiers and among these classifiers, the MLP-BR NN classifier achieves higher accuracy for language identification. Then, the level of identified language is estimated for the improvement of language processing application. The proposed method achieves 97.9% classification accuracy. As a result, the proposed method is highly recommended for estimating the level of code-mixing language identification based on the speech signal. Acknowledgements This work was supported by Thiagarajar Research Fellowship (TRF) in Thiagarajar College of Engineering, Madurai.
References 1. A.P. Pandian, Performance evaluation and comparison using deep learning techniques in sentiment analysis. J. Soft Comput. Paradigm (JSCP) 3(2), 123–134 (2021)
Fused Feature-Driven ANN Model for Estimating Code-Mixing Level …
315
2. M. Tripathi, Sentiment analysis of Nepali COVID19 tweets using NB, SVM AND LSTM. J. Artif. Intell. 3(03), 151–168 (2021) 3. P. Kumar, A. Biswas, A.N. Mishra, M. Chandra, Spoken language identification using hybrid feature extraction methods. arXiv prepr. arXiv:1003.5623 (2010) 4. C.Y. Lin, H.C. Wang, Language identification using pitch contour information. Department of Electrical Engineering, National Tsing Hua University, Hsinchu, Taiwan 2005 5. B. Aarti, S.K. Kopparapu, Spoken Indian language classification using artificial neural network—an experimental study. in 2017 4th IEEE International Conference on Signal Processing and Integrated Networks (SPIN), pp. 424–430 Sept (2017) 6. S.M. Siniscalchi, J. Reed, T. Svendsen, C.-H. Lee, Universal attribute characterization of spoken languages for automatic spoken language recognition. Comput. Speech Lang. 27(1), 209–227 (2013) 7. G. Singh, S. Sahil, V. Kumar, M. Kaur, M. Baz, M. Masud, Spoken language ıdentification using deep learning. Comput. Intell. Neurosci. Article ID 5123671, 12 (2021) 8. S. Jothilakshmi, V. Ramalingam, S. Palanivel, A hierarchical language identification system for Indian languages. Digital Signal Process. 22(3), 544–553 (2012) 9. M.B. Alsabek, I. Shahin, A. Hassan, in Studying the Similarity of COVID-19 Sounds based on Correlation Analysis of MFCC. https://dblp.org/rec/journals/corr/abs-2010-08770.bib (2020) 10. H. Venkatesan, T.V. Venkatasubramanian, J. Sangeetha, Automatic language ıdentification using machine learning techniques, in Proceedings of the International Conference on Communication and Electronics Systems (2018). IEEE Xplore Part Number: CFP18AWO-ART; ISBN:978-1-5386-4765-3 11. M.A. Zissman, Automatic language identification using Gaussian mixture and hidden Markov models, in 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing. (Minneapolis, MN, USA, 1993), pp. 399–402 12. M. Sadanandam, HMM based language identification from speech utterances of popular Indic languages using spectral and prosodic features. Traitement Signal 38(2), 521–528 (2021) 13. Sarthak, S. Shukla, G. Mittal, Spoken Language Identification Using ConvNets, arXiv:1910.04269v1 [cs.CL] 9 Oct (2019) 14. O.K. Hamid, Frame Blocking and Windowing Speech Signal. J. Inf. Commun. Intell. Syst. (JICIS) 4(5) (2018).ISSN: 2413–6999 15. S. Markov, A. Minaev, I. Grinev, D. Chernyshov, B. Kudruavcev, V. Mladenovic, A spectralbased pitch detection method. AIP Conf. Proc. 2188, 050005 (2019). https://doi.org/10.1063/ 1.5138432 16. S.R. Mugunthan, T. Vijayakumar, Design of improved version of sigmoidal function with biases for classification task in ELM domain. J. Soft Comput. Paradigm (JSCP) 3(02), 70–82 (2021) 17. H.K. Andi, An accurate bitcoin price prediction using logistic regression with LSTM machine learning model. J. Soft Comput. Paradigm 3(3), 205–217 (2021) 18. J.L.Z. Chen, K.L. Lai, Deep convolution neural network model for credit card fraud detection and alert. J. Artif. Intell. 3(02),101–112 (2021)
Pre-emptive Caching of Video Content Using Predictive Analysis Rohit Kumar Gupta, Atharva Naik, Saurabh Suthar, Ashish Kumar, and Ankit Mundra
Abstract Pre-emptive caching is a technique to pre-fetch data based on the outcome of algorithmic predictions. In this paper, we use machine learning models that account for time-based trends instead of metadata to estimate a popularity score integrated with existing caching schemes to pre-fetch trending videos or replace cached content with newer videos having higher in-demand user requests. The paper primarily focuses on regression models for prediction as they are faster to train which is crucial for a resource-intensive task like caching. Keywords Popularity prediction · Video content · Caching · CDN · Regression model
1 Introduction With increasing Internet usage, passive entertainment consumption and the booming scale of video hosting websites, managing content distribution has become very important in recent years such that a feed is streamed fast enough to minimize lags in transmission. To overcome this persistent issue with scaling, servers have become more decentralized through the means of content delivery networks, which are server proxies that host cached data received from a centralized database. But the upkeep of a CDN [1–3] can be very expensive if resources are not efficiently managed. Caching every video type is impractical and commercially unfeasible due to the high amount of space and server infrastructure requirements. Instead, CDNs rely on popularity determination to estimate what kind of video content is more in demand and caches the content on multiple edge nodes which then beam R. K. Gupta (B) · A. Naik · S. Suthar · A. Kumar · A. Mundra (B) Manipal University Jaipur, Jaipur, India e-mail: [email protected] A. Mundra e-mail: [email protected] A. Kumar e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_24
317
318
R. K. Gupta et al.
the stream to numerous users across spread out locations. Commonly used CDN caching techniques rely solely on the amount of user requests a particular content receives to cache data. This is an example of non-pre-emptive caching. Newer CDNs use complex models to determine increasing popularity and pre-fetch data before it receives many user requests. We have performed a comparative study using country-wise statistical data from YouTube and machine learning models from Microsoft Azure ML Studio to gauge the best performing algorithms for predicting video popularity and utilized them to set a custom caching policy. Compared to industry grade caching, pre-emptive policies improve average response time and minimize transmission latency as content with increasing popularity will be pre-cached in CDN edge nodes, thereby limiting network overheads.
2 Related Work Existing approaches rely on metadata of videos for predicting their future popularity [4, 5]. These methods are robust but consume a lot of time for training the learning models. Most applications are focused on keyword-based prediction models. We intend to use a trend-based methodology to achieve objectives. The system will establish relationships between the most important metrics of a video [2, 6] which include but are not limited to the number of dislikes, likes, comments, etc. By using regression models instead of neural networks, training time and model complexity can be lowered. Faster predictions [7] are optimal for pre-emptive caching, where speedy delivery of content is a major priority. A custom caching policy [2, 8] can be created to override a cached item if another video having a higher request count is not captured by the prediction model. In [1], human perception models are used to consider network parameters like topology, data links and strategies using mixed-linear functions. This approach utilizes image data and continuous feed from video sources to train models. Image data is user defined, and it may not represent the real video content in thumbnails. In [3], LSTM approach used which is highly suitable for predictive analysis provides high accuracy in outcomes. They are using time series data required which is currently unavailable in used datasets. In [9], CDNs have been implemented to play a major role in delivering content at a reduced latency and to provide high quality of experience to the users. Due to the limited storage on these caching servers [10], it is of utmost importance to manage data transmission on these servers, which required to optimize the caching policies to maximize cache hits.
Pre-emptive Caching of Video Content Using Predictive Analysis
319
3 Methodology We are using a real-world YouTube dataset [11] taken from Kaggle for predictive analysis of views and cache them based on their predictive score to the content delivery network (CDN). We have compared predictions using different models like Poisson regression [12], linear regression [13], decision forest regression [14], boosted decision tree regression [15] and neural network-based regression [16]. The learning model with the highest accuracy across multiple countries’ video content in the dataset will be finalized for use in the caching scheme.
3.1 Dataset Dataset used for the proposed approach includes the following features: • • • • • • • • • •
video_id trending_date title channel_title category_id publish_time tags views likes dislike
3.2 Data Pre-processing Data pre-processing includes the cleaning of dataset by removing null values, outliers, analysing the data distribution, converting categorical data into numeric data as the machine learning models only take the numeric data for training and processing and performing normalization or standardization on data according to the use case. For the taken dataset, it includes many irrelevant features like video_id and category_id as they do not affect the video popularity. Trending data is converted to the days as the day is the categorical variable. So, it will be converted to dummy variables, creating columns for every category. We are not using the features like title and channel_title as it will increase the computation complexity as the new columns are added when the categorical data is converted to numeric data. The data is densely populated at the centre. So, we need to normalize the data using log normalization.
320
R. K. Gupta et al.
Fig.1 Distribution of comment_count versus frequency after normalization
3.3 Normalization Initial visualizations of the dataset indicated that most parameters were positively skewed. As demonstrated in the image below for one of the parameters, viz. comment_count for Canadian YouTube dataset, videos with fewer than 100,000 comments outnumber every other category. The dataset was normalized [17] using log transform to reduce the uneven skewness of the data distribution. Post normalization, the distribution was observed as shown in Figs. 1 and 2.
3.4 Correlation Matrix The results of an initial correlation matrix indicated many features with low correlation with the view counts of the dataset. We dropped these features in favour of three highly correlated ones, viz. likes, dislikes and comment_count (Fig. 3).
Pre-emptive Caching of Video Content Using Predictive Analysis
321
Fig. 2 Scatterplot distribution of comments versus views after normalization
Fig. 3 Correlation matrix showing relationships between the four finalized attributes—likes, dislikes, comment count and views
322
R. K. Gupta et al.
3.5 Predictive Analysis Methods a.
Linear regression
Linear regression [13] predicts outcomes based on dependent variables by establishing linear correlation between predictor parameters and independent variables, which determines the output of a linear regression model. Simple linear regression is one of the most common regression approaches for finite continuous data. b.
Poisson regression
Poisson regression [12] predicts outcomes based on the count data and contingency tables. The output or response variable is assumed to have a Poisson distribution. c.
Decision forest regression
Decision forest regression is an ensemble learning technique that works well on nonlinear data distribution by creating decision trees. The output of each tree is a Gaussian distribution which is aggregated to find a resultant value. d.
Boosted decision tree
Boosted decision tree utilizes MART gradient boosting algorithm to efficiently build a regression tree in a series of steps. Each step has a predefined loss function that is corrected in the next step. It is optimal for linear as well as nonlinear distributions. e.
Neural networks
The neural network [16] is the network of small computing units, i.e. neurons which perform the computation. It can be used for any of the tasks that have big data which will eventually increase the accuracy of the model. But, sometimes machine learning methods perform better than neural networks [16] on regression problems.
3.6 Caching methodology We can couple the prediction model with existing caching policies, for faster preemption, a score-gated least recently used (SG-LRU) [8] caching policy can be used. It caches videos based on their prediction score by the highest accuracy model, i.e. boosted decision tree. CDN will cache the videos according to the highest prediction score and space availability in the CDN. Then, view counts will be counted over the hits (user requests) received on a particular video. Pre-fetched videos with higher hit ratio will be retained in the CDN, while those with lower hit ratio will be replaced by pre-fetched content estimated by the learning algorithm. After a preset time frame, if a video gets less than 80% of predicted views at that instant, the video will be allotted a lower priority then eventually replaced by another video that gets a higher estimated popularity score (Fig. 4).
Pre-emptive Caching of Video Content Using Predictive Analysis
Content Request
Popularity Predictor
Score
Apply Caching Policy
323
Pre-emptive Cache
Hit/Miss
Fig. 4 Content caching methodology
Table 1 Error metrics for different model Linear regression
Poisson regression
Decision tree
Boosted decision tree
Neural network
Mean absolute error
0.507864
0.202652
0.465561
0.451615
0.475683
Root mean-squared error
0.7072567
0.689224
0.612788
0.583466
0.616534
Relative absolute error
0.437201
0.435296
0.400783
0.388778
0.408233
Relative squared 0.219402 error
0.21147
0.166911
0.15132
0.168054
Coefficient of determination
0.78853
0.833089
0.84868
0.831946
0.780596
4 Results and Analysis Here, we used the five models to predict the popularity of the video. We used linear regression, Poisson regression, decision tree, boosted decision forest and neural network. We are using a coefficient of determination to know the accuracy [18] of the model (Table 1). Underlying graph tells the accuracy (coefficient of determination) of different models and the highest accuracy, we have achieved with boosted decision tree (Fig. 5). Boosted decision tree is the most accurate model with a lower training time (Fig. 6). The only other model that gets trained within 5 s is linear regression, but at the cost of accuracy, which is the lowest of all models at 78.05%. Therefore, boosted decision tree is the most preferable choice for popularity prediction (Table 2).
5 Conclusion Out of the different learning models that we used, boosted decision tree regression showed highest accuracy with 84.86% of R2 score. Combining Azure ML Studio [19] and local machine Python resources, we have proceeded with this algorithm for prediction of popularity scores which are transformed to the number of views
324
R. K. Gupta et al.
Fig. 5 Accuracy results after training the five types of regression models
Fig. 6 Graphical representation of training times for different learning models (lower is better)
using antilog transform. Unlike metadata-based predictions that rely on complex topologies with interrelated learning techniques, proposed model relies on changes in engagement rates by capturing real-time trends, thereby eliminating the need of constant live data feed or image processing as predictions can be performed on statistical snapshots taken at regular intervals. The proposed prediction methodology can be seamlessly integrated with existing caching policies which makes the approach versatile. Optimization can be carried out for pre-fetching content using
Pre-emptive Caching of Video Content Using Predictive Analysis Table 2 Accuracy metrics along with training time for different regression algorithms
325
Model
Accuracy (R2 score)
Training time (in s)
Linear regression
78.05
3
Poisson regression
78.88
8
Decision forest
83.30
9
Boosted decision tree
84.86
5
Neural network
83.22
18
a combination of pre-emptive analysis and a demand-based priority system. This technique when combined with the score-gated LRU caching scheme demonstrated in the research paper [8] has a higher hit ratio which results in more efficiency and faster content delivery through advanced prefetching and handles dynamic requests from the predictive model without a higher cache memory footprint. The SG-LRU policy and the learning model can be further customized as per network requests and scaling demand.
References 1. S.M.S. Tanzil, W. Hoiles, V. Krishnamurthy, Adaptive scheme for caching youtube content in a cellular network: a machine learning approach. IEEE Access 5 (Apr 2017). https://doi.org/ 10.1109/ACCESS.2017.2678990 2. A. Masood, T.V. Nguyen, S. Cho, Deep regression model for videos popularity prediction in mobile edge caching networks. in 2021 International Conference on Information Networking (Feb 2021).https://doi.org/10.1109/ICOIN50884.2021.9333920 3. R. Viola, A. Martin, J. Morgade, S. Masneri, M.Z.P. Angueira, J. Montalbán, Predictive CDN selection for video delivery based on LSTM network performance forecasts and cost-effective trade-offs. IEEE Trans. Broadcast. (Nov 2020). https://doi.org/10.1109/TBC.2020.3031724 4. J. Chorowski, J. Wang, J.M. Zurada, Review and performance comparison of SVM- and ELMbased classifiers. Sci. Dir. Neuro Comput. 128, 507–516 (Mar 2014) 5. W. Ding, Y. Shang, L. Guo, X. Hu, R. Yan, T. He, Video popularity prediction by sentiment propagation via implicit network, in ACM International Conference on Information Knowledge Management (Oct 2015), pp. 1621–1630 6. H. Zhu, Y. Cao, W. Wang, T. Jiang, S. Jin, Deep reinforcement learning for mobile edge caching: review, new features, and open issues. IEEE Netw. 32(6), (2018). https://doi.org/10. 1109/MNET.2018.1800109 7. A. Bielski, T. Trzcinski, Understanding multimodal popularity prediction of social media videos with self-attention. IEEE Access 6 (Dec 2018). https://doi.org/10.1109/ACCESS.2018.288 4831 8. G. Hasslinger, K. Ntougias, F. Hasslinger, O. Hohlfeld, Performance evaluation for new web caching strategies combining LRU with score based object selection. Science Direct (12 Apr 2017) 9. R.K. Gupta, R. Hada, S. Sudhir, 2-Tiered cloud based content delivery network architecture: An efficient load balancing approach for video streaming. Int. Conf. Signal Proc. Commun. (ICSPC) 2017, 431–435 (2017). https://doi.org/10.1109/CSPC.2017.8305885
326
R. K. Gupta et al.
10. R.K. Gupta, V.K. Verma, A. Mundra, R. Kapoor, S. Mishra, Improving recommendation for video content using hyperparameter tuning in sparse data environment, in ed. by P. Nanda, V.K. Verma, S. Srivastava, R.K. Gupta, A.P. Mazumdar, Data Engineering for Smart Systems. Lecture Notes in Networks and Systems, vol. 238 (Springer, Singapore, 2022). https://doi.org/ 10.1007/978-981-16-2641-8_38 11. Trending youtube video Statistics dataset. https://www.kaggle.com/datasnaek/youtube-new? select=USvideos.csv. Date of Access: 18 Feb 2021; Time of Access: 19:36 IST 12. P.C. Consul, F. Famoye, Generalized poisson regression model. Commun. Stat.-Theory Methods, 89–109 (Jul 2007). https://doi.org/10.1080/03610929208830766 13. O.O. Aalen, A linear regression model for the analysis of life times. Stat. Med. first published: Aug 1989, online issue: (Oct 2006). https://doi.org/10.1002/sim.4780080803 14. W. Tong, H. Hong, H. Fang, Q. Xie, R. Perkins, Decision forest: combining the predictions of multiple independent decision tree models. Am. Chem. Soc. 525–531 (Feb 2003). https://doi. org/10.1021/ci020058s 15. A. Poyarkov, A. Drutsa, A. Khalyavin, G. Gusev, Boosted decision tree regression adjustment for variance reduction in online controlled experiments, in 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Aug 2016), pp. 235–244.https://doi. org/10.1145/2939672.2939688 16. S.C. Wang, Artificial Neural Network, in Interdisciplinary Computing in Java Programming (2003) 17. T. Jayalakshmi, A. Santhakumaran, Statistical normalization and back propagation for classification. Int. J. Comput. Theor. Eng. 3(1), 1793–8201 (2011) 18. D.M. Allen, Mean square error of prediction as a criterion for selecting variables. Technometrics 13, 469–475 (Apr 2012). https://doi.org/10.1080/00401706.1971.10488811 19. Microsoft Azure Machine Learning Studio https://studio.azureml.net/.Platform used for creating learning models and visualizing data
Information Dissemination Strategies for Safety Applications in VANET: A Review Mehul Vala and Vishal Vora
Abstract The intelligent transportation system (ITS) aims to improve the performance of the transportation systems. Vehicular ad hoc networks (VANETs) are the potential mechanism by which ITS can realize its goal. In VANET, moving vehicles form ad hoc networks through wireless connection for exchanging critical information also known as information dissemination. Safety-related information dissemination is multicast or broadcast communication, and it must be fast and reliable. This criterion draws the researchers’ focus to develop efficient dissemination schemes. This review paper discusses safety-related message dissemination strategies, along with comprehensive classification, challenges, and future research direction. Keywords VANET · Ad hoc network · Broadcasting · Multi-hop · Data dissemination
1 Introduction The ultimate goal of the intelligent transportation system (ITS) is to improve the performance of the transportation systems [8]. Vehicular ad hoc network (VANET) is the key enabling technology by which ITS can realize its goal. The next-generation vehicles are intelligent in the sense that they are equipped with processing and communication technologies. VANET supports the idea of communication among moving vehicles [12]. Moving vehicles form ad hoc networks through wireless connection for exchanging critical information. Such information exchange is called information dissemination. Standards are developed to govern this kind of communication, and it is known as wireless access in vehicular environment (WAVE). WAVE standards are actually a combination of dedicated short-range communication (DSRC) and IEEE 1609 M. Vala (B) · V. Vora Atmiya University, Rajkot, Gujarat, India e-mail: [email protected] V. Vora e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_25
327
328
M. Vala and V. Vora
Applicaon layer
Safety Applications
SAE J2735
Message Sublayer Network & Transport Layer
Traffic Management and other Applications
IEEE 1609.2 (Security)
IEEE 1609.3 (WSMP)
LLC Sublayer
IEEE 802.2
MAC Sublayer Extension
IEEE 1609.4
MAC Sublayer
ICP/UDP IPV6
IEEE 802.11p
PHY Layer Fig. 1 DSRC protocol stack
standards [13]. Figure 1 shows DSRC protocol stack. The wireless connectivity can be categorized as vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I) connectivity depending on connection between two moving vehicles or vehicles and stationary nodes [7]. The DSRC standards support both V2V and V2I communications with ranges up to 1000 m. It supports data rates from 3 to 27 Mb/s over a bandwidth of 10 MHz [2]. Though DSRC supports V2I communication, installation of road side infrastructure is a costly affair. So to make it practically viable technology, infrastructure-less pure ad hoc communication is preferred among researchers [18, 35]. Practical range of transmission is less than 1 km, and in certain situations, the safety messages need to be sent to longer distances. In such a situation, multi-hop broadcasting is crucial; hence, it is drawing the attention of many researchers into developing efficient and reliable dissemination schemes beyond the transmission range of sender [24]. The structure of the paper will be as follows: Sect. 2 provides overview of VANET technology. Section 3 describes the classification of data dissemination strategies. Section 4 covers safety data dissemination methods. Discussion and future scope covered in Sect. 5, and finally Sect. 6 concludes the paper.
2 VANET Overview Vehicular ad hoc network (VANET) is a special case of mobile ad hoc networks (MANETs), in which every moving vehicle forms wireless connectivity with other moving vehicles for information sharing purposes [13].
Information Dissemination Strategies for Safety . . .
329
2.1 VANET Architecture All the devices that constitute VANET architecture are defined as – On Board Unit (OBU): is an equipment that is being installed into vehicles. It establishes wireless connectivities with other OBUs and RSUs while on the move. – Road Side Unit (RSU): RSUs are installed at regular intervals on roads and constitute infrastructure in VANET. Technically RSUs are similar to OBUs but stationary in nature and used to establish wireless connectivities with moving vehicles. It may work as a bridge for Internet connection. Maximum DSRC range is 1 km. So to realize fully connected networks, RSUs need to be placed at every kilometer interval and raise the cost. Smart vehicles are equipped with many sensors and processing devices which can collect and process crucial information. Through the use of V2V and V2I communication as shown in Fig. 2, they can share it with other vehicles. For example, vehicles can share its location, speed, direction to other vehicles for cooperative safety application realization [10].
Fig. 2 VANET architecture [4]
330
M. Vala and V. Vora
Fig. 3 VANET applications
2.2 VANET Standard DSRC standards are designed for short to medium range of communication, and its aim is to offer the least delay and high data rate in VANET. The US Federation of Communication commission (FCC) has allocated 75 MHz of spectrum at 5.9 GHz (5.85–5.925 GHz) for V2V and V2I communication [14]. DSRC standards are further composed of two standards IEEE 802.11p and IEEE1609. IEEE 802.11p governs the operation of medium access control (MAC) and physical (PHY) layers while IEEE1609 governs higher layers functions for vehicular communication [36].
2.3 VANET Applications Overall VANET applications can be broadly classified into three categories as shown in Fig. 3 [2, 6, 26]. Active Safety applications The main aim of active safety applications is to reduce life-threatening accidents by providing warning to drivers so as to avoid collisions. Information like vehicle positions, speed, and braking events can be shared to other vehicles. By processing collective information, vehicles can locate the hazards. Few representative active safety applications are shown in Fig. 4. Traffic management applications This category of applications attempt to reduce road congestion, increase fuel efficiency, and support cooperative navigation. Few example, applications are speed limit warning, optimal speed for green light, cruise control, platooning. Infotainment Applications This class of applications covers local as well global service offered to drivers, for example, the nearest fueling station, Internet access. Three different classifications are presented above for VANET applications. The goal of VANET is to be able to provide all three classes of services with respective
Information Dissemination Strategies for Safety . . .
331
Fig. 4 Active safety applications
QoS requirements. Active safety applications are time-sensitive applications, and speedy propagation in networks is very crucial. It requires propagating the information beyond the transmission range of sending vehicles also. It is where multi-hop information dissemination strategies need to be implemented [17].
3 Message Dissemination Strategies In VANET safety-related applications, the shared data is usually important to a group of nodes. Due to high dynamic topology and short wireless link lifetime, traditional routing strategies will be ill-suited for VANET applications. Hence, most of the research work explores broadcasting-based data dissemination strategies. We can classify these information dissemination strategies into two broad categories. Single-hop broadcast and Multi-hop broadcast [24]. Both of the above strategies differ by the way information disseminates in networks.
332
M. Vala and V. Vora
3.1 Single-hop Broadcast In this method, the sender shares the information to its immediate neighbor vehicles. Receiving vehicles kept this information for own use. Periodically some of the information broadcasted to its single-hop neighbor vehicles. Many safety-related applications are implemented through single-hop broadcast, for example, braking event warning, blind spot warning, lane change warning, etc. Based on the frequency of broadcast single-hop broadcast strategies can be divided into fixed broadcast and adaptive broadcast [21]. Fixed broadcast In fixed broadcast, vehicle periodically broadcast crucial information to its immediate neighbors. The vehicles that receive these information, update their database with these new information. At some fixed interval, they also send few information with their neighbors. So by cooperatively sharing information to singlehop neighbors, they ultimately enhance the transport safety. Here as the broadcast interval is fixed key design interest toward information selection and information aggregation. Selection of fixed interval is needed to be optimum. It should not promote congestion in network neither it should create scarcity of data [23]. Adaptive Broadcast In adaptive broadcasting, the broadcast interval is selected based on the need. Suppose it detects that the congestion is there in network then broadcast rate is reduced. Single-hop broadcast schemes utilize store-and-forward strategy to convey information. Hence, they are best suited for applications, where information needs to be shared to short distance and timing criteria is not very strict [29].
3.2 Multi-hop Broadcast In the proposed DSRC standard, the transmission range is 1 km for communication, but experimental result shows that practical range is not more than 300 m. in such case to propagate safety-related messages to longer distance multi-hop message forwarding schemes need to be utilized. In an ad hoc network, central coordination is missing, so establishing multi-hop message dissemination in VANET is a challenging task. The severity of the problem increases in extremely dense and sparse networks which are typical in vehicular communication [34]. Broadcast mechanism in its native sense is simple flooding. In simple flooding the sender broadcast the data to all single-hop neighbors. In Multi-hop broadcasting this data further propagated to receiver’s neighbors and so on. In simple flooding, many vehicles broadcast the same packets and waste bandwidth. Plus in dense network, such kind of flooding easily creates congestion in the network. Sometimes it is referred to as broadcast storm problems. Plain flooding leads to the following problems in information dissemination [27].
Information Dissemination Strategies for Safety . . .
– – – –
333
Excessive Redundant data Channel Contention Large Packet drops Delay in message delivery.
Summary of the comparison between single-hop and multi-hop broadcast techniques is shown in Table 1.
4 Safety Message Dissemination Methods As discussed above, plain broadcasting is very inefficient and leads to broadcast storm problem in network. To alleviate these problems, methods of selective broadcasting are practiced. In which upon receiving the packets at one-hop distance, out of all receiving nodes, one or few nodes are selected as a relay candidate to further broadcast the packets. Other nodes keep the data for their own use. The popular selection strategies practiced in the literature for relay node selection are: Distance-dependent, Link quality-based, Probability-based, Counter-based, Cluster-based, Network coding-based, Neighbor knowledge-based, and Hybrid strategies as shown in Fig. 5. Distance-Dependent: Based on the distance between sender and receiver, the farthest node is selected to relay the message. By selecting the farthest node for relaying, the largest area can be covered with minimum-hop count. Link Quality-Based: Realistic channel conditions considered for next hop selection. The next broadcasting node is selected based on received RSSI value or other channel conditions in this method.
Table 1 Comparative analysis of single-hop and multi-hop broadcast schemes Single-hop brodcast Multi-hop broadcast Characteristics
Advantage
Application
• Message exchange between immediate neighbor only [24] • Message exchange rate can be fixed or adaptive as per design • Delay tolerant scheme • Less Redundancy [2] • Avoid broadcast storm problem Cooperative awareness applications like • Blind spot alert • Lane Change, Collision warning [26]
• Message exchange beyond 1-hop distance [17] • Can cover large area through multi-hop message propagation • Broadcast storm problem • Long distance propagation of critical messages [17] Emergency applications • post-crash alert • road condition alert, etc. [13]
334
M. Vala and V. Vora
Fig. 5 Complete broadcast classification
Probability-Based: Among the available nodes for relaying messages, different probabilities are assigned to every node for relaying the message. The node with the highest probability will broadcast the message, and other nodes will discard their scheduled broadcast when they hear the relay nodes broadcast. The probability assignment strategy is dependent on different parameters such as distance, density of vehicles, direction, and speed. Counter-Based: In counter-based scheme, whenever any node receives a broadcast packet, it first sets a random wait time before relaying it further. In wait time duration, it will count the number of retransmissions of same packets. If the total retransmission is less than predetermined threshold, then node will rebroadcast it, otherwise discard the broadcasting. Cluster-Based: In this method, a group or cluster is formed among neighbor vehicles having common features. The common features include but not limited to relative velocity, acceleration, position, direction, Vehicle density, transmission range, etc. A cluster head (CH) is selected within all cluster members (CM). On behalf of all cluster members, only cluster head will broadcast the message toward other clusters. Neighbor Knowledge-Based: In this method, vehicles exchange among them several key information such as position, direction, speed. By processing this information, every vehicle forms knowledge about its surrounding network condition. Based on acquired knowledge vehicles choose the optimum node as a relay candidate. Network Coding-Based: In this method, transmitted data is encoded and decoded to enhance network throughput. Here relay nodes combine several received packets before transmitting. In this sense, the aim is to reduce net transmission compared to broadcasting without network coding. Hybrid: To improve performance and alleviate limitations of the above-mentioned methods, sometimes researchers combine more than one method in the relay node selection process. All such methods belong to the hybrid category.
Information Dissemination Strategies for Safety . . .
335
4.1 Beacon-Assisted Versus Beacon-Less All the methods used for relay node selection can be either be beacon-assisted or beacon-less. Beacon-assisted methods require periodic exchange of Hello Messages, while Beacon-less methods do not have any such requirements [25]. Periodic exchange of beacons increases overhead but at the same time it improves performance. The bandwidth is very precious resource in VANET, so to reduce wastage of bandwidth beacon-less methods can be utilized [11]. The following section refers and classifies papers based on beacon-less and beacon-assisted data dissemination strategies. Beacon-Assisted Protocols DV-CAST exchanges periodic messages to one-hop neighbors and generates local topology knowledge. It stands robust against diverse traffic conditions. Each node continuously checks the local topology to find out any node in same or opposite direction to broadcast. It applies store-carry-forward mechanism when no node available in sparse network. Otherwise, it applies rebroadcast suppression and efficiently forwards the packet. Weighted p-persistent suppression scheme used to reduce broadcast storm problem. The direction and position information beacons are needed to continuously exchange. In diverse scenario of dense and sparse network, the optimum frequency of these beacon messages is very crucial in deciding performance of proposed work [31]. Inter-vehicle geocast (IVG) shares information of position, direction, and acceleration to calculate the area of interest. From this area, it selects the best forwarding nodes. Timer-based approach is used for broadcasting the messages. Whenever any message is received and it is first time received, node will wait for specific time. Upon expiration of timer, node will retransmit the message. Timer-based next forwarder selection scheme reduces redundant transmission [3]. In distributed optimized time (DOT)-based approach, beacon-assisted timeslot density control is provided. Thus, it addresses the scalability issue for dense traffic reducing the density of vehicles in each time slot. One-hop neighborhood information exchanged thorough beacons to select the farthest vehicle for rebroadcast [28]. MOZO is a clustering-based protocol [19], in which through hello messages vehicles collaborate with each other to form dynamic moving zones. The moving zone consists of vehicles having similar moving patterns and connected with one-hop link. Captain vehicle maintains the combined location and velocity tree (CLV-tree) to estimate position of vehicles in cluster. Whenever a vehicle leaves the cluster, it is updated into the leaving event queue (LE). Though compared to position sharing less data needs to be exchanged, it still needs neighbor information to perform data dissemination. Major problem beacon-assisted protocol faces are the frequent contention and broadcast storm. AddP adjusts the retransmission rate based on node density to reduce broadcast storm problem. In addition to this, AddP also selects the best suitable candidate to relay the packet based on local density and distance. To alleviate hidden node problem, it proposes transmitted packet monitoring mechanism to con-
336
M. Vala and V. Vora
firm if relay node has transmitted the message or not. Network coding-based data aggregation mechanism is utilized to reduce duplicate packets propagating in the network [22]. Zhang in [37] has proposed an adaptive link quality-based safety message (ALQSM) forwarding scheme for vehicular network. In this, physical channel connectivity checking method is proposed. Based on the calculated connectivity, probability among vehicles different score is assigned to potential forwarders. The scoreoriented priority method will select an optimal forwarder. This method aims to reduce the contention during broadcasting among different vehicles. Data dissemination scheme presented in [20] is based on clustering and probabilistic broadcasting (CPB). A clustering algorithm forms cluster of vehicles closely moving in same directions, which allows vehicles to exchange received messages with cluster head. During this phase, probabilistic forwarding is used where probability is calculated based on how many times the message is received during defined interval. Only cluster head will forward the received message toward its transmission direction. The Enhanced Counter-based broadcast protocol in Urban VANET (ECUV) improves data dissemination for urban VANETs using a road topology-based approach to select the best relay nodes to coverage capabilities in urban vehicle-to-vehicle (V2V) scenarios. This protocol avoids broadcast storm problem by reducing the transmission probability in high vehicle density as well as increase coverage in lowdensity scenario [15]. Beacon-Less Protocols SEAD utilizes beacon-less method to estimate node density, and based on estimated node density, it dynamically defines probability of rebroadcast. The redundancy ratio is computed at each node to find out node density as per given equation [1]. R=
Total received messages(Original + Duplicated) Total new messages (Original)
The locally measured metric offers a beacon-less and adaptive dissemination scheme that helps in reducing broadcast storm problem. The distance between sending and receiving nodes is utilized to compute wait time while node density will be used to compute retransmit probability. In this sense, it is a hybrid beacon-less protocol. Range-based relay node selecting (RBRS) protocol describes emergency warning dissemination protocol. The receiver node will refrain from immediate broadcasting and wait time for random time before retransmission. The wait time will be inversely proportional to the distance between sending and receiving vehicles. In this way, the chosen relay vehicle will be the farthest vehicle from the sender. In cases when boundary vehicles are not available, then chosen relay vehicle will wait unnecessarily longer time and the provided coverage area will be less due to close distance from the sender. It helps in reducing broadcast storm problem by discarding the scheduled transmission, when node hears the same message transmission by other relay node [16].
Information Dissemination Strategies for Safety . . .
337
SAB protocols provide estimation of traffic conditions by speed observation through negative corelation between them. Three versions of speed adaptive broadcast (SAB) protocols, namely Probabilistic-SAB, Slotted-SAB, and Grid-SAB provided. Grid-based SAB provides the lowest redundancy of packets among three proposed protocols. Without extra beacon overhead, this paper addresses the issue of scalability and reliability [5]. In [30], author represents a novel way to optimally use bandwidth by reducing large number of data packets, thus reducing the wastage of bandwidth. A fuzzybased beacon-less probabilistic broadcasting algorithms (FBBPA) is proposed, in which the broadcasting probability is calculated by considering distance, direction, angular orientation, and buffer load. The packet having the highest probability in buffer will be transmitted first. DRIVE aims to mitigate the broadcast storm problem and network partitions by disseminating data within an area of interest (AoI). It does not require vehicles to maintain a neighbor table instead it uses a sweet spot to alleviate the broadcast storm problem and increase coverage range. Vehicle that is located within sweet spot is more likely to disseminate data and enhance coverage compared to distance-based broadcasting [32]. In [33], wang designed a distributed relay selection method, by considering the locations, channel quality, velocities, and message receiving statuses of vehicles, to improve performance in highly mobile vehicular ad hoc networks. An instantly decodable network coding for the next relay vehicle to retransmit packets, resulting in significant improvements in both network throughput and transmission delay. Simulation results show that the proposed strategy effectively reduces the delay of data dissemination in highway scenarios. In [9], beacon-less traffic-aware geographical routing protocol (BTA-GRP) is proposed which tries to eliminate mobility-induced unreliability in vanet. BTA-GRP is an improved geographic routing strategy which adapted to high mobility and link disconnection issues. It considers traffic density, distance, and direction for next broadcast node. The protocol is suitable for dense as well sparse traffic. Table 2 summarizes all reviewed papers based on methods being used, objective of research work, and evaluation scenario.
5 Discussion and Future Scope The message dissemination process will depend heavily on type of traffic, type of application, and its QoS requirements. The forwarding strategy may be singlehop or multi-hop depending on the distance between sender and receiver as wellperformance criteria. The elected scheme needs to assure that all neighbor nodes have received crucial information through broadcast without network congestion, excessive delay and with good efficiency. Single-hop communication can provide acceptable throughput but data delivery time is large due to the store-and-forward nature of communication. Hence, it is suit-
338
M. Vala and V. Vora
Table 2 Information dissemination approaches Protocol Strategy Forwarding Objective method IVG [3]
Beaconassisted Beaconassisted
Distancebased Neighbor knowledge
AddP [22]
Beaconassisted
Density and distance based
DOT [28]
Beaconassisted Beaconassisted Beaconassisted
Locationbased Link quality-based Clustering and probabilitybased Cluster-based
RBRS [16]
Beaconassisted Beaconassisted Beacon-less
FBBPA [30]
Beacon-less
SEAD [1]
Beacon-less
SAB [5]
Beacon-less
DRIVE [32]
Beacon-less
DVCast [31]
ALQSM [37] CPB [20]
MoZo [19] ECUV [15]
NCRS-NC Beacon-less [33] BTA-GRP [9] Beacon-less
Counter-based Distancebased Fuzzy-based Probabilitybased Density-based
Locationbased Network coding-based Position-based
Broadcast storm Broadcast storm disconnected network Broadcast storm hidden node Redundancy reduction Redundancy reduction Delay reduction, improve coverage Broadcast storm Broadcast storm Delay reduction Delay reduction Broadcast storm Scalability redundancy reduction Overhead reduction Delay reduction Delay reduction, disconnection issue
Scenario
Simulator
Highway
Glomosim
Highway and Urban
Ns-2
Highway and Urban
OMNeT++
Highway
–
Urban
OMNeT++
Highway
NS-2
Highway and Urban Highway and Urban Highway
Ns-2
Highway and Urban Highway
– Ns-2 Ns-3
Highway and Urban
OMNeT++
Highway and Urban Highway
OMNeT++
Highway and Urban
– NS-2
Information Dissemination Strategies for Safety . . .
339
able for delay-tolerant applications, while performing poorly in delay-sensitive applications. Due to the limitations of single-hop communication, considerable research activities are ongoing toward multi-hop data dissemination schemes. A good multihop dissemination strategy will elect only a subset of neighbor nodes to rebroadcast the message in the network. The redundancy rate and congestion in the network are dependent on the elected scheme of dissemination.
6 Conclusion This paper provides review about VANET technology, discussion on VANET architecture, VANET protocol stack, and applications provided. This review paper highlights the importance of VANET in establishing ITS applications. Broadcasting is the basic mechanism for information dissemination in vehicular network. Due to high mobility and no centralized coordination, the task of message dissemination becomes very challenging. Safety-related application is utmost important among all and needs special consideration. A comprehensive classification of safety message dissemination is provided. Choice between beacon-less strategy and beacon-assisted strategy is a trade of between reliability and bandwidth saturation. To efficiently utilize available bandwidth, beacon-less schemes are suitable.
References 1. I. Achour, T. Bejaoui, A. Busson, S. Tabbane, Sead: a simple and efficient adaptive data dissemination protocol in vehicular ad-hoc networks. Wirel. Netw. 22(5), 1673–1683 (2016) 2. S. Al-Sultan, M.M. Al-Doori, A.H. Al-Bayatti, H. Zedan, A comprehensive survey on vehicular ad hoc network. J. Netw. Comput. Appl. 37, 380–392 (2014) 3. A. Bachir, A. Benslimane, A multicast protocol in ad hoc networks inter-vehicle geocast, in The 57th IEEE Semiannual Vehicular Technology Conference, 2003. VTC 2003-Spring, vol. 4, pp. 2456–2460. IEEE (2003) 4. R. Chandren Muniyandi, M.K. Hasan, M.R. Hammoodi, A. Maroosi, An improved harmony search algorithm for proactive routing protocol in vanet. J. Adv. Transp. 2021 (2021) 5. M. Chaqfeh, A. Lakas, A novel approach for scalable multi-hop data dissemination in vehicular ad hoc networks. Ad Hoc Netw. 37, 228–239 (2016) 6. F.D. Da Cunha, A. Boukerche, L. Villas, A.C. Viana, A.A. Loureiro, Data Communication in VANETs: A Survey, Challenges and Applications. Ph.D. thesis, INRIA Saclay, INRIA (2014) 7. K.C. Dey, A. Rayamajhi, M. Chowdhury, P. Bhavsar, J. Martin, Vehicle-to-vehicle (v2v) and vehicle-to-infrastructure (v2i) communication in a heterogeneous wireless networkperformance evaluation. Transp. Res. Part C: Emerg. Technol. 68, 168–184 (2016) 8. G. Dimitrakopoulos, P. Demestichas, Intelligent transportation systems. IEEE Veh. Technol. Mag. 5(1), 77–84 (2010) 9. S. Din, K.N. Qureshi, M.S. Afsar, J.J. Rodrigues, A. Ahmad, G.S. Choi, Beaconless trafficaware geographical routing protocol for intelligent transportation system. IEEE Access 8, 187671–187686 (2020)
340
M. Vala and V. Vora
10. Y.P. Fallah, C.L. Huang, R. Sengupta, H. Krishnan, Analysis of information dissemination in vehicular ad-hoc networks with application to cooperative vehicle safety systems. IEEE Trans. Veh. Technol. 60(1), 233–247 (2010) 11. R. Fracchia, M. Meo, D. Rossi, Vanets: to beacon or not to beacon?. in IEEE Globecom Workshop on Automotive Networking and Applications (AutoNet 2006) (2006) 12. H. Hartenstein, L. Laberteaux, A tutorial survey on vehicular ad hoc networks. IEEE Commun. Mag. 46(6), 164–171 (2008) 13. G. Karagiannis, O. Altintas, E. Ekici, G. Heijenk, B. Jarupan, K. Lin, T. Weil, Vehicular networking: a survey and tutorial on requirements, architectures, challenges, standards and solutions. IEEE Commun. Surv. Tutorials 13(4), 584–616 (2011) 14. J.B. Kenney, Dedicated short-range communications (dsrc) standards in the united states. Proc. IEEE 99(7), 1162–1182 (2011) 15. L. Khamer, N. Labraoui, A.M. Gueroui, S. Zaidi, A.A.A. Ari, Road network layout based multihop broadcast protocols for urban vehicular ad-hoc networks. Wirel. Netw. 27(2), 1369–1388 (2021) 16. T.H. Kim, W.K. Hong, H.C. Kim, Y.D. Lee, An effective data dissemination in vehicular adhoc network, in International Conference on Information Networking, pp. 295–304. Springer (2007) 17. S. Latif, S. Mahfooz, B. Jan, N. Ahmad, Y. Cao, M. Asif, A comparative study of scenariodriven multi-hop broadcast protocols for vanets. Veh. Commun. 12, 88–109 (2018) 18. W. Liang, Z. Li, H. Zhang, S. Wang, R. Bie, Vehicular ad hoc networks: architectures, research issues, methodologies, challenges, and trends. Int. J. Distrib. Sens. Netw. 11(8), 745303 (2015) 19. D. Lin, J. Kang, A. Squicciarini, Y. Wu, S. Gurung, O. Tonguz, Mozo: a moving zone based routing protocol using pure v2v communication in vanets. IEEE Trans. Mob. Comput. 16(5), 1357–1370 (2016) 20. L. Liu, C. Chen, T. Qiu, M. Zhang, S. Li, B. Zhou, A data dissemination scheme based on clustering and probabilistic broadcasting in vanets. Veh. Commun. 13, 78–88 (2018) 21. M. Naderi, F. Zargari, M. Ghanbari, Adaptive beacon broadcast in opportunistic routing for vanets. Ad Hoc Netw. 86, 119–130 (2019) 22. R. Oliveira, C. Montez, A. Boukerche, M.S. Wangham, Reliable data dissemination protocol for vanet traffic safety applications. Ad Hoc Netw. 63, 30–44 (2017) 23. B. Pan, H. Wu, J. Wang, Fl-asb: a fuzzy logic based adaptive-period single-hop broadcast protocol. Int. J. Distrib. Sens. Netw. 14(5), 1550147718778482 (2018) 24. S. Panichpapiboon, W. Pattara-Atikom, A review of information dissemination protocols for vehicular ad hoc networks. IEEE Commun. Surv. Tutorials 14(3), 784–798 (2011) 25. B. Paul, M. Ibrahim, M. Bikas, A. Naser, Vanet Routing Protocols: Pros and Cons. arXiv preprint arXiv:1204.1201 (2012) 26. A. Rasheed, S. Gillani, S. Ajmal, A. Qayyum, Vehicular ad hoc network (vanet): a survey, challenges, and applications, in Vehicular Ad-Hoc Networks for Smart Cities, pp. 39–51. Springer (2017) 27. T. Saeed, Y. Mylonas, A. Pitsillides, V. Papadopoulou, M. Lestas, Modeling probabilistic flooding in vanets for optimal rebroadcast probabilities. IEEE Trans. Intel. Transp. Syst. 20(2), 556–570 (2018) 28. R.S. Schwartz, K. Das, H. Scholten, P. Havinga, Exploiting beacons for scalable broadcast data dissemination in vanets, in Proceedings of the Ninth ACM International Workshop on Vehicular Inter-Networking, Systems, and applications, pp. 53–62 (2012) 29. C. Sommer, O.K. Tonguz, F. Dressler, Traffic information systems: efficient message dissemination via adaptive beaconing. IEEE Commun. Mag. 49(5), 173–179 (2011) 30. A. Srivastava, A. Prakash, R. Tripathi, Fuzzy-based beaconless probabilistic broadcasting for information dissemination in urban vanet. Ad Hoc Netw. 108, 102285 (2020) 31. O.K. Tonguz, N. Wisitpongphan, F. Bai, Dv-cast: a distributed vehicular broadcast protocol for vehicular ad hoc networks. IEEE Wirel. Commun. 17(2), 47–57 (2010) 32. L.A. Villas, A. Boukerche, G. Maia, R.W. Pazzi, A.A. Loureiro, Drive: an efficient and robust data dissemination protocol for highway and urban vehicular ad hoc networks. Comput. Netw. 75, 381–394 (2014)
Information Dissemination Strategies for Safety . . .
341
33. S. Wang, J. Yin, Distributed relay selection with network coding for data dissemination in vehicular ad hoc networks. Int. J. Distrib. Sens. Netw. 13(5), 1550147717708135 (2017) 34. L. Wu, L. Nie, J. Fan, Y. He, Q. Liu, D. Wu, An efficient multi-hop broadcast protocol for emergency messages dissemination in vanets. Chinese J. Electron. 26(3), 614–623 (2017) 35. Z. Xu, X. Li, X. Zhao, M.H. Zhang, Z. Wang, Dsrc versus 4g-lte for connected vehicle applications: a study on field experiments of vehicular communication performance. J. Adv. Transp. 2017 (2017) 36. S. Zeadally, R. Hunt, Y.S. Chen, A. Irwin, A. Hassan, Vehicular ad hoc networks (vanets): status, results, and challenges. Telecommun. Syst. 50(4), 217–241 (2012) 37. X. Zhang, Q. Miao, Y. Li, An adaptive link quality-based safety message dissemination scheme for urban vanets. IEEE Commun. Lett. 22(10), 2104–2107 (2018)
Tech Stack Prediction Using Hybrid ARIMA and LSTM Model Radha SenthilKumar , V. Naveen, M. Sri Hari Balaji, and P. Aravinth
Abstract A tech stack is a set of tools developer use to make an application. It consists of software applications, frameworks, and programming languages that realize some aspects of the program. In the advent of the future tech world, the student communities and developing computer society are eager to lay their hands on new frameworks and tech stacks. But of learning a reasonable technology will increase their chance of industrial growth and career enhancement. This will also channelize their productivity in right choice to get better outcomes. So, developing a hybrid model using autoregressive integrated moving average (ARIMA) and long shortterm memory (LSTM) to forecast the growth of popular tech stack for the upcoming decade. This will be feasible to find out the tech in rising curve with high accuracy. The outcome of the prediction will further pave the way for exciting opportunities and paths for growth. The accountability of this predictions is purely based on popular search engines and developer communities such as stack overflow. Keywords Time series forecasting · Tech stack analysis · Machine learning · Grid search method
1 Introduction Tech stack prediction focuses on bridging the gaps between college education and industrial demand in terms of developing students Skills. Learning the right thing will channelize the students’ career goal in right manner. So, we are mainly focusing on forecasting the developing tech stacks growth curve in the next decade. Thus, we can foresee the outcome of specific developing tech stack before trying hands in it. Many forecasting models are built for use case scenarios like weather forecasting and stock price prediction. But giving helping hand to student community on telling
R. SenthilKumar · V. Naveen (B) · M. Sri Hari Balaji · P. Aravinth Anna University, Madras Institute of Technology, Chennai, Tamil Nadu, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_26
343
344
R. SenthilKumar et al.
Fig. 1 For perspective, the growth of the open source projects and technologies in the past decade
what to learn will implement a rapid change among students which will lead to high productivity of pupils. Figure 1 shows the growth of the open source projects and technologies in the past decade. Currently, there are no existing systems for the student community to know about the developing technology to focus their productivity on right track to develop their career. But, there are many other prediction systems that focus on stock marketing, gold rate prediction, and financial systems. Taking that approaches into consideration, we built a system to predict the most popular technology to learn in their concerned domain with of search tags from popular search engines. Thus, the students productivity can be improved and industrial knowledge can be obtained with the help of effective self-learning.
2 Related Work To check whether the data is trend or stationary or not stationary has been proposed in [1]. The performance estimation can be gone through in-sample and out-sample forecasting the predicted values. The work depicts the observation and analysis to find the trend and usage of Box-Jenkins forecasting to find the trend in the future for the projected years. Also, the trend levels are categorized as high, low, medium, and no trend for the period of time has been proposed in [2]. Parameter estimation in [3] and to check, the data is stationary or not has been indicated. Since the model by using ARIMA only accepts the stationary values, the conversion of stationary process is carried out. Analyzing the data and forecast the predicted values by using LSTM for the population data of the specific province in [4]. Upon collecting the past data, the
Tech Stack Prediction Using Hybrid ARIMA and LSTM Model
345
growth trend of the population is identified to forecast for the next set of years. An idea of modeling and forecasting the demand in a food company by using time series approach is discussed in [5]. In this, the historical demand data is utilized to forecast demand and these forecasts result in predicting the supply chain. In [6] illustrates the stock price prediction. This work is carried out with autoregressive integrated moving average(ARIMA) model because of its wide acceptability, and it is proceeded on various possible previous data to predict accurate stock prices. This works focuses on good fit of the model for various sectors stocks. The parameter selection for ARIMA by using iterative method and the performance metric to determine the accuracy of the model has been discussed in [7]. An LSTM technique is applied over the decomposed sub-components for model fitting, and short-term future values are forecasted based on the obtained model is discussed in [8]. For the model, the dataset collection has been proposed in [9] which include the popular search engines, stack overflow and GitHub. Based on the respective tags data has been fetched. Various methodologies have been discussed in [10] to forecast the data and to predict the data with high accuracy with least error rate. The LSTM model uses a decomposable time series model with three major model components. They are trend, seasonality, and holidays. The LSTM model will try to fit the linear and nonlinear functions as components [11, 12]. By using LSTM machine learning, the proposed Bitcoin dataset has been trained and deployed to obtain a more accurate in-sample and out-sample forecasting of the Bitcoin price has been proposed in [13]. A combination of ensembles is carried out, and the working of model performance is evaluated to determine the perfectly fitted model in accordance with parameters has been proposed in [14]. In [15] illustrates using of upgraded version of extreme learning machine(ELM) can obtain the good accuracy and also reduces the classification error.
3 Proposed Work 3.1 Tech Stack Architecture In predicting tags count, we are considering only the past data over the years so no external factors contribute. Past values has effect on future values (autoregression), but a sudden variation in past dataset will affect forecast to large extend. To overcome the impacts of short-term fluctuations, the moving average part in ARIMA model contributes by analyzing the data points with respect to the averages of subset of data. Upon the data recorded over regular interval and to measure events that happen over a period of time, hybrid model will be a sophisticated choice. The entire architecture for the hybrid model of tech stack prediction has been shown in Fig. 2. It includes the various process for the conversion of non-stationary to stationary, parameter estimation for the model, fitting the model and forecasting the predicted values.
346
R. SenthilKumar et al.
Fig. 2 Tech stack architecture for hybrid ARIMA and LSTM model
3.2 Data Collection The main resource for building our model is lying on fetching past years dataset related to search tags on the popular search engine, stack overflow. Stack overflow is ranked as top developer community in terms of emerging programming world. Thus, there are more chances and the programmers are getting in touch with stack overflow for learning their new tech skills and tools. The tech domains that the developer community concentrating are observed with that of tags that are linked to every question and answers that are posted in this community. Tags count can give us an idea of how much developers are currently posting questions or answers in that tech topic, which contributes our project on visualizing and forecasting the growth of specific tech domain. So, these tags can precisely give us overview of how many developers are currently working on with specific domain. Thus, this will yield us the past years dataset of how the community in specific time got in touch with specific domain. Hence, we can define the popularity of the particular domain in upcoming years. So far, we collected 146 rows and 30 columns of data for initial level of analysis starting from year 2009 to 2020. We choose the most used and popular four tech domains.
Tech Stack Prediction Using Hybrid ARIMA and LSTM Model
347
3.3 Model Selection Then, we need to forecast the specific technology growth for the next few years from the data and we obtained from the past years. We also tried with other models like long short-term memory (LSTM), Holt Winters, and ensemble, but finally we end up with hybrid ARIMA and LSTM model because it has a lesser error rate while comparing with others and also yields greater accuracy. For that purpose, we decided to take combination of autoregressive integrated moving average (ARIMA) and long short-term memory (LSTM) hybrid model. ARIMA Model ARIMA models are the class of models for forecasting a time series using the series past values. Thus, we focused on analyzing data with plotting charts to see the trend of the data over the years. And after that, testing the data for stationarity is a must do task before we feed our ARIMA model with our dataset. By making sure each attributes are stationary, identifying seasonality in the dependent series and extract knowledge from the autocorrelation, partial autocorrelation and inverse autocorrelation plots to decide if any autoregressive or moving average component should be used in the model. While testing whether the estimated model conforms to the specifications of a stationary univariate process. The residuals should be independent of each other and constant in mean and variance over a particular period of time. LSTM Model LSTM is an artificial recurrent neural network technique. It is used because it has comparatively more long term memory than RNN. Also, it helps mitigate the vanishing gradient problem which is commonly seen in neural networks. It uses a series of gates contained in memory blocks which are connected through layers. The three types of gates are input gate which writes input to the cell, forget gate which reads output from the cell, and output gate which resets the old cell value. A number of hyperparameters need to be specified for the LSTM model to predict optimally. The number of layers should be one for simple problems, two for complex features, and more than two layers make it harder to train the dataset. Hybrid Model (ARIMA-LSTM) Finally, by concluding in order to successfully identify the behavior of a time series, it is essential to control its two main components: the trend and the cyclical component. The first describes the overall movement while the second points to periodic fluctuations. The idea is, therefore, to transform the original time series into a smooth function by isolating it from its seasonal component. The residuals from the ARIMA model are trained using LSTM, which is an artificial recurrent neural network technique. In addition, based on the performance both ARIMA and LSTM models: the first is optimal for the seasonal component and the second is effective for the trend. Once this is done, we can crosscheck the predicted trend and the reproduced cyclical component to construct a forecast of the original series. The predictions from the three models, viz., ARIMA, LSTM, and ARIMA- LSTM hybrid is finally tested with collected dataset of 10 years through evaluation metrics like RMSE and MAE. The one with a better performance is finalized for further development of the system.
348
R. SenthilKumar et al.
Fig. 3 Observing data as non-stationary data
3.4 Data Processing If our dataset is non-stationary, we intend to transform the non-stationary data to stationary data. Because non-stationary data are unpredictable and cannot be modeled/forecasted. So, to change a series to stationary, we tend to difference it. That is, we subtract the previous value from the current value. Again if the data is not stationary, we will perhaps perform nonlinear transformation such as log transformation or square root transformation. Test for Stationarity To check our dataset for stationarity, we look at plots for trends or seasonality. And, we look at statistical test (ADF test) results. By using augmented Dickey–Fuller test with the support of statsmodel package via adfuller() function. On inferring to the p-value which is returned by the results of the adfuller() function, we can tell whether the series is stationary or not. If the p-value obtained is greater than the significance level of 0.05, we can clearly see that the given time series is non-stationary has shown in Fig. 3. Since ARIMA is feded with stationary data. If the p-value is less than the significance level of 0.05, then the time series is known as stationary has shown in Fig. 4.
Tech Stack Prediction Using Hybrid ARIMA and LSTM Model
349
Fig. 4 Observing data as stationary data
3.5 Model Building and Parameter Estimation Before proceeding to the ARIMA model, we tend to see autocorrelation of the series indexed in time. A stationary process has the property that the mean, variance, and autocorrelation structure do not change over time. From that we can arrive at the result that data is stationary with autocorrelation do not change over time. Then, we proceed with building the ARIMA model for the picked up technology domains to forecast and the further tags count to get overview of best tech stack to learn further to build promising applications. 1. For building ARIMA, the parameters p, q, and d are pre-determined to feed the model with ARIMA Order. i. p is the order of the autoregressive (AR) term ii. q is the order of the moving average (MA) term iii. d is the number of differencing required to make the time series stationary 2. Order p is the lag value which can be obtained on analyzing the PACF plot (partial autocorrelation). The lag value can be calculated on seeing the plot which crosses the upper confidence interval for the first time. 3. Order q is obtained from the ACF plot (autocorrelation). This value can be calculated on seeing the plot which crosses the upper confidence interval for the first time.
350
R. SenthilKumar et al.
Fig. 5 In-sample forecasting for Python technology in programming domain
4. Also, grid search method is utilized to arrive at the least mean squared error value with taken corresponding ARIMA order values. These order values will be considered to build ARIMA model. 5. Once the ARIMA model is forecasted the residuals will be evaluated and the actual difference will be feed with hybrid LSTM model. Finally, the overall forecasting will be done based on the obtained residuals.
4 Experimental Analysis With the help past value data, the model able to predict the upcoming next two years by using hybrid model (ARIMA-LSTM). The model will get trained up until the previous value to make the next prediction for forecasting. In Fig. 5 represents the in-sample forecasting data for Python technology in programming language domain. In Fig. 6 represents the out-sample forecasting data for Python technology in programming language domain. In Fig. 7 represents the in-sample and out-sample forecasting data for Java technology in programming language domain. In Fig. 8 represents the overall comparison of forecasting data for all technologies in programming language domain. It is clearly indicating that Python will be most dominating one while compared to other languages in the next upcoming months. In Fig. 9 represents the overall comparison of forecasting data for all technologies in frontend domain. Based on forecasting reactjs
Tech Stack Prediction Using Hybrid ARIMA and LSTM Model
351
Fig. 6 The raising curve for Python in programming domain through out-sample forecasting
Fig. 7 In-sample and out-sample forecasting for Java technology in programming domain
will be most widely used in the next upcoming months when compared with other technologies.
5 Performance Metrics While performing the analysis of the tech stack prediction, the evaluation metrics of the model that we have made use of are square error (MSE), root mean square error (RMSE), and mean absolute error (MAE). While comparing with other models, the hybrid model (ARIMA-LSTM) has the least error rate and also yields greater accu-
352
R. SenthilKumar et al.
Fig. 8 Comparison of all technologies in programming domain including past and forecasting data
Fig. 9 Comparison of all technologies in frontend domain including past and forecasting data Table 1 Performance metrics for Python—programming domain Metrics ARIMA model LSTM model MAE MSE RMSE
1129.36 1936264.65 1391.49
1133.54 1986264.46 1409.34
Hybrid model (ARIMA-LSTM) 1013.83 1653274.09 1285.79
racy. In parameter estimation of ARIMA model will be chosen based on the above metrics and the model will be fitted and forecasted with the obtained parameters. Table 1 represents the comparison of all the evaluation metrics for the hybrid model forecasting of Python technology. It is clearly evident that the hybrid model yields less error rate while comparing with other models. Frontend Angular, Node.js, Veu.js, jQuery, React Backend MongoDB, PostgreSQL, MySQL, SQLite, Elasticsearch Machine Learning Numpy, Pandas, MATLAB, Pytorch, Keras Programming Language Python, Java, R, C++, Kotlin
Tech Stack Prediction Using Hybrid ARIMA and LSTM Model
353
6 Limitations of the Model The hybrid model based on their performance and accuracy worked well on the small amount of data. The predictions of unseen data are quite a bit in correlation with the tuned model’s prediction. But, the limitations lie with the collection of data. Even though the model’s forecasting is quite fit with reality, we have a very small volume of data. As of now the hybrid model can able to predict accurately up to 2 years but when we collect more data from different sources and communities, we can feed model with multivariate and for the further development process.
7 Conclusion In this paper, we dealt with forecasting the popular technologies growth in each domain. Predicting the technology which is going to flourish in the next decade will allow the student developer communities to spare their free time in learning these technology, which will also increase their skillset and job opportunities. Through our proposed hybrid model, we can pick out one technology in each domain which will going to be trendy by evaluating the current state of them. Also the proposed system is based on specific search engine’s search tags count analyzed for the specific time. Also, asynchronous task processing system can be implemented to evaluate forecasting models toward front end by User intervention.
References 1. S. Athiyarath, M. Paul, S. Krishnaswamy, A comparative study and analysis of time series forecasting techniques. SN Comput. Sci. 1(3) (2020) 2. O.A. Balasmeh, R. Babbar, T. Karmaker, Trend analysis and ARIMA modeling for forecasting precipitation pattern in Wadi Shueib catchment area in Jordan (Arab. J, Geosci, 2019) 3. J.P. Brockwell, R.A. Davis, Modeling and forecasting with ARMA processes, in Proceedings of Introduction to Time Series and Forecasting, pp. 121–155 (2016) 4. J. Dai, S. Chen, The application of ARIMA model in forecasting population data. J. Phys. Conf. Ser. (2019). https://doi.org/10.1088/1742-6596/1324/1/012100 5. J. Fattah, L. Ezzine, Z. Aman, H.E. Moussami, A. Lachhab, Forecasting of demand using ARIMA model. Int. J. Eng. Bus. Manage. (2018) 6. P. Mondal, L. Shit, Study of effectiveness of time series modeling (Arima) in forecasting stock prices. Int. J. Comput. Sci. Eng. Appl. 4(2), 13–29 (2014) 7. S. Prajapati, A. Swaraj, R. Lalwani, A. Narwal, K. Verma, G. Singh, A. Kumar, Comparison of Traditional and Hybrid Time Series Models for Forecasting COVID-19 Cases (Publ, Social and Information Networks, 2021) 8. R.R. Sharma, M. Kumar, S. Maheshwari, K.P. Ray, EVDHM-ARIMA based time series forecasting model and its application for COVID-19 cases. IEEE Trans. Instrum. Measur. 99, 6502210 (2020) 9. Y. Tian, W. Ng, J. Cao, S. McIntosh, Geek talents: who are the top experts on GitHub and stack overflow? Comput. Mater. Continua 465–479 (2019)
354
R. SenthilKumar et al.
10. S. Tipirneni, C.K. Reddy, Self-supervised transformer for multivariate clinical time-series with missing values (Mach, Learn, 2021) 11. M. Tripathi, Sentiment analysis of Nepali COVID19 Tweets using NB. SVM AND LSTM. J. Artif. Intel. 3(03), 151–168 (2021) 12. J.I.-Z. Chen, K.-L. Lai, Deep convolution neural network model for credit card fraud detection and alert. J. Artif. Intel. 3(2), 101–112 (2021) 13. H.K. Andi, An accurate Bitcoin price prediction using logistic regression with LSTM machine learning model. J. Soft Comput. Paradigm 3(3), 205–217 (2021) 14. A.P. Pandian, Performance evaluation and comparison using deep learning techniques in sentiment analysis. J. Soft Comput. Paradigm (JSCP) 3(2), 123–134 (2021) 15. S.R. Mugunthan, T. Vijayakumar, Design of improved version of sigmoidal function with biases for classification task in ELM domain. J. Soft Comput. Paradigm (JSCP) 3(02), 70–82 (2021)
Deceptive News Prediction in Social Media Using Machine Learning Techniques Anshita Malviya and Rajendra Kumar Dwivedi
Abstract Our society has witnessed numerous incidents concerning fake news. Social media has always played a significant role in its contribution. People with notorious mindsets often are the generators and spreaders of such incidents. These mischievous people spread the fake news without even realizing the effect, and it has on naive people. People believe on the fake news and start behaving accordingly. Fake news appeals to our emotions. It plays with our feelings, and it can make us angry, happy or scared. Also, fake news can lead to hatred or anger towards a specific person. Now-a-days, people easily befool each other using social media as a tool to spread the fake news. In this paper, machine learning-based models are proposed for detection of such deceptive news which creates disturbances in our society. This model is implemented in Python. Logistic regression, multinomial naive Bayes, passiveaggressive classifier and multinomial classifier with hyperparameter algorithms are used. Results show that the logistic regression algorithm outperforms others in terms of accuracy to detect the fake news. Keywords Logistic regression · Multinomial NB algorithm · Passive-aggressive classifier · Multinomial classifier with hyperparameter · Count vectorizer · Tfidf vectorizer · Hash vectorizer
1 Introduction The ineluctable part of our community is the social media. The news source present on it cannot be trusted. Now-a-days, these social platforms are the medium of spreading fake news which consists of forge stories and vague quotes, facts and sources. These hypothetical and spurious stories are used to influence people’s opinion towards any issue. The term fake news amalgamates these different conceptions such as misinformation, disinformation and mal-information. Since few years, the spread of fake news is generally through social media platforms like Twitter, Facebook, A. Malviya (B) · R. K. Dwivedi Department of Information Technology and Computer Application, MMMUT Gorakhpur, Gorakhpur, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_27
355
356
A. Malviya and R. K. Dwivedi
WhatsApp, YouTube, etc., in the form of videos, memes, advertisements, imposing contents and many more. This has become a serious issue as it is causing serious crimes and affecting the peace and brotherhood among people. As we have seen that fake news comprises of misinformation, disinformation and mal-information and we have seen that these information goes viral because of overload of information on social media and uses. The intent of the people or the medium of transfer of information helps of fake news. Misinformation means false information, the news which is not true and is fake. Public opinion is changed using this form of fake news. Intentionally changing the real news to fake by a person is referred to as a disinformation. It can be dangerous on social platforms as large amount of news are present these and lot of people are uses of social media. Malinformation is referred to as true news but when it goes viral causes harm to the society, organization and community. All of these three together constitutes fake news. These are circulated for fun or for some political or business interests. People does not give second thought about the news which they read on social media, this is because that the processing capability of our brain is very less, that is why it makes the judgement according to some previous issues or facts. As we have seen that during this COVID-19 pandemic, many misinformation are present on social media related to some have remedies which are not authentic, false advisories or some intrigue theories. These were some news related to financial crises being foist in India during this pandemic, which was fake checked by Press Information Bureau on 24 March 2020. Even our Prime Minister requested us not to believe these rumours and stay calm and motivated. The swamp of fake news was seen on social media when the government brought CAA act in 2019. The people thought that their citizenship would be taken because of this act which was not true. The Supreme Court of India asked the government to explain the citizens about this act and remove their doubts and misconceptions. Spreading of fake news is also seen during the time of elections. There are various ways by which this fake news could be identified. Readers’ emotions are influenced by these fake news which leads to anger and disputes. If the website address is fake, author is anonymous, source of news is mispresented and the article is grammatically incorrect, all these clearly indicate that the news is fake. People should also check the publication date of the article to find the authenticity of the news. We readers could also take actions against the misinformation spread on social media. If we see a post trending on social media with misinformation, we should report it or we could also report people who are spreading this information and creating disturbance in the society. We could be the editors and could find the truth behind the articles and protect our community from this problem. Rest of the paper is organized as follows. Section 2 introduced the literature survey related to the deceptive news prediction using machine learning algorithms. Proposed methodology is presented in Sect. 3. Section 4 gives the overview of the machine learning techniques used in prediction of fake news. Experimental work is explained in Sect. 5. Conclusion and future directions is given in Sect. 6.
Deceptive News Prediction in Social Media Using Machine …
357
2 Literature Survey This section discusses a detailed review on the detection of fake news and classification of fake and real news present on social media platforms. It has been observed that since few years researchers have shown their great interest in this area. Mandical et al. [1] discussed about the problem of fake news spreading worldwide and the importance of machine learning in the detection of these fake news. A fake news classification system has been proposed by the authors using three machine learning algorithms such as naïve Bayes, passive-aggressive classifier and deep neural networks on eight datasets. If correct approach would be used, then the task of identification of fake news could be made insignificant. Ksieniewicz et al. [2] told about two methods of identifying fake news. The first method is fact checker which involves volunteers, and the second is based on intelligent systems. A different method is used for detection of false news which involves stream data classification approach and drift occurring concept. Benchmark data has been used to evaluate the approach used. Bhogade et al. [3] focus on the growth and popularity of social media which contributes towards the spread of false news disturbing the positive mindset of people. The authors collected various news stories using natural language processing, machine learning as well as artificial intelligence. The usage of different models of machine learning has been discussed for prediction of fake news and checking the authenticity of news stories. Ashtaputre et al. [4] proposed a model which involves machine learning techniques and natural language processing for the detection of fake news. They performed comparison of different classification models and techniques to determine the best result among them. They used TFIDF vectorization for data preprocessing. Baarir and Djeffal [5] explained about the current threat to mankind which is the wide spread of illegal information in the form of news on social media. The most interesting research topic of present time is the detection and prevention of fake news. Tfidf vectorizer has been used as the data preprocessing technique, and support vector machine classifier is used for the construction of detection system. Bharath et al. [6] explored five machine learning algorithms such as logistic regression, support vector machines, Naïve Bayes, recurrent neural network models and found their effectiveness towards solving the fake news detection task. They also performed sentiment analysis and concluded that naïve Bayes and SVM are the best approaches among other used methods. Sharma et al. [7] elaborated about the consequences and disadvantages of using social media platforms. Tendentious options created to any issue is created among people. The authors performed binary classification of different news articles using artificial intelligence, natural language processing and machine learning. Ahmed et al. [8] proposed techniques to check the authenticity of the news. They concluded that the machine learning methods are reliable for detecting fake news. They evaluated the accuracy of their proposed model with other systems by applying
358
A. Malviya and R. K. Dwivedi
various combinations of machine learning algorithms like support vector machine, passive-aggressive classifier, logistic regression and naïve Bayes. Kumar et al. [9] presented the significance of techniques of natural language processing to classify news in fake and real. Text classification method with the help of various classifications models are used by them to predict the result. They concluded that LSTM application is best among all the models used. Waghmare and Patnaik [10] showed their concern towards developing a technique which would predict fake news and control their widespread transmission of putting positive and negative impact. They used blockchain framework with machine learning methods for the detection and classification of social news. For the purpose of revoking, the illegal achieves of spreading false news blockchain framework is important. Khivasara et al. [11] proposed web-based augmentation to provide authenticity of content to the readers. Deep learning models are used as the web augmentation. LSTM and GPT-2 are two algorithms used by the authors to distinguish between real and fake news articles. Mishra [12] found influence association among readers by proposing a HiMap model for the detection of fake news. Direct and indirect association among the readers could be captured using this approach. Experiments are performed using two Twitter datasets and calculated accuracy for the state-of-the-art models. Shaikh and Patil [13] detected the accuracy of the fake social news with the help of various machine learning approaches. Many fields such as politics, education, etc., are influenced by the spread of fake news. Resources are limited because of which the deletion of fake news becomes complicated. TFIDF vectorization has been used for feature extraction. Chokshi and Mathew [14] collected the percentage of users tricked by the spread of fake news as in 2020, 95% of the people protesting against the CAA law thought that their citizenship would be taken and these people became the victim of this problem. Different deep learning methods are used by the authors to solve this problem. Artificial neural network, convolutional neural network architectures have also been used. Wang et al. [15] investigated the issues related to the prediction of fake news. They proposed novel set-up consisting of the fake article detector, the annotator and the reinforced selector as the models based on deep learning were not capable to tackle the dynamic nature of article on social media. This set-up improved the efficiency of finding the fake news. Lee et al. [16] had proposed an architecture of deep learning which is used to detect the fake news in Korean articles. The sentences written in Korean are smaller than English sentences which create problem, and therefore, different CNN-based architectures of deep learning are used to resolve the issue. Qawasmeh et al. [17] accepted that in comparison to present text-based analysis of traditional approaches, and the detection of fake news is tougher. It has been seen that machine learning methods are less efficient than neural network models. They have used latest machine learning approaches for the automatic identification of take news.
Deceptive News Prediction in Social Media Using Machine …
359
Agarwalla et al. [18] discussed about the people intentions of spreading the fake news and creating disturbance in the society. They tried to find an efficient and relevant model for the prediction of fake news using classification algorithms and natural language processing. Han and Mehta [19] used naïve Bayes and hybrid CNN and RNN approaches used to mitigating the problem of fake news. They performed comparison among both machine learning and deep learning approaches for finding the accurate systems for the detection of false news. Manzoor and Singla [20] told about the success of detection of fake news using various machine learning algorithms, but the ongoing change in the features and orientation of fake news on different social media platform can be solved by various deep learning approaches such as CNN, deep Boltzmann machine, deep neural network, natural language and many more.
3 Proposed Methodology Methodology used to make various models based on different machine learning algorithms to analyse model accuracy and find fake versus true news consists of the following algorithm and also depicted in Fig. 1. Algorithm Deceptive news prediction Input: Take news dataset Output: Accuracy of different models for predicting fake news Begin Step 1: Take news dataset consisting of fake and real news articles Step 2: Preprocessing of the data Step 3: Split the data in train and test datasets Step 4: Design and Train the models with machine learning algorithms like logistic regression, multinomial naïve Bayes, etc. Step 5: Test the models Step 6: Analyse the models End
4 Selection of Machine Learning Techniques Machine learning, a subset of artificial intelligence, is so versatile today that we use it several times in a day without having knowledge of it. We cannot imagine this world
360 Fig. 1 Methodology for fake news detection
A. Malviya and R. K. Dwivedi
Start
Get the news dataset
Preprocessing of dataset
Split the dataset in train and test datasets
Design and Train the models
Test the models
Analyze the accuracy of models
Stop
without machine learning as we already got so many things from it and in future will also get. Learning is a native behaviour of living beings. Living beings gets new knowledge from the surrounding and modify it by experiences like happiness and hurdles which comes on their way. Simulating the learning ability of living beings into machines is what we all know as machine learning. Used machine learning algorithms in experimental work are discussed below.
Deceptive News Prediction in Social Media Using Machine …
361
4.1 Logistic Regression (LR) Logistic regression comes under supervised machine learning technique and is the most widely used algorithm. Output of categorical dependent attributes are predicted with the help of independent attributes set in this algorithm. The result of the prediction is in the form of probabilistic values (between 0 and 1). The classification problems are solved using it. There is ‘S’-shaped logistic function/sigmoidal function in place of regression line for predicting two values (0 or 1). Data classification can be performed using continuous and discrete datasets. The concept of threshold value is used in logistic regression. The nature of dependent attribute should be categorical and multi-collinearity should not be present in independent variables are the two main assumptions of logistic regression. There are three types of logistic regression—binomial, multinomial and ordinal. The equation of logistic regression is given below. log[y/(1 − y)] = b0 + b1 x1 + b2 x2 + · · · bn xn
4.2 Multinomial Naïve Bayes (MNB) Multinomial naïve Bayes algorithm is commonly used in natural language processing and is a probabilistic learning approach. It uses the approach of Bayes theorem. Probability of each item for a given list is calculated, and the output is the item with the highest probability. Many algorithms come under naïve Bayes theorem with a principle that each attribute is independent of the other attribute. Bayes theorem formula is as follows. P( A/B) = P( A) × P(B/A)/P(B) The advantages of multinomial NB are that it is easy to implement, can be used for continuous and discrete data, real time applications can be simply predicted and it can handle huge dataset. This algorithm is not suitable for regression. It is suitable for textual data classification rather than predicting numerical data.
4.3 Passive-Aggressive Classifier (PAC) Passive-aggressive classifier is a machine learning algorithm, but it is not very popular among enthusiasts. This is very efficient for various applications such as detection of fake news on social media. This algorithm is similar to a perceptron model that consists of regularization parameter and does not used a learning rate. Passive refers
362
A. Malviya and R. K. Dwivedi
to not making changes in the model if the prediction is correct and aggressive refers to making changes in the model if the prediction comes as incorrect. It is a classification algorithm for online learning used in machine learning. Online learning is one of the categories of machine learning like supervised, unsupervised, batch, instance-based and model-based learning. A system can be trained in passive-aggressive classifier by incrementally giving the instances continuously sequentially, individually or in small batches.
4.4 Multinomial Classifier with Hyperparameter (MCH) Multinomial classifier with hyperparameter is a naïve Bayes algorithm. This algorithm is not so much popular among the machine learning enthusiasts. This algorithm is mostly suitable for text data. It is a naïve Bayes algorithm which involves tuning with the hyperparameter.
5 Experimental Work 5.1 Data Collection The first step involved in developing the classification model is collecting data. The goodness of the predictive model is based on the quality and quantity of the data collected which turn out to be one of the most important steps in developing a machine learning model. The news dataset is taken from Kaggle repository. Figure 2 represents five records of dataset. This dataset consists of 20,800 instances and five attributes namely ID, title, author, text and label. The ID attribute represents unique ID for the news article, title attribute tells the title of the news, author attributes give the name of the author of the article, text attributes contain entire news and label attribute informs about the authenticity of the news articles in terms of zero and one. Five instances (some portion) of dataset is given in Fig. 2.
Fig. 2 Five instances of the dataset
Deceptive News Prediction in Social Media Using Machine …
363
5.2 Preprocessing of Fake News Dataset In data preprocessing, we will take the text attribute from the dataset which comprises of actual news articles. For making the model more predictable, we will modify this text attribute so that more information could be extracted. This is done using ‘nltk library’. Firstly, we have removed stopwords present in the article. Stopwords are the words which are used to connect and tell the tense of sentences and thus have less importance in the context of sentences and can be removed. After that tokenization is performed followed by vectorization. Vectorization is the technique of natural language processing in which words are mapped with vectors of real numbers for semantic prediction. Three vectorization techniques are used in this paper namely count, hash and tfidf vectorizers. These vectorizers are used for extracting features with the help of text with an aim to build processing models. Tfidf stands for term frequency inverse document frequency vectorizer. The transformation of text into significant number representation is done by this vectorizer. It is a very common algorithm which is used to fit algorithms of machine learning for prediction. Count vectorizer is a best tool. On the basis of the occurrence of each word in the text, and this tool is used to transform a given text into a vector. It is helpful in the case of multiple texts. Hashing vectorizer uses hashing techniques which is used to find the name of string token so that it can be mapped with integers. It is a vectorizer which is used to transform collection of documents into sparse matrix which consists of the count of token.
5.3 Design and Train/Test the Models Before building the models, we divided the dataset into two parts namely train and test datasets. Train dataset consists of 67% of total instances and test dataset consists of 33%of total instances. The independent attribute taken for training the models is text attribute, and label attribute is taken as dependent.
5.4 Analyse the Models We evaluated the models using confusion matrix and accuracy report. Table 1 depicts the comparisons of accuracy of different models using three types of vectorizer. We have seen that when data is preprocessed using tfidf vectorizer, then highest accuracy is achieved by logistic regression and passive-aggressive classifier model. Logistic regression model gives best performance that is 95% and 93% when its data is preprocessed using count vectorizer and hash vectorizer, respectively.
364
A. Malviya and R. K. Dwivedi
Table 1 Comparison of accuracy S. No.
Machine learning algorithms
Tfidf vectorizer
Count vectorizer
Hash vectorizer
1.
Multinomial naïve Bayes
0.900
0.898
0.876
2.
Passive-aggressive classifier 0.951
0.935
0.925
3.
Multinomial classifier with hyperparameter
0.900
0.898
0.876
4.
Logistic regression
0.950
0.949
0.926
Figure 3 depicts the accuracy report for each algorithm in which X-axis represents machine learning techniques and Y-axis represents their accuracy, and Fig. 4 shows
Accuracy Report
Fig. 3 Accuracy of ML techniques
0.96 0.94
Accuracy
0.92 0.9 0.88 0.86 0.84 0.82 MNB
PAC
MCH
LR
Machine Learning Techniques
Tfidf
Count
Hashing
Accuracy Report
Fig. 4 Accuracy of algorithms versus vectorizers
Accuracy of models
0.96 0.94 0.92 0.9 0.88 0.86 0.84 0.82 Tfidf
Count
Hashing
Vectorizaon Techniques
MNB
PAC
MCH
LR
Deceptive News Prediction in Social Media Using Machine …
365
Table 2 Confusion matrix S. No. Vectorization
Machine learning algorithms
1.
Multinomial naïve Fake news Bayes Real news
3238
151
453
2193
Passive-aggressive Fake news classifier Real news
3233
156
142
2504
Multinomial classifier with hyperparameter
Fake news
3237
152
Real news
448
2198
Logistic regression Fake news
3259
130
Real news
170
2476
Multinomial naïve Fake news Bayes Real news
3136
253
364
2282
Passive-aggressive Fake news classifier Real news
3168
221
170
2476
Multinomial classifier with hyperparameter
Fake news
3136
253
Real news
364
2282
Logistic regression Fake news
3236
153 2491
2.
3.
Tfidf vectorizer
Count vectorizer
True label/predicted Fake news Real news label
Real news
155
Hashing vectorizer Multinomial naïve Fake news Bayes Real news
3297
92
659
1987
Passive-aggressive Fake news classifier Real news
3155
234
216
2430
Multinomial classifier with hyperparameter
Fake news
3293
96
Real news
656
1990
Logistic regression Fake news
3183
206
Real news
242
2404
accuracy of algorithms versus vectorization in which X-axis shows different vectorization techniques and Y-axis depicts their accuracy. Table 2 presents the confusion matrix. Confusion matrix is used to compare the performance of classification model on test data when true values are known.
6 Conclusions and Future Directions In this paper machine, learning models are developed such as logistic regression model, multinomial naïve Bayes model, passive-aggressive classifier model and
366
A. Malviya and R. K. Dwivedi
multinomial classifier with hyperparameter to predict fake news. Results show that overall logistic regression model gives best accuracy among all the proposed models and with all used data preprocessing algorithms for the detection of deceptive news spreading on social platforms and influencing human behaviour. It is also found that with tfidf vectorizer passive-aggressive classifier model gives best result with 95.1% accuracy. As future work, more experiments can be done by using more machine learning algorithms, preprocessing techniques and datasets for finding efficient system to detect fake news.
References 1. R.R. Mandical, R. Monica, N. Mamatha, A.N. Krishna, N. Shivakumar, Identification of Fake News Using Machine Learning (IEEE, 2020). ISBN: 978-1-7281-6828-9/20 2. P. Ksieniewicz, P. Zyblewski, M. Choras, R. Kozik, A. Giełczyk, M. Wozniak, Fake News Detection from Data Streams (IEEE, 2020). ISBN: 978-1-7281-6926-2/20 3. M. Bhogade, B. Deore, A. Sharma, O. Sonawane, M.S. Changpeng, A research paper on fake news detection. Int. J. Adv. Sci. Res. Eng. Trends 6(6) (2021). ISSN (Online) 2456-0774. https://doi.org/10.51319/2456-0774.2021.6.0067 4. P. Ashtaputre, A. Nawale, R. Pandit, S. Lohiya, A machine learning based fake news content detection using NLP. Int. J. Adv. Sci. Technol. 29(7), 11219–11226 (2020) 5. N.F. Baarir, A. Djeffal, Fake news detection using machine learning, in 2020 2nd International Workshop on Human-Centric Smart Environments for Health and Well-being (IHSH). ISBN: 978-1-6654-4084-4/21 6. G. Bharath, K.J. Manikanta, G.B. Prakash, R. Sumathi, P. Chinnasamy, Detecting fake news using machine learning algorithms, in 2021 International Conference on Computer Communication and Informatics (ICCCI—2021), Coimbatore, 27–29 Jan 2021 (IEEE). ISBN: 978-1-7281-5875-4/21/2021 7. U. Sharma, S. Saran, S.M. Patil, Fake news detection using machine learning algorithms. Int. J. Eng. Res. Technol. (IJERT). ISSN: 2278-0181. Special issue—(2021), NTASU–2020 conference proceedings 8. S. Ahmed, K. Hinkelmann, F. Corradini, Development of fake news model using machine learning through natural language processing. World Acad. Sci. Eng. Technol. Int. J. Comput. Inf. Eng. 14(12) (2020). ISNI: 00000000 919 5 0263 9. K.A. Kumar, G. Preethi, K. Vasanth, A study of fake news detection using machine learning algorithms. Int. J. Technol. Eng. Syst. (IJTES) 11(1), 1–7 (2020). ISSN: 0976-1345 10. A.D. Waghmare, G.K. Patnaik, Fake news detection of social media news in blockchain framework. Indian J. Comput. Sci. Eng. (IJCSE) 12(4) (2021). https://doi.org/10.21817/indjcse/2021/ v12i4/211204151. e-ISSN: 0976-5166, p-ISSN: 2231-3850 11. Y. Khivasara, Y. Khare, T. Bhadane, Fake news detection system using web-extension, in 2020 IEEE Pune Section International Conference (PuneCon), Vishwakarma Institute of Technology, Pune, India, 16–18 Dec 2020 (IEEE, 2020). ISBN: 978-1-7281-9600-8/20 12. R. Mishra, Fake news detection using higher-order user to user mutual-attention progression in propagation paths, in Computer Vision Foundation 2020 Workshop 13. J. Shaikh, R. Patil, Fake news detection using machine learning, in International Symposium on Sustainable Energy, Signal Processing and Cyber Security (iSSSC) (IEEE, 2020). ISBN: 978-1-7281-8880-5/20/ ©2020 IEEE. https://doi.org/10.1109/iSSSC50941.2020.93588 14. A. Chokshi, R. Mathew, Deep learning and natural language processing for fake news detection: a survey, in International Conference on IoT Based Control Networks and Intelligent Systems. ICICNIS (2020)
Deceptive News Prediction in Social Media Using Machine …
367
15. Y. Wang, W. Yang, F. Ma, J. Xu, B. Zhong, Q. Deng, J. Gao, Weak supervision for fake news detection via reinforcement learning. Assoc. Adv. Artif. Intell. (2020). www.aaai.org 16. D.H. Lee, Y.R. Kim, H.J. Kim, S.M. Park, Y.J. Yang, Fake news detection using deep learning. J. Inf. Process Syst. 15(5), 1119–1130 (2019). ISSN 1976-913X (Print), ISSN: 2092-805X (Electronic). https://doi.org/10.3745/JIPS.04.0142 17. E. Qawasmeh, M. Tawalbeh, M. Abdullah, Automatic identification of fake news using deep learning, in Sixth International Conference on Social Networks Analysis, Management and Security (SNAMS) (IEEE, 2019). ISBN: 978-1-7281-2946-4/19 18. K. Agarwalla, S. Nandan, V.A. Nair, D.D. Hema, Fake news detection using machine learning and natural language processing. Int. J. Recent Technol. Eng. (IJRTE) 7(6) (2019). ISSN: 2277-3878 19. W. Han, V. Mehta, Fake news detection in social networks using machine learning and deep learning: performance evaluation, in International Conference on Industrial Internet (ICII) (IEEE, 2019). ISBN: 978-1-7281-2977-8/19 ©2019 IEEE. https://doi.org/10.1109/ICII.2019. 00070 20. S.I. Manzoor, D.J.N. Singla, Fake news detection using machine learning approaches: a systematic review, in Proceedings of the Third International Conference on Trends in Electronics and Informatics (ICOEI 2019). IEEE Xplore Part Number: CFP19J32-ART; ISBN: 978-1-5386-9439-8
Several Categories of the Classification and Recommendation Models for Dengue Disease: A Review Salim G. Shaikh, B. Suresh Kumar, and Geetika Narang
Abstract Dengue fever is becoming more familiar with each passing year. To control the disease, it is necessary to conduct a complete analysis of the dengue-affected regions and the condition’s symptoms. Mosquitos spread dengue fever. Dengue fever is caused by a family of viruses called Flaviviridae with four genetic variants that spread by the bite of infected Aedes mosquitoes. Dengue fever affects over 2.5 billion people worldwide, with approximately 100 million new cases reported each year. Dengue fever’s global prevalence has risen considerably in recent years. Upward of 100 countries in the Americas, East Asia, the western Pacific, Africa, and the eastern Mediterranean now have the disease. In this paper, surveyed various signs and symptoms of dengue viral are discussed. The dengue classification and recommendationbased techniques are surveyed and analyzed. Different recommendation and classification models such as artificial neural network (ANN), support vector machine (SVM), ensemble learning, random forest, and decision tree are compared with the help of different performance metrics such as accuracy, specificity, and sensitivity rate. Keywords Dengue fever · Artificial neural network · Support vector machine
1 Introduction Dengue fever is a mosquito-borne infection transmitted by the DENV viral infection and spread by the Aedes Aegypti mosquito. According to the World Health Organization (WHO), around 4.2 million suspected fever cases were identified globally in 2019. Earlier, the same organization released an advisory designating dengue fever S. G. Shaikh (B) Department of CSE, Amity University, Jaipur, Jaipur, India e-mail: [email protected] B. Suresh Kumar Department of CSE, SGU Kolhapur, Kolhapur, India G. Narang Department of CSE, TCOER, Pune, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_28
369
370
S. G. Shaikh et al.
as one of the years with the most dangerous infections. Throughout Brazil, a global epidemic of dengue fever occurred in 2019, with a 149% increase in prevalence under certain regions leading to the majority of a unique kind of virus (DENV-2) [1]. The Aedes Aegypti mosquitoes replicate in wet conditions and prefer to breed in a tropical environment with high rainfall. To prevent infection, systems of government fund public information campaigns encouraging people to properly dispose of tires and bottles in the outdoors because they can collect water and provide a suitable habitat for mosquitos in the long term. Dengue fever affects over 2.5 billion people worldwide, with approximately 100 million new cases reported each year. Dengue fever’s global prevalence has risen considerably in recent years. Upward of 100 countries in the Americas, East Asia, the western Pacific, Africa, and the eastern Mediterranean now have the disease. The eastern Mediterranean, Africa, the West Pacific, the Americas, and Southeast Asia are perhaps the most severely affected [2]. Dengue fever is also known as dandy fever, dengue hemorrhagic fever, or break-borne flu. Dengue fever is an infectious disease caused by a retrovirus spread by mosquitoes. Dengue virus is transmitted by a mosquito species that also spreads chikungunya, yellow fever, and Zika virus. Dengue fever is spread through the bite of a dengue virus-infected Aedes Mosquitoes. Whenever a mosquito bites an individual with an infectious agent in their bloodstream, the mosquito gets contaminated. It is not contagious and cannot be passed through one individual to the next. Signs and symptoms of dengue often appear 4–6 days of infection, which can continue up to ten days. Sudden fever with high temperature is the most common sign of dengue. Unbearable headaches, the ache behind your eyes, knee and muscular pain that is excruciating fatigue sickness vomiting 2–5 days following the commencement of the temperature, rashes emerge on the epidermis [3]. Moderate discoloration such as easy bruising, nose bleeds, and bleeding gums. Signs and symptoms can be modest and misinterpreted for the flu or other chronic conditions. The virus affects small kids and persons that have never had it before in a gentler way than it does adult children and young people. Yet, significant issues can arise. Dengue hemorrhagic fever is a clinical syndrome marked by a high temperature, disruption to lymphatic blood arteries, blood from the nostrils and mouth, hepatic swelling, and cardiovascular equipment malfunction. Excessive blood, panic, and mortality may occur due to the signs. The condition is known as dengue shock syndrome (DSS). Dengue hemorrhagic sickness is more common in immunocompromised patients and those with a subsequent offense infectious disease [4]. Analysis of the several machine learning (ML) approaches have been applied in various areas over the last two decades, including geographies, the ecosystem, and health care, to extract significant results using complex and heterogeneous datasets. Machine learning, unlike statistical approaches, includes a combination of a substantial percentage of dependent variables, the modeling of multiple interrelationships among different factors, and the fitting of complex systems without making an assumption aspect (e.g., linear, exponential, and logistic) of operations. In dengue-forecasting studies, support vector machines, artificial neural networks, decision trees, ensemble learning, and random forest are shared machine learning-based prediction techniques.
Several Categories of the Classification and Recommendation …
371
The sections of this paper are: The brief introduction and history of dengue are explained in Sect. 1. In Sect. 2, various existing methods for dengue prediction are surveyed. Problems with dengue prediction are described in Sect. 3. Different types of data sources for dengue classification and prediction systems are explained in Sect. 4. In Sect. 5, the various recommendation and classification models for dengue disease.
2 Literature Review Lee et al. [5] studied the virus’s vibrant dissemination structure from the perspective of a stochastic process. With cross-infection of individuals and mosquitoes in multiple disease phases, authors had determined the parameters of an epidemiological compartment system. The authors investigated all of the atmospheric and insect factors that might influence the pandemic to anticipate dengue cases and outbreaks. In the proposed study, authors were used series data for the period from multiple metropolitan areas of 4.7 million to create a double approximate solution for a framework structure of finite difference for variable to variance decomposition storage area framework, Markov chain Monte Carlo technique, and integral boundary technique to analyze the pandemic intervals of dengue infection under the influence of environmental variables. Gangula et al. [6] designed a classification model to identify the key traits that propagate dengue fever. The authors were analyzed that among the essential methodologies in the modern assessment was machine learning. Numerous strategies were used in clinical use. The authors found that dengue fever was among the most dangerous bacterial infections, and it necessitates the development of optimistic predictions by a high-level machine to train. In hybrid implementations, the authors had used the ensemble machine learning approach to uncover the variables linked to the transmission of dengue fever and boost efficiency. Silitonga et al. [7] presented the random forest classification technique with a tenfold crossvalidation framework. As the classification model, an automated system was used to generate a more accurate and consistent model that predicts dengue’s diagnostic and therapeutic level inside the crucial period. If experimental attribute outcomes were recognized since it produced the maximum accuracy (58%) of the classification models. The goals of the proposed study were to measure the model performance constructed to identify the proper category inside a particular dataset that used an artificial neural network classification model and a random forest classification algorithm independently and to discover the best-performing classification. The findings of the proposed study would be employed in the creation of machine learning algorithms, which could forecast the clinical severity of dengue fever in the crucial stage if testing parameters were available, using the best-performing classification. Chakraborty and Chandru [8] developed a strategic framework for dengue detection. Several contemporary dengue-forecasting models take advantage of established correlations involving environmental and socio-demographic parameters and transmission rates. Still, they were not dynamic enough to account for the rapid and
372
S. G. Shaikh et al.
abrupt growth and decrease in occurrence count data. The authors had developed a nonparametric, adaptable probability distribution process. A vector error correction model was proposed based on previous dengue occurrence count data and potential climatological confounders. The authors had demonstrated that the proposed method outperforms other methodological approaches. Also, the authors showed that the proposed method was a brilliant strategy and robust framework for health experts. Balamurugan et al. [9] designed an innovative feature selection method; entropy weighted score-based optimal ranking algorithm. The proposed framework proved to be a valuable and effective technique for healthcare prediction and diagnosis. The suggested system’s outstanding features selection for quickly identifying qualities (components) was accountable for the condition’s primary causes. The dengue dataset was constructed in this investigation by gathering clinical laboratory testing results of adult and paediatric patients as real-time samples from several medical clinics in Tamil Nadu’s Thanjavur region. The prediction model was used to observe a statistical methodology and the project’s outcomes, surpass conventional systems. Mussumeci and Coelho [10] presented a machine learning algorithm to anticipate weekly dengue prevalence in 790 Brazilian cities. The machine learning-based algorithm includes selecting features like least absolute shrinkage and selection operator (LASSO), random forest regression with long short-term memory (LSTM), and a deep recurrent neural network. To identify the geographic dimension of illness propagation, the authors employed a multidimensional statistical model as classifiers and time-series data from comparable cities. The long short-term memory (LSTM) recurrent neural network approach outperformed all others in forecasting future dengue outbreaks in towns of various shapes and sizes. In Table 1, various existing techniques are depicted with research gaps and performance metrics. The different methods of dengue prediction with merits and demerits are illustrated in Table 2.
3 Issues Occurred in Dengue Fever Dengue fever is a significant global emerging sickness that significantly strains affected countries’ medical systems, necessitating the development of a highperformance machine learning prediction model. It is hard to distinguish dengue from several other familiar fulminant chronic conditions even before abnormalities a fast, and inexpensive approach is desperately required to promote timely detection, both to improve patient treatment as well as to enable the efficient use of available assets, as well as to evaluate a patient who is at elevated risk of adverse effects. It could also be advantageous to analyze early disease attributes with commonly available diagnostic procedures in a significant number of treating a wide variety of dengue illnesses encountered in the surroundings to develop a robust instance for dengue in the construction of predictive classifiers [11, 12]. Timely identification of infections with an expanded significant focus and an early response to looming epidemics may be tremendously beneficial in reducing the number of dengue illnesses globally.
Several Categories of the Classification and Recommendation …
373
Table 1 Dengue prediction and recommendation existing models Author’s year
Proposed method
Gap/problem definition
Performance metrics
Lee et al. (2021) [5]
Vector compartment technique for dengue prediction
Computational time and memory consumption is high
Accuracy
Gangula et al. (2021) [6]
Ensemble machine learning-based framework
Overfitting issues and Accuracy taking more time to train
Silitonga et al. (2021) Machine [7] learning-based approach
Overlapping matters in a large dataset
Accuracy
Chakraborty and Chandru (2020) [8]
Gaussian process-based regression methodology
Overfitting issues
Root mean square error Mean absolute deviation
Balamurugan et al. (2020) [9]
Entropy weighted score-based optimal ranking technique
Limited dataset and need to enhance feature selector technique
Accuracy Recall Precision Receiver operating characteristic (ROC) True positive rate False positive rate
Mussumeci and Coelho (2020) [10]
Large scale-based Training time is multivariate prediction enormous model Required large memory for training
Mean square logs an error Mean square error
When combined with risk mapping to determine places prone to dengue infection, these strategies might have significant health benefits of preventing the emergence and spread and reducing cases, lowering overall illness, and dengue fatality.
3.1 Data Sources for Dengue Classification and Prediction Systems The various data sources for dengue classification and prediction systems are depicted in Fig. 1.
3.2 Traditional Data Sources Health care and epidemiology information from conventional health storage systems (such as hospitals), ecological and climatic data from forecasting organizations,
374
S. G. Shaikh et al.
Table 2 Existing techniques of dengue prediction models with merits and demerits Author’s year
Techniques
Merits
Demerits
Lee et al. (2021) [5]
Markov chain-based MONTE CARLO technique
The unknown parameters can be reconstructed with parameter estimation using this technique’s entire probability distribution function
Convergences issues
Gangula et al. (2021) [6]
Decision tree Support vector machine (SVM) Naive Bayes
Reduces the dispersion issues and provides robustness
High deployment cost
Silitonga et al. (2021) Artificial neural [7] network (ANN) Random forest
Handle high dimension data
High computational cost
Chakraborty and Chandru (2020) [8]
Generalized additive model (GAM) Random forest
As the amount of Nonparametric and possible determinants is computational considerable, its complexity is high capabilities to simulate massively complex exponential associations
Balamurugan et al. (2020) [9]
Support vector machine Naive Bayes Multilayer perceptron
Provided efficient and effective results
High computing power
Mussumeci and Coelho (2020) [10]
Long short-term memory (LSTM) Random forest regression technique
Low complexity to update the weights
Overfitting issues Highly sensitive with random weights
and geographic and demographic statistics from many other relevant government resources are examples of traditional multiple data source materials.
3.3 Modern Data Sources Due to advancements in technologies, new data sources that can predict dengue outbreaks have been accessible, with large volumes of information emerging accessible on the Website. Nowadays, epidemiology investigators pay more attention to this reasonably necessary and distinctive data source [13].
Several Categories of the Classification and Recommendation …
375
WHO
Traditional Data sources
Hospitals
Department of meteorology Data Sources Phone Calls
Modern Data sources
BioSensor, Sensor
Social Networks
Fig. 1 Various data source for dengue classification and prediction systems
4 Different Categories of Recommendation and Classification Models for Dengue Disease Dengue is a significant universal medical concern that affects and harms people worldwide. Various recommendation and classification models are developed for the early detection of dengue. This section explains several recommendation and classification models categories such as artificial neural network (ANN), ensemble learning, random forest, and support vector machine (SVM) for dengue disease. Also, the dengue recommendation and classification methods are compared to the analysis.
4.1 Ensemble Learning Ensemble learning is used for classification purposes and discards all useless features that are not essential in model training. Ensemble learning can be used for classification purposes in the dengue disease recommendation model. Ensemble learning is built on the premise that combining the outcomes of multiple models produces improved products than using a particular model. The logic is premised on the idea of constructing a set of hypotheses using several approaches and afterward combining
376
S. G. Shaikh et al.
them to achieve better performance over acquiring only one assumption that used a particular method [6].
4.2 Artificial Neural Network Artificial neural network (ANN) is a central nervous system work technique. Such frameworks are modeled on the biological nervous system, although the nervous system only employs a subset of the principles found in biological nervous systems. ANN models, in particular, mimic the electrical impulses of the central nervous system. Components often referred to as a geek or a perceptron are linked to one another. Neural network models are a class of computing methods motivated by the biological central and peripheral nervous system trained to identify complex features and solve classification tasks without programming [14]. Algorithms recognize distinctive traits in the instances evaluated autonomously. Artificial neurons are the nodes that make up computational models. ANN can be used as a dengue detector and classifier in dengue recommendation systems.
4.3 Support Vector Machine A support vector machine (SVM) is the simplest way to classify two or more data classes. SVM is a machine learning algorithm based on supervised learning. It helps in classification as well as regression. A support vector machine (SVM) is a linear margin classification model that may also be used in nonlinear situations. Support vector machines (SVMs) are powerful and demanding data categorization technologies. It divides the dataset into two segments using hyperactive planes to classify data. It is a sophisticated process that outlines the association across attributes and consequences using multivariate levels. Despite its complexity, it may be used for realworld issues requiring categorization and forecasting [15]. There are several symptoms of dengue as mentioned in previous section, to classify such symptoms SVM classifier can be used and provided efficient results in detection and classification.
4.4 Random Forest Random forest is a flexible, straightforward computational model in the vast majority of circumstances, produces tremendous success with hyper-parameters or without hyper-parameters modification. Along with its simplicity and versatility, it has become one of the most commonly used approaches for the classification of dengue disease, it can be used for classification and regression tasks. Essential characteristics
Several Categories of the Classification and Recommendation …
377
Table 3 Comparison of different recommendation and classification models Recommendation and classification models
Accuracy (%)
Specificity
Sensitivity
Artificial neural network [14]
96
97%
96%
Support vector machine (SVM) [15]
82
87%
76%
Ensemble learning [6]
95
Nil
Nil
Random forest [16]
92.3
92.1%
94%
Decision tree [17]
99
84%
90%
of the random forest algorithm are that it can manage sets of data with both categorical and continuous, as in regression and classification issues. Random forest is an machine learning (ML)-based-supervised method. This transformation decision tree composition frequently learned to use the “bagging” approach into a “forest.” The essential idea of the bagging process is that integrating many techniques and algorithms boosts output tremendously [16].
4.5 Decision Tree Decision trees are a data mining method that combines computational and statistical approaches to aid in the characterization, classification, and refinement of a database. Arrangements in the shape of trees which reflect decisions sets are known as decision trees. Such choices result in classification dataset rules. The decision tree’s principal goal is to reveal the structure’s hidden patterns and relationships [17]. In dengue classification model, decision trees are used in two ways: the first way is classification and the second way is regression. The survey of the several dengue recommendation and classification models comparison analysis are depicted in Table 3. The accuracy, specificity, and sensitivity performance metrics are used for the survey comparison analysis. The graph presents the results of different dengue recommendation and classification models in Figs. 2 and 3. The graphical comparison analysis delivered that decision tree methodology has attained maximum accuracy rate. ANN model has achieved maximum sensitivity and specificity. Because ANN model has used the maximum iterations in the neural network and get the high-performance metrics.
5 Conclusion and Future Scope There is no particular diagnosis of dengue fever, and viable vaccinations are still in research. The most efficient and straightforward method of managing dengue infection is to disrupt pathogen circulation through mosquito control. In this paper, problems with dengue classification models are discussed. One shortcoming of dengue
378
Fig. 3 Parameter analysis with specificity and sensitivity rate
% age of accuracy
Comparison Analysis with Various Models 100% 80% 60% 40% 20% 0%
Accuracy
Comparison Analysis with Various Models % age of specificity and sensitivity
Fig. 2 Performance analysis of dengue recommendation and classification models: accuracy
S. G. Shaikh et al.
100% 80% 60% 40% 20% 0%
Specificity Sensitivity
viral prediction methods is that individuals without technical experience, training, or skill may find it challenging to evaluate the predicted outcomes. Although health care experts find it extremely difficult to grasp clinical records, and most individuals lack knowledge or expertise over what is, to so many, an obscure and complicated issue finds machine learning algorithms are hard to comprehend. There are various signs and symptoms of dengue infection, such as high fever and joint body pain. The several data sources for dengue classification and prediction systems are discussed. The various traditional data sources are World Health Organization (WHO), hospitals, etc. Sensors, social networks, biosensors, phone calls, etc., are modern data sources for dengue prediction systems. Several recommendations and classification models are compared with evaluation parameters such as accuracy, specificity, and sensitivity. The comparison analysis presented that decision tree methodology has attained maximum accuracy. ANN model has achieved maximum sensitivity and specificity. In the future, more classification and prediction models of dengue will be compared for better analysis. The recommendation and prediction systems are dependent on data sources. Therefore, in the future, more data will be collected from different data sources for training and testing of the recommendation and prediction models. A novel model will be developed for the reduction of existing systems issues.
Several Categories of the Classification and Recommendation …
379
References 1. E.D. de Araujo Batista, F.M. Bublitz, W.C. de Araujo, R.V. Lira, Dengue prediction through machine learning and deep learning: a scoping review protocol. Res. Square 2(04), 1–9 (2020) 2. V.R. Louis, R. Phalkey, O. Horstick, P. Ratanawong, A. Wilder-Smith, Y. Tozan, P. Dambach, Modeling tools for dengue risk mapping-a systematic review. Int. J. Health Geogr. 13(1), 1–14 (2014) 3. A. Wilder-Smith, E.E. Ooi, O. Horstick, B. Wills, Dengue. The Lancet 393(10169), 350–363 (2019) 4. C. Cobra, J.G. Rigau-Pérez, G. Kuno, V. Vomdam, Symptoms of dengue fever in relation to host immunologic response and virus serotype, Puerto Rico, 1990–1991. Am. J. Epidemiol. 142(11), 1204–1211 (1995) 5. C.H. Lee, K. Chang, Y.M. Chen, J.T. Tsai, Y.J. Chen, W.H. Ho, Epidemic prediction of dengue fever based on vector compartment model and Markov chain Monte Carlo method. BMC Bioinform. 22(5), 1–11 (2021) 6. R. Gangula, L. Thirupathi, R. Parupati, K. Sreeveda, S. Gattoju, Ensemble machine learning based prediction of dengue disease with performance and accuracy elevation patterns. Mater. Today Proc. (2021) 7. P. Silitonga, B.E. Dewi, A. Bustamam, H.S. Al-Ash, Evaluation of dengue model performances developed using artificial neural network and random forest classifiers. Procedia Comput. Sci. 179, 135–143 (2021) 8. A. Chakraborty, V. Chandru, A robust and non-parametric model for prediction of dengue incidence. J. Indian Inst. Sci. 1–7 (2020) 9. S.A. Balamurugan, M.M. Mallick, G. Chinthana, Improved prediction of dengue outbreak using combinatorial feature selector and classifier based on entropy weighted score based optimal ranking. Inform. Med. Unlocked 20, 100400 (2020) 10. E. Mussumeci, F.C. Coelho, Large-scale multivariate forecasting models for dengue-LSTM versus random forest regression. Spat. Spatio-Temporal Epidemiol. 35, 100372 (2020) 11. J.D. Mello-Román, J.C. Mello-Román, S. Gomez-Guerrero, M. García-Torres, Predictive models for the medical diagnosis of dengue: a case study in Paraguay. Comput. Math. Methods Med. (2019) 12. A.L. Buczak, B. Baugher, L.J. Moniz, T. Bagley, S.M. Babin, E. Guven, Ensemble method for dengue prediction. PLoS ONE 13(1), e0189988 (2018) 13. P. Siriyasatien, S. Chadsuthi, K. Jampachaisri, K. Kesorn, Dengue epidemics prediction: a survey of the state-of-the-art based on data science processes. IEEE Access 6, 53757–53795 (2018) 14. N. Zhao, K. Charland, M. Carabali, E.O. Nsoesie, M. Maheu-Giroux, E. Rees, K. Zinszer, Machine learning and dengue forecasting: comparing random forests and artificial neural networks for predicting dengue burden at national and sub-national scales in Colombia. PLoS Negl. Trop. Dis. 14(9), e0008056 (2020) 15. N.I. Nordin, N.M. Sobri, N.A. Ismail, S.N. Zulkifli, N.F. Abd Razak, M. Mahmud, The classification performance using support vector machine for endemic dengue cases. J. Phys. Conf. Ser. 1496(1), 012006 (2020). IOP Publishing 16. G.M. Hair, F.F. Nobre, P. Brasil, Characterization of clinical patterns of dengue patients using an unsupervised machine learning approach. BMC Infect. Dis. 19(1), 1–11 (2019) 17. D.S.R. Sanjudevi, D. Savitha, Dengue fever prediction using classification techniques. Int. Res. J. Eng. Technol. (IRJET) 6(02), 558–563 (2019)
Performance Analysis of Supervised Machine Learning Algorithms for Detection of Cyberbullying in Twitter Nida Shakeel and Rajendra Kumar Dwivedi
Abstract These days, the use of social media is inevitable. Social media is beneficial in several means, but there are severe terrible influences. A crucial difficulty that needs to be addressed is cyberbullying. Social media, especially Twitter, advances numerous concerns due to a misunderstanding concerning the notion of freedom of speaking. One of those problems is cyberbullying, which influences both man or woman victims as well as the societies. Harassment by way of cyberbullies is a big issue on social media. Cyberbullying affects both in terms of the mental and expressive manner of someone. So there is a need to plan a technique to locate and inhibit cyberbullying in social networks. To conquer this condition of cyberbullying, numerous methods have been devised. This paper would help to comprehend the methods and procedures like logistic regression (LR ), naïve Bayes (NB ), support vector machine (SVM), and term frequency—inverse document frequency (TF-IDF) which are used by numerous social media web sites, especially Twitter. In this paper, we have worked on the accuracy of the SVM, LR, and NB algorithms to detect cyberbullying. We observed that the SVM outperforms the others. Keywords Cyberbullying · Social networks · Machine learning · Twitter · Victims · Logistic regression (LR) · Naïve Bayes (NB) · Support vector machine (SVM)
1 Introduction Due to the large improvement of Internet technology, social media web sites which include Twitter and Facebook have to turn out to be famous and play a massive function in transforming human life. Millions of youths are spending their time on social media devotedly and exchanging records online. Social media has the potential to attach and proportion facts with everyone at any time with many humans concurrently. Cyberbullying exists via the internet where cell phones, video game N. Shakeel (B) · R. K. Dwivedi Department of Information Technology and Computer Application, MMMUT Gorakhpur, Gorakhpur, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_29
381
382
N. Shakeel and R. K. Dwivedi
packages, or other mediums ship or put up textual content, photos, or movies to hurt or embarrass some other character deliberately. Cyberbullying can happen at any time throughout the day, in a week, and outreach a person everywhere via the internet. Cyberbullying texts, pix, or motion pictures may be published in an undisclosed way a disbursed immediately to a very huge target market. Twitter is the furthermost regularly used social networking software that permits humans to micro-weblog around an intensive variety of areas. It is a community stage for verbal exchange, creativity, and public contribution with nearly 330 million vigorous monthly customers, greater than one hundred billion day-by-day lively customers, and about 500 billion tweets are produced on ordinary every day. Conversely, with Twitter turning into a high quality as well as a real communication network, a have a look at has stated that Twitter exists as a “cyberbullying playground”. This paper focuses on the detection of cyberbullying which is one of the critical problems. Some methodologies used, here in this paper are SVM, LR, NB, and TD-IDF. SVM is any of the utmost famous strategies used for classification and regression in machine learning. TFIDF is a time used inside the recovery of statistics. It concludes the rate of phrases in a record as well as its converse file regularity. NB is a set of grouping procedures based totally on the Bayes hypothesis. In this, every pair of features being categorized is independent of each different.
1.1 Motivation Today, social media has grown to be part of anyone’s life. Which makes it clear for bullies to pressurize them as many human beings take it very seriously? Many instances had been pronounced within the past few years but the rates of cyberbullying were elevated in the ultimate 4–5 years. As the rate increases, many humans dedicated suicide due to the fact they get pissed off by way of bullies’ hate messages, and they do not get every other way to overcome it. By noticing all the increments in those cases, it is very vital to take action against bullies.
1.2 Contribution This paper makes the following contribution as follows: • Firstly, inform us about the effects and causes of cyberbullying and how fast it is growing because the day passes. • We have also discussed a few technologies which might be very beneficial to come across positive and negative comments, feedback, or posts on social networks. • Implementation of the accuracy of SVM, LR, and NB.
Performance Analysis of Supervised Machine Learning Algorithms …
383
1.3 Organization The rest of the paper is organized as follows. Section 2 presents the background. Section 3 presents related work. Section 4 presents the proposed approach. Section 5 presents the novelty of the proposed approach, and Sect. 6 presents the conclusion and future work.
2 Background Bullying is commonly described as repeated antagonistic behavior. Bullying has covered especially bodily acts, verbal abuse, and social exclusion. The boom of electronic communications technology causes teens and children to undergo a brand new manner of bullying. There are many exceptional kinds of cyberbullying along with harassment, flaming, exclusion, outing, and masquerading. Reducing cyberharassment is vital due to the fact numerous negative health outcomes were determined among people who had been stricken by cyberbullying, along with depression, tension, loneliness, and suicidal conduct. The major participants in cyberbullying are social networking web sites. The societal media system offers us high-quality communication stage chances in addition they boom the liability of younger human beings to intimidating circumstances virtual. Cyberbullying on a community media community is a worldwide occurrence due to its vast volumes of active users. The style suggests that cyberbullying in a social community is developing hastily every day. The vigorous nature of these web sites helps inside the boom of virtual competitive behavior. The nameless function of person profiles grows the complication to discover the intimidator. Community media is generally owed to its connectivity in the shape of systems. But then this could be dangerous while rumors or intimidation posts are extended into the community which cannot be simply managed. Twitter and Facebook may be occupied as instances that might be general among numerous community media websites. According to Facebook customers must extra than one hundred fifty billion links which offer the clue approximately how intimidation content can be extended inside the community in a portion of the period. To physically perceive, these intimidation messages over this massive system is hard. It has been recognized that because of cyberbullying victims grow to be dangerously timid and may get violent minds of revenge or even suicidal thoughts. They suffer from despair, low self-worth, and anxiety. It is worse than bodily bullying because cyberbullying is “behind-the-scenes” and “24/7”. Even the bully’s tweets or remarks do not vanish; they stay for a long period and constantly impact the sufferer mentally. It is nearly like ragging besides it occurs in the front of heaps of mutual buddies, and the scars live for all time because the messages live forever on the net. The hurtful and tormenting messages embarrass the sufferers to a degree that cannot be imagined. The outcomes are even worse and more excessive. In most instances, this is nine out of ten instances the younger
384
N. Shakeel and R. K. Dwivedi
victims do now not inform their dad and mom or guardian out of embarrassment and get into depression or worse suicide.
2.1 Categorize of Cyberbullying Since the internet is inclusive international, there are few ways to use it for some purpose you need. There exist web sites, media, and many stages which can be discovered for one’s motive. Certain most commonplace kinds of cyberbullying are: • Threatening emails through WhatsApp and SMS. • Constructing a laugh at the sufferer via posting private substance on social media stages like Facebook, Instagram, WhatsApp, Twitter, and so on. • Creating and usage of false profiles to become courtesy, collect a few data, and then embarrass the object character. • Making a bad photo of the intimidated by posting such material. • With false pix and scripts to embarrass the individual.
2.2 Cause and Effect of Cyberbullying The reason for cyberbullying contains inside the inspiration of the bully. Why do they pick out to intimidate and what creates them these tons assured? It wholly begins while any individual selects to move the bounds to disgrace the alternative one. Therefore, what are the inspiring aspects of cyberbullying? Figure 1 depicts the causes of cyberbullying. This includes an absence of understanding, the victims deserve it, self-loathing and it becomes an obsession. Following are the causes of cyberbullying. a.
An Absence of Understanding
Where generation has untied up the area, it has additionally specified the right of declaring some view, analyzing anybody at the same time as sedentary at family. It may be very easy to expanse yourself from the extreme conditions above the net by using simply shutting down. That’s why folks that do not appreciate the level of ache that they have would possibly affect the alternative character remain those who emerge as an aggressor. This creates them to experience power. b.
The Victim Deserves It
The notion of getting the right to choose who deserves what is one of the major reasons for cyberbullying. Once it’s far nearby college, the teens frequently sense that they need to do something to make them feel advanced. For this, they are inclined to dishonor or persecutor different human beings to create a sense of inferiority. Someway, they suppose that it is k to bully others as of their fame.
Performance Analysis of Supervised Machine Learning Algorithms …
385
Fig. 1 Causes of cyberbullying
c.
Self-loathing
Studies have observed that there is a robust link between the humans who must bully formerly and folks who are the tyrants now. The individuals who have been as soon as sufferers might reoccurrence as a bully to vent out the annoyance that they had. However someway, the series keeps, and that they grow to be aching harmless children also. d.
It Becomes an Obsession
If you have got existing using community media systems like Facebook or Instagram, you influence realize how hard it is to disregard the memos and reports. Subsequently, when an intimidator starts somewhat on such stages, non-stop commitment creates him to get hooked to it. Cyberbullying comes in numerous paperwork a number of them contain flickering, maltreatment, criticism, pretend, cyber-stalking, and dangers. Cyberbullying includes and acts as making private threats, sending, scary abuses or tribal offenses seeking to add the sufferer’s laptop with worms, and spamming his/her inbox through emails. The sufferer could cope by cyberbullying up to a certain level through proscribing his/her processor association period, snubbing intimidating messages, and snubbing emails from strange sources. Additional operational and deal with often, swapping ISPs swapping cellular mobile bills, and looking to hint the source. As we all understand, this generation is socializing the arena and is soon going to undertake new technology. We can get linked to the world very easily; however, there is a high-quality factor as well as poor components as it is not always most effective distracting the youngsters; however, many crimes are being committed over the internet each day and one of the maximum committed crimes over the internet is cyberbullying.
386
N. Shakeel and R. K. Dwivedi
Cyberbullying is a practice of victimization or harassment that is completed through the usage of electronic means. Similarly, cyberbullying and cyberharassment are also called online bullying. It is amassed among youths such as we can say that after somebody annoys every other individual on the internet or some specific community media web site, it clues to certain dangerous oppression manners like posting gossips, threats, and so forth. However, bullying or harassment through the internet is very risky to a whole slice of human beings as they go over mental harm as harrying several humans impend others to submit their pix or else nudes over the internet while to keep away from this they must fulfill the needs of the culprits. Cyberbullying is an unlawful and illicit action. The internet is a place, wherein cyberbullying may be very not unusual. Social media sites like Facebook, Snap chat, Instagram, and other commonplace web sites.
3 Related Work Different cyberbullying detection techniques have been invented over the latter few years. All techniques had proven their better effort on certain unique datasets. There are several strategies for figuring out cyberbullying. This segment offers a short survey on such strategies. Nurrahmi [1] described that cyberbullying is a repetitive action that annoys, disgraces, hovers, or bothers different human beings via digital gadgets and virtual communal interacting web sites. Cyberbullying via the internet is riskier than outdated victimization, as it may probably increase the disgrace to vast virtual target spectators. The authors aimed to hit upon cyberbullying performers created on scripts and the sincerity evaluation of customers and inform them around the damage of cyberbullying. They pragmatic SVM and KNN to study plus stumble on cyberbullying texts. Prasanna Kumar et al. [2] describes the fast growth of the net and the development of verbal exchange technology. Due to the upward thrust of the popularity of social networking, the offensive behaviors taking birth that’s one of the critical problems referred to as cyberbullying. Further, the authors mentioned that cyberbullying has emerged as a menace in social networks and it calls for extensive research for identification and detection over web customers. The authors used some technologies such as semi-supervised targeted event detection (STED), Twitter-based event detection and analysis system (TEDAS). Pradheep et al. [3] explained that the social networking platform has emerged as very famous inside the last few years. Through this social media, human beings engage percentage, speak and disseminate know-how for the gain of others through the use of multimodal capabilities like multimedia text, photographs, movies, and audio. Cyberbullying influences each in phrases of mental and expressive funds of a person. Therefore, there is essential to conceive a technique toward stumbling on and inhibiting cyberbullying in communal networks. The cyberbullying image can be detected by the use of the pc imaginative and prescient set of rules which incorporates methods like image similarity and optical character recognition (OCR). The cyberbullying video could be detected
Performance Analysis of Supervised Machine Learning Algorithms …
387
using the shot boundary detection algorithm where the video could be broken into frames and analyzed the use of numerous strategies in it. The proposed framework also helps figure out the cyberbullying audio in the social community. Mangaonkar et al. [4] explained the usage of Twitter statistics is growing day-with the aid of-day, hence are unwanted deeds of its workers. The solitary of the unwanted conduct is cyberbullying which might also even result in a suicidal attempt. Different collaborative paradigms are recommended and discussed in this paper. They used the strategies like Naive Bayes (NB). Al-Ajlan and Ykhlef [5] explained that tools are ruling our survives now we trust upon a generation to perform a maximum of our everyday deeds. The nameless natural surroundings of communal networks, in which operators use nicknames in place of their actual ones creating their actions exact tough near hint has led to a developing range of online crimes like cyberbullying. In this paper, we advocate-boosted Twitter cyberbullying discovery founded totally on deep getting to know (OCDD). As per the cataloging segment, deep learning may be recycled, at the side of a metaheuristic optimization algorithm for parameter tuning. Agrawal and Awekar [6] described that internet devices affected each aspect of humanoid existence, getting easiness in linking human beings about the sphere and has made records to be had to massive levels of the civilization with a click on of a button. Harassment by way of cyberbullies is an enormous phenomenon on social media. In this paper, deep neural network (DNN) technique has been used. Meliana and Fadlil [7] explained that nowadays community media is actual crucial on behalf of certain human beings for the reason that it is the nature of community media that can create humans waste to the use of societal media, and this takes extensively validated in phrases of healing, and community interplay has been abridged because of community media, and kits linked to the somatic to be decreased, and social media may be fine and terrible, fantastic uncertainty recycled to deal hoary associates who do not encounter for lengthy; however, poor matters may be used to crime or matters that are not suitable. Keni et al. [8] described that state-of-the-art youngsters have grown in an era that is ruled by new technologies where communications have typically been accomplished through the use of social media. The rapidly growing use of social networking sites by a few young adults has made them susceptible to getting uncovered to bullying. Cyberbullying is the usage of technology as a medium to bully a person. Although it is been a difficulty for many years. The effect of cyberbullying has been boom due to the speedy use of social media. Wade et al. [9] explained that within the latest years social networking sites have been used as a platform of leisure, task opportunity, and advertising; however, it has additionally caused cyberbullying. Generally, cyberbullying is an era used as a medium to bully someone. Nirmal et al. [10] defined the growth within the use of the Internet and facilitating get admission to online communities which includes social media have brought about the emergence of cybercrime. Cyberbullying may be very not unusual nowadays. Patidar et al. [11] defined the speedy increase of social networking web sites. Firstly, the authors cited the nice aspect of social networking sites after that they cited some of its downside which is using social media human beings can be humiliated, insulted, bullied, and pressured using nameless customers, outsiders, or else
388
N. Shakeel and R. K. Dwivedi
friends. Cyberbullying rises to use tools to embarrassing social media customers. Ingle et al. [12] described in the current year’s Twitter has appeared to be a highquality basis for customers to show their day-by-day events, thoughts, and emotions via scripts and snapshots. Because of the rapid use of social networking web sites like Twitter the case of cyberbullying has been also been boomed. Desai et al. [13] describe that the use of internet plus community media credentials tend within the consumption of distribution, receiving, and posting of poor, dangerous, fake, or else suggest gratified material approximately any other man or woman which, therefore, manner cyberbullying. Cyberbullying has caused an intense growth in intellectual health problems, mainly many of the younger technology. Khokale et al. [14] defined as internet having affected each phase of social lifestyles, fetching easiness in linking human beings about the sphere and has made facts to be had to large sections of the civilization on a click on of a knob. Cyberbullying is a practice of automatic verbal exchange, which evils the popularity or confidentiality of a discrete or hovers, or teases, exit a protracted—eternal effect. Mukhopadhyay et al. [15] defined social media as the usage of a digital stage for linking, interrelating, distributing content, and opinions nearby the sphere. Shah et al. [16] described that in a cutting-edge technologically sound world using social media is inevitable. Along with the blessings of social media, there are serious terrible influences as properly. A crucial problem that desires to be addressed here is cyberbullying. Zhang et al. [17] explained cyberbullying can ensure a deep and durable effect on its sufferers, who are frequently youth. Precisely detecting cyberbullying aids prevent it. Conversely, the sound and faults in community media posts and messages mark identifying cyberbullying exactly hard. Dwivedi et al. [18] explained that nowadays IoT-based structures are developing very rapidly which have a diverse form of wireless sensor networks. Further, the authors defined anomaly or outlier detection in this paper. Singh and Dwivedi [19] defined human identity as a part of supplying security to societies. In this paper, the writer labored at the techniques to find which methodology has better overall performance so that you can provide human identification protection. Sahay et al. [20] defined that cyberbullying impacts greater than 1/2 of younger social media users’ global, laid low with extended and/or coordinated digital harassment. Dwivedi et al. [21] defined a system studying-based scheme for outlier detection in smart healthcare sensor cloud. The authors used numerous performance metrics to evaluate the proposed paintings. Shakeel and Dwivedi [22] explained that social media plays an important role in today’s world. Further, the author described influence maximization as a problem and how to conquer this problem if you want to find the maximum influential node on social media. Malpe and Vaikole [23] depicted that cyberbullying is a movement where someone or else a collection of folk’s usages societal networking web sites happening the internet through smartphones, processors, as well as pills toward misfortune, depression, injured, or damage an alternative person. Dwivedi et al. [24] defined how a smart information system is based totally on sensors to generate a massive amount of records. This generated record may be stored in the cloud for additional processing. Further, in this paper, the authors defined healthcare monitoring sensor cloud and integration of numerous frame sensors of various patients and cloud. Chatzakou
Performance Analysis of Supervised Machine Learning Algorithms …
389
et al. [25] defined cyberbullying and cyber-aggression as more and more worrisome phenomena affecting human beings across all demographics. The authors similarly referred to that more than half of younger face cyberbullying and cyber-aggression because of using social media. Rai and Dwivedi [26] defined that many technologies have made credit cards common for both online and offline purchases. So, protection is expected to save you fraud transactions. Therefore, the authors worked on it using different methodologies and thus find the accuracies of every methodology. Shakeel and Dwivedi [27] explained the positive and negative impact of using social media. The authors mainly focuses on negative impact of social media one of them is cyberbullying, because of cyberbullying which many people committed suicide. Zhao et al. [28–30] described that the rate of cyberbullying is increasing because of the maximum use of social media. The author worked on how to reduce and stop cyberbullying and for that reason designed a software program to automatically detect bullying content material on social media. Pasumpon Pandian [31] defined some commonplace programs of deep gaining knowledge of like sentiment evaluation which possess a higher appearing and green automatic feature extraction techniques in comparison to conventional methodologies like surface approach and so forth. Andi [32] described how the call for machine studying and AI-assisted buying and selling has been extended. In this paper, the writer proposed a set of rules to demonstrate a higher algorithm for predicting the bitcoin price based on the present global stock marketplace. Chen and Lai [33] explained how the usage of the net, numerous enterprises including the economic industry has been exponentially extended. Further, this paper addresses numerous demanding situations that have compounded despite the emerging technological growth and reliance. Tripathi [34] explained the lack of physical reference to each other due to COVID19 lockdown which ends up in an increase in the social media verbal exchange. Some social media like Twitter come to be the most famous region for the people to specific their opinion and also to communicate with every other.
3.1 Research Gap The existing work’s result concluded that the voting classifier has the best accuracy and the support vector machine has the lowest accuracy. Thus, in the proposed paper, we have worked on the accuracy of SVM, NB, and LR and compared the accuracy of a majority of these three classifiers. Table 1 presents a comparative study of related work. Based on the survey, inside the comparative table of related work, we have got referred to the author’s name, years of the paper, name of the paper, and methodology used.
390
N. Shakeel and R. K. Dwivedi
Table 1 Comparative study of related work S. No.
Author’s
Year
Title
1.
Nurrahmi [1]
2016
Indonesian Twitter SVM and KNN cyberbullying detection using text classification and user trustworthiness
Algorithm used
There are some wrong tags in the Indonesian POS tagger
Limitations
2.
Prasanna 2017 Kumar et al. [2]
A survey on cyberbullying
Semi-supervised targeted event detection (STED), Twitter-based event detection and analysis system (TEDAS)
Not any
3.
Pradheep et al. [3]
2017
Automatic multimodel cyberbullying detection from social networks
Naïve Bayes
Sometimes a proposed model is not able to control the stop words
4.
Mangaonkar et al. [4]
2018
Collaborative detection of cyberbullying behavior in Twitter data
Naive Bayes (NB), logistic regression, and support vector machine (SVM)
When the true negatives increase then the models are not working
5.
Al-Ajlan and Ykhlef [5]
2018
Improved Twitter cyberbullying detection based on deep learning
Convolutional neural network (CNN)
The metaheuristic optimization algorithm is incorporated to find the optimal or near-optimal value
6.
Agrawal and Awekar [6]
2018
Deep learning for detecting cyberbullying across multiple social media platforms
Deep neural network (DNN)
Some models do not work properly
7.
Meliana and Fadlil [7]
2019
Identification of cyberbullying by using clustering approaches on social media Twitter
Naïve Bayes and decision tree
Naïve Bayes was not able to find the hate comments as compared to that of decision tree J48
8.
Keni et al. [8]
2020
Cyberbullying detection using machine learning algorithms
Principle component analysis (PCA) and latent semantic analysis (LSA)
The performance of other classifications is not good (continued)
Performance Analysis of Supervised Machine Learning Algorithms …
391
Table 1 (continued) S. No.
Author’s
Year
Title
Algorithm used
Limitations
9.
Wade et al. [9]
2020
Cyberbullying detection on Twitter mining
Convolutional neural network (CNN) and long short-term memory (LSTM)
CNN-based models do not have better performance as compared to that DNN models
10.
Nirmal et al. [10]
2020
Automated detection of cyberbullying using machine learning
Naïve Bayes model, SVM model, and DNN model
Difficult to detect some hate words because of specific code
11.
Ingle et al. [12]
2021
Cyberbullying monitoring system for Twitter
Gradient boosting
Naïve Bayes, logistic regression does not possess good results
12.
Desai et al. [13] 2021
Cyberbullying detection on social media using machine learning
BERT
The accuracy of SVM and NB is not good as compared to that of pre-trained BERT
13.
Khokale et al. [14]
2021
Review on detection of cyberbullying using machine learning
Support vector machine (SVM) classifier, logistic regression
The authors did not find the use of K-folds across the techniques
14.
Mukhopadhyay et al. [15]
2021
Cyberbullying detection based on Twitter dataset
Convolutional neural network (CNN)
Not good performance
15.
Malpe et al. [23] 2020
A comprehensive study on cyberbullying detection using machine learning technique
Deep neural network (DNN)
Not any
16.
Chatzakou et al. 2019 [25]
Detecting LDA cyberbullying and cyber-aggression in social media
Effective tools for detecting harmful actions are scarce, as this type of behavior is often ambiguous
392
N. Shakeel and R. K. Dwivedi
4 Proposed Approach In this paper, we can develop using Python technology. Within that first, we can seek and discover the dataset and download it to train the model. After downloading, first, we can preprocess the data after which transferred to TF-IDF. Then, with the help of NB, SVM, and LR algorithms, we train the dataset and generate models one after the other. Then, we are going to develop web-based software that uses the Anaconda framework. We will fetch the actual time tweets from Twitter and then we apply generated model to those fetched tweets and test the textual content or images are cyberbullying or not. Figure 2 shows the proposed framework of this paper. Algorithm 1 Detection of cyberbullying and non-cyberbullying words Input: Twitter datasets Output: identifies cyberbullying or non-cyberbullying Begin Step 1: Take the input data from Twitter Step 2: Start preprocessing Step 3: Divide the processed data into each comment Step 4:Classify the feature selection Step 5: Apply machine learning algorithm (i) NB (ii) SVM (iii) LR Step 6: If cyberbullying words occurs Then, identify and classify cyberbully words
Fig. 2 Proposed framework
Performance Analysis of Supervised Machine Learning Algorithms …
393
Else Calculate the non-cyberbullying words End In Algorithm 1, firstly, we take datasets from Twitter. And then, we will preprocess the data and transferred it to TF-IDF. After preprocessing, we divide the processed data into cyberbully or non-cyberbully comments. After that, we classify the feature selection and apply machine learning algorithms like NB, SVM, and, LR and check the cyberbullying and non-cyberbullying words.
4.1 Preprocessing In the proposed model, it is remained useful to eliminate and clean undesirable noise in text detection. Later information cleaning, the dataset is distributed into dual groups: a training set and a testing set wherever every dataset is labeled as cyberbullying or non-cyberbullying. The first element includes 70% of the tweets used for training purposes; also the further component incorporates 30% used for prediction reasons. The data has been fetched from Twitter. The information accumulated should encompass three attributes user attributes, class, and format. The user attributes are used for the identification of a user, and the class feature is used to understand businesses and the format expresses the consumer touch upon numerous status/corporations. When the dataset has been organized, it must be broken up into texts which encompass remarks, conversations, and many others. The selected attributes to classify the tweets are shown in Table 2.
4.2 Feature Extraction Subsequently, cleaning the dataset within the overhead stages, tokens may be removed from it. The system of removing tokens is known as tokenization, wherein we proceed with the extracted records as sentences or passages and then amount produced the arrived textual content as detached words, letterings, or sub phrases in the shape of a list. These words want to be transformed into statistical courses hence that every dataset may be signified within the procedure of statistical information.
4.3 Feature Engineering and Feature Selection Some of the extremely communal methods to expand cyberbullying detection is to carry out feature engineering, and the supreme shared features that progress the
394
N. Shakeel and R. K. Dwivedi
Table 2 Selected attributes to classify the tweets Attributes
Class
Format
Noun
CB/non-CB
Text
Pronoun
CB/non-CB
Text
Adjective
CB/non-CB
Text
Local features
The basic features extracted from a tweet
Text
Contextual features
Professional, religious, family, legal, and financial factors specific to CB
Text
Sentiment features
Positive or negative (foul words specific to CB) or direct or indirect CB polite words, modal words, unknown words, number of insults and hateful blacklisted words
Text
Emotion features
Harming with detail description, power differential any form Text of aggression, targeting a person, targeting to a more persons, intent, repetition, one-time CB, harm, perception, reasonable person/witness, and racist sentiments
Gender specific
Male/female
Text
User features
Network information, user information, his/her activity information, tweet content, account creation time, and verified account time
Numeric
Twitter basic features Number of followers, number of mentions, and number of Numeric following, favorite count, popularity number of hash tags, and status count Linguistic features
Other languages words, punctuation marks, and abbreviated words rather than abusive sentence judgments
Text
satisfaction of cyberbullying discovery classifier overall concert are; literal, community, consumer, feeling, word embedding’s functions. We tried to construct functions based totally on the textual context and their semantic alignment.
4.4 Classification Techniques In this section, numerous classifiers have been used to categorize whether the tweet is cyberbullying or non-cyberbullying. The classifier models constructed are NB, LR, and SVM. a.
Naïve Bayes (NB)
Naïve Bayes is broadly used for file/textual classification problems. Conversely, in the cyberbullying detection area, Naïve Bayes turned into the utmost generally used to ensure cyberbullying predictions simulations. b.
Logistic Regression (LR)
Logistic regression is any of the maximum common machine learning algorithms, which arises under the supervised learning method. Logistic regression can be used
Performance Analysis of Supervised Machine Learning Algorithms …
395
to categorize the opinions using dissimilar forms of records and can simply decide the simplest variables used for the taxonomy. c.
Support Vector Machine (SVM)
An SVM model is the demonstration of information as facts in an area drawn hence that the samples of the distinct groups are separated by way of a clean hole that is varied as viable. SVM’s can successfully implement a nonlinear category, indirectly representing their contributions into great dimensional feature space.
4.5 Performance Evaluation The proposed work of this paper states the performance measurement, datasets, and results. In overall performance measurement, we have used some metrics along with accuracy; such as accuracy recall, precision, F1-score, and specificity. We have taken the datasets of Twitter to discover the accuracy, F1-score, balance accuracy, specificity of some of the classifiers. A.
Performance Measurement
To examine our classifiers, numerous estimation metrics should be used. We have assumed the extremely shared norms which can be normally used, namely: accuracy, precision, consider, F1-measure. Such norms are described as follows: i.
Accuracy: It is defined as the ratio of the quantity of effectively expected opinions to the quantity of general quantity of critiques present in the corpus. Accuracy = (TP + TN)/(TP + TN + FP + FN)
ii.
Precision: It provides the correctness of the classifier. It is the ratio of the range of properly expected wonderful analyzes to the whole quantity of opinions expected as high quality. Precision = TP/(TP + FP)
iii.
(2)
Recall: It is the ratio of a wide variety of efficaciously expected effective analyzes toward the real quantity of superb critiques gift in the quantity. Recall = TP/(TP + FN)
iv.
(1)
(3)
F 1-Score: It is the vocal suggestion of exactness and does not forget. F1-degree will have a great price of 1 and the worst value is 0. F-measure = 2 ∗ (Recall ∗ Precision)/(Recall + Precision)
(4)
396
N. Shakeel and R. K. Dwivedi
Table 3 Confusion matrix corresponding to several classifiers Approaches used
True negative (TN)
False positive (FP)
False negative (FN)
True positive (TP)
Naïve Bayes (NB)
10,501
1569
1409
11,088
Support vector machine (SVM)
10,812
1688
1398
11,102
Logistic regression (LR)
10,772
1298
1161
10,499
v.
Specificity: It is the ratio of true negative with the sum of true negative and false positive. Specificity = TN/(TN + FP)
(5)
where True positive (TP) is successful; efficiently categorized as high quality. True negative (TN) is a rejection; effectively labeled as negative. False positive (FP) is a false alarm, falsely categorized as tremendous. False negative (FN) is an error, falsely categorized as poor. B.
Dataset
Detecting cyberbullying in social media over cyberbullying keywords also the use of machine learning for finding are hypothetical and sensible demanding situations. In this paper, we used an international dataset of 37,373 tweets to estimate classifiers that are typically utilized in cyberbullies’ gratified material exposure. C.
Confusion Matrix
A confusion matrix is a matrix that characterizes the overall presentation of some classifiers. The cataloging matrix of NB, SVM, and LR classifiers utilized in cyberbullying detection are shown in Table 3. LR gives the lowest false positive, lowest false negative, and lowest true positive, and SVM provides the maximum true negative, false positive, and true positive, while the NB offers the highest false negative and lowest true negative. D.
Results
The existing work of this paper is that the accuracy of SVM is lowest as compared to that of the voting classifier. So, the proposed work of this paper is that I have worked at the accuracy of the support vector machine and established that the result is SVM has the best accuracy among the rest of the classifiers including NB and LR. Cyberbullying detection algorithm is carried out with three classifiers NB, SVM, and LR. Their overall performance comparison is shown in Table 4. Table 4 shows the performance comparison of several classifiers. It can be comprehended from Table 4 that the accuracy of the SVM is good among the rest of the two classifiers.
Performance Analysis of Supervised Machine Learning Algorithms … Table 4 Performance comparison of several classifiers
Metrics
Precision
397
Classifiers Naïve Bayes (NB)
Support vector machine (SVM)
Logistic regression (LR)
0.85
0.87
0.88
Recall
0.87
0.89
0.86
Specificity
0.88
0.88
0.89
F1-score
0.86
0.87
0.90
Accuracy
0.87
0.94
0.89
Fig. 3 Comparison of precision
Figure 3 gives a comparison of the precision of numerous classifiers viz., NB, SVM, and LR. It may be visible that SVM has the best precision. Figure 4 shows the comparison of these classifiers, and it is determined that recall of SVM is the best. Figure 5 provides the evaluation of the specificity of those classification methods, and it is far discovered that the specificity of LR is the highest. Figure 6 describes the F1-score of these schemes, and it can be noticed that LR has the very best F1score. Figure 7 depicts the accuracy of NB, SVM, and LR, and we can see that SVM gives the very best accuracy. Thus, we will say that SVM is the best scheme for cyberbullying detection. Table 5 indicates the comparison table of existing work and proposed work in which we have taken a few metrics like precision, recall, balance accuracy, F1score, and accuracy. And we compare most of these metrics of our proposed work to that of existing work and observe that the precision metric of existing work is 0.87 while the precision metric of our proposed work is 0.88. Which expect that the precision metric of proposed works is better whereas the precision metric of existing work is lower? After, evaluating the precision metric, we come to the take into recall metric and observed that the recall metric of existing work is 0.81 whereas the recall
398
Fig. 4 Comparison of recall
Fig. 5 Comparison of specificity
Fig. 6 Comparison of F1-score
N. Shakeel and R. K. Dwivedi
Performance Analysis of Supervised Machine Learning Algorithms …
399
Fig. 7 Comparison of accuracy
Table 5 A comparison study of existing and proposed work
Metrics
Existing work
Proposed work
Precision
0.87
0.88
Recall
0.81
0.86
Specificity
0.87
0.88
F1-score
0.88
0.90
Accuracy
0.93
0.94
metric of our proposed works is 0.86. Which predicts that the recall metric of the proposed work is better whereas the recall metric of existing work is lower. After, comparing the recall metric, we come to the F1-score metric and observed that the F1-score metric of existing work is 0.88, while the F1-score metric of our proposed work is 0.90. Which are expecting that the F1-score metric of proposed work is higher while the F1-score metric of existing work is lower. After, evaluating most of these metrics, we come to the accuracy metric of SVM and found that the accuracy metric of the existing work is 0.93 and the accuracy of the proposed work is 0.94 which predicts that the accuracy metric of the proposed work is better than that of the existing work.
5 Novelty of the Proposed Approach The existing work’s result concluded that the voting classifier has the best accuracy and the support vector machine has the lowest accuracy. Thus, in the proposed paper, we have worked at the accuracy of SVM, NB, and LR to compare the accuracy of a majority of these three classifiers and determine the accuracy, F1-score, specificity, and recall of SVM is maximum among that three classifiers.
400
N. Shakeel and R. K. Dwivedi
6 Conclusion and Future Work Although social media platform has come to be an essential entity for all of us, cyberbullying has numerous negative influences on a person’s lifestyle which contains sadness, nervousness, irritation, worry, consider concerns, small shallowness, prohibiting from social activities, and occasionally desperate behavior also. Cyberbullying occurrences are not the simplest taking region via texts; however, moreover audio and video capabilities play an essential function in dispersal cyberbullying. This study has discussed a designated and comprehensive review of the preceding research completed within the subject of cyberbullying. The existing work of this paper is that the accuracy of SVM is lowest as compared to that of the voting classifier. Further, in this paper, we have worked on the accuracy, F1-score, specificity, recall, and precision of NB, SVM, and LR and observed that the accuracy, F1-score, balance accuracy, specificity, and recall of SVM is better in comparison to that of NB. The accuracy of SVM is 0.94 which outperforms the existing work. The future may be prolonged to examine distinctive Twitter organizations or network pages to perceive every unfamiliar or violent post using the societies in opposition to authorities businesses or others.
References 1. H. Nurrahmi, D. Nurjanah, Indonesian Twitter cyberbullying detection using text classification and user credibility, in International Conference on Information and Communications Technology (ICOIACT) (2016), pp. 542–547 2. G. Prasanna Kumar et al., Survey on cyberbullying. Int. J. Eng. Res. Technol. (IJERT) 1–4 (2017) 3. T. Pradheep, J.I. Sheeba, T. Yogeshwaran, Automatic multimodal cyberbullying detection from social networks, in International Conference on Intelligent Computing Systems (ICICS) (2017), pp. 248–254 4. A. Mangaonkar, A. Hayrapetian, R. Raje, Collaborative detection of cyberbullying behavior in Twitter, in IEEE (2018) 5. M.A. Al-Ajlan, M. Ykhlef, Optimized cyberbullying detection based on deep learning (2018) 6. S. Agrawal, A. Awekar, Deep learning for cyberbullying across multiple social media platforms (2018), pp. 2–12 7. N. Meliana, A. Fadlil, Identification of cyberbullying by using clustering method on social media Twitter, in The 2019 Conference on Fundamental and Applied Science for Advanced Technology (2019), pp. 1–12 8. A. Keni, Deepa, M. Kini, K.V. Deepika, C.H. Divya, Cyberbullying detection using machine learning algorithms. Int. J. Creat. Res. Thoughts (IJCRT) 1966–1972 (2020) 9. S. Wade, M. Parulekar, K. Wasnik, Survey on detection of cyberbullying. Int. Res. J. Eng. Technol. (IRJET) 3180–3185 (2020) 10. N. Nirmal, P. Sable, P. Patil, S. Kuchiwale, Automated detection of cyberbullying using machine learning. Int. Res. J. Eng. Technol. (IRJET) 2054–2061 (2021) 11. M. Patidar, M. Lathi, M. Jain, M. Dharkad, Y. Barge, Cyber bullying detection for Twitter using ML classification algorithms. Int. J. Res. Appl. Sci. Eng. Technol. (IJRASET) 24–29 (2021) 12. P. Ingle, R. Joshi, N. Kaulgud, A. Suryawanshi, M. Lokhande, Cyberbullying monitoring system for Twitter. Int. J. Sci. Res. Publ. 540–543 (2021)
Performance Analysis of Supervised Machine Learning Algorithms …
401
13. A. Desai, S. Kalaskar, O. Kumbhar, R. Dhumal, Cyberbullying detection on social media using machine learning. ITM Web Conf. 2–5 (2021) 14. S. Khokale, V. Gujrathi, R. Thakur, A. Mhalas, S. Kushwaha, Review on detection of cyberbullying using machine learning. J. Emerg. Technol. Innov. Res. (JETIR) 61–65 (2021) 15. D. Mukhopadhyay, K. Mishra, L. Tiwari, Cyber bullying detection based on Twitter dataset. ResearchGate 87–94 (2021) 16. R. Shah, S. Aparajit, R. Chopdekar, R. Patil, Machine learning-based approach for detection of cyberbullying tweets. Int. J. Comput. Appl. 52–57 (2020) 17. X. Zhang, J. Tong, N. Vishwamitra, E. Whittaker, Cyberbullying detection with a pronunciation based convolutional neural network, in 15th IEEE International Conference on Machine Learning and Applications (2016), pp. 740–745 18. R.K. Dwivedi, A.K. Rai, R. Kumar, Outlier detection in wireless sensor networks using machine learning techniques: a survey, in IEEE International Conference on Electrical and Electronics Engineering (ICE3) (2020), pp. 316–321 19. A. Singh, R.K. Dwivedi, A survey on learning-based gait recognition for human authentication in smart cities, in Part of the Lecture Notes in Networks and Systems, Series 334 (Springer, 2021), pp. 431–438 20. K. Sahay, H.S. Khaira, P. Kukreja, N. Shukla, Detecting cyberbullying and aggression in social commentary using NLP and machine learning. Int. J. Eng. Technol. Sci. Res. 1428–1435 (2018) 21. R.K. Dwivedi, R. Kumar, R. Buyya, A novel machine learning-based approach for outlier detection in smart healthcare sensor clouds. Int. J. Healthc. Inf. Syst. Inform. 4(26), 1–26 (2021) 22. N. Shakeel, R.K. Dwivedi, A learning-based influence maximization across multiple social networks, in 12th International Conference on Cloud Computing, Data Science & Engineering (2022) 23. V. Malpe, S. Vaikole, A comprehensive study on cyberbullying detection using machine learning approach. Int. J. Futur. Gener. Commun. Netw. 342–351 (2020) 24. R.K. Dwivedi, R. Kumar, R. Buyya, Gaussian distribution based machine learning scheme for anomaly detection in wireless sensor network. Int. J. Cloud Appl. Comput. 3(11), 52–72 (2021) 25. D. Chatzakou, I. Leontiadis, J. Blackbum, E. De Cristofaro, G. Stringhini, A. Vakali, N. Kourtellis, Detecting cyberbullying and cyber aggregation in social media. ACM Trans. Web 1–33 (2019) 26. A.K. Rai, R.K. Dwivedi, Fraud detection in credit card data using machine learning techniques, in Part of the Communications in Computer and Information Science (CCIS), no. 1241 (2020), pp. 369–382 27. N. Shakeel, R.K. Dwivedi, A survey on detection of cyberbullying in social media using machine learning techniques, in 4th International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV) (2022) 28. R. Zhao, A. Zhou, K. Mao, Automatic detection of cyberbullying on social networks based on bullying features, in International Conference on Distributed Computing and Networks (ICDCN) (2019) 29. S.M. Ho, D. Kao, M.-J. Chiu-Huang, W. Li, Detecting “hotspots” on twitter: a predictive analysis approach. Forensic Sci. Int. Digit. Investig. 3, 51–53 (2020) 30. S.R. Mugunthan, T. Vijayakumar, Design of improved version of sigmoidal function with biases for classification task in ELM domain. J. Soft Comput. Paradigm (JSCP) 3(02), 70–82 (2021) 31. A. Pasumpon Pandian, Performance evaluation and comparison using deep learning techniques in sentiment analysis. J. Soft Comput. Paradigm (JSCP) 3(02), 123–134 (2021) 32. H.K. Andi, An accurate bitcoin price prediction using logistic regression with LSTM machine learning model. J. Soft Comput. Paradigm 3(3), 205–217 (2021) 33. J.I.-Z. Chen, K.-L. Lai, Deep convolution neural network model for credit card fraud detection and alert. J. Artif. Intell. 3(02), 101–112 (2021) 34. M. Tripathi, Sentiment analysis of Nepali COVID19 tweets using NB, SVM, AND LSTM. J. Artif. Intell. 3(03), 151–168 (2021)
Text Summarization of Legal Documents Using Reinforcement Learning: A Study Bharti Shukla, Sonam Gupta, Arun Kumar Yadav, and Divakar Yadav
Abstract Studying and analyzing judicial documents are challenging tasks for the common person. The basic reason for the complexity of documents is their long length with complex language. In this regard, a summarize document is required in simple language that can be understandable to the common person. Manual summarization of legal documents is tedious, and it requires an automatic legal document text summarization technique. In the current scenario, deep learning and machine learning play crucial role in text processing and text summarization. This paper discusses the detailed survey of legal document summarization and the dataset used in various legal documents. Also, discuss the various machine learning models and analyze them evaluation metrics to evaluate the best summarization model. Secondly, this study discusses the quality of summarization with and without t reinforcement learning. The analysis concluded that the rational augmented model [RAG] with deep reinforcement learning outperforms for legal document text summarization with 79.8% accuracy, 70.5% precision, 90.75% recalls, and 75.09% F-measure, respectively. A long short-term memory network model without deep reinforcement learning outperforms 0.93 and 0.98 recall and F-measure, respectively. Keywords Text summarization · Deep reinforcement learning · Legal documents · Text extraction · Deep learning
1 Introduction In the digital world, the growth of big data is increasing day by day on a large scale. A massive amount of data is available in English or Hindi on social media, World Wide Web, etc. However, users need summarized information in optimal form. B. Shukla · S. Gupta (B) Ajay Kumar Garg Engineering College, Ghaziabad, India e-mail: [email protected] A. K. Yadav · D. Yadav National Institute of Technology, Hamirpur, H.P., India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_30
403
404
B. Shukla et al.
Text summarization is a big challenge in the world of legal documents. Because, if summarized information is available of any particular documents, it is easy to predict the judgment and retrieve the exact information [1]. In other words, we can say that the text summarization summarizes the original legal document for easily understandable and readable content without reading individual documents and is helpful to predict the judgment of legal documents [2]. The automatic text summarization is most important in legal domains to retrieve the legal text. However, it is most challenging to summarize legal text because it is stored in various formats, i.e., structured, unstructured, different styles, etc. Automatic summarization of legal text from legal documents can be done with the help of NLP language and machine learning. Natural language processing plays a crucial role in text summarization, information retrieval, automatic extraction, automatic retrieval, data classification, etc. In the legal domain, the keywords play an important role to create the index and help to predict the judgment of legal text. Previously, supervised, and unsupervised techniques were also used to retrieve the text from the legal domain [3, 4]. The reinforcement learning is also performed better for automatic text summarization. It is working as an agent while summarizing the legal text. With the help of reinforcement learning, we can decide the better summarization in the documents. The summarized information of original documents helps understand the brief idea of legal cases [5]. Nowadays, deep learning methods are also used for document summarization and can perform better. In this regard, researchers commented that the bidirectional long short-term memory network approach (BI-LSTM) could analyze the whole original document to summarize data. It can explore more contextual information from the legal domain. This model is executed after retrieving legal text or extraction of legal text [6, 7]. Figure 1 represents the overall working of text summarization in legal documents. Fig. 1 Text summarization process [8]
Text Summarization of Legal Documents Using Reinforcement …
405
There are some other challenges of automatic text summarization for legal documents that are classified as follows: a.
USAGE • Multi-document summarization • User-specific summarization • Application of text summarization
b.
INPUT • Input and output formats • Length of input documents • Supported language
c.
METHOD • Text summarization approaches • Statistical and linguistic feature • Using deep learning for text summarization
d.
OUTPUT • Stop criteria of summarization process • Quality of generated summary • Evaluation of generated summary.
After understanding the importance of the summarization of legal documents, this study motivates the authors to study the challenges, working strategies, methods of summarizing, and evaluating the best method to find the optimal summarized document. • If we apply text summarization in the legal document area to predict the judgment, humans can easily predict the legal judgment. • In this research paper, the author suggests the best text summarization technique with and without reinforcement learning for legal document The rest of the paper is organized as follows; Sect. 2 represents the methods used for text summarization. The analysis of datasets used in the study for legal documents is done in Sect. 3. In Sect. 4, performance comparison of proposed methods for legal documents has been done. After comparison, we discuss the best technique in Sect. 5, and conclusion of research paper is discussed in Sect. 6.
2 Approaches/Methods Used for Text Summarization In this section, the authors describe the detailed study on approaches used for text summarization for legal documents. For a detailed survey on text summarization, authors search documents from Google with different combinations of keywords
406
B. Shukla et al.
such as: “legal document text summarization”, “legal document summarization using machine learning”, and “legal document summarization using deep learning”. The authors collected many research papers from various social media for the literature survey. Finally, the author selected 21 relevant documents to analyze the topic domain’s work so far. The Authors found around 30 papers related to the research topic and found that 21 papers are relevant out of 30. The paper [9] applies XML technology and transductor for text summarization and text extraction from legal documents. The authors used this idea on the information management platform for organizing legal documents. The authors proposed two classifiers, i.e., catchphrase extraction and catchphrase prediction task, and the model achieved 70% accuracy. The paper [10] introduced a multi-model architecture for text summarization, classification, and prediction problems. The authors try to decrease the lack of data problem for the legal domain. The model shows the best performance in state-of-the-art. In the paper [11], the researchers provide a detailed survey of automatic text summarization [ATS] techniques. In this research paper, the authors introduced various algorithms used by previous researchers. Also, the authors introduced issues and challenges of the automatic text summarization technique. In the paper [12], the researchers provide a detailed survey of extractive text summarization techniques. The authors reviewed various challenges and datasets of extractive summarization techniques in this paper. Also, the authors provide details of the most popular benchmarking datasets. In the paper [13], the authors implement a LetSum model for legal text summarization in the legal domain. LetSum model implements a document summary of text summarization for legal documents in table style for improved text view and ease to read. The LetSum model achieves 57.5% accuracy. After surveying the literature in Table 1 of text summarization, we found several important points. One such significant gap is that the text summarization field is still interested among the various researchers. So, that there is further improvement in performing text summarization of legal documents. Another important point is, with the using of machine learning and deep learning, not much work have done on legal document text summarization. Only few techniques have discussed, for text summarization in reinforcement learning for legal documents. Also, there are requirements of an accurate approach for text summarization in reinforcement learning. Based on the above researched gaps, we found and defined the following research questions (RQs) for our research work. (i) RQ1: Which technique gives the best comparison result for legal document text summarization with reinforcement learning? (ii) RQ2: Which technique gives the best comparison result for text summarization without reinforcement learning?
3 Datasets Used in the Study Various researchers have used multiple datasets for text summarization in legal documents. The dataset is used in multiple languages like English, Chinese, Japanese, etc.
Text Summarization of Legal Documents Using Reinforcement …
407
Table 1 Comparison of various techniques of text summarization Key No.
Publication (year) Methodology/finding
Proposed classifiers
L1
A comparative study of summarization algorithms applied to legal case judgments (2019) [14]
Summarization of Summarized algorithm [supervised algorithm and unsupervised]
Legal at model shows best performance on real-world dataset
L2
A deep learning approach to contract element extraction (2017) [15]
1. The authors implement BILSTM-LR model for linear regression operate on fixed size window without involve any manual rule 2. BILSTM-CRF model is introduced for extracting multiple tokens of contract element
BILSTM-CRF model have best performance as compared to BILSTM-LR model
L3
An automatic system for summarization and information extraction of legal information (2010) [9]
The authors described XML technologies idea was to elaborate and transductor on an information management platform to organize the linguistic cues and semantic rules to achieve a precise information extraction in different fields
Better performance on basis of parameters
L4
Overview of the FIRE 2017 IRLeD track: information retrieval from legal documents (2017) [3]
This model is used to (i) Catchphrase retrieving information extraction task, and from legal document (ii) precedence retrieval task
Better performance
L5
Text summarization from legal documents: a survey [16]
The detailed survey on text summarization for legal text
Done survey of all summarization algorithms
1. BILSTM-LSTM-LR 2. BILSTM-CRF
Text summarization techniques
Performance/result
(continued)
408
B. Shukla et al.
Table 1 (continued) Key No.
Publication (year) Methodology/finding
Proposed classifiers
Performance/result
L6
Interpretable rationale augmented charge prediction system (2018) [17]
The author proposed RA model for predication the accuracy
Rationale-augmented classification model
The performance is good and it having comparable accuracy
L7
Multi-task deep learning for legal document translation, summarization and multi-label classification (2018) [10]
The authors developed a multi-model architecture to decrease the problem of data scarcity in legal domain
Multi-model architecture
The multi-task deep learning model performed best in state-of-the-art result of all task
L8
Automatic text A comprehensive summarization: a survey of ATS comprehensive survey [11]
ATS
The researched paper provides a systematic review of ATS approaches
L9
A survey on extractive text summarization [12]
The various techniques, populous benchmarking datasets and challenges of extractive summarization have been reviewed
ETS
The researched paper interprets extractive text summarization methods with a less redundant summary
L10
Legal texts summarization by exploration of the thematic structures and argumentative roles [13]
The model builds a LetSum table style summary for improving coherency and readability of the text
The preliminary evaluation results are very promising
L11
Robust deep reinforcement learning for extractive legal summarization [5, 18]
For training the deep summarization models with the help of reinforcement learning models
The researched model improve performance of text summarization in legal domain
ELS
Table 2 represents the research studies based on the dataset. The PESC dataset [18] is the best dataset for text summarization in legal documents. The PESC dataset is used with deep reinforcement learning for summarizing the text. The dataset achieved the best accuracy compared to another dataset with a deep reinforcement learning field.
Text Summarization of Legal Documents Using Reinforcement …
409
Table 2 Datasets used for text summarization Key No.
Dataset
Domain
Url
L1
17,347 legal case documents [14]
Supreme Court of India
https://www.westlawasia. com/
L2
3500 English contracts [15] UK legal documents
http://nlp.cs.aueb.gr/public ations.html
L3
Legal information [9]
http://www.lexisnexis.ca http://www.carswell.co
L4
1. A collection of legal case LII of India documents with their catchphrases 2. A collection of legal case documents, and prior case [3]
L5
Text Retrieval Conference (TREC), Message Understanding Conference (MUC), Document Understanding Conference (DUC4), Text Analysis Conference (TAC5), and Forum for Information Retrieval Evaluation (FIRE6) [16]
MEAD open-source http://www.trec.nist.gov dataset for summarization http://www-nlpir.nist.gov/ related_projects/muc/
L6
Chinese legal dataset [17]
CAIL2018
L7
Digital corpus of the European parliament European parliament (DCEP) and Joint Research Centre—Acquis Communaut are (JRC-Acquis) [10]
https://mediatum.ub.tum. de/1446650 https://mediatum.ub.tum. de/1446648 https://mediatum.ub.tum. de/1446655 https://mediatum.ub.tum. de/1446654 https://mediatum.ub.tum. de/1446653
L8
The most common Essex Arabic summaries benchmarking datasets [11] corpus (EASC) dataset
https://www.lancaster.ac
L10
3500 judgments of the Federal Court of Canada [13]
Corpus
http://www.canlii.org/ca/ cas/fct/
L11
Legal snippet contains approximately 595 characters [5, 18]
PESC dataset
NA
1. Canadian federals 2. QuickLaw 3. Westlaw-Carswell
www.liiofindia.org
1. http://wenshu.court. gov.cn/ 2. https://github.com/han kcs/HanLP
410
B. Shukla et al.
4 Performance Comparison of Proposed Methods in the Study As per the discussion in this section, we compared previous techniques based on various parameters. The techniques use different datasets in different languages. The previous research techniques provide the best result for text summarization for legal documents with reinforcement learning. Also, some techniques are used for text summarization without applying reinforcement learning. Table 3 shows the performance comparison of techniques based on parameters. Figure 2 represents the comparison graph of text summarization techniques for legal documents. In Fig. 2, we combine all techniques of text summarization with and without applying reinforcement learning for performance comparison. All techniques try to achieve the best parameter, but some techniques cannot summarize the text for legal documents. The LSTM model achieves the highest value among all models. The LSTM model achieved 98.8% recall without deep reinforcement learning. At the same time, the rational augmented model achieved 90.75% recall with deep reinforcement learning.
5 Discussion As per the findings in performance comparison, we can say that text summarization plays a crucial role in legal documents. The text summarization is a summary of the original legal document for easily understandable and readable content without reading individual documents. Moreover, it is helpful to predict the judgment of legal documents. In Table 3, we have summarized the comparison of previous techniques to meet our research questions. We come to the following conclusion: RQ1: Which technique gives the best comparison result for text summarization with reinforcement learning?
After analyzing Fig. 3, we can say that the rational augmented model [RAG] model is the best model for summarizing data. Because the RAG model extracts the information from a legal document. Along with this, it is also helpful to predict the judgment in a legal document. RAG model achieves the best result as compared to other models. Accuracy, precision, recall, F-measure value of rational augmented model [RAG] model is 79.8, 70.5, 90.75, and 75.09. The LetSum model is used with deep reinforcement learning for text summarization. After analyzing the performance of the LetSum model, we conclude that the RAG model is best for text summarization in legal documents. RQ2: Which technique gives the best comparison result for text summarization without reinforcement learning?
After analyzing Table 3, we can say that the long short-term memory network model is the best model for text summarization. The LSTM model achieves 93.1%
LSTM
Long short-term memory network (catchphrase) [3]
Text summarization [16]
Multi-task deep learning [10]
PESC dataset [18]
Deep reinforcement learning—summarization approaches on legal documents [13]
Deep reinforcement learning [17]
L4
L5
L6
L11
L10
L6
Rational augmented model
LetSum
ELS
Multimodal architecture
Hybrid automatic summarization system
XML
XML technologies and transductor [9]
L3
Elison
Proposed models
Summarize algorithm [14] NA
Key No.
L1
79.8
57.5
25.70
NA
NA
NA
70
NA
Accuracy (%)
Table 3 Performance comparison of text summarization techniques Precision (%)
70.5
NA
NA
NA
NA
93.1
NA
NA
Recall (%)
90.75
NA
NA
NA
62.4
98.8
NA
50.5
F-measure (%)
75.09
NA
NA
82
54.9
NA
NA
37.10
Remark
The model achieve all parameters
Better performance on basis of parameters
Better performance on basis of parameters
The model outperforms as compared to others model but unable to calculate remaining parameters
The model achieves better performance
The model outperforms as compared to others model
Better performance on the basis of parameters
The model achieves better performance
Text Summarization of Legal Documents Using Reinforcement … 411
412
B. Shukla et al. Accuracy (%)
90
F-Measure (%)
79.8
90
80
70
80
70
75.09
70
57.5
60
82
Accuracy (%)
50 40
60
54.9
50 37.1
40
25.7
30
30
20
20
10
10
F-Measure (%)
0
0 XML tech LetSum[17] PESC Rational dataset[16] Augumneted and tran [9] Model[12]
Models
Models
(a) Performance Comparison of accuracy parameters
(b) Performance Comparison of F-Measure parameters
RECALL(%) 98.8 100 90 80 70 60 50 40 30 20 10 0
90.75 62.4
50.5
RECALL(%)
Models
(c) Performance Comparison of Recall parameters
Fig. 2 Performance comparison of accuracy, F-measure and recall in text summarization techniques
recall and 98.8 F-measure. The remaining models also try to achieve the best result, but as compared to other models, the LSTM model performs best for text extraction without applying reinforcement learning.
6 Conclusion Data summarization is brief information of original documents. Data summarization is most important in the legal domain. Because, after summarizing the document,
Text Summarization of Legal Documents Using Reinforcement … Fig. 3 Text summarization with deep reinforcement learning
413
Text Summarizaon with Deep Reinforcemet learning 90 80 70 60 50 40 30 20 10 0
79.8 57.5
Accuracy…
Rational LetSum [17] Augumneted Model [12]
it is straightforward to predict the judgment of the legal document. We analyzed the different datasets used by researchers in multiple languages. Also, we perform a comparison of previous baseline models based on their advantages, finding, and outcomes. After comparing the performance of various baseline models, we conclude that the rational augmented model [RAG] model with reinforcement learning is the best model for data summarization in a legal document. Also, long short-term memory network model without deep reinforcement learning is the best model for text summarization. The LSTM model achieves 93.1% recall and 98.8 F-measure. In the future work, we will use the automatic text summarization technique with reinforcement learning to predict legal documents’ judgment. Because predicting the judgment of lengthy documents are very costly and time-consuming process. So, we can apply the text summarization model with reinforcement learning to predict legal documents’ judgment. Acknowledgements This research is supported by Council of Science and Technology, Lucknow, Uttar Pradesh via Project Sanction letter number CST/D-3330.
References 1. S.P. Singh, A. Kumar, A. Mangal, S. Singhal, Bilingual automatic text summarization using unsupervised deep learning, in 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT) (2016), pp. 1195–1200. https://doi.org/10.1109/ICEEOT. 2016.7754874 2. S. Ryang, T. Abekawa, Framework of automatic text summarization using reinforcement learning, in Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (2012)
414
B. Shukla et al.
3. A. Mandal et al., Overview of the FIRE 2017 IRLeD track: information retrieval from legal documents, in FIRE (Working Notes) (2017) 4. T.N. Le, M. Le Nguyen, A. Shimazu, Unsupervised keyword extraction for Japanese legal documents. JURIX (2013) 5. D.-H. Nguyen et al., Robust deep reinforcement learning for extractive legal summarization, in International Conference on Neural Information Processing (Springer, Cham, 2021) 6. F.A. Braz et al., Document classification using a Bi-LSTM to unclog Brazil’s Supreme Court. arXiv preprint arXiv:1811.11569 (2018) 7. J.S. Manoharan, Capsule network algorithm for performance optimization of text classification. J. Soft Comput. Paradigm (JSCP) 3(01), 1–9 (2021) 8. https://marketbusinessnews.com/automatic-text-summarization-in-business/267019/ 9. E. Chieze, A. Farzindar, G. Lapalme, An automatic system for summarization and information extraction of legal information, in Semantic Processing of Legal Texts (Springer, Berlin, Heidelberg, 2010), pp. 216–234 10. A. Elnaggar et al., Multi-task deep learning for legal document translation, summarization and multi-label classification, in Proceedings of the 2018 Artificial Intelligence and Cloud Computing Conference (2018) 11. W.S. El-Kassas et al., Automatic text summarization: a comprehensive survey. Expert Syst. Appl. 165, 113679 (2021) 12. N. Moratanch, S. Chitrakala, A survey on extractive text summarization, in 2017 International Conference on Computer, Communication and Signal Processing (ICCCSP) (2017), pp. 1–6. https://doi.org/10.1109/ICCCSP.2017.7944061 13. A. Farzindar, G. Lapalme, Legal text summarization by exploration of the thematic structure and argumentative roles, in Text Summarization Branches Out (2004) 14. P. Bhattacharya et al., A comparative study of summarization algorithms applied to legal case judgments, in European Conference on Information Retrieval (Springer, Cham, 2019) 15. I. Chalkidis, I. Androutsopoulos, A deep learning approach to contract element extraction. JURIX (2017) 16. A. Kanapala, S. Pal, R. Pamula, Text summarization from legal documents: a survey. Artif. Intell. Rev. 51(3), 371–402 (2019) 17. X. Jiang et al., Interpretable rationale augmented charge prediction system, in Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations (2018) 18. L. Manor, J.J. Li, Plain English summarization of contracts. arXiv preprint arXiv:1906.00424 (2019)
Use of Near-field Communication (NFC) and Fingerprint Technology for Authentication of ATM Transactions K. Renuka, R. P. Janani, K. Lakshmi Narayanan, P. Kannan, R. Santhana Krishnan, and Y. Harold Robinson
Abstract The automated teller machine (ATM) is a handy solution for users to meet their banking needs. However, using a bank card or check card type of card during ATM cash transaction or withdraw has some drawbacks, such as being vulnerable to ATM rigging, card magnetic strips being destroyed, card manufacture and transportation costs, and taking longer to authenticate customers. This study focuses on near-field communication (NFC) card-emulation mode and fingerprint technology as cash card alternatives that can be employed at the user’s discretion. NFC requires a very short distance between the two devices (usually less than 4 cm), making it perfect for making transactions requiring important data. To collect user information, a fingerprint sensor can also be utilized instead of an NFC tag and reader. Although cash card is not mandatory for authentication, the system will nonetheless be more secure than the current approach, which uses an ATM card. This ensures high level of security during authentication. Keywords ATM · Card cloning · Fingerprint technology · Secure authentication · OTP · Rigging · Near-field communication
K. Renuka · R. P. Janani Francis Xavier Engineering College, Tirunelveli, Tamil Nadu, India K. Lakshmi Narayanan (B) · P. Kannan ECE Department, Francis Xavier Engineering College, Tirunelveli, Tamil Nadu, India e-mail: [email protected] P. Kannan e-mail: [email protected] R. Santhana Krishnan ECE Department, SCAD College of Engineering and Technology, Tirunelveli, Tamil Nadu, India Y. Harold Robinson School of Information Technology and Engineering, Vellore Institute of Technology, Vellore, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_31
415
416
K. Renuka et al.
1 Introduction Due to the increased use of plastic money in today’s world, the use of ATM counters has also increased a lot. But in reality, the transactions that we do at the ATM counters are not safe. Users are provided with a card that contains a magnetic strip, and every time the user makes any transaction, he/she should enter the personal identification number PIN into the ATM, which, if revealed to others, will create a great issue. And the repeated usage of the card during every transaction will damage the magnetic strip in the card and can create a malfunction of the card. Moreover, there are many critical problems, like ATM skimming, card cloning, etc., which make transactions even more threatening. The ultimate aim of this paper is to ensure completely secure ATM transactions without any fear of the user’s details being stolen. The two technologies used in “Use of Near-field Communication (NFC) and Fingerprint Technology for Authentication of ATM Transactions” for secured authentication of ATM transactions are near field communication (NFC) technology and fingerprint technology. NFC technology permits data transmission to a receiving device in real time [1]. With technology embedded into mobile phones to make people’s lives easier, it is not necessary to carry cash or credit card to perform bank transactions. NFC payments make payments faster and easier, and they will be more convenient than old ways [2]. The following is how fingerprint technology is used. Fingerprint module scans the customer’s biometric data, it generates four-digit code as a message and sends it to the registered customer’s mobile number via a GSM modem which is connected to the microcontroller. Therefore, the system we have proposed will be very effective for safe ATM transactions [3, 4]. Say for example, if a person is in a situation that, he/she missed the ATM card, if the hackers come to know their PIN, their money can be easily taken from their account; or if the ATM machine is attached with a skimming machine, there will be high risk of money theft from the users account and the user will be left helpless, as the user details are collected without their knowledge. To solve this problem, our system comes with NFC and fingerprint technology. Even if the user losses his/her card with the NFC tag, no one can take money from the user’s account, as the OTP will be sent to the user’s mobile number that is registered with the bank account. Money can be withdrawn only with OTP [5, 6]. The other option is to use fingerprint sensor. Here, even the card is not needed. Our fingerprint and the OTP are enough to access the bank account [7, 8]. So with this system, high-level safety can be ensured by the proposed system. The main reason for choosing fingerprint for ATM transaction is that every user has an identical fingerprint pattern and any fingerprint cannot be same as other. This leads the user to make their ATM transaction more secure than other type of transaction. This system is one-of-a-kind in its own way. To access the system, a user can use a card with an NFC tag or a fingerprint. The system will link to the user’s details in either of the two ways. The system asks the user to enter a four-digit number to determine whether or not they are human. Following the identification of the user’s information, an OTP will be delivered to the user’s
Use of Near-field Communication (NFC) and Fingerprint …
417
registered mobile number. The user is next requested to enter the needed amount. The transaction will then be processed.
1.1 Motivation The primary goal of our proposed work is the replacement of conventional ATM cards with NFC tag readers and finger print technology to provide users with the best ATM transaction experience. The NFC technology exhibits a variety of applications, like contactless payment, easy pairing of a Bluetooth device by the use of an NFC tag and reader, record keeping, exchange of contact details and other confidential data securely, etc. The NFC reader contains three modes, namely reader/writer mode, peer-to-peer mode, card-emulation mode and wireless charging mode. Of these modes, the card-emulation mode acts as a contactless smart card that can be used for secure and contactless ATM transactions. Any NFC-enabled card or NFC-enabled smartphone can be shown above the NFC reader for the user’s data to be read. The fingerprint technology makes use of a fingerprint sensor to scan the fingerprint of the user to gather the information that is registered with the user’s fingerprint. Because fingerprint technology eliminates the need for ATM cards, ATM transactions are more secure and safer.
2 Related Works Hassan et al. [9] have suggested a system in which the card is replaced with a fingerprint that is linked to the bank account, and the PIN is input on a shuffled keypad. The technology was created in such a way that it prevents the misuse of actual ATM cards and allows for secure transactions. Christian et al. [2] have shown that e-service quality influences perceived usefulness, which influences intention to use, and that indicators influence perceived usefulness and NFC indicators influence perceived ease of use. Mahansaria and Roy [10] have discussed the security analysis and threat modeling, which highlight the system’s security strength during authentication. Deelaka Ranasinghe and Yu [11] have presented a unique concept design for a device that serves as an RFID or NFC tag with fingerprint authentication in their paper. Govindraj et al. [12] have proposed and addressed the challenges that the systems experience by lowering authentication time with the use of a biometric fingerprint sensor and adding an extra layer of security by generating OTP authentication using a local server. In comparison to existing methodologies, their implementation has produced better outcomes and a greater performance rate. Kolev [13] has used the advent of NFC modules to lower the cost of access control of commercial and office buildings and has devised a system that is meant to be utilized for patrol team control. Embarak [14] has presented a two-step strategy for completing ATM transactions
418
K. Renuka et al.
utilizing a closed end-to-end fraud prevention system and by adding a smartphone as an additional layer for ATM transactions, and employing authentic user smartphone ID numbers to safeguard ATM transactions using current technology. Lazaro et al. [15] have investigated the feasibility of employing an NFC-enabled smartphone as a reader to read implanted sensors based on battery-free near-field communication (NFC) integrated circuits. Gupta et al. [16] have investigated the feasibility of employing an NFC-enabled smartphone as a reader to read implanted sensors based on battery-free near-field communication (NFC) integrated circuits.
3 Technologies Used 3.1 Near-field Communication (NFC) Technology Near-field communication (NFC) technology is widely used for contactless communications for a short range. It uses electromagnetic radio fields to enable communication between any two electronic devices within a short range using NFC tag and NFC reader. Because transactions take place over such a short distance, NFC technology requires an NFC reader on the receiving device and an NFC tag/chip on the transmitting device. To transfer data, NFC-enabled devices must be physically touching or within a few centimeters of each other. Near-field communication works because the receiving device reads your data as soon as you send it. Near-field communication (NFC) is a way of establishing communication between two electronic devices (or a device and an NFC tag) by bringing them close to each other. The NFC has benefits of radio frequency identification over Bluetooth to use radio frequency identification (RFID) and other communication technologies to carry out secure transactions. One of the benefits in NFC tag is that, it does not require a power supply because it is self-contained [10].
3.2 Fingerprint Biometrics There are two approaches for recognizing a fingerprint: minutiae-based and imagebased methods. The fingerprint is represented by local features such as termination and bifurcations in the minutiae-based technique. The other method is based on an image. It matches fingerprints based on the global properties of the entire image. It is a sophisticated strategy. In this process, all of the photographs are saved. The use of fingerprint technology makes ATM transactions extremely safe and straightforward. Because fingerprints are unique to each individual, no one else will be able to access an individual’s information [17]. Physical ATM cards can be replaced with fingerprint technology to avoid issues such as ATM card skimming and card cloning.
Use of Near-field Communication (NFC) and Fingerprint …
419
A fingerprint verification system uses a one-to-one comparison to identify a person’s genuine identification by comparing their fingerprints to the fingerprint that was acquired and saved initially. To recognize an individual, an identification system performs one-to-many comparisons and searches throughout the whole template of the database [9, 18, 19].
3.3 Hardware and Specifications Table 1 lists the hardware components and its specifications (Fig. 1). Table 1 Hardware components
Fig. 1 Proposed system
S. No.
Hardware component
Specification
1
PIC microcontroller
–
2
NFC reader and NFC tag
–
3
Finger print sensor
–
4
GSM module
–
5
Relay
12 V
6
DC motor
5V
7
LCD
16 × 2 display
8
Keypad
4 × 3 matrix
9
Connecting wires
As required
420
K. Renuka et al.
4 Proposed System The PIC microcontroller [20] has a RISC design that allows for quick performance as well as simple programming and interface. It is the central component to which all other hardware is attached. A fingerprint sensor that scans the user’s fingerprint and an NFC reader that reads the NFC tag are the two major devices that are used as input sources [21]. To provide input to the system, the fingerprint sensor and NFC reader are connected to the PIC microcontroller. The output is produced by the sequential operation of three devices: the GSM module, the LCD, and the motor. Using the GSM phone network, the GSM module is utilized to receive data from a remote place. It is used to validate the OTP in this system. The transaction status is displayed on the LCD (16 × 2). To signify that the transaction is taking place, a DC motor (5 V) is employed. Finally, for power, the entire system is connected to a nearby power socket. Table 2 given lists the comparison of the existing system to the proposed system.
4.1 Working The PIC microcontroller is the ultimate processing unit of the entire system. It receives the input through an NFC reader and fingerprint sensor, processes it, and gives the output with the help of a GSM module, LCD display, and motor. It is the user’s wish to use either an NFC tag or fingerprint. If the user wants to use the NFC reader, he/she must show the ATM card with an NFC tag above the NFC reader at a distance of less than 4 cm [22] as shown in Fig. 3. Once the NFC reader reads Table 2 Existing system versus proposed system Existing system
Proposed system
ATM hacking is a serious issues. Nowadays, online fraud, cloning, and counterfeiting are commonplace
Fingerprint technology and NFC are employed; therefore, hacking is impossible
ATM card is a mandatory requirement for transaction
Fingerprint technology is also incorporated, so there is no need to always carry an ATM card for money transactions
There is a greater risk of the user’s personal information being misused
Users’ information will be highly safeguarded and safe in both NFC and fingerprint technology scenarios
If an ATM card is used repeatedly, the magnetic The NFC tag does not need to be in contact strips will be damaged with the system If the PIN is found by card skimming and the user’s card is stolen, money can be taken from the user’s account without their knowledge
If a transaction is requested, an OTP will be issued to the user’s mobile number, letting them know if their card is being used by someone else
Use of Near-field Communication (NFC) and Fingerprint …
421
Fig. 2 Experimental setup
the NFC tag and gathers the user information and the information is valid, the PIC microcontroller continues the further process by transmitting the signal to the GSM module (Fig. 2). The GSM module sends a four-digit OTP to the user’s mobile number that is registered with the user’s bank account [23] as shown in Fig. 4a, b. The user should enter the OTP by typing it into the keypad. Then, the system will ask the user to enter a four-digit number to check whether the user is a human or robot. After the number is entered, the transaction will be processed as shown in Figs. 5 and 6. This ensures the security of every ATM transaction made using this system [24]. If the user wants to use the fingerprint sensor instead of NFC, he/she must keep their finger on the fingerprint sensor for the collection of user information [25]. The fingerprint of the user should be connected to the user’s details by registering it with the bank account [11]. After the user’s information is collected, the same process should be repeated as it is done while using NFC. The main advantage of the fingerprint sensor is that there is no use of a physical ATM card in this process. The motor and motor driver work and indicate that the transaction is being processed [26]. The LCD displays transaction types in the display once the GSM initialization is completed to 100%. It also asks us to choose between other types of transactions,
422
K. Renuka et al.
Fig. 3 NFC tag and reader
Fig. 4 a Verification of user with a four-digit number. b OTP is sent to registered mobile
Fig. 5 Enter the received OTP
such as contactless and cardless. The LCD displays a message such as SHOW YOUR CARD when we choose a contactless transaction (i.e., one that does not require a card). The card must next be inserted into the NFC reader as shown in the diagram. The NFC card reader displays a blue light when the card is shown to it, and it is properly identified. The graphic below illustrates this. When the card is properly recognized, it prompts us to verify it by displaying ANY 4 DT 4 HUMAN on the LCD, requesting us to enter any four digits to verify. It sends a one time password
Use of Near-field Communication (NFC) and Fingerprint …
423
Fig. 6 Keypad for entering the OTP
to your registered mobile phone and displays a message reading “CHECK YOUR MOBILE” to validate that the user is the correct user of the appropriate account after validating that the user is human. The LCD displays a message as soon as the OTP is received to the registered mobile number. ENTER UR DYNAMIC NO, THEN TYPE IN YOUR ONE TIME PASSWORD. Following the entry of the OTP, the authentication process is signaled by the operation of a motor coupled to a relay. When the user has been correctly validated, the relay will flash a blue light [12]. When we choose a cardless transaction (i.e., a transaction that uses a fingerprint), the LCD displays a notification that says “PUT YOUR FINGER IN THE DISPLAY.” The finger must then be inserted into the fingerprint reader. If the put fingerprint is correctly identified, the fingerprint controller displays a green light, otherwise, it displays a red light [27]. When a fingerprint is correctly identified, it requests human verification, and the procedure is repeated in the same way as with an NFC tag reader.
5 Conclusion This paper went over every aspect of near-field communication (NFC) technology. NFC’s range can be extended by combining it with current infrared and Bluetooth technologies. NFC is a safe and convenient means to transfer data between two
424
K. Renuka et al.
electronic devices. NFC’s interoperability with RFID technology is another benefit. NFC is based on the RFID technology. Magnetic field induction is used by RFID to establish communication between electrical devices in close proximity. NFC works at a frequency of 13.56 MHz and has a maximum data transfer rate of 424 kbps. ATMs are a convenient way for users to meet their banking demands. ATM machines are found all over the world and are utilized by a vast number of people. As a result, it is critical that ATM transactions be secure and rapid. The usage of a debit card or other type of card during ATM transactions has a number of drawbacks, including the possibility of ATM skimming, card magnetic strips becoming destroyed, card production and transit costs, and a longer time to identify users. The advantages of utilizing a smartphone in NFC card-emulation mode over an ATM card have been discussed, and it is clear that it is a viable option [14]. The suggested system’s security robustness against vulnerable assaults during authentication is highlighted by security analysis and threat modeling. In the future, we intend to replace the PIC microcontroller with a Raspberry Pi to increase the system’s speed and networking. Also, by using a smartphone to replace the NFC card, cardless ATM transactions are now possible.
5.1 Future Scope • Fingerprint technology keeps the customer’s information safe, as it is the most advanced technology to safeguard any kind of user information. • Security analysis and threat modeling shown in this work highlight the security strength of the system during authentication. • As tags and readers are integrated into NFC, privacy can be maintained more easily than with other tags. • An ATM transaction in this project does not necessitate the use of a card because it improves ATM transactions for users by introducing cardless ATM transactions. The objective for our future work is to concentrate on the security features of the custom authentication software that will be installed on the phone [28]. In addition, an in-depth examination of the probable security assaults during ATM transactions will be conducted, as part of the scope of this project. Acknowledgements This work was supported in part by the Embedded and IoT Applied Laboratory at Francis Xavier Engineering College, Tamil Nadu, India. Also, we would like to thank the anonymous reviewers for their valuable comments and suggestions.
Use of Near-field Communication (NFC) and Fingerprint …
425
References 1. C. Shuran, Y. Xiaoling, A new public transport payment method based on NFC and QR code, 2020 IEEE 5th International Conference on Intelligent Transportation Engineering (ICITE) (2020), pp. 240–244. https://doi.org/10.1109/ICITE50838.2020.9231356 2. L. Christian, H. Juwitasary, Y.U. Chandra, E.P. Putra, Fifilia, Evaluation of the E-service quality for the intention of community to use NFC technology for mobile payment with TAM, in 2019 International Conference on Information Management and Technology (ICIMTech) (2019), pp. 24–29. https://doi.org/10.1109/ICIMTech.2019.8843811 3. Shakya, S., Smys, S.: Big data analytics for improved risk management and customer segregation in banking applications. J. ISMAC 3(03), 235–249 (2021) 4. Ityala, S., Sharma, O., Honnavalli, P.B.: Transparent watermarking QR code authentication for mobile banking applications, in International Conference on Inventive Computation Technologies (Springer, Cham, 2019), pp. 738–748 5. M. Satheesh, M. Deepika, Implementation of multifactor authentication using optimistic fair exchange. J. Ubiquitous Comput. Commun. Technol. (UCCT) 2(02), 70–78 (2020) 6. Manoharan, J.S., A novel user layer cloud security model based on chaotic Arnold transformation using fingerprint biometric traits. J. Innov. Image Process. (JIIP) 3(01), 36–51 (2021) 7. Joe, C.V., Raj, J.S.: Deniable authentication encryption for privacy protection using blockchain. J. Artif. Intell. Capsule Netw. 3(3), 259–271 (2021) 8. Pathak, B., Pondkule, D., Shaha, R., Surve, A.: Visual cryptography and image processing based approach for bank security applications, in International Conference on Computer Networks and Inventive Communication Technologies (Springer, Cham, 2019), pp. 292–298 9. A. Hassan, A. George, L. Varghese, M. Antony, K.K. Sherly, The biometric cardless transaction with shuffling keypad using proximity sensor, in 2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA) (2020), pp. 505–508. https://doi.org/ 10.1109/ICIRCA48905.2020.9183314 10. D. Mahansaria, U.K. Roy, Secure authentication for ATM transactions using NFC technology, in 2019 International Carnahan Conference on Security Technology (ICCST) (2019), pp. 1–5. https://doi.org/10.1109/CCST.2019.8888427 11. R.M.N. Deelaka Ranasinghe, G.Z. Yu, RFID/NFC device with embedded fingerprint authentication system, in 2017 8th IEEE International Conference on Software Engineering and Service Science (ICSESS) (2017), pp. 266–269. https://doi.org/10.1109/ICSESS.2017.8342911 12. V.J. Govindraj, P.V. Yashwanth, S.V. Bhat, T.K. Ramesh, Smart door using biometric NFC band and OTP based methods, in 2020 International Conference for Emerging Technology (INCET) (2020), pp. 1–4. https://doi.org/10.1109/INCET49848.2020.9153970 13. S. Kolev, Designing a NFC system, in 2021 56th International Scientific Conference on Information, Communication and Energy Systems and Technologies (ICEST) (2021), pp. 111–113. https://doi.org/10.1109/ICEST52640.2021.9483482 14. O.H. Embarak, A two-steps prevention model of ATM frauds communications, in 2018 Fifth HCT Information Technology Trends (ITT) (2018), pp. 306–311. https://doi.org/10.1109/CTIT. 2018.8649551 15. A. Lazaro, M. Boada, R. Villarino, D. Girbau, Feasibility study on the reading of energyharvested implanted NFC tags using mobile phones and commercial NFC IC, in 2020 IEEE MTT-S International Microwave Biomedical Conference (IMBioC) (2020), pp. 1–3. https://doi. org/10.1109/IMBIoC47321.2020.9385033 16. R. Gupta, G. Arora, A. Rana, USB fingerprint login key, in 2020 8th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO) (2020), pp. 454–457. https://doi.org/10.1109/ICRITO48877.2020.9197785 17. Amruth, Y., Gopinatha, B.M., Gowtham, M.P., Kiran, M., Harshalatha, Y., Fingerprint and signature authentication system using CNN, in 2020 IEEE International Conference for Innovation in Technology (INOCON) (2020), pp. 1–4. https://doi.org/10.1109/INOCON50539.2020. 9298235
426
K. Renuka et al.
18. A. Sathesh, Enhanced soft computing approaches for intrusion detection schemes in social media networks. J. Soft Comput. Paradigm (JSCP) 1(02), 69–79 (2019) 19. S.R. Mugunthan, Soft computing based autonomous low rate DDOS attack detection and security for cloud computing. J. Soft Comput. Paradigm (JSCP) 1(02), 80–90 (2019) 20. S.S. Devi, T.S. Prakash, G. Vignesh, P.V. Venkatesan, Ignition system based licensing using PIC microcontroller, in 2021 Second International Conference on Electronics and Sustainable Communication Systems (ICESC) (2021), pp. 252–256. https://doi.org/10.1109/ICESC51422. 2021.9532920 21. A. Mandalapu, V. Daffney Deepa, L.D. Raj, J. Anish Dev, An NFC featured three level authentication system for tenable transaction and abridgment of ATM card blocking intricacies, in 2015 International Conference and Workshop on Computing and Communication (IEMCON) (2015), pp. 1–6. https://doi.org/10.1109/IEMCON.2015.7344491 22. A. Albattah, Y. Alghofaili, S. Elkhediri, NFC technology: assessment effective of security towards protecting NFC devices & services, in 2020 International Conference on Computing and Information Technology (ICCIT-1441) (2020), pp. 1–5. https://doi.org/10.1109/ICCIT-144 147971.2020.9213758 23. A. Khatoon, M. Sharique, Performance of GSM and GSM-SM over α-μ fading channel model, in TENCON 2019—2019 IEEE Region 10 Conference (TENCON) (2019), pp. 1981–1985. https://doi.org/10.1109/TENCON.2019.8929377 24. S. Sridharan, K. Malladi, New generation ATM terminal services, in 2016 International Conference on Computer Communication and Informatics (ICCCI) (2016), pp. 1–6. https://doi.org/ 10.1109/ICCCI.2016.7479928 25. P. Poonia, O.G. Deshmukh, P.K. Ajmera, Adaptive quality enhancement fingerprint analysis, in 2020 3rd International Conference on Emerging Technologies in Computer Engineering: Machine Learning and Internet of Things (ICETCE) (2020), pp. 149–153. https://doi.org/10. 1109/ICETCE48199.2020.9091760 26. J. Li, Research on DC motor driver in automobile electric power steering system, in 2020 International Conference on Intelligent Transportation, Big Data & Smart City (ICITBS) (2020), pp. 449–453. https://doi.org/10.1109/ICITBS49701.2020.00097 27. R.B.S. Prajeesha, N. Nagabhushan, T. Madhavi, Fingerprint-based licensing for driving, in 2021 6th International Conference for Convergence in Technology (I2CT) (2021), pp. 1–6. https://doi.org/10.1109/I2CT51068.2021.9418134 28. Y. Kim, M. Jun, A design of user authentication system using QR code identifying method, in 2011 6th International Conference on Computer Sciences and Convergence Information Technology (ICCIT) (2011), pp. 31–35
Light Gradient Boosting Machine in Software Defect Prediction: Concurrent Feature Selection and Hyper Parameter Tuning Suresh Kumar Pemmada, Janmenjoy Nayak, H. S. Behera, and Danilo Pelusi Abstract Predicting software defects is critical for ensuring software quality. Many supervised learning approaches have been used to detect defect-prone instances in recent years. However, the efficacy of these supervised learning approaches is still inadequate, and more sophisticated techniques will be required to boost the effectiveness of defect prediction models. In this paper, we present a light gradient boosting methodology based on ensemble learning that uses simultaneous feature selection (Recursive Feature Elimination (RFE)) and hyperparameter tuning (Random search). Our proposed technique LGBM + Randomsearch + RFE method is evaluated using the AEEEM dataset, including Apache Lucene, Eclipse JDT Core, Equinox, Mylyn, and Eclipse PDE UI. The experimental findings demonstrate that the proposed approach outperforms LGBM + Randomsearch, LGBM, and the top classical machine learning algorithms on all performance criteria considered. Keywords Light gradient boosting machine · Recursive feature elimination · Software defect prediction · Ensemble learning
S. K. Pemmada (B) Department of Computer Science and Engineering, Aditya Institute of Technology and Management (AITAM), Tekkali 532201, India e-mail: [email protected] J. Nayak Department of Computer Science, Maharaja Sriram Chandra Bhanja Deo University, Baripada, Odisha 757003, India S. K. Pemmada · H. S. Behera Department of Information Technology, Veer Surendra Sai University of Technology, Burla 768018, India D. Pelusi Communication Sciences, University of Teramo, Teramo, Italy e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 J. S. Raj et al. (eds.), Intelligent Sustainable Systems, Lecture Notes in Networks and Systems 458, https://doi.org/10.1007/978-981-19-2894-9_32
427
428
S. K. Pemmada et al.
1 Introduction In the software development life cycle, software testing is a vital but costly procedure. Defects are inevitable in contemporary software development due to their complexity. Defect-prone software projects may have unintended repercussions when implemented, resulting in massive losses for businesses or even endangering people’s lives. Most approaches for assessing defects rely on historical information and pertinent data to build a prediction model. They allow the software industry to inspect source files and code before release. As a result, any defects can be identified and corrected on time [1]. Currently, defect fixing consumes more than 80% of software maintenance and development expenditures. If these defects could be caught early in the software development cycle, the cost would be significantly reduced. As a consequence, a number of academics have sought to build defect prediction models to assist programmers in discovering probable defects ahead of time [2]. Software defect prediction (SDP) has attracted a lot of attention as a quality factor that may assist software developers in finding more defects in their systems. It entails using machine learning (ML) approaches on software metrics generated from software system repositories to anticipate a software system’s quality and dependability [3]. Software engineers may utilize the knowledge acquired from SDP procedures to improve software development processes and manage constrained software resources [4]. Due to the constraints of software testing abilities and performance, defect prediction methodologies are still unsatisfactory in practice. The software may be harmed if the prediction model deviates from the norm or low prediction performance. Because most of the instances are defect-free and only a few are defect-prone, class imbalance is a major factor impacting prediction performance in SDP. This means that the dataset is imbalanced. As a result, a significant number of researchers have focused on unbalanced learning for SDP, and several earlier studies [5, 6] have presented various ways for dealing with dataset imbalances. The problem of class imbalance has been addressed in this article using a technique known as Synthetic Minority Over-sampling Technique (SMOTE). SMOTE is an over-sampling approach in which synthetic samples are created for the minority class. However, the performance of these models is still unsatisfactory to accomplish the intended outcomes. It concentrates on the feature space to produce new examples by interpolating between positive instances that are close together. Feature selection is the process of selecting features from a large number of features in a dataset. It is one of the most important fields of high-dimensional data analysis study. The feature selection process is very important to construct highly effective machine learning algorithms. There are three types of feature selection approaches available: filter, wrapper, and embedding methods. In this article, wrapper method is used. Wrapper approaches choose subsets of features from the whole feature set and then train the model. The features that need to be removed or added to the feature subset are determined depending on the results of the preceding model. Recursive Feature Elimination (RFE) is a feature selection wrapper concept that employs a greedy algorithm in its execution. The RFE algorithm begins with the
Light Gradient Boosting Machine in Software Defect …
429
entire features. The feature set has been selected based on the classification accuracy. Feature sets have been ranked at the end of each iteration, and the feature with the least importance is eliminated. This method is repeated until only the relevant features remain in the feature set [7]. Aiming at improving the performance of the defect prediction, Logistic Regression (LR) [8], Naive Bayes (NB) [9], Support Vector Machine (SVM) [10], and Random Forest (RF) [11] have been effectively implemented and predicting different defects in software systems. However, in skewed and duplicated defect datasets, these techniques are sub-optimal. The prediction performance of these approaches deteriorates when the defect datasets include missing or irrelevant information. Individual classifiers, such as support vector machines (SVM) and artificial neural networks (ANN), are biased toward the majority class and disregard the minority class, resulting in a high false negative rate (FNR) [12]. It is worth noting that ensemble learning models are excellent for dealing with the data difficulties stated above. Ensembles are models that incorporate predictions from two or more different models. Ensemble learning approaches lower the spread in a predictive model’s average skill, enhance average prediction performance across any contributing classifiers in the ensemble, and frequently reduce the variance component of prediction mistakes generated by the contributing models. Despite the fact that ensemble learning techniques increase computational cost and complexity, they may provide superior predictions and performance than any individual contributing model. The following is the article’s main contribution: (A)
(B) (C)
This article aims to demonstrate the utilization of a light gradient boosting machine. This ensemble learning methodology integrates distinct machine learning algorithms in identifying defect-prone and defect-free modules. The class imbalance problem is addressed by employing SMOTE to generate synthetic minority samples. Recursive feature elimination is employed to identify significant features in improving performance.
The rest of this work is broken into the parts below: The literature on software defect prediction using different machine learning and ensemble learning techniques are presented in Sect. 2. The proposed methodology is presented in Sect. 3, together with the framework of the proposed method. Section 4 includes empirical data, simulation environment, model parameter configuration, and analysis of results. Finally, Sect. 5 concluded the work by suggesting potential future directions.
2 Literature Study This section studied the literature on software defect prediction, which deals with class imbalance and feature selection. Catherine and Djodilatchoumy [13] investigated the usage of Multi-Layer Perceptron Neural Network for effective prediction of defects. They used MLP-NN as an
430
S. K. Pemmada et al.
attribute evaluator with a subset of features selected by utilizing a collection-basedmulti filter selection technique and correlation-based feature selection. Five available in AEEEM have been used to test the model. The results are then compared to those of other well-known classifiers as Logistic Regression, Random Tree, and MLP-NN. The results show that the proposed approach outperformed well, and feature selection significantly enhanced prediction accuracy. In order to deal with the unbalanced data in software defect prediction, Guo et al. [14] employed a random over-sampling strategy to construct minority class instances from a high-dimensional sample space. To provide a robust approach to generating new synthetic samples, two limits have been imposed: scaling the random oversampling scope to a sufficient region and distinguishing the majority class samples in a critical position. They experimentally validated the proposed technique on datasets ivy02.0, log4j-1.1, xalan-2.5, velocity-1.4, redator, synapse-1.1, arc, lucene-2.4, and MW1 of software projects, and the result outperforms typical unbalanced processing algorithms Table 1 summarizes some further literature on software defect prediction.
3 Proposed Method This section outlines the proposed intelligent method light gradient boosting method with random search and RFE on predicting defect-free or defect-prone modules built on tree learning algorithms based on the boosting concept [23]. It enables the light gradient boosting model with hyperparameter tweaking and feature selection simultaneously. Random search is utilized to tune model parameters while ranking feature selection, such as recursive feature elimination, is also employed to enhance the proposed method’s performance. LGBM accelerates the training phase and reduces memory use by using a leaf-wise growth process with depth restrictions and histogram-based techniques. The level-wise design process of a decision tree is ineffectual because it handles the leaves of the same layer, resulting in a large amount of additional memory that is superfluous. Leaf-wise is a more effective method that discovers the leaves with the greatest branching advantage and progresses through the branching cycle. Therefore, LGBM adds a maximum depth limit to the top of the leaf, preventing overfitting while maintaining high performance [24]. The framework of the proposed method is presented in Fig. 1.
4 Experimental Setup and Result Discussion This section contains information on the dataset, simulation environment, as well as a discussion of the findings of proposed and compared methods.
Light Gradient Boosting Machine in Software Defect …
431
Table 1 Literature on software Defect Prediction S. No
Dataset (source)
Method
Performance
Evolution factor
References
1
PROMISE, AEEEM
Support vector machine—SMOTE
Mylyn: F-measure, AUC: 0.74
F-measure, AUC
[15]
2
NASA, AEEEM
Sampling with the majority (SWIM) boost
G-mean: 0.72, AUC: 0.74
G-mean, AUC [16]
3
AEEEM
Extended nearest neighbor algorithm, SMOTE, simulated annealing
Accuracy: 87.19
Accuracy
[17]
4
PROMISE
Naïve Bayes
Accuracy: 98.7
Accuracy
[18]
5
NASA
Linear regression, Naïve Bayes
Accuracy: linear Accuracy regression, Naïve Bayes: 98.1
[19]
6
PROMISE
Random forest
Precision: 0.893, recall: 0.919, accuracy: 0.946, and F-measure: 0.919
Precision, recall, accuracy, and F-measure
[20]
7
NASA, PROMISE, ReLink, and AEEEM
Naïve Bayes, decision tree
Relink AUC—95%
AUC
[21]
8
AEEEM’s JDT, PDE, Mylyn, jm1, kc1, PR, and ECLIPSE
Multiview transfer learning for SDP
Eclipse AUC 0.8964
AUC
[1]
9
NASA, PROMISE, AEEEM, and Relink
Deep forest using defect prediction
NASA AUC—0.92
AUC
[2]
10
AEEEM, NASA, PROMISE
Distance-based classifiers, connectivity-based classifiers
AEEEM AUC—0.83
AUC
[22]
4.1 Empirical Data D’Ambros et al. collected the AEEEM dataset [25] based on Apache and Eclipse. AEEEM contains 61 different metrics for each program, which integrate numerous traditional source code metrics that provide software features. The source code metrics that are based on change metrics, entropy, and churn of source code metrics
432
S. K. Pemmada et al.
Fig. 1 The framework of the proposed method
[26]. Table 2 shows the AEEEM dataset’s precise details, including the total number of modules, the number of software metrics, the number of defect-prone modules, the number of defect-free modules, and the imbalance rate (IR) for each dataset. Table 2 Detailed statics of AEEEM repository Dataset program
Modules Features Majority samples Minority samples Imbalance rate
Apache Lucene
691
61
627
64
9.8
Equinox
324
61
195
129
1.5
Eclipse JDT Core
997
61
791
206
3.8
Mylyn
1862
61
1617
245
9.8
Eclipse PDE UI
1497
61
1288
209
6.2
Light Gradient Boosting Machine in Software Defect …
433
Table 3 LGBM parameters in AEEEM Dataset
LGBM parameters (RFE + random search)
Apache Lucene
boosting_type = ‘goss’, learning_rate = 0.29788814813308806, max_depth = 10, n_estimators = 150, num_leaves = 30, random_state = 25
Equinox
boosting_type = ‘goss’, class_weight = none, colsample_bytree = 1.0, importance_type = ‘split’, learning_rate = 0.29923603106613095, max_depth = 12, min_child_samples = 20, min_child_weight = 0.001, min_split_gain = 0.0, n_estimators = 150, n_jobs = −1, num_leaves = 30, objective = none, random_state = 25, reg_alpha = 0.0, reg_lambda = 0.0, silent = true, subsample = 1.0, subsample_for_bin = 200,000, subsample_freq = 0
Eclipse JDT Core boosting_type = ‘goss’, class_weight = none, colsample_bytree = 1.0, importance_type = ‘split’, learning_rate = 0.2975322352484806, max_depth = 12, min_child_samples = 20, min_child_weight = 0.001, min_split_gain = 0.0, n_estimators = 150, n_jobs = −1, num_leaves = 150, objective = none, random_state = 25, reg_alpha = 0.0, reg_lambda = 0.0, silent = true, subsample = 1.0, subsample_for_bin = 200,000, subsample_freq = 0 Mylyn
learning_rate = 0.25589949734941975, max_depth = 10, n_estimators = 150, num_leaves = 50, random_state = 25
Eclipse PDE UI
learning_rate = 0.20777294930484105, max_depth = 12, n_estimators = 150, num_leaves = 100, random_state = 25
4.2 Simulation Environment and Parameter Setting The proposed approach and all comparative machine learning methods have been implemented in a Lenovo Ideapad Flex5 machine with the following system settings: processor: 11th Gen Intel (R) Core (TM) i7-1165G7 @ 2.80 GHz 2.80 GHz, RAM: 16 GB, GPU: Intel(R) Iris(R) Xe Graphics, OS: Windows 11 Home, and python modules such as Numpy, Pandas, Sklearn, lightgbm, matplotlib. Recursive Feature Elimination is used to choose features, while Random Search is utilized to tweak parameters for the Light Gradient Boosting Machine in Software Defect Prediction. In the recursive feature elimination—feature selection process, we considered 25 features out of 61 features, considered step is 1, and continued the process up to 22 iterations. Table 3 shows the parameter configuration used to evaluate the proposed approach in various projects of AEEEM.
4.3 Result Discussion The dataset mentioned above contains the number of software projects that have been categorized as defect-free or defect-prone. A light gradient boosting approach has been proposed to detect the defect-prone modules in the AEEEM data. According to the literature review, most researcher papers regarded accuracy as the significant
434
S. K. Pemmada et al.
measure in the classification problem. Apart from accuracy, various metrics such as true positive rate, false positive rate, precision, true negative rate, f 1-score, and ROCAUC might be useful in gaining a better understanding of the simulated outcomes. The proposed method has been validated by comparing different ML algorithms such as KNN, RF, SGD, DT, LR, GNB, LDA, and QDA using performance measures such as accuracy, true positive rate, false positive rate precision, true negative rate, f 1-score, and ROC-AUC [27]. Table 4 presents the performance of the proposed LGBM + Randomsearch + RFESHAP, LGBM + Randomsearch, LGBM and several machine learning algorithms such as KNN, RF, SGD, DT, LR, GNB, LDA, and QDA with AEEEM data. In Apache Lucene, LGBM + Randomsearch + RFE and LGBM + Randomsearch have a testing accuracy of 0.952, whereas LGBM, KNN, RF, SGD, DT, GNB, LDA, LR, and QDA have testing accuracies of 0.932, 0.793, 0.916, 0.713, 0.900, 0.685, 0.785, 0.761, and 0.737, respectively. In terms of TPR, LGBM + Randomsearch and LGBM did well with 0.945. However, the proposed technique LGBM + Randomsearch + RFE outperformed the others in TNR, f 1, ROC-AUC, FPR, and Precision. In the case of Equinox, LGBM + Randomsearch + RFESHAP is ranked first with 0.872 testing accuracy, followed by LGBM + Randomsearch and LDA with 0.859 and 0.821 testing accuracy, respectively. In terms of TPR, LGBM performed well, while the proposed approach did better in precision, FPR, TNR, ROC-AUC, and F1-score. In the case of Eclipse JDT core, the proposed method obtained the testing accuracy of 0.912; the proposed LGBM + Randomsearch + RFESHAP outperformed all other approaches. The proposed technique performed better with 0.920, 0.097, 0.909, 0.903, 0.914, 0.911, respectively for TPR, FPR, Precision, TNR, F1, ROCAUC. LGBM + Randomsearch + RFESHAP produced better results in the case of Mylyn followed by LGBM that is equal to the proposed LGBM + Randomsearch. KNN, SGD, RF, DT, GNB, LR, LDA, and QDA achieved accuracies of 0.833, 0.669, 0.918, 0.867, 0.631, 0.767, 0.757, and 0.675, respectively. In the case of Eclipse PDE UI, LGBM + Randomsearch + RFESHAP is more accurate, with a score of 0.928, followed by 0.926, 0.923, and 0.919 for LGBM + Randomsearch, LGBM, and random forest, respectively. The accuracies of KNN, SGD, DT, GNB, LR, LDA, and QDA are 0.841, 0.611, 0.866, 0.640, 0.723, 0.738, and 0.690, respectively. The proposed technique outperformed other techniques in terms of all performance criteria considered. This research reveals that the proposed approach is a reliable model that outperforms all others in terms of considered performance measures. The ROC curve of the proposed approach for both classes, as well as their coverage of analyzing the difference between classes of Apache Lucene, Equinox, Eclipse, JDT Core, Mylyn, and Eclipse PDE UI datasets, is shown in Fig. 2a–e. The proper method has a higher coverage area in both micro and macro average than all of the ROC curves generated for ensemble and machine learning-based models, demonstrating a better capacity to distinguish between defect-free and defect-prone models. Table 5 shows a comparison of the proposed approach’s performance with that of several earlier articles; the results reveal that the proposed method outperformed in
0.65
1.00
1.00
0.67
0.71
1.00
1.00
0.67
SGD
RF
DT
GNB
1.00
LU
EQ
1.00
0.82
SGD
0.62
0.67
0.62
0.85
0.71
LGBM
KNN
0.93
0.75
0.80
0.89
0.91
0.72
0.80
0.81
0.73
0.68
0.85
0.71
0.86
0.94
0.92
0.69
0.77
0.77
0.64
1.00
1.00
1.00
0.27
0.70
0.82
0.96
0.94
0.69
0.74
0.72
0.66
1.00
1.00
0.63
0.88
0.74
0.78
0.76
0.69
0.90
0.92
0.71
0.79
0.93
0.95
0.87
0.74
0.82
0.73
0.68
0.78
0.82
0.73
0.74
0.76
0.86
0.71
0.71
0.83
0.96
0.69
0.69
0.73
0.87
0.87
0.97
0.87
0.98
0.97
Proposed method
LGBM + RS
0.87
0.67
1.00
0.99
0.88
1.00
True negative rate
0.79
0.72
QDA
1.00
0.89
1.00
1.00
PD
Precision
0.74
0.78
0.74
0.75
LR
LDA
0.85
1.00
0.89
LGBM
KNN
1.00
1.00
My
0.95
0.93
0.93
1.00
1.00
Proposed method
LGBM + RS
1.00
JD Testing accuracy
EQ
Training accuracy
LU
0.83
0.77
0.81
0.89
0.90
0.70
0.79
0.79
0.72
0.87
0.90
0.67
0.83
0.92
0.90
0.91
JD
0.75
0.76
0.87
0.93
0.92
0.68
0.76
0.77
0.63
0.87
0.92
0.67
0.83
0.92
0.93
0.92
My
0.57
0.77
0.85
0.96
0.94
0.69
0.74
0.72
0.64
0.87
0.92
0.61
0.84
0.92
0.93
0.93
PD
Table 4 Performance of the proposed and several comparison models in AEEEM LU
EQ
0.76
0.80
0.90
0.96
0.96
F1-score
0.71
0.80
0.75
0.66
0.94
0.95
0.71
0.90
0.95
0.95
0.94
0.70
0.71
0.76
0.86
0.87
0.72
0.82
0.73
0.62
0.82
0.84
0.80
0.83
0.90
0.85
0.87
True positive rate
0.74
0.82
0.85
0.90
0.91
0.67
0.76
0.79
0.67
0.89
0.92
0.62
0.90
0.90
0.92
0.92
JD
0.72
0.81
0.91
0.93
0.92
0.63
0.77
0.78
0.59
0.89
0.93
0.63
0.95
0.97
0.93
0.92
My
0.41
0.82
0.89
0.94
0.93
0.64
0.71
0.71
0.60
0.88
0.93
0.85
0.97
0.98
0.92
0.91
PD
LU
EQ
0.13
0.23
0.18
0.26
0.15
0.25
0.20
0.31
0.31
0.27
0.13
0.71
0.80
0.89
0.95
0.96
0.74
0.76
0.82
0.86
0.87
ROC-AUC
0.19
0.23
0.21
0.21
0.14
0.12
0.29
0.29
0.17
0.04
0.03
False positive rate
0.73
0.84
0.86
0.90
0.91
0.25
0.18
0.22
0.17
0.14
0.12
0.17
0.23
0.19
0.11
0.10
JD
0.69
0.85
0.92
0.93
0.92
0.20
0.26
0.24
0.25
0.15
0.09
0.25
0.24
0.13
0.07
0.08
My
(continued)
0.71
0.87
0.91
0.94
0.93
0.19
0.23
0.27
0.24
0.15
0.09
0.43
0.23
0.15
0.04
0.06
PD
Light Gradient Boosting Machine in Software Defect … 435
0.72
0.92
0.74
0.88
0.92
DT
GNB
0.91
QDA
0.79
JD
0.83
0.85
0.80
0.90
0.86
0.88
My
0.89
0.74
0.76
0.88
0.85
0.91
PD
0.88
0.80
0.74
0.87
0.85
0.90
LU
0.81
0.77
0.79
0.79
0.86
0.88
EQ
0.77
0.82
0.74
0.85
0.75
0.80
JD
0.75
0.82
0.78
0.83
0.86
0.88
My
0.80
0.74
0.76
0.75
0.85
0.91
PD
0.81
0.77
0.73
0.76
0.85
0.91
LU
0.80
0.81
0.81
0.77
0.91
0.92
EQ
0.76
0.82
0.73
0.74
0.77
0.82
JD
0.74
0.80
0.80
0.77
0.88
0.90
My
0.74
0.76
0.77
0.71
0.87
0.92
PD
0.74
0.75
0.73
0.71
0.86
0.92
LU
0.76
0.78
0.77
0.72
0.90
0.91
EQ
0.75
0.82
0.73
0.74
0.79
0.82
JD
0.71
0.79
0.79
0.75
0.87
0.90
Proposed Method: LGBM + Randomsearch + RFESHAP; LGBM + RS: LGBM + Randomsearch; LU: Apache Lucene; JD: Eclipse JDT Core; EQ: Equinox; My: Mylyn; PD: Eclipse PDE UI
0.87
0.83
LR
LDA
0.82
EQ
0.79
LU
RF
0.90
Table 4 (continued) My
PD
0.74 0.72
0.71
0.72
0.68
0.87
0.92
0.76
0.77
0.67
0.87
0.92
436 S. K. Pemmada et al.
Fig. 2 ROC-AUC curve of the proposed method with a Apache Lucene, b Equinox, c Eclipse JDT Core, d Mylyn, e Eclipse PDE UI
Light Gradient Boosting Machine in Software Defect … 437
438
S. K. Pemmada et al.
Table 5 Performance comparison of the proposed method with previous articles Project name
Proposed methods
Previous methods
References
Apache Lucene
LGBM + Randomsearch + RFESHAP Accuracy—0.952, AUC—0.955, F1-score—0.959
DPCMM-LR DPSAM 64.2 DPCMM 76.1 (accuracy) + LR 72.3 (F1-score) (precision)
[28]
MLP + FS 89.00 (accuracy)
[13]
Equinox LGBM + Randomsearch + RFESHAP Accuracy—0.872, AUC—0.872, F1-score—0.872
MDA-O 0.6637 (F-measure)
MDA-O 0.7102 (G-measure)
MDA-O 0.765 (AUC)
DPDF 0.82 (AUC)
DPDF 0.93 (accuracy)
DPDF LR 0.39 [2] 0.81 (F1-score) (precision)
SC 0.79 (AUC)
[22]
MLP + FS 80.00 (accuracy)
[13]
ManualDown ManualDown MDA-O 0.6742 0.7082 0.7874 (F-measure) (G-measure) (AUC)
[29]
DPDF 0.85 (AUC)
Eclipse JDT Core
LGBM + Randomsearch + RFESHAP Accuracy—0.912, AUC—0.911, F1-score—0.914
LGBM + Randomsearch + RFESHAP Accuracy—0.9212, AUC—0.9211, F1-score—0.9224
DPDF 0.78 (accuracy)
NB 0.72 DPDF [2] (precision) 0.75 (F1-score)
SC 0.81 (AUC)
[22]
LR 84.6 (accuracy)
[13]
MDA-O 0.5754 (F-measure)
MDA-O 0.6999 (G-measure)
MDA-O 0.764 (AUC)
MTDT 0.869 (AUC) DPDF 0.86 (AUC)
Mylyn
[29]
[29]
[1] DPDF 0.85 (accuracy)
SVM 0.74 LR 0.56 [2] (precision) (F1-score)
SC 0.83 (AUC)
[22]
MLP + FS 87.00 (accuracy)
[13]
MDA-O 0.6381 (F-measure)
MDA-O 0.6907 (G-measure)
MDA-O 0.7502 (AUC)
[29]
(continued)
Light Gradient Boosting Machine in Software Defect …
439
Table 5 (continued) Project name
Proposed methods
Previous methods DPDF 0.82 (AUC)
Eclipse LGBM + PDE UI Randomsearch + RFESHAP Accuracy—0.926, AUC—0.927, F1-score—0.927
DPDF 0.87 (accuracy)
References DPDF NB 0.36 [2] 0.47 (F1-score) (precision)
SC 0.63 (AUC)
[22]
MLP + FS 87.08 (accuracy)
[13]
MDA-O 0.663 (F-measure)
MDA-O 0.7085 (G-measure)
MTDT 0.8252 (AUC) DPDF 0.77 (AUC) SC 0.72 (AUC)
MDA-O 0.7301 (AUC)
[29]
[1]
DPDF 0.87 (accuracy)
DBN 0.79 LR 0.35 [2] (precision) (F1-score) [22]
all performance measures. In Apache Lucene, the proposed approach scored 0.952 accuracy, 0.955 AUC, and 0.959 f 1-score that is better compared to 0.93 accuracy, 0.82 AUC, and 0.761 f 1-score in the previous study. In Equinox, the proposed method achieved the same accuracy, AUC, and f 1-score of 0.872, while the prior article’s greatest accuracy, AUC, and f 1-score were 0.80, 0.85, and 0.75, respectively. The accuracy, AUC, and f 1-score of the proposed technique in Eclipse JDT Core are 0.912, 0.911, and 0.914, respectively, while the maximum accuracy, AUC, and f 1score in prior publications were 0.85, 0.86, and 0.699, respectively. The suggested technique’s accuracy, AUC, and f 1-score in Mylyn are 0.9212, 0.9211, and 0.9224, respectively, while previous publications’ greatest accuracy, AUC, and f 1-score were 0.87, 0.82, and 0.63, respectively. The proposed approach scored 0.926 accuracy, 0.927 AUC, and 0.927 f 1-score in Eclipse PDE UI, whereas the prior paper scored 0.87 accuracy, 0.82 AUC, and 0.66 f 1-score. Findings indicate that the proposed approach performed much better than earlier publications in terms of accuracy, AUC, and f 1-score.
5 Conclusion Despite various machine learning-based algorithms, detecting and identifying software defects has always been difficult. To solve the current research gap, this work proposes an ensemble learning-based LGBM model. This research aims to resolve
440
S. K. Pemmada et al.
high dimensionality and choose the best hyperparameters for the learning algorithm. The efficacy of various filter feature selection techniques varies, making it difficult to choose an appropriate and relevant filter feature selection approach to utilize in SDP. On the proposed light gradient boosting approach, random feature elimination is utilized for feature selection, and random search is employed for hyperparameter tuning simultaneously. The effectiveness of the proposed method LGBM + Randomsearch + RFESHAP has been validated using LGBM + Randomsearch, LGBM, and several ML techniques such as SGD, KNN, RF, GNB, DT, LR, LDA, and QDA. Based on various performance measures and extensive research, it is clear that the proposed approach is effective in detecting software defects. The proposed model identified defects in software modules with 0.952, 0.872, 0.912, 0.921, and 0.926 accuracy for Apache Lucene, Equinox, Eclipse JDT Core, Mylyn, and Eclipse PDE UI, respectively. In the future, we will look at the possibilities of using our proposed technology for cross-project defect prediction on additional projects in the future. Finally, we had like to investigate the possibility of SMOTUNED technique to handle the class imbalance issue in defect prediction or conduct comprehensive research to compare with the SMOTE/SMOTUNED method on unbalanced datasets.
References 1. J. Chen, Y. Yang, K. Hu, Q. Xuan, Y. Liu, C. Yang, Multiview transfer learning for software defect prediction. IEEE Access 7, 8901–8916 (2019). https://doi.org/10.1109/ACCESS.2018. 2890733 2. T. Zhou, X. Sun, X. Xia, B. Li, X. Chen, Improving defect prediction with deep forest. Inf. Softw. Technol. 114, 204–216 (2019). https://doi.org/10.1016/j.infsof.2019.07.003 3. P. Suresh Kumar, H.S. Behera, J. Nayak, B. Naik, A pragmatic ensemble learning approach for effective software effort estimation. Innov. Syst. Softw. Eng. (2021). https://doi.org/10.1007/ s11334-020-00379-y 4. P. Suresh Kumar, H.S. Behera, J. Nayak, B. Naik, Bootstrap aggregation ensemble learningbased reliable approach for software defect prediction by using characterized code feature. Innov. Syst. Softw. Eng. 17(4), 355–379 (2021). https://doi.org/10.1007/s11334-021-00399-2 5. R. Shatnawi, Improving software fault-prediction for imbalanced data, in 2012 International Conference on Innovations in Information Technology (IIT), Mar 2012, pp. 54–59. https://doi. org/10.1109/INNOVATIONS.2012.6207774 6. R. Chen, S.-K. Guo, X.-Z. Wang, T.-L. Zhang, Fusion of multi-RSMOTE with fuzzy integral to classify bug reports with an imbalanced distribution. IEEE Trans. Fuzzy Syst. 27(12), 2406– 2420 (2019). https://doi.org/10.1109/TFUZZ.2019.2899809 7. S. Mehta, K.S. Patnaik, Improved prediction of software defects using ensemble machine learning techniques. Neural Comput. Appl. 33(16), 10551–10562 (2021). https://doi.org/10. 1007/s00521-021-05811-3 8. V.U.B. Challagulla, F.B. Bastani, I.-L. Yen, R.A. Paul, Empirical assessment of machine learning based software defect prediction techniques, in 10th IEEE International Workshop on Object-Oriented Real-Time Dependable Systems (2005), pp. 263–270. https://doi.org/10. 1109/WORDS.2005.32 9. Ö.F. Arar, K. Ayan, A feature dependent Naive Bayes approach and its application to the software defect prediction problem. Appl. Soft Comput. 59, 197–209 (2017). https://doi.org/ 10.1016/j.asoc.2017.05.043
Light Gradient Boosting Machine in Software Defect …
441
10. X. Rong, F. Li, Z. Cui, A model for software defect prediction using support vector machine based on CBA. Int. J. Intell. Syst. Technol. Appl. 15(1), 19 (2016). https://doi.org/10.1504/IJI STA.2016.076102 11. H. Lu, B. Cukic, M. Culp, Software defect prediction using semi-supervised learning with dimension reduction, in Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering—ASE 2012 (2012), p. 314. https://doi.org/10.1145/2351676.235 1734 12. I.H. Laradji, M. Alshayeb, L. Ghouti, Software defect prediction using ensemble learning on selected features. Inf. Soft