139 12 20MB
English Pages 674 [647] Year 2023
Lecture Notes in Networks and Systems 719
Jyoti Choudrie Parikshit N. Mahalle Thinagaran Perumal Amit Joshi Editors
ICT with Intelligent Applications ICTIS 2023, Volume 1
Lecture Notes in Networks and Systems Volume 719
Series Editor Janusz Kacprzyk , Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Fernando Gomide, Department of Computer Engineering and Automation—DCA, School of Electrical and Computer Engineering—FEEC, University of Campinas—UNICAMP, São Paulo, Brazil Okyay Kaynak, Department of Electrical and Electronic Engineering, Bogazici University, Istanbul, Türkiye Derong Liu, Department of Electrical and Computer Engineering, University of Illinois at Chicago, Chicago, USA Institute of Automation, Chinese Academy of Sciences, Beijing, China Witold Pedrycz, Department of Electrical and Computer Engineering, University of Alberta, Alberta, Canada Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Marios M. Polycarpou, Department of Electrical and Computer Engineering, KIOS Research Center for Intelligent Systems and Networks, University of Cyprus, Nicosia, Cyprus Imre J. Rudas, Óbuda University, Budapest, Hungary Jun Wang, Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
The series “Lecture Notes in Networks and Systems” publishes the latest developments in Networks and Systems—quickly, informally and with high quality. Original research reported in proceedings and post-proceedings represents the core of LNNS. Volumes published in LNNS embrace all aspects and subfields of, as well as new challenges in, Networks and Systems. The series contains proceedings and edited volumes in systems and networks, spanning the areas of Cyber-Physical Systems, Autonomous Systems, Sensor Networks, Control Systems, Energy Systems, Automotive Systems, Biological Systems, Vehicular Networking and Connected Vehicles, Aerospace Systems, Automation, Manufacturing, Smart Grids, Nonlinear Systems, Power Systems, Robotics, Social Systems, Economic Systems and other. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution and exposure which enable both a wide and rapid dissemination of research output. The series covers the theory, applications, and perspectives on the state of the art and future developments relevant to systems and networks, decision making, control, complex processes and related areas, as embedded in the fields of interdisciplinary and applied sciences, engineering, computer science, physics, economics, social, and life sciences, as well as the paradigms and methodologies behind them. Indexed by SCOPUS, INSPEC, WTI Frankfurt eG, zbMATH, SCImago. All books published in the series are submitted for consideration in Web of Science. For proposals from Asia please contact Aninda Bose ([email protected]).
Jyoti Choudrie · Parikshit N. Mahalle · Thinagaran Perumal · Amit Joshi Editors
ICT with Intelligent Applications ICTIS 2023, Volume 1
Editors Jyoti Choudrie Hertfordshire Business School University of Hertfordshire Hatfield, Hertfordshire, UK Thinagaran Perumal Department of Computer Science, Faculty of Computer Science and Information Technology Universiti Putra Malaysia Serdang, Selangor, Malaysia
Parikshit N. Mahalle Department of AI and DS Vishwakarma Institute of Information Technology Pune, India Amit Joshi Global Knowledge Research Foundation Ahmedabad, Gujarat, India
ISSN 2367-3370 ISSN 2367-3389 (electronic) Lecture Notes in Networks and Systems ISBN 978-981-99-3757-8 ISBN 978-981-99-3758-5 (eBook) https://doi.org/10.1007/978-981-99-3758-5 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
Preface
Seventh International Conference on Information and Communication Technology for Intelligent Systems (ICTIS 2023) targets state-of-the-art as well as emerging topics pertaining to information and communication technologies (ICTs) and effective strategies for its implementation for engineering and intelligent applications. The conference is anticipated to attract a large number of high-quality submissions, stimulate the cutting-edge research discussions among many academic pioneering researchers, scientists, industrial engineers, students from all around the world and provide a forum to researcher; propose new technologies, share their experiences and discuss future solutions for design infrastructure for ICT; provide a common platform for academic pioneering researchers, scientists, engineers and students to share their views and achievements; enrich technocrats and academicians by presenting their innovative and constructive ideas; and focus on innovative issues at international level by bringing together the experts from different countries. The conference was held during April 27–28, 2023, physically on April 27, 2023, at Hotel Pride Plaza, Bodakdev, Ahmedabad, and digitally on April 28, 2023, platform: Zoom and organized and managed by Global Knowledge Research Foundation and G R Scholastic LLP in collaboration with Knowledge Chamber of Commerce and Industry. Research submissions in various advanced technology areas were received, and after a rigorous peer-review process with the help of program committee members and external reviewer, 160 papers were accepted with an acceptance rate of 17%. All 160 papers of the conference are accommodated in 3 volumes, also papers in the book comprise authors from 5 countries. This event’s success was possible only with the help and support of our team and organizations. With immense pleasure and honor, we would like to express our sincere thanks to the authors for their remarkable contributions, all the technical program committee members for their time and expertise in reviewing the papers within a very tight schedule and the publisher Springer for their professional help. We are overwhelmed by our distinguished scholars and appreciate them for accepting our invitation to join us through the virtual platform and deliver keynote speeches and technical session chairs for analyzing the research work presented by v
vi
Preface
the researchers. Most importantly, we are also grateful to our local support team for their hard work for the conference. London, UK Pune, India Seri Kembangan, Malaysia Ahmedabad, India
Jyoti Choudrie Parikshit N. Mahalle Thinagaran Perumal Amit Joshi
Contents
Monitoring Algorithm for Datafication and Information Control for Data Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mita Mehta
1
Understanding Customer Perception Regarding Branded Fuel ‘XP-95’ in IOCL Retail Outlets Using Business Intelligence (BI) by the Help of NVIVO and SPSS Software . . . . . . . . . . . . . . . . . . . . . . . . . . . Arunangshu Giri, Dipanwita Chakrabarty, Adrinil Santra, and Soumya Kanti Dhara
9
Investigation of Aadhaar Card Enrolment in Government Schools Using R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sameer Ahsan, V Sakthivel, T. Subbulakshmi, and Vallidevi Krishnamurthy Automated Spurious Product Review Auditing System . . . . . . . . . . . . . . . . R. Sainivedhana, K. Koushik Khanna, Ch. Mahesh, and N. Sabiyath Fatima
17
29
A Comparison of the Key Size and Security Level of the ECC and RSA Algorithms with a Focus on Cloud/Fog Computing . . . . . . . . . . Dhaval Patel, Bimal Patel, Jalpesh Vasa, and Mikin Patel
43
Data Prevention Protocol for Cloud Computing Security Using Blockchain Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Priyanka Mishra and R. Ganesan
55
Intelligence Monitoring of Home Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hampika Gorla, Mary Swarna Latha Gade, Babitha Lokula, V. Bindusree, and Pranay Kumar Padegela TB Bacteria and WBC Detection from ZN-Stained Sputum Smear Images Using Object Detection Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V. Shwetha
69
77
vii
viii
Contents
AuraCodes: Barcodes in the Aural Band . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Soorya Annadurai A Scrutiny and Investigation on Student Response System to Assess the Rating on Profuse Dataset—An Aerial View . . . . . . . . . . . . . Shweta Dhareshwar and M. R. Dileep
87
95
Analysis of Hospital Patient Data Using Computational Models . . . . . . . . 107 Impana Anand, M. Madhura, M. Nikita, V. S. Varshitha, Trupthi Rao, and Ashwini Kodipalli A Research on the Impact of Big Data Analytics on the Telecommunications Sector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 Ashok Kumar, Nancy Arya, and Pramod Kumar Sharma Face Mask Isolation Canister Design for Healthcare Sector Towards Preventive Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 Sushant Dalvi, Aditya Deore, Rishi Mutagekar, Sushant Kadam, Avinash Somatkar, and Parikshit N. Mahalle Modeling the Impact of Access to Finance on MSMEs’ Contribution Towards Exports of the Economy . . . . . . . . . . . . . . . . . . . . . . . 139 Md. Motahar Hossain and Nitin Pathak Privacy Challenges and Solutions in Implementing Searchable Encryption for Cloud Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 Akshat Patel, Priteshkumar Prajapati, Parth Shah, Vivek Sangani, Madhav Ajwalia, and Trushit Upadhyaya Multi-objective Optimization with Practical Constraints Using AALOA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 Balasubbareddy Mallala, P. Venkata Prasad, and Kowstubha Palle An Implementation of Lightweight Cryptographic Algorithm for IOT Healthcare System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 Ahina George, Divya James, and K. S. Lakshmi Home Appliances Automation Using IPv6 Transmission Over BLE . . . . . 193 Lalit Kumar and Pradeep Kumar IoT Based Collision Avoidance System with the Case Study Using IR Sensor for Vehicles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 Manasvi, Neha Garg, and Siddhant Thapliyal Airline Ticket Price Forecasting Using Time Series Model . . . . . . . . . . . . . 215 A. Selvi, B. Sinegalatha, S. Trinaya, and K. K. Varshaa AI Powered Authentication for Smart Home Security—A Survey . . . . . . 227 P. Priya, B. Gopinath, M. Mohamed Ashif, and H. S. Yadeshwaran
Contents
ix
WSRR: Weighted Rank-Relevance Sampling for Dense Text Retrieval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 Kailash Hambarde and Hugo Proença Tweet Based Sentiment Analysis for Stock Price Prediction . . . . . . . . . . . . 249 K. Abinanda Vrishnaa and N. Sabiyath Fatima A Study on the Stock Market Trend Predictions . . . . . . . . . . . . . . . . . . . . . . 261 Rosemol Thomas, Hiren Joshi, and Hardik Joshi Optimization of Partial Products in Modified Booth Multiplier . . . . . . . . 267 Sharwari Bhosale, Ketan J. Raut, Minal Deshmukh, Abhijit V. Chitre, and Vaibhav Pavnaskar Comparative Study on Text-to-Image Synthesis Using GANs . . . . . . . . . . 279 Pemmadi Leela Venkat, Veerni Venkata Sasidhar, Athota Naga Sandeep, Anudeep Peddi, M. V. P. Chandra Sekhara Rao, and Lakshmikanth Paleti Online Hate Speech Identification Using Fine-tuned ALBERT . . . . . . . . . 289 Sneha Chinivar, M. S. Roopa, J. S. Arunalatha, and K. R. Venugopal A Real Time Design Pattern Using MapReduce Strategy Visit for Distributed Service Oriented Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301 K. Selvamani, S. Kanimozhi, and H. Riasudheen Deep Learning Based Gesture Recognition System Using EMG Spectrogram Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309 Praahas Amin and Airani Mohammad Khan Optimization in Fuzzy Clustering: A Review . . . . . . . . . . . . . . . . . . . . . . . . . 321 Kanika Bhalla and Anjana Gosain Explainable AI for Intrusion Prevention: A Review of Techniques and Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339 Pankaj R. Chandre, Viresh Vanarote, Rajkumar Patil, Parikshit N. Mahalle, Gitanjali R. Shinde, Madhukar Nimbalkar, and Janki Barot Multi-level Association Based 3D Multiple-Object Tracking Framework for Self-driving Cars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351 Divyajyoti Morabad, Prabha Nissimagoudar, H. M. Gireesha, and Nalini C. Iyer Automatic Extraction of Software Requirements Using Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361 Siddharth Apte, Yash Honrao, Rohan Shinde, Pratvina Talele, and Rashmi Phalnikar Real-Time Pedestrian Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371 Sagar Kumbari and Basawaraj
x
Contents
Android Based Mobile Application for Hockey Tournament Management System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381 K. Selvamani, H. Riasudheen, S. Kanimozhi, and S. Murugavalli Hybrid Intrusion Detection System Using Autoencoders and Snort . . . . . 391 Yudhir Gala, Nisha Vanjari, Dharm Doshi, and Inshiya Radhanpurwala A Hybrid Image Edge Detection Approach for the Images of Tocklai Vegetative Tea Clone Series TV1, TV9, TV10 of Assam . . . . . . 403 Jasmine Ara Begum and Ranjan Sarmah Gesture Controlled Iterative Radio Controlled Car for Rehabilitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417 Nabamita Ghosh, Prithwiraj Roy, Dibyarup Mukherjee, Madhupi Karmakar, Suprava Patnaik, and Sarita Nanda Using Ensemble of Different Classifiers for Defect Prediction . . . . . . . . . . 427 Ruchika Malhotra, Ankit Kumar, and Muskan Gupta Designing an ABM that Can Be Used to Predict the Impact of the Number Portability Regulation in Namibia Using Netlogo . . . . . . . 435 Henok Immanuel, Attlee Gamundani, and Edward Nepolo Performance Analysis of Deep Neural Network for Intrusion Detection Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445 Harshit Jha, Maulik Khanna, Himanshu Jhawar, and Rajni Jindal Classification of Choroidal Neovascularization (CNV) from Optical Coherence Tomography (OCT) Images Using Efficient Fine-Tuned ResNet and DenseNet Deep Learning Models . . . . . . . . . . . . . . . . . . . . . . . . 457 Megha Goriya, Zeel Amrutiya, Ayush Ghadiya, Jalpesh Vasa, and Bimal Patel Early Heart Disease Prediction Using Support Vector Machine . . . . . . . . 471 T. Bala Krishna, Neelapala Vimala, Pakala Vinay, Nagudu Siddhardha, and Padilam Murali Manohar An Analysis of Task-Based Algorithms for Cloud Computing . . . . . . . . . . 481 Murli Patel, Abhinav Sharma, Priyank Vaidya, and Nishant Doshi Performance Analysis of SOC Estimation Approaches for Lithium-Ion Batteries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493 R. Shanmugasundaram, C. Ganesh, B. Adhavan, M. Mohamed Iqbal, B. Gunapriya, and P. Tamilselvi An Analysis of Resource-Oriented Algorithms for Cloud Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505 Abhinav Sharma, Priyank Vaidya, Murli Patel, and Nishant Doshi
Contents
xi
Study of Architectural Designs for Underwater Wireless Sensors Network in View of Underwater Applications . . . . . . . . . . . . . . . . . . . . . . . . 517 Pooja A. Shelar, Parikshit N. Mahalle, Gitanjali R. Shinde, and Janki Barot 8IoT Based Single Axis Sun Tracking Solar Panel . . . . . . . . . . . . . . . . . . . . 533 D. Priyanka, M. Yasaswi, M. Akhila, M. Leela Rani, and K. Pradeep Kumar Blockchain Protocols: Transforming the Web We Know . . . . . . . . . . . . . . . 545 Hitesh Jangid and Priyanka Meel An Internet of Things-Based Smart Asthma Inhaler Integrated with Mobile Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559 P. Srivani, A. Durga Bhavani, and R. Shankar Machine Learning Prediction Models to Predict Long-Term Survival After Heart and Liver Transplantation . . . . . . . . . . . . . . . . . . . . . . 567 Vandana Jagtap, Monalisa Bhinge, Neha V. Dwivedi, Nanditha R. Nambiar, Snehal S. Kankariya, Toshavi Ghatode, Rashmita Raut, and Prajyot Jagtap Using Blockchain Smart Contracts to Regulate Forest Policies . . . . . . . . . 579 Aaryan Dalal and Reetu Jain Curbing Anomalous Transactions Using Cost-Sensitive Learning . . . . . . 589 S. Aswathy and V. Viji Rajendran Comparative Analysis of Machine Learning Algorithms on Mental Health Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 599 Lakshmi Prasanna and Shashi Mehrotra Comparative Analysis of Machine Learning Algorithms for Predicting Mobile Price . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607 D. N. V. S. Vamsi and Shashi Mehrotra Investigating Library Usage Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617 Inutu Kawina and Shashi Mehrotra Advanced Application Development in Agriculture—Issues and Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625 Purnima Gandhi Naïve Bayesian Classifier Committee (NBCC) Using Decision Tree (C4.5) for Feature Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 637 Manmohan Singh, Monika Vyas, Kamiya Pithode, Nikhat Raja Khan, and Joanna Rosak-Szyrocka
Editors and Contributors
About the Editors Prof. Jyoti Choudrie is Professor of Information Systems in Hertfordshire Business School, Management, Leadership and Organisation (MLO) department where she previously led the Systems Management Research Unit (SyMRU) and currently is Convenor for the Global Work, Economy and Digital Inclusion group. She is also Editor-in-Chief for Information, Technology and People journal (An Association of Business School 3 grade journal). In terms of research, Prof. Choudrie is known as Broadband and Digital Inclusion Expert in University of Hertfordshire, which was also the case in Brunel University. To ensure her research is widely disseminated, Prof. Choudrie co-edited a Routledge research monograph with Prof. C. Middleton: The Management of Broadband Technology Innovation and completed a research monograph published by Routledge Publishing and focused on social inclusion along with Dr. Sherah Kurnia and Dr. Panayiota Tsatsou titled: Social Inclusion and Usability of ICT-Enabled Services. She also works with Age (UK) Hertfordshire, Hertfordshire County Council, and Southend YMCA where she is undertaking a Knowledge Transfer Partnership project investigating the role of Online Social Networks (OSN). Finally, she is focused on artificial intelligence (AI) applications in organizations and society alike, which accounts for her interests in OSN, machine and deep learning. She has been Keynote Speaker for the International Congress of Information and Communication Technologies, Digital Britain conferences and supervises doctoral students drawn from around the globe. Presently, she is seeking 3–4 doctoral students who would want to research AI in society and organizations alike. Dr. Parikshit N. Mahalle is Senior Member IEEE and is Professor, Head of Department of Artificial Intelligence and Data Science at Vishwakarma Institute of Information Technology, Pune, India. He completed his Ph.D. from Aalborg University, Denmark, and continued as Postdoc Researcher at CMI, Copenhagen, Denmark. He has 23+ years of teaching and research experience. He is Member of the Board of
xiii
xiv
Editors and Contributors
Studies in Computer Engineering and Ex-chairman Information Technology, Savitribai Phule Pune University and various universities and autonomous colleges across India. He has 12 patents, 200+ research publications (Google Scholar citations-2450 plus, H index-22, and Scopus Citations are 1300 plus with H index-16, Web of Science H index-10) and authored/edited 50+ books with Springer, CRC Press, Cambridge University Press, etc. He is Editor-in-Chief for IGI Global—International Journal of Rough Sets and Data Analysis, Inter-science International Journal of Grid and Utility Computing, Member—Editorial Review Board for IGI Global—International Journal of Ambient Computing and Intelligence, and Reviewer for various journals and conferences of the repute. His research interests are machine learning, data science, algorithms, Internet of Things, and identity management and security. He is guiding 8 Ph.D. students in the area of IoT and machine learning, and recently, 5 students have successfully defended their Ph.D. under his supervision. He is also Recipient of “Best Faculty Award” by Sinhgad Institutes and Cognizant Technologies Solutions. He has delivered 200 plus lectures at national and international level. Dr. Thinagaran Perumal received his B.Eng. in Computer and Communication System Engineering from Universiti Putra Malaysia in 2003. He completed his M.Sc. and Ph.D. Smart Technologies and Robotics from the same university in 2006 and 2011, respectively. Currently, he is appointed as Senior Lecturer at the Department of Computer Science, Faculty of Computer Science and Information Technology, Universiti Putra Malaysia. He is also currently appointed as Head of Cyber-Physical Systems in the university and also been elected as Chair of IEEE Consumer Electronics Society Malaysia Chapter. Dr. Thinagaran Perumal is Recipient of 2014 IEEE Early Career Award from IEEE Consumer Electronics Society for his pioneering contribution in the field of consumer electronics. His research interests are toward interoperability aspects of smart homes and Internet of Things (IoT), wearable computing, and cyber-physical systems. His recent research activities include proactive architecture for IoT systems; development of the cognitive IoT frameworks for smart homes and wearable devices for rehabilitation purposes. He is Active Member of Electronics Society and its Future Directions Committee on Internet of Things. He has been invited to give several keynote lectures and plenary talk on Internet of Things in various institutions and organizations internationally. Dr. Amit Joshi is currently Director of Global Knowledge Research Foundation, also Entrepreneur and Researcher who has completed his graduation (B.Tech.) in Information Technology and M.Tech. in Computer Science and Engineering and completed his research in the areas of cloud computing and cryptography in medical imaging with a focus on analysis of the current government strategies and world forums needs in different sectors on security purposes. He has an experience of around 10 years in academic and industry in prestigious organizations. He is Active Member of ACM, IEEE, CSI, AMIE, IACSIT-Singapore, IDES, ACEEE, NPA, and many other professional societies. Further currently, he is also International Chair of InterYIT at International Federation of Information Processing (IFIP, Austria), He has presented and published more than 50 papers in national and international
Editors and Contributors
xv
journals/conferences of IEEE and ACM. He has also edited more than 20 books which are published by Springer, ACM, and other reputed publishers. He has also organized more than 40 national and international conferences and workshops through ACM, Springer, and IEEE across 5 countries including India, UK, Thailand, and Europe.
Contributors K. Abinanda Vrishnaa Department of Computer Science and Engineering, B.S.A. Crescent Institute of Science and Technology, Chennai, Tamil Nadu, India B. Adhavan PSG Institute of Technology and Applied Research, Coimbatore, India Sameer Ahsan School of Computer Science and Engineering, Vellore Institute of Technology, Chennai, Tamil Nadu, India Madhav Ajwalia Chandubhai S. Patel Institute of Technology (CSPIT), Faculty of Technology and Engineering (FTE), Charotar University of Science and Technology (CHARUSAT), Changa, Gujarat, India M. Akhila Seshadri Rao Gudlavalleru Engineering College, Gudlavalleru, Andhra Pradesh, India Praahas Amin Department of Electronics, Mangalore University, Mangalore, KA, India Zeel Amrutiya Smt. K. D. Patel Department of Information Technology (KDPIT), Faculty of Technology and Engineering (FTE), Chandubhai S. Patel Institute of Technology (CSPIT), Charotar University of Science and Technology (CHARUSAT), Changa, Gujarat, India Impana Anand Department of Artificial Intelligence and Data Science, Global Academy of Technology, Bangalore, India Soorya Annadurai Microsoft Corporation, Redmond, WA, USA Siddharth Apte School of Computer Engineering and Technology, Dr. Vishwanath Karad MIT World Peace University, Pune, India J. S. Arunalatha Department of CSE UVCE, Bangalore University, Bengaluru, India Nancy Arya Department of Computer Science and Engineering, Shree Guru Gobind Singh Tricentenary University, Gurgaon, India S. Aswathy Computer Science and Engineering, NSS College of Engineering, Palakkad, India T. Bala Krishna Gudlavalleru Engineering College, Gudlavalleru, India Janki Barot Silver Oak University, Ahmedabad, India
xvi
Editors and Contributors
Basawaraj KLE Technological University, Hubballi, Karnataka, India Jasmine Ara Begum Assam Rajiv Gandhi University of Cooperative Management, Sivasagar, India Kanika Bhalla USICT, Guru Gobind Singh Indraprastha University, Delhi, India A. Durga Bhavani BMS Institute of Technology and Management, Bangalore, India Monalisa Bhinge Dr. Vishwanath Karad, MIT World Peace University, Pune, India Sharwari Bhosale Department of Electronics and Telecommunication Engineering, Vishwakarma Institute of Information Technology, Pune, Maharashtra, India V. Bindusree Department of Electronics and Communication Engineering, Institute of Aeronautical Engineering, Hyderabad, Telangana, India Dipanwita Chakrabarty Haldia Institute of Technology, Haldia, West Bengal, India M. V. P. Chandra Sekhara Rao Department of CSBS, R.V.R & J.C College of Engineering, Guntur, India Pankaj R. Chandre Department of Computer Science and Engineering, MIT School of Computing, MIT ADT University, Loni Kalbhor, India Sneha Chinivar Department of CSE UVCE, Bangalore University, Bengaluru, India Abhijit V. Chitre Department of Electronics and Telecommunication Engineering, Vishwakarma Institute of Information Technology, Pune, Maharashtra, India Aaryan Dalal Jakarta Intercultural School, South Jakarta, Indonesia Sushant Dalvi Vishwakarma Institute of Information Technology, Pune, India Aditya Deore Vishwakarma Institute of Information Technology, Pune, India Minal Deshmukh Department of Electronics and Telecommunication Engineering, Vishwakarma Institute of Information Technology, Pune, Maharashtra, India Soumya Kanti Dhara Haldia Institute of Technology, Haldia, West Bengal, India Shweta Dhareshwar Department of Master of Computer Applications, Nitte Meenakshi Institute of Technology, Yelahanka, Bengaluru, Karnataka, India M. R. Dileep Department of Master of Computer Applications, Nitte Meenakshi Institute of Technology, Yelahanka, Bengaluru, Karnataka, India Dharm Doshi K. J. Somaiya Institute of Engineering and Information Technology, Mumbai, India
Editors and Contributors
xvii
Nishant Doshi Department of Computer Science and Engineering, Pandit Deendayal Energy University, Gandhinagar, Gujarat, India Neha V. Dwivedi Dr. Vishwanath Karad, MIT World Peace University, Pune, India N. Sabiyath Fatima Department of Computer Science and Engineering, B S Abdur Rahman Crescent Institute of Science and Technology, Chennai, India Mary Swarna Latha Gade Department of Electronics and Communication Engineering, Institute of Aeronautical Engineering, Hyderabad, Telangana, India Yudhir Gala K. J. Somaiya Institute of Engineering and Information Technology, Mumbai, India Attlee Gamundani Namibia University of Science and Technology, Windhoek, Namibia Purnima Gandhi Institute of Technology, Nirma University, Ahmedabad, India R. Ganesan Department of Computer Science Engineering, Vellore Institute of Technology, Chennai, Tamil Nadu, India C. Ganesh Rajalakshmi Institute of Technology, Chennai, India Neha Garg Department of Computer Science and Engineering, Graphic Era Deemed to be University, Dehradun, India Ahina George Department of IT, Rajagiri School of Engineering and Technology, Kakkanad, Kerala, India Ayush Ghadiya Smt. K. D. Patel Department of Information Technology (KDPIT), Faculty of Technology and Engineering (FTE), Chandubhai S. Patel Institute of Technology (CSPIT), Charotar University of Science and Technology (CHARUSAT), Changa, Gujarat, India Toshavi Ghatode Dr. Vishwanath Karad, MIT World Peace University, Pune, India Nabamita Ghosh Kalinga Institute of Industrial Technology, Bhubaneswar, India H. M. Gireesha KLE Technological University, Hubballi, Karnataka, India Arunangshu Giri Haldia Institute of Technology, Haldia, West Bengal, India B. Gopinath Department of Computer Science and Engineering, M. Kumarasamy College of Engineering, Karur, Tamilnadu, India Megha Goriya Smt. K. D. Patel Department of Information Technology (KDPIT), Faculty of Technology and Engineering (FTE), Chandubhai S. Patel Institute of Technology (CSPIT), Charotar University of Science and Technology (CHARUSAT), Changa, Gujarat, India Hampika Gorla Department of Electronics and Communication Engineering, Institute of Aeronautical Engineering, Hyderabad, Telangana, India
xviii
Editors and Contributors
Anjana Gosain USICT, Guru Gobind Singh Indraprastha University, Delhi, India B. Gunapriya New Horizon College of Engineering, Bangalore, India Muskan Gupta Delhi Technological University, Delhi, India Kailash Hambarde IT: Instituto de Telecomunicações, University of Beira Interior, Covilha, Portugal Yash Honrao School of Computer Engineering and Technology, Dr. Vishwanath Karad MIT World Peace University, Pune, India Henok Immanuel Namibia University of Science and Technology, Windhoek, Namibia Nalini C. Iyer KLE Technological University, Hubballi, Karnataka, India Prajyot Jagtap Dr. Vishwanath Karad, MIT World Peace University, Pune, India Vandana Jagtap Dr. Vishwanath Karad, MIT World Peace University, Pune, India Reetu Jain On My Own Technology Pvt Ltd, Mumbai, India Divya James Department of IT, Rajagiri School of Engineering and Technology, Kakkanad, Kerala, India Hitesh Jangid Department of Information Technology, Delhi Technological University, New Delhi, India Harshit Jha Delhi Technological University, New Delhi, 110042, India Himanshu Jhawar Delhi Technological University, New Delhi, 110042, India Rajni Jindal Delhi Technological University, New Delhi, 110042, India Hardik Joshi Gujarat University, Ahmedabad, Gujarat, India Hiren Joshi Gujarat University, Ahmedabad, Gujarat, India Sushant Kadam Vishwakarma Institute of Information Technology, Pune, India S. Kanimozhi Panimalar Engineering College, Chennai, India Snehal S. Kankariya Dr. Vishwanath Karad, MIT World Peace University, Pune, India Madhupi Karmakar Kalinga Institute of Industrial Technology, Bhubaneswar, India Inutu Kawina Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Guntur, India Airani Mohammad Khan Department of Electronics, Mangalore University, Mangalore, KA, India
Editors and Contributors
xix
Nikhat Raja Khan Computer Science and Engineering Department, IES College of Technology, Bhopal, MP, India K. Koushik Khanna Department of Computer Science and Engineering, B S Abdur Rahman Crescent Institute of Science and Technology, Chennai, India Maulik Khanna Delhi Technological University, New Delhi, 110042, India Ashwini Kodipalli Department of Artificial Intelligence and Data Science, Global Academy of Technology, Bangalore, India Vallidevi Krishnamurthy School of Computer Science and Engineering, Vellore Institute of Technology, Chennai, Tamil Nadu, India Ankit Kumar Delhi Technological University, Delhi, India Ashok Kumar Department of Computer Science and Engineering, Shree Guru Gobind Singh Tricentenary University, Gurgaon, India K. Pradeep Kumar Seshadri Rao Gudlavalleru, Andhra Pradesh, India
Gudlavalleru
Engineering
College,
Lalit Kumar J.C.Bose University of Science and Technology, YMCA, Faridabad, Haryana, India Pradeep Kumar J.C.Bose University of Science and Technology, YMCA, Faridabad, Haryana, India Sagar Kumbari KLE Technological University, Hubballi, Karnataka, India Lakshmi Prasanna Koneru Lakshmaiah Education Foundation, Vaddeswaram, Guntur, Andhra Pradesh, India K. S. Lakshmi Department of IT, Rajagiri School of Engineering and Technology, Kakkanad, Kerala, India Babitha Lokula Department of Electronics and Communication Engineering, Institute of Aeronautical Engineering, Hyderabad, Telangana, India M. Madhura Department of Artificial Intelligence and Data Science, Global Academy of Technology, Bangalore, India Parikshit N. Mahalle Department of Artificial Intelligence and Data Science, Vishwakarma Institute of Information Technology, Savitribai Phule Pune University, Kondhawa, Pune, India Ch. Mahesh Department of Computer Science and Engineering, B S Abdur Rahman Crescent Institute of Science and Technology, Chennai, India Ruchika Malhotra Delhi Technological University, Delhi, India Balasubbareddy Mallala Chaitanya Bharathi Institute of Technology, Hyderabad, India
xx
Editors and Contributors
Manasvi Department of Computer Science and Engineering, Graphic Era Deemed to be University, Dehradun, India Padilam Murali Manohar Gudlavalleru Engineering College, Gudlavalleru, India Priyanka Meel Department of Information Technology, Delhi Technological University, New Delhi, India Shashi Mehrotra Faculty of Engineering, College of Computing Science and Technology, Teerthanker Mahaver University, Moradabad, India; College of Computer Science and Engineering, Teerthanker Mahaver University, Moradabad, India Mita Mehta Symbiosis Institute of Health Sciences, Symbiosis International (Deemed University), Pune, India Priyanka Mishra Department of Computer Science Engineering, Vellore Institute of Technology, Chennai, Tamil Nadu, India M. Mohamed Ashif Department of Computer Science and Engineering, M. Kumarasamy College of Engineering, Karur, Tamilnadu, India M. Mohamed Iqbal PSG Institute of Technology and Applied Research, Coimbatore, India Divyajyoti Morabad KLE Technological University, Hubballi, Karnataka, India Md. Motahar Hossain Department of Business Management, University School of Business, Chandigarh University, Mohali, Punjab, India Dibyarup Mukherjee Kalinga Institute of Industrial Technology, Bhubaneswar, India S. Murugavalli Panimalar Engineering College, Chennai, India Rishi Mutagekar Vishwakarma Institute of Information Technology, Pune, India Nanditha R. Nambiar Dr. Vishwanath Karad, MIT World Peace University, Pune, India Sarita Nanda Kalinga Institute of Industrial Technology, Bhubaneswar, India Edward Nepolo Namibia University of Science and Technology, Windhoek, Namibia M. Nikita Department of Artificial Intelligence and Data Science, Global Academy of Technology, Bangalore, India Madhukar Nimbalkar Department of Computer Science and Engineering, MIT School of Computing, MIT ADT University, Loni Kalbhor, India Prabha Nissimagoudar KLE Technological University, Hubballi, Karnataka, India
Editors and Contributors
xxi
Pranay Kumar Padegela Department of Electronics and Communication Engineering, Institute of Aeronautical Engineering, Hyderabad, Telangana, India Lakshmikanth Paleti Department of CSBS, R.V.R & J.C College of Engineering, Guntur, India Kowstubha Palle Chaitanya Bharathi Institute of Technology, Hyderabad, India Akshat Patel Chandubhai S. Patel Institute of Technology (CSPIT), Faculty of Technology and Engineering (FTE), Charotar University of Science and Technology (CHARUSAT), Changa, Gujarat, India Bimal Patel Smt. K. D. Patel Department of Information Technology (KDPIT), Faculty of Technology and Engineering (FTE), Chandubhai S. Patel Institute of Technology (CSPIT), Charotar University of Science and Technology (CHARUSAT), Changa, Gujarat, India Dhaval Patel Smt. K D Patel Department of Information Technology (KDPIT), Chandubhai S. Patel Institute of Technology (CSPIT), Faculty of Technology and Engineering (FTE), Charotar University of Science and Technology (CHARUSAT), Changa, Gujarat, India Mikin Patel Smt. K D Patel Department of Information Technology (KDPIT), Chandubhai S. Patel Institute of Technology (CSPIT), Faculty of Technology and Engineering (FTE), Charotar University of Science and Technology (CHARUSAT), Changa, Gujarat, India Murli Patel Department of Computer Science and Engineering, Pandit Deendayal Energy University, Gandhinagar, Gujarat, India Nitin Pathak Department of Commerce, University School of Business, Chandigarh University, Mohali, Punjab, India Rajkumar Patil Department of Information Technology, MIT School of Computing, MIT ADT University, Loni Kalbhor, India Suprava Patnaik Kalinga Institute of Industrial Technology, Bhubaneswar, India Vaibhav Pavnaskar Microchip, Cork, Ireland Anudeep Peddi Department of CSBS, R.V.R & J.C College of Engineering, Guntur, India Rashmi Phalnikar School of Computer Engineering and Technology, Dr. Vishwanath Karad MIT World Peace University, Pune, India Kamiya Pithode Computer Science and Engineering Department, IES College of Technology, Bhopal, MP, India Priteshkumar Prajapati Chandubhai S. Patel Institute of Technology (CSPIT), Faculty of Technology and Engineering (FTE), Charotar University of Science and Technology (CHARUSAT), Changa, Gujarat, India
xxii
Editors and Contributors
P. Priya Department of Computer Science and Engineering, M. Kumarasamy College of Engineering, Karur, Tamilnadu, India D. Priyanka Seshadri Rao Gudlavalleru Engineering College, Gudlavalleru, Andhra Pradesh, India Hugo Proença IT: Instituto de Telecomunicações, University of Beira Interior, Covilha, Portugal Inshiya Radhanpurwala K. J. Somaiya Institute of Engineering and Information Technology, Mumbai, India M. Leela Rani Seshadri Rao Gudlavalleru Engineering College, Gudlavalleru, Andhra Pradesh, India Trupthi Rao Department of Artificial Intelligence and Data Science, Global Academy of Technology, Bangalore, India Ketan J. Raut Department of Electronics and Telecommunication Engineering, Vishwakarma Institute of Information Technology, Pune, Maharashtra, India Rashmita Raut Dr. Vishwanath Karad, MIT World Peace University, Pune, India H. Riasudheen CEG Campus, Anna University, Chennai, India M. S. Roopa Department of CSE, Dayananda Sagar College of Engineering, Bengaluru, India Joanna Rosak-Szyrocka Department of Production Engineering and Safety, Faculty of Management, Czestochowa University of Technology, Czestochowa, Poland Prithwiraj Roy Kalinga Institute of Industrial Technology, Bhubaneswar, India N. Sabiyath Fatima Department of Computer Science and Engineering, B.S.A. Crescent Institute of Science and Technology, Chennai, Tamil Nadu, India R. Sainivedhana Department of Computer Science and Engineering, B S Abdur Rahman Crescent Institute of Science and Technology, Chennai, India V Sakthivel School of Computer Science and Engineering, Vellore Institute of Technology, Chennai, Tamil Nadu, India Athota Naga Sandeep Department of CSBS, R.V.R & J.C College of Engineering, Guntur, India Vivek Sangani Chandubhai S. Patel Institute of Technology (CSPIT), Faculty of Technology and Engineering (FTE), Charotar University of Science and Technology (CHARUSAT), Changa, Gujarat, India Adrinil Santra Haldia Institute of Technology, Haldia, West Bengal, India Ranjan Sarmah Assam Rajiv Gandhi University of Cooperative Management, Sivasagar, India
Editors and Contributors
xxiii
Veerni Venkata Sasidhar Department of CSBS, R.V.R & J.C College of Engineering, Guntur, India K. Selvamani CEG Campus, Anna University, Chennai, India A. Selvi Department of Computer Science and Engineering, M. Kumarasamy College of Engineering, Karur, Tamilnadu, India Parth Shah Chandubhai S. Patel Institute of Technology (CSPIT), Faculty of Technology and Engineering (FTE), Charotar University of Science and Technology (CHARUSAT), Changa, Gujarat, India R. Shankar BMS Institute of Technology and Management, Bangalore, India R. Shanmugasundaram Sri Ramakrishna Engineering College, Coimbatore, India Abhinav Sharma Department of Computer Science and Engineering, Pandit Deendayal Energy University, Gandhinagar, Gujarat, India Pramod Kumar Sharma Department of Computer Science and Engineering, Shree Guru Gobind Singh Tricentenary University, Gurgaon, India Pooja A. Shelar Smt. Kashibai Navale College of Engineering, Savitribai Phule Pune University, Pune, India Gitanjali R. Shinde Department of Computer Science and Engineering, Vishwakarma Institute of Information Technology, Savitribai Phule Pune University, Kondhawa, Pune, India Rohan Shinde School of Computer Engineering and Technology, Dr. Vishwanath Karad MIT World Peace University, Pune, India V. Shwetha Electrical and Electronics Department, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, India Nagudu Siddhardha Gudlavalleru Engineering College, Gudlavalleru, India B. Sinegalatha Department of Computer Science and Engineering, Kumarasamy College of Engineering, Karur, Tamilnadu, India
M.
Manmohan Singh Computer Science and Engineering Department, IES College of Technology, Bhopal, MP, India Avinash Somatkar Vishwakarma Institute of Information Technology, Pune, India P. Srivani BMS Institute of Technology and Management, Bangalore, India T. Subbulakshmi School of Computer Science and Engineering, Vellore Institute of Technology, Chennai, Tamil Nadu, India Pratvina Talele School of Computer Engineering and Technology, Dr. Vishwanath Karad MIT World Peace University, Pune, India P. Tamilselvi Sri Eshwar College of Engineering, Coimbatore, India
xxiv
Editors and Contributors
Siddhant Thapliyal Department of Computer Science and Engineering, Graphic Era Deemed to be University, Dehradun, India Rosemol Thomas Gujarat University, Ahmedabad, Gujarat, India S. Trinaya Department of Computer Science and Engineering, M. Kumarasamy College of Engineering, Karur, Tamilnadu, India Trushit Upadhyaya Chandubhai S. Patel Institute of Technology (CSPIT), Faculty of Technology and Engineering (FTE), Charotar University of Science and Technology (CHARUSAT), Changa, Gujarat, India Priyank Vaidya Department of Computer Science and Engineering, Pandit Deendayal Energy University, Gandhinagar, Gujarat, India D. N. V. S. Vamsi Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Guntur, Andhra Pradesh, India Viresh Vanarote Department of Computer Science and Engineering, MIT School of Computing, MIT ADT University, Loni Kalbhor, India Nisha Vanjari K. J. Somaiya Institute of Engineering and Information Technology, Mumbai, India K. K. Varshaa Department of Computer Science and Engineering, Kumarasamy College of Engineering, Karur, Tamilnadu, India
M.
V. S. Varshitha Department of Artificial Intelligence and Data Science, Global Academy of Technology, Bangalore, India Jalpesh Vasa Smt. K. D. Patel Department of Information Technology (KDPIT), Faculty of Technology and Engineering (FTE), Chandubhai S. Patel Institute of Technology (CSPIT), Charotar University of Science and Technology (CHARUSAT), Changa, Gujarat, India Pemmadi Leela Venkat Department of CSBS, R.V.R & J.C College of Engineering, Guntur, India P. Venkata Prasad Chaitanya Bharathi Institute of Technology, Hyderabad, India K. R. Venugopal Department of CSE UVCE, Bangalore University, Bengaluru, India V. Viji Rajendran Computer Science and Engineering, NSS College of Engineering, Palakkad, India Neelapala Vimala Gudlavalleru Engineering College, Gudlavalleru, India Pakala Vinay Gudlavalleru Engineering College, Gudlavalleru, India Monika Vyas Civil Engineering Department, IES College of Technology, Bhopal, MP, India
Editors and Contributors
xxv
H. S. Yadeshwaran Department of Computer Science and Engineering, M. Kumarasamy College of Engineering, Karur, Tamilnadu, India M. Yasaswi Seshadri Rao Gudlavalleru Engineering College, Gudlavalleru, Andhra Pradesh, India
Monitoring Algorithm for Datafication and Information Control for Data Optimization Mita Mehta
Abstract Datafication is a technique of collecting, storing and analyzing the data in digital format. This process has become progressively predominant as advances in technology has made it easier to collect and store large amount of data, in order to identify pattern trends, trends and anomalies. These algorithms are also sued to ensure the integrity and quality of the data, and to detect and prevent the fraudulent activities. Information control is the process of managing and regulating the access to data. This study emphasizes that monitoring algorithm are used to restrict the access to sensitive data based on the identity of the user, their role and the context of the request. These algorithms can also encrypt or obscure sensitive data to protect it from unauthorized access. In total, monitoring algorithms has an important part in datafication and information control, as they help organizations to make sense of the vast amounts of the data they collect, and to protect sensitive information from unauthorized access. However, it is important to note that these algorithms are not full proof and should be part of a larger security strategy. Keywords Datafication · Information control · Algorithm
1 Introduction Datafication states to the process of converting various aspects of our lives into data, which can then be collected, analyzed and used for various purposes. This includes things like tracking our online behavior, monitoring our physical movements, and collecting information about our personal preferences and habits [1]. In today’s digital world, data is being generated at unprecedent rates, with billions of people connected to the internet and using smartphones, social media and other digital devices. This has led to a vast amount of data being collected and stored, creating and unprecedented amount of data to be analyzed and used [2]. In this research, author has placed the critical review of literature about datafication and information control where the M. Mehta (B) Symbiosis Institute of Health Sciences, Symbiosis International (Deemed University), Pune, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_1
1
2
M. Mehta
study found that monitoring algorithm can be optimally used for improved results. Such studies have not been placed earlier in the field of literature, hence this study brings unique and novel extension in the filed of algorithm. Data has become part of all citizens’ life and how datafication as a new set of algorithms can change the experience of business and society at a large has been discussed in this article. Proposed findings and conclusion reveal the future implications of this study.
1.1 Datafication Datafication can be seen in many aspects of our lives, including our online interactions, physical movements and even our health. For example, our online behavior, such as the website we visit and the search term we use, tracked and analyzed to create a detailed profile of our interests, habits and preferences [3]. Similarly, the location data from our smartphones is used to track our physical movements, allowing companies and organizations to gain insights into how we move through the world. Even our health data is being collected and analyzed with wearable devise and the apps tracking our steps, heart rates and other health metrics.
1.2 Information Control Information control denotes to the customs in which this data is used and measured. This can contain things like restriction, surveillance and the manipulations of information for political or economic gain [4]. The cumulative datafication of our lives has elevated concerns about privacy and the potential for abuse of this data by governments, companies and other entities. Governments and corporations may use data to control information, influence public opinion and manipulate the market. This can lead to a variety of negative costs, such as the erosion of privacy, the spread of misinformation and the manipulation of public opinion.
2 Literature Review Datafication and information control can also have a significant impact on marginalized communities, who may be disproportionately affected by these activities. To take and example, data driven systems may preserve existing biases and discrimination, leading to further marginalization and inequality. Therefore, it is important to consider the implications of datafication and information control and take steps to ensure that these practices are used ethically and responsibly. Datafication is the procedure of converting various aspects of our lives into data and the process of collection, analysis and use of this data which is known as information control. This
Monitoring Algorithm for Datafication and Information Control …
3
can have both positive and negative effects on our lives and it is important to consider the implications of these practices and take steps to ensure they are used responsibly. It can also examine the existing research on topic, including studies on the ways in which data is collected and used. This can further lead to what are the implications of datafication for securing privacy with keeping in focus the social gain at a large. Datafication also focuses on how data is collected and exploited or used. Such literature indicates that examination of the different types of data that are collected such as location data, behavioral data and the health-related data. Study focuses on how such data can be analyzed and used. Research also indicates that datafication is cleverly used in target market advertising and developing new products [5, 6]. Additional important area of exploration in datafication is the investigation of the inferences of data for security and privacy reasons. Authors found that datafication can lead to threats [7]. Due to continuous tracing of the online behavior there is high possibility of breach of privacy policy which may get compromised heavily [8]. Some communities who falls into marginalized bracket may affect greatly with datafication [9]. Such impact on marginalized communities would be disproportionately [10]. Author has mentioned the tremendous potential of datafication in the field of social and economic upliftment for any nation. Studies mentioned that how datafication can be optimally utilized in public transport. Social gain is one of the most discussed implication across the literature. Health and transport are one of the most latent yet social gain-oriented projects where datafication can be used [11]. When the merit comes that comes with threat also; the literature cited the disadvantage of the same as well in terms of manipulation of data. Datafication would inspect the existing research on the topic, including studies on the ways in which data is collected and used, the implications of datafication for privacy and security, and the potential for datafication to be used for social and economic gain. It is important to consider the potential negative impact and the ethical concerns surrounding datafication while exploring its benefits. Information control would examine the existing research on the topic, including studies on the ways in which information is controlled, the implications of information control for society, and the potential for information control to be used for political or economic gain. One key area of research in information control is the study of how information is controlled. This includes examination of the different types of information that is controlled, such as news, social media content, and government data, and the ways in which this information is controlled and restricted. Research in this area has shown that information control can take many forms, from censorship to propaganda to disinformation campaigns [12, 13]. Another important area of research in information control is the examination of the implications of information control for society. Studies in this area have shown that information control can have a significant impact on democracy, freedom of expression, and human rights [14, 15]. Research has also shown that information control can lead to a loss of trust in media and government institutions [16].
4
M. Mehta
Additionally, there are studies exploring the potential for information control to be used for political or economic gain. For example, research has shown that information control can be used to manipulate public opinion and influence elections (e.g., [17]). However, there are also concerns about the potential for information control to be used to perpetuate existing power imbalances and undermine democratic institutions [18]. A literature review on information control would examine the existing research on the topic, including studies on the ways in which information is controlled, the implications of information control for society, and the potential for information control to be used for political or economic gain. It is important to consider the potential negative impact and the ethical concerns surrounding information control while exploring its benefits.
3 Datafication and Information Control There have been several studies directed on the topics of datafication and information control. These studies have examined various aspects of these phenomena, including how data is collected and used, the implications of datafication and information control for privacy and security, and the potential for these practices to be used for social and economic gain. One study [6] found at the potential economic benefits of datafication, arguing that the collection and analysis of large amounts of data can lead to new business opportunities and improved public services. The study also highlighted the potential for datafication to be used for nefarious purposes, such as political manipulation and economic exploitation. Another study examined the implications of datafication for privacy and security, finding that the collection and analysis of large amounts of personal data can lead to a loss of privacy and increased risks of data breaches. The study also highlighted the potential for datafication to have a disproportionate impact on marginalized communities. A study by [19] observed at the ways in which information control is used to manipulate public opinion and influence elections. The study found that the spread of false and misleading information through social media and other digital platforms can have a significant impact on political outcomes. A study by [18] examined the implications of information control for democracy, freedom of expression, and human rights, arguing that the manipulation of information can undermine democratic institutions and lead to a loss of trust in media and government institutions. There have been several studies conducted on the topics of datafication and information control. These studies have highlighted the potential benefits and drawbacks of these phenomena, as well as the potential for datafication and information control to be used for nefarious purposes and to have a disproportionate impact on marginalized communities.
Monitoring Algorithm for Datafication and Information Control …
5
4 Algorithm for Datafication and Information Control Through this study author aims to propose the monitoring algorithm for concern of data privacy and management. There are many types of algorithms that can be developed for information control, depending on the specific use case and the desired outcome. Some examples include: • Filtering algorithms, which can be used to identify and remove unwanted or irrelevant information from a dataset • Censorship algorithms, which can be used to block or remove certain types of content from being accessed or shared. • Propaganda algorithms, which can be used to influence public opinion by selectively presenting certain information and suppressing other information. • Personalization algorithms, which can be used to tailor the information presented to an individual based on their interests, preferences, and behavior. • Monitoring algorithms, which can be used to track and analyze the spread of information across different channels and platforms. Monitoring algorithms are a set of techniques used to track and analyze data, systems, and processes. These algorithms can be used to identify patterns, trends, and anomalies in the data, as well as to ensure the integrity and quality of the data. Monitoring algorithms for data privacy and controlling fraud refer to the methods and techniques used to protect sensitive personal information and detect and prevent fraudulent activities. These algorithms can be used to identify patterns, trends, and anomalies in the data, as well as to ensure its integrity and quality. Some examples of monitoring algorithms include: • Anomaly detection: Algorithms that analyze data to detect unusual or suspicious behavior, such as large transactions or login attempts from unusual locations. • Identity verification: Algorithms that verify the identity of a person or entity, such as by comparing a live photo or video to a government-issued ID. • Access control: Algorithms that restrict access to sensitive data based on the identity of the user, their role, and the context of the request. • Data encryption: Algorithms that encrypt data to protect it from unauthorized access. • Data masking: Algorithms that obscure sensitive data to protect it from unauthorized access. Privacy-preserving data mining: Algorithms that analyze data while preserving the privacy of individual data subjects. Propaganda algorithms, which can be used to influence public opinion by selectively presenting certain information and suppressing other information. Such algorithm can influence through social media of public advertisements to influence public or community at a large.
6
M. Mehta
5 Conclusion These algorithms can be applied to a wide range of data types and formats, including text, images, and sensor data. It’s important to note that these algorithms are not foolproof and should be part of a larger security strategy. Datafication is technological trend entering into business and government places at great level. Artificial intelligence comes with great advantage but responsibility as well. There is no doubt about the application comes with huge blessings to startups to the healthcare sectors. Government can trace the potential threat it may faces in advance. Future of datafication lies in its careful application with some regulations. In the context to decision making and analysis, datafication would be efficient tool. Datafication combined with blockchain technology can bring more securely and original results. In further studies this research work can be base for identifying the future possibilities of application of datafication like cognitive computing or machine learning. Datafication can be applicable with monitoring algorithm in the filed of marketing like brand to persona, in the field of higher education to keep surveillance of the record for the accreditation purpose as well. In short, this study gives huge highlights on the powerful aspect of datafication with careful application of it from business, government to the society at a large. Acknowledgements This research is free from any funding and there is no conflict of interest.
References 1. Moyo DZ, Dixon DP (2022) Datafication, digitisation, and the narration of agriculture in Malawi. In: Routledge handbook of the digital environmental humanities, 1st edn. Routledge, London, pp 172–181 2. Bibri SE, Allam Z (2022) The metaverse as a virtual form of data-driven smart cities: the ethics of the hyper-connectivity, datafication, algorithmizing, and platformization of urban society. Comput Urban Sci 2(1):22 3. Wang G, Zhao J, Van Kleek M, Shadbolt N (2023) Treat me as your friend, not a number in your database’: co-designing with children to cope with datafication online 4. Pettigrew AM (1972) Information control as a power resource. Sociology 6(2):187–204 5. Domingo-Ferrer J, Blanco-Justicia A (2020) Privacy-preserving technologies. In: The international library of ethics, law and technology. Springer International Publishing, Cham, pp 279–297 6. Manyika J et al (2011) Social media and fake news in the 2016 election. J Econ Perspect 31(2):211–236 7. Acquisti A (2015) Privacy in the age of big data: a research agenda. Inf Syst Res 26(2):167–182 8. Barocas S, Selbst SD (2016) Big data’s disparate impact. SSRN Electron J 9. Boyd D, Crawford K (2012) Critical questions for big data. Inform Commun Soc 15 10. Joffe G (2017) Data colonialism: the exploitation of black and brown bodies in the age of big data. Int J Commun 11 11. Gerbaudo P (2012) Tweets and the streets: social media and contemporary activism. Pluto Press 12. Benkler Y, Roberts H, Zuckerman E (2017) A duty of care in the algorithmic society. Harvard Business Review
Monitoring Algorithm for Datafication and Information Control …
7
13. Tufekci Z (2014) Big questions for social media big data: representativeness, validity and other methodologies.Proc Int AAAI Conf Web Social Media 8:505–514 14. Howard PN (2011) Reply to Evgeny morozov’s review of the digital origins of dictatorship and democracy: Information technology and political Islam. Perspect Politics 9(4):900–900 15. King G, Rieder B, Gottfried J (2016) How the news media amplify political polarization. Am J Political Sci 60(3):515–530 16. Stoker G (2018) Why democracy needs a strong state. Cambridge University Press 17. Allcott H, Gentzkow M (2017) Social media and fake news in the 2016 election. J Econ Perspect 31(2):211–236 18. Epstein B, Gitlin T (1996) The twilight of common dreams: Why America is wracked by culture wars. Contemp Sociol 25(6):725 19. Woolley SC, Howard PN (2016) Automation, algorithms, and politics| political communication, computational propaganda, and autonomous agents—introduction. Int J Commun 10:9
Understanding Customer Perception Regarding Branded Fuel ‘XP-95’ in IOCL Retail Outlets Using Business Intelligence (BI) by the Help of NVIVO and SPSS Software Arunangshu Giri , Dipanwita Chakrabarty , Adrinil Santra , and Soumya Kanti Dhara
Abstract Petrochemical industry plays an important role in a country and IOCL (Indian Oil Corporation Ltd.) is one of the most eminent names in Indian petrochemical industry. IOCL Marketing division was keen to understand the customers’ behavioural intentions towards purchase of XP 95. XP 95 is a comparatively high price product though it protects engine and hence gives a better mileage than that of the ordinary petrol as claimed by the company. This research has been executed with the aim of answering the major research question (How do customers perceive the performance of XP-95?). After collecting the data from 1000 respondents through convenience sampling method, we have used both qualitative (Thematic Analysis) and quantitative analysis (Collinearity Testing and Multiple Regression Analysis) using Business intelligence (BI) by the help of NVIVO and SPSS Software. This research has unfolded the truth that some factors, like mileage, anti-knocking, pollution reduction are major underlying reasons behind a favorable purchase decision of XP-95 by the customers. Keywords Customer perception · Branded fuel (XP-95) · Retail outlets · Business intelligence (BI) · NVIVO · SPSS software
1 Introduction IOCL (Indian Oil Corporation Limited) is a diversified, integrated energy major with presence in almost all the streams of oil, gas, petrochemicals and alternative energy sources. It is a world of high-calibre people, state-of-the-art technologies and cuttingedge R&D; a world of best practices, quality-consciousness and transparency; and a world where energy in all its forms is tapped most responsibly and delivered to the A. Giri (B) · D. Chakrabarty · A. Santra · S. K. Dhara Haldia Institute of Technology, Haldia, West Bengal 721657, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_2
9
10
A. Giri et al.
consumers most affordably. IOCL is the highest-ranked Indian Energy PSU (Rank 212) in Fortune’s ‘Global 500’ listing with the vision to become ‘A Globally Admired Company’. XP-95 is mainly purchased by two or four wheeler sports cars as observed by the assistant manager of the coco. XP 95 is a comparatively high price product though it protects engine and hence give a better mileage than that of the ordinary petrol as claimed by the company [1]. This research was aiming the performance of XP 95 regarding customer reach is concern. To answer this, it is important to know whether customers are aware about the product. Also, the perception of the consumers regarding the product is an important consideration before deciding about the promotional strategies [2]. Therefore, we framed the major research question: ‘How do customers perceive the performance of XP-95?’
2 Research Methodology In this research paper, we used cross-sectional descriptive research design framework [3]. Here, both secondary and primary data were used for establishing the research model. We collected secondary data from different research reports and websites [4]. On the other hand, we collected primary data from targeted respondents by the help of pre-tested structured questionnaire using 5 Point Likert Scale ranged from ‘Strongly Agree-5’ to ‘Strongly Disagree-1’ [5]. We distributed this questionnaire among 1000 customers visited in IOCL Retail Outlets. In structure questionnaire, we have used close-ended questions for quantitative analysis and open-ended questions for thematic and qualitative analysis. We used convenience sampling method for collecting primary data [6, 7]. The survey period was from 15th July, 2022 to 15th October, 2022. In this study, we executed Collinearity Testing and Multiple Regression Analysis using SPSS-28 software and Thematic Analysis using NVIVO software. Initially we checked reliability of primary data [8].
3 Data Analysis and Results See Table 1.
3.1 Qualitative Analysis for Customer Perception (Regarding XP-95) Measurement From Word-Mapping of NVIVO Software (Fig. 1), we can interpret that almost all the customers reported that they received better mileage from XP-95. This was the
Understanding Customer Perception Regarding Branded Fuel ‘XP-95’ … Table 1 General information about customer-respondents (demographic profile)
11
Categories
Characteristics
Frequency
Percentage (%)
Gender
Male
658
65.8
Female
342
34.2
Less than 25 years
172
17.2
26–35 years
485
48.5
36–45 years
196
19.6
More than 45 years
147
14.7
Less than Rs. 250,000
408
40.8
Rs. 250,001 to Rs. 500,000
294
29.4
Rs. 500,001 to Rs. 1,000,000
182
18.2
More than Rs. 1,000,000
116
11.6
Below 10th Std 113
11.3
Up to 10th Std
265
26.5
Up to 12th Std
398
39.8
UG
142
14.2
PG
82
8.2
Rural
295
29.5
Urban
343
34.3
Semi-urban
256
25.6
Metro-city
106
10.6
Age
Family income/ annum
Educational qualification
Residence of respondents
major cause for customers opting for XP-95 apart from promotions. XP-95 users also stated that it kept the engine health good due to its superior anti-knocking quality compared with MS (Motor-Spirit). Environment-conscious customers also preferred XP-95 due to the fact that it reduces hydrocarbon emissions. One of the major cause for which customer didn’t opt for XP-95 was that it more expensive compared to normal petrol. Many customers preferred using XP-95 because they felt better quality performance from their vehicles while using XP-95.
12
A. Giri et al.
Fig. 1 NVIVO Output-thematic analysis regarding the feedback on customer perception regarding XP-95
3.2 Quantitative Analysis for Customer Perception (Regarding XP-95) Measurement Reliability Testing. Cronbach’s Alpha value was tested for measuring the reliability of primary dataset. The Alpha value more than 0.7 indicates that it is under acceptable range of reliability [9, 10]. It measures internal consistency of primary dataset including dependent and independent variables. The Alpha result of 0.895 (Table 2) was a strong acceptable value. Here, we identified 7 factors. We took ‘Customer Perception regarding XP-95’ as dependent factor and remaining 6 factors as independent factors. Also, the ‘Variance Inflation Factor’ (VIF) values of less than 3 (Table 3) under acceptable range indicated that independent factors influencing Customer Perception regarding XP-95 as dependent factor are free from multicollinearity [11]. Multiple Regression Analysis. Here, we used Multiple Regression to find the impact of independent factors on the dependent factor ‘Customer Perception regarding XP95’. From the Model Summary of Regression analysis (Table 4), we observed that Table 2 Reliability statistics for customer perception (regarding XP-95) measurement
Cronbach’s alpha
N of items
0.895
6
Understanding Customer Perception Regarding Branded Fuel ‘XP-95’ …
13
Table 3 Collinearity statistics for customer perception (regarding XP-95) measurement Factors
Tolerance
VIF
Pollution reduction (X1)
0.302
2.135
Mileage (X2)
0.283
2.431
Anti-knocking (X3)
0.275
1.279
Acceleration (X4)
0.353
2.106
Expensive (X5)
0.416
1.348
Quality (X6)
0.221
1.822
Dependent variable: customer perception regarding XP-95
R-value (correlation coefficient) was 0.897 with 1% level of significance. It indicated that there was a strong correlation between dependent and independent variables [11]. Also we got acceptable R-Square value of 0.805 or 80.5%. On the other hand, Durbin-Watson value ( 0): Add the product; Compute Average of: Product_quality_percentage; Worth_the money_percentage; comfort_percentage; User_experience_percentage; recommend_others_percentage Display the percentage of: Product_quality_percentage; Worth_the money_percentage; comfort_percentage; User_experience_percentage; recommend_others_percentage; Display the reviews with highest priority on top; if(Delete = = 1):Delete the product; Stop
4 Result and Analysis The performance of enhanced online product review monitoring system which works on the basis of signed interference algorithm and genetic algorithm is compared with the spam scanners which scan the reviews using spam filtering software that is based on machine learning technology. But the model is deterministic in nature which produces different output each time when the model is run with the same initial state that is, it classifies the review differently when the same review is provided for multiple times. It also requires huge amount of dataset for training the model and needed human intervention to preprocess the data and analyze it to apply as the input to the corresponding model which is a time-consuming method. The time consumed by the online product review monitoring system 37% less than machine learning approach and 75% less than collective hyping mechanism. Comparison between time vs number of reviews shown in Table 1 and Fig. 5.
38
R. Sainivedhana et al.
Fig. 4 Genetic algorithm flowchart
The enhanced online product review monitoring system performance is also compared with collective hyping mechanism which works by recognizing identical patterns in spam reviews and labeling an input as spam if it contains the same pattern as spam reviews contain. But it does not label the reviews properly because all kinds of spams cannot be provided as input for training. If there is any unseen pattern of spam about which the system is not trained or if the pattern is not recorded in the evaluating system, then it does not identify it as spam. It wrongly predicts by giving genuine label. To the worse case, online product review monitoring system
Automated Spurious Product Review Auditing System
39
Table 1 Time versus number of reviews Reviews under consideration
Time in seconds Online product review monitering system
Percentage decrease rate
Machine learning approach
Collective hyping mechanism
PDR differentiating with machine learning approach
PDR differentiating with collective hyping mechanism
(1) 12
32
52
122
42
77
(2)18
47
82
202
45
79
(3) 22
62
97
252
38
78
(4) 27
77
142
322
48
79
(5) 32
92
172
382
49
78
(6) 37
107
212
452
52
78
Fig. 5 Number of reviews versus time taken
is 63% more accurate than machine learning approach and possess 72% more accuracy than collective hyping mechanism. The accuracy of review monitor is high when compared to machine learning and collective hyping approach since it uses the simple algorithms of signed interference algorithm and genetic algorithm. Comparison between number of reviews vs number of error count shown in Table 2 and Fig. 6. The error percentage for online product review monitoring system is 50% lower than machine learning approach and 40% lower than collective hyping mechanism since the system does not depend on linguistic features to detect the spam reviews rather the system works based on solid conditions like whether the total number of reviews given by the customer is 1 for this product and whether the content of the
40
R. Sainivedhana et al.
Table 2 Number of reviews versus number of error count Reviews under consideration
Error count
(1) 12
0
5
7
(2) 18
2
7
10
71.4
80
(3) 22
2
8
15
75
86.6
(4) 27
3
10
17
70
88.23
(5) 32
6
18
18
66.6
66.6
(6) 37
8
20
24
60
66.6
Online product review monitering system
Percentage decrease rate Machine learning approach
Collective hyping mechanism
PDR differentiating with Machine learning approach
PDR differentiating with collective hyping mechanism
100
100
Fig. 6 Number of reviews and number of error count
review matches the product which is being reviewed. Comparison between Number of spams Recognized vs Number of Reviews is shown in Table 3 and Fig. 7. The review monitoring system takes less time when compared to machine learning system and collective hyping mechanism to verify the same number of reviews. This speed efficiency is due to the fact that is built using simple if then rules whereas the collective hyping mechanism needs to identify the pattern in the review and compare whether the pattern is as same as the spam reviews pattern which takes more time to analyze.
Automated Spurious Product Review Auditing System
41
Table 3 Spams recognized versus number of reviews Reviews under consideration
Spam recognition
(1) 12
12
6
(2) 18
18
(3) 22
21
(4) 27 (5) 32 (6) 37
Online product review monitering system
Machine learning approach
Percentage decrease rate Collective hyping mechanism
PDR differentiating with machine learning approach
PDR differentiating with collective hyping mechanism
5
50
58.3
10
9
44.4
50
12
10
42.8
52.3
26
15
10
42.3
61.5
30
20
18
33.3
40
36
24
21
33.3
38.8
Fig. 7 Number of spams recognized versus number of reviews
5 Conclusion Analysis is made to compare the performance between the three mechanisms based on speed factor, error metrics and spam detection. As the result of the analysis, the enhancing online product review monitoring system proved to be the best while comparing with machine learning system and collective hyping mechanism. This system other than identifying fake reviews, also lists the prioritized review list for the user to know about the product in a more efficient way which helps to increase the sale of the product. Also serves to ensure that all the users give reviews only after using the product. It also helps the user giving clear overview of the product under
42
R. Sainivedhana et al.
the sectors like product quality, whether the product is worth the money, comfort of using the product, whether the user will recommend the product to others and user experience in a percentage form. If the user is not interested in typing review it makes the life simple my letting the user fill the stars if they are satisfied in the corresponding sector. So, using this product reviewer increases business revenue and also makes the customers life easy.
References 1. Shehnepoor S, Salehi M, Farah-bakhsh R, Crespi N (2017) NetSpam: a network-based spam detection framework for reviews in online social media. IEEE Trans Inform Forens Sec 12(7):1585–1595 2. Radovanovi´c D, Krstaji´c B (2018) Review spam detection using machine learning. In: 2018 23rd international scientific-professional conference on information technology (IT) 3. Jia S, Zhang X, Wang X, Liu Y (2018) Fake reviews detection based on LDA. In: 2018 4th IEEE international conference on information management 4. Zhang Q, Wu J, Zhang P, Long G, Zhang C (2017) Collective hyping detection system for identifying online spam activities. IEEE Intell Syst 32(5):53–63 5. Rajamohana SP, Umamaheswari K, Dharani M, Vedackshya R A survey on automated spurious spam detection techniques. IEEE international conference on innovations in green energy and healthcare technologies (ICIGEHT’17) 6. Istiaq Ahsan MN, Nahian T, Kafi AA, Hossain MI, Shah FM (2016) Review spam detection using active learning. In: 2016 IEEE 7th annual information technology, electronics and mobile communication conference (IEMCON) 7. Kumar S, Gao X, Welch I, Mansoori M (2016) A machine learning based web spam filtering approach’. In: 2016 IEEE 30th international conference on advanced information networking and applications 8. Yang X One methodology for spam review detection based on review coherence metrics. In: Proceedings of 2015 international conference on intelligent computing and internet of things 9. Algur SP, Biradar JG (2015) Rating consistency and review content based multiple stores review spam detection. In: 2015 International conference on information processing (ICIP) 10. Lin Y, Zhu T, Wu H, Zhang J, Wang X, Zhou A (2014) Towards online AntiOpinion spam: spotting fake reviews from the review sequence. In: 2014 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM 2014) 11. Liang D, Liu X, Shen H (2014) Detecting spam reviewers by combing reviewer feature and relationship. In: 2014 International conference on informative and cybernetics for computational social systems (ICCSS) 12. Wang G, Xie S, Liu B, Yu PS (2011) Review graph based online store review spammer detection. In: 2011 IEEE 11th international conference on data 13. Lai CL, Xu KQ, Lau RYK, Li Y, Jing L (2010) Toward a language modeling approach for consumer review spam detection. In: 2010 IEEE 7th international conference on e-business engineering 14. Algur SP, Patil AP, Hiremath P, Shivashankar S (2010) Conceptual level similarity measure based review spam detection. In: 2010 International conference on signal and image processing
A Comparison of the Key Size and Security Level of the ECC and RSA Algorithms with a Focus on Cloud/Fog Computing Dhaval Patel, Bimal Patel, Jalpesh Vasa, and Mikin Patel
Abstract The rapid development of data generation and demand for real-time processing and analysis in the Internet of Things (IoT) has led to the emergence of new issues. In response, an extension of cloud computing, known as fog computing, aims to overcome these problems. It aims to bring computation, storage, and network services closer to the end-user by using a distributed computing infrastructure. With the ever-increasing need for secure communication in the digital age, the importance of encryption algorithms cannot be overstated. This study provides an analysis of two commonly used encryption techniques, Elliptic Curve Cryptography (ECC) and Rivest-Shamir-Adleman (RSA), with a specific focus on cloud/fog computing. The research compares the key size and security level of the ECC and RSA algorithms and evaluates their suitability for usage in resource-constrained fog computing environments. ECC, a newer technique, offers the same level of security as RSA but uses smaller key sizes, making it more resource-efficient. The research highlights that the choice of encryption technique for cloud/fog computing depends on the system’s specific requirements. The study concludes that ECC could be recommended for improved security and faster performance without putting unnecessary strain on the computing resources. RSA, on the other hand, has been widely accepted and proven its security over the years. The comparison shows that the choice of encryption algorithm for cloud/fog computing depends on the specific requirements of the system.
D. Patel (B) · B. Patel · J. Vasa · M. Patel Smt. K D Patel Department of Information Technology (KDPIT), Chandubhai S. Patel Institute of Technology (CSPIT), Faculty of Technology and Engineering (FTE), Charotar University of Science and Technology (CHARUSAT), Changa, Gujarat, India e-mail: [email protected] B. Patel e-mail: [email protected] J. Vasa e-mail: [email protected] M. Patel e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_5
43
44
D. Patel et al.
Keywords RSA · ECC · Cryptography · Computation speed · Computation cost · IoT · Cloud computing
1 Introduction Human civilizations today have enthusiastically embraced physically smaller computer devices like mobile phones, digital assistants, smart cards, and other embedded systems to enhance daily life. Technology is advancing quickly, which has led to a surge in cloud-connected gadgets that are producing enormous amounts of data. The cloud is the largest data units where processing and storage are done primarily to use the resources and services users are looking for [1]. On larger devices, in scenarios like smart factories, farms, and cities, and on smaller devices, in situations like smart chips, cell phones, and portable gadgets [2].
1.1 Cloud Computing The basic goal of employing cloud computing is to provide computer services over a wide range of networks using remote servers, data, and software. Cloud-based services make use of a global network of servers connected by various web resources. In the cloud, actual computers connected by the internet work with both wired and wire-less gadgets in addition to actuators and readers. The different cloud nodes serve as middlemen between service providers and end consumers. Because they may provide scalable processing power, big data analysis storage locations specific to cloud architecture are required [3]. The cloud is a crucial component for effective Industrial Internet of Things (IIoT) applications [4].
1.2 Fog Computing According to Cisco, the expansion of cloud computing used to analyze large amounts of data and provide better services with lower latency rates is known as fog computing [5, 6]. For the purpose of providing data services, Fog has a significant number of geographic nodes set up over a big area [7]. Every server in Fog is a more energy-efficient update to the cloud network while still offering services close to the customers [8, 9]. The benefit of having Fog servers close to devices is that they can store data and operate in real-time [10, 11]. Fog networks were created to supplement cloud computing while reducing reliance on the cloud for resource availability [12]. One advantage of deploying fog in the Internet of Things is lower data transfer latency. It supports IoT applications and offers low latency, mobility,
A Comparison of the Key Size and Security Level of the ECC and RSA …
45
location recognition, scalability, security, and integration with heterogeneous devices [13, 14]. The smaller devices are highly prone to security breaches since they have constrained computational power, fewer memory, and inefficient energy consumption resources, whether they are distributed through Cloud computing or Fog architecture networks. Doubts about the service provider’s honesty and the reliability of the technologies used to handle the sensitive data of the end user exacerbate the issue. Despite the numerous studies that have been done to delve into these problems despite their inherent limits and despite the extraordinary answers that have been offered, the residual problems have not yet been totally or properly remedied. The most effective part of stopping data theft is the encryption procedures. Data encryption is a technique for limiting access to sensitive data on any user’s device without that user’s express agreement. Due to the massive volume of data being generated by technology and processed, filtered, and analyzed on the cloud, there are now problems with traffic congestion, delays, and privacy concerns. This is due to the large number of connected devices. Many limitations also become clear because the cloud is unable to provide some necessities such heterogeneous devices, low latency, mobility, and position identification [15]. Moreover, the IoT’s current infrastructure is not designed to handle the volume and speed of data that it generates [16].
1.3 Internet of Things (IoT) One approach to describe the Internet of Things is as a collection of interconnected devices and internet networks, the data from which must be efficiently processed and stored in order to be retrieved whenever necessary in a safe and understandable manner. And for that, cloud architecture is a key component for effective data processing and storing in the IoT area [17, 18]. The Internet of Things (IoT) relies on cloud computing to function globally, as this technology makes it possible for data to be safely and covertly accessed from anywhere in the world. When a situation involving limited resources occurs, the IoT’s are sensitive to external threats since, unlike other industrial settings, they are not dependent on dependable communication. [19–21].
2 Background In earlier cryptosystems, all messages were encrypted and decrypted using symmetrical keys. Because users at both endpoints used the same key, they were said to as symmetrical. The security of the data after the end key is compromised is one of their worst mistakes. Among all asymmetric encryption techniques, the RivestShamir-Adleman (RSA) and ECC algorithms are among the top PKC techniques. They out-peform several crypto-analytical methods when used collectively [22].
46
D. Patel et al.
Even yet, there will always be a trade-off between a computationally constrained device’s advantages and disadvantages. The CPU, a specific type of data storage memory, and power-supplying batteries are always present in devices with modest computing capabilities. Since chips, transistors, and microprocessors are all getting smaller due to the arbitrary Moore’s law, a more effective but more expensive method of securing the information processed and stored through them is required. When an algorithm like the Rivest-Shamir-Adleman (RSA) algorithm is employed as a Public Key Cryptography (PKC) method to increase computational capacity, extra safeguards are more necessary [23]. In this study, a comparison between the RSA and ECC algorithms is done based on the key sizes. To gather information for analyzing these two algorithms, a thorough literature analysis was conducted. In order to create a more reliable and secure computing device, the goal is to supply researchers with a superior algorithm as a replacement to one that has already been used. The results would be very helpful in identifying a solution that can use resources wisely and overcome existing weaknesses. It would undoubtedly provide a more secure option with better performance for devices with limited resources.
2.1 Rivest-Shamir-Adleman Algorithm (RSA) Rivest-Shamir-Adleman’s (RSA) algorithm, developed in 1978 by Rivest and his colleagues, is widely recognized as the most reliable, secure, and trustworthy asymmetric algorithm [24, 25]. The RSA is a very flexible asymmetric algorithm since it can use variable-length keys. However, employing RSA has a limitation because the speed is slightly reduced in trade for the data’s security. The size of an RSA key ranges from 512 to 2048 bits. The encryption/decryption keys in RSA are kept distinct, with the encryption method being publicly accessible and the decryption key being kept in a secure location. Two huge prime numbers are used in the RSA public keys, which aid in the message’s encryption [26]. The private keys can be crucial for message decryption, and only individuals who are familiar with prime numbers can use them. Numerous cryptographic evaluations over the years have deemed RSA to be a very trustworthy algorithm. The level of dependability of RSA is excellent. The difficulties needed to factorize the huge numbers in RSA are what give it its resilience. On the other hand, RSA’s effectiveness would be readily jeopardized if the vast numbers it contains could be easily accounted for. The creators of the RSA algorithm proposed a decoding task challenge in 1991 as a “Factorization attack” to test the complexity or simplicity of its breakage. The attack pattern and timing are critical factors in the RSA algorithm. The benefits of employing the RSA method are found in the faster and easier encryption stages. RSA is simpler for developers to comprehend, and it is also simpler to implement. It has received significant support from businesses and industries, and as a result, it is widely used in many different global sectors. The RSA is not without flaws, even with its widespread use. RSA is vulnerable to GCD threats in the event of a subpar execution. The GCD attack is simpler to launch and includes readily downloadable source codes on sites like
A Comparison of the Key Size and Security Level of the ECC and RSA …
47
GitHub. It can run on smaller servers and doesn’t require much more RAM. When one unique element is known in RSA, it becomes simpler for GCD to operate. The slower rate at which the key is created is another problem with RSA. Additionally, the slow rate of signing and decoding makes it difficult to deploy RSA inconspicuously and securely.
2.2 Elliptical Curve Cryptography Algorithm (ECC) The security level is maintained despite the keys’ lower size. Elliptical Curve Cryptography (ECC) is a PKC that makes use of the algebraic structure of the elliptical curve on finite fields [27]. The brilliance of ECC rests in their use of fewer keys to increase security as compared to non-ECCs. Both public and private keys The ECC, which is frequently used in comparison to RSA, results in the formation of sets of public—key for cryptography utilizing keys that are necessary for ECC authentication. Koblitz [28] and Miller [29] each used elliptical curves for cryptography reasons in the 1980s (Tables 1, 2, 3 and 4). Because virtually no known sub-exponential algorithm can read the particular logarithmic sequentially on a desired elliptical curve, the originality of ECC resides on using smaller parameters compared to RSA or any non-ECCs. Therefore, employing ECC will ultimately result in faster computing and more security. Together with a scaled multiplying categorization algorithm, the ECC carries out its task. ECC Table 1 Comparison of key size in symmetric and asymmetric algorithm as suggested by NIST
Table 2 ECC and RSA comparison in terms of cost
Symmetric algorithm key size (in bits)
Asymmetric algorithm key size (in bits) RSA key size
ECC key size
80
1024
160
112
2048
224
128
3072
256
192
7680
384
256
15,360
512
Key size
Ratio of Cost
RSA
ECC
1024
160–223
3:1
2048
224–255
6:1
3072
256–383
10:1
7680
384–511
32:1
15,360
512
64:1
48
D. Patel et al.
Table 3 Key size ratio for RSA and ECC algorithms
Key size
Key size ratio
RSA
ECC
512
112–159
5:1
1024
160–223
6:1
2048
224–255
9:1
3072
256–383
12:1
7680
384–511
20:1
15,360
512+
30:1
Table 4 ECC and RSA comparison on various aspects Computation
Key size
Bandwidth saving
Key generation
Encryption
Decryption
Small devices efficiency
ECC
Fast
Small
High
Fast
Fast
Fast
High
RSA
Slow
Large
Low
Slow
Slow
Slow
Low
uses extremely effective algorithms to factor numbers greater than 10,100, such as General Number Field Sieve (GNFS) techniques, among others. ECC is frequently used on hardware and software platforms to improve security while transferring information through dubious networks. A few examples of software that uses ECC are the Bitcoin digital signature protocol [30], OpenSSL [31], and picture encryption. When employing ECC, wireless medical devices, Android applications, and Radio Frequency Identifiers require the deployment of the necessary hardware in the IoT domain. The primary factor affecting an algorithm’s effectiveness in a scalar format is its Hamming weight, which measures the frequency of nonzero digits in encryption, private code. Any scalar algorithm is made faster by a reduction in Hamming weight. Scalar recoding can therefore be used for the ECC representation of private keys to lower the Hamming weight. ECC offers benefits in terms of smaller keys or cipher texts and quicker key creation. The implementation of the signatures is likewise quicker here. Similarly, the encryption and decryption processes in ECC are comparatively quicker. Because the signing procedure is split into two sections for computing purposes, the latency is significantly lower when compared to RSA in ECC. ECC boasts a strong authentication system that is necessary for key exchange. Recently, administrations and governmental organizations have supported ECC more than others. New crypto can be accepted in ECC because to special curves developed for matching in bilinear processes. ECC offers benefits in terms of smaller keys or cipher texts and quicker key creation. The implementation of the signatures is likewise quicker here as shown in Fig. 1.
A Comparison of the Key Size and Security Level of the ECC and RSA …
49
Fig. 1 Elliptical curve [36]
Fig. 2 Comparison of ECC and RSA in terms of Key size
2.3 ECC Standardization Process To achieve practical implementation and widespread acceptance, standardization is crucial for Elliptic Curve Cryptography (ECC). The National Institute of Standards and Technology (NIST), a federal government agency of the United States, has developed essential standards to ensure secure application of ECC in various cryptoanalytical domains. For safer elliptical curved binary fields, the NIST recommends fields of 2163, 2233, 2248, 2409, and 2571 (Figs. 2, 3 and 4).
3 Comparison of ECC and RSA Algorithm The computation of employed scale multiplication determines how efficient ECC is (kp). Here, a 160-bit ECC key size is sufficient for a 1024-bit RSA key size. The ECC offers the same level of security as other algorithms like RSA, AES, etc. with
50
D. Patel et al.
Fig. 3 Cost Ratio of RSA and ECC
Fig. 4 A key size ratio for similar level of security of RSA and ECC
a tiny key size. A smaller key size for the ECC method can offer the same level of security as a higher key size for the RSA technique. For instance, the 160-bit key of ECC will accommodate the 1024 bit in RSA together with the extremely precious key in it of 2048 for the corporate key. As secure as RSA with 15,424 bits, ECC with 512 bits. The sizes of RSA keys and ECC keys do not correlate linearly. In other words, doubling the size of an RSA key does not also double the size of an ECC key. This stark contrast demonstrates how much faster ECC key creation and signing are than RSA, as well as how much less memory ECC consumes. ECC has smaller cypher texts, keys, and signatures, as well as faster key and signature creation. Deploying Elliptic Curve Cryptography (ECC) securely is challenging, which is considered its biggest flaw. ECC has a larger learning curve and takes more time to produce usable results than RSA, which is simpler on both the validation and encryption sides. Additionally, the private and public keys in ECC are not interchangeable equally, in contrast to RSA, where they are both integers. The private key is still an integer in ECC, whereas the public key is a point on a curve. ECC enables systems with limited resources, such as mobile devices, embedded computers, and crypto
A Comparison of the Key Size and Security Level of the ECC and RSA …
51
currency networks, to consume only 10% of the bandwidth and storage space needed by RSA. The benefits of ECC may be experienced in the following circumstances: when there is a lot of online commerce going on or when there is a big need for web servers. For instance, internet banking or e-commerce. device with a slower processing rate and less memory. such as smartphones and tablets. devices with memory and storage limitations and reduced CPU speeds. Typical examples include mobile devices, tablets, smartphones, smart cards, etc.
4 Conclusion The ever-increasing number of smart devices necessitates the use of more advanced data security technologies. This analysis concludes that the Elliptic Curve Cryptography (ECC) algorithm is significantly superior to widely used algorithms like RSA in terms of security level, which is dependent on key size. Therefore, ECC is recommended for improved security and faster performance without adding unnecessary strain to computers. ECC-like algorithms are not only more secure than RSA but also more practical for everyday use, especially in a world where smaller devices have inherent limitations with CPU power and battery life. ECC also uses less energy and costs less than RSA. However, like RSA, ECC is susceptible to sidechannel attacks and fault attacks, so further development is necessary to improve its security. In conclusion, ECC is a viable alternative to RSA for securing data in resource-limited environments, but ongoing improvements are necessary to ensure its continued security [32].
References 1. Paharia B, Bhushan K (2018) Fog computing as a defensive approach against distributed denial of service (ddos): a proposed architecture. In: 2018 9th International conference on computing, communication and networking technologies (ICCCNT) (2018). 2. Sittón-Candanedo I, Alonso RS, Rodríguez-González S, García Coria JA, De La Prieta F (2020) Edge computing architectures in industry 4.0: a general survey and comparison springer international publishing 3. Díaz M, Martín C, Rubio B (2016) State-of-the-art, challenges, and open issues in the integration of internet of things and cloud computing. J Netw Comput Appl 67:99–117 4. Shu Z, Wan J, Zhang D, Li D (2016) Cloud-integrated cyber-physical systems for complex industrial applications. Mob Netw Appl 21(5):865–878 5. Chen Y, Sun E, Zhang Y (2017) Joint optimization of transmission and processing delay in fog computing access networks. In: 2017 9th international conference on advanced infocomm technology (ICAIT) 6. Sarkar S (2016) Theoretical modelling of fog computing: a green computing paradigm to support IoT applications. IET Netw 5(6):23–29
52
D. Patel et al.
7. Sari A (2018) Context-aware intelligent systems for fog computing environments for cyberthreat intelligence. In: Fog computing. Springer International Publishing 8. Khakimov A, Muthanna A, Muthanna MSA (2018) Study of fog computing structure. In: NW Russia young researchers in electrical and electronic engineering conference (EIConRusNW). IEEE 9. Pan J, McElhannon J (2018) Future edge cloud and edge computing for Internet of things applications. IEEE Int Things J 5:439–449 10. Lin C, Yang J (2018) Cost-efficient deployment of fog computing systems at logistics centers in industry 4.0. IEEE Trans Ind Inform 14:4603–4611 11. Grover J, Jain A, Singhal S, Yadav A (2018) Real-time vanet applications using fog computing. In: Proceedings of first international conference on smart system, innovations and computing. Singapore 12. Shukla S, Hassan MF, Jung LT, Awang A (2019) Architecture for latency reduction in healthcare iot. Springer International Publishing 13. Manju AB, Sumathy S (2019) Efficient load balancing algorithm for task preprocessing in fog computing environment. In: Smart intelligent computing and applications, pp 291–298 14. Vasa J et al (2023) Architecture, applications and data analytics tools for smart cities: a technical perspective. In: Sentiment analysis and deep learning: proceedings of ICSADL 2022. Springer Nature Singapore, Singapore, pp 859–873 15. Mohamed N, Al-Jaroodi J, Jawhar I, Noura H, Mahmoud S (2017) Uavfog: a uav-based fog computing for internet of things. In: 2017 IEEE SmartWorld, ubiquitous intelligence & computing, advanced & trusted computed, scalable computing & communications, cloud & big data computing, internet of people and smart city innovation (smartWorld/SCALCOM/ UIC/ATC/CBDCom/IOP/SCI) 16. Ashrafi TH, Hossain MA, Arefin SE, Das KDJ, Chakrabarty A (2017) Iot infrastructure: fog computing surpasses cloud computing. In: Hu Y-C, Tiwari S, Mishra KK, Trivedi MC (eds) Intelligent communication and computational technologies, chapter: internet of things. Springer, Singapore 17. Campeanu G (2018) A mapping study on microservice architectures of Internet of things and cloud computing solutions. In: 2018 7th Mediterranean conference on embedded computing (MECO) 18. Chovatiya F et al (2018) A research direction on data mining with IoT. In: Inform Commun Technol Intell Syst (ICTIS 2017) 1:183–190 19. Gyandev Prajapati A, Jayantilal Sharma S, Sahebrao Badgujar V (2018) All about cloud: a systematic survey. In: 2018 international conference on smart city and emerging technology (ICSCET) 20. Rui J, Danpeng S (2016) Architecture design of the Internet of things based on cloud computing. In: 2015 seventh international conference on measuring technology and mechatronics automation 21. Pan J, Wang J, Hester A, Alqerm I, Liu Y, Zhao Y (2018) EdgeChain: an edge-IoT framework and prototype based on blockchain and smart contracts. IEEE Internet of Things J 6(3):4719– 4732 22. Kumar S (2006) Elliptic curve cryptography for constrained devices. Dissertation, RurhUniversity Bochum 23. Wang Y, Streff K, Raman S (2012) Smartphone security challenges. Computer 45(12):52–58 24. Rivest RL, Shamir A, Adelman L (1978) A method for obtaining digital signature and publickey cryptosystems. Commun ACM 21:120–126 25. Mansour A, Davis A, Wagner M, Bassous R, Fu H, Zhu Y (2017) Multi-asymmetric cryptographic RSA scheme. In Proceedings of the 12th Annual Conference on Cyber and Information Security Research (pp. 1–8) 26. Hong J-H (2000) RSA public key crypto-processor core design and hierarchical system test using IEEE 1149 family. Ph.D. dissertation, Dept. Elect. Eng., National Tsing Hua Univ., Hsinchu, Taiwan R.O.C
A Comparison of the Key Size and Security Level of the ECC and RSA …
53
27. Nimbhorkar SU, Malik LG (2012) A survey on elliptic curve cryptography (ECC). Int J Adv Stud Comput Sci Eng 1(1):1–5 28. Koblitz N (1987) Elliptic curve cryptosystems. Math Comput 48:2003–2009 29. Miller V (1986) Use of elliptic curves in cryptography. Adv Cryptol CRYPTO ’85 LNCS 218(483) 30. Khoirom MS, Laiphrakpam DS, Themrichon T (2018) Cryptanalysis of multimedia encryption using elliptic curve cryptography. Optik 31. Li C, Zhang Y, Xie EY (2019) When an attacker meets a cipher-image in 2018: A year in review. J Inf Secur Appl 48:102361 32. Käsper E (2011) Fast elliptic curve cryptography in OpenSSL. In: Proceedings of the 2011 international conference on financial cryptography and data security, Gros Islet, St. Lucia, 28 February–4 March 2011. Springer, Berlin/Heidelberg, Germany
Data Prevention Protocol for Cloud Computing Security Using Blockchain Technology Priyanka Mishra and R. Ganesan
Abstract Cloud computing is a new developing paradigm which provides many platforms to researchers. It is a promising technology which has continuously revolutionized different sectors in every possible factor. It is a promising technology which provides which provides new dimension, growth and many tremendous factors to the users as well as researchers. In that particular synergy fusion of cloud and blockchain technology helps the service providers and users to provide and utilize the resources which can be authenticate and meanwhile privacy also preserved. This joining method is unique, eco-friendly. This joint method enlightens the various path or dimensions of computing which will able to solve even more than security related issues. The major concern now in cloud is security. It is very much essential to service providers to sustain and maintain customer’s privacy and data loss. The present review paper focused on data security and block-chain using customer’s data privacy solutions. Also an in-depth review is made for improving the enhancement of ability or functionality of cloud systems and security improvement. Block-chain technology in different will help the cloud systems in different aspects and it shows the potential in performance and privacy improvement. Keywords Cloud computing; Security · Block-chain · Data security · Bitcoin
1 Introduction Cloud computing is an emerging technology which provides various paths to users and service providers to allocate their resources in wider dimensions. These factors encouraged the researchers to giving their thoughts and inputs more in this particular field [1]. In present scenario security is most important to protect the data, applications and services and infrastructure. Cloud computing also delivered data service requirements at low cost with minimal efforts and provides high level scalability. This P. Mishra (B) · R. Ganesan Department of Computer Science Engineering, Vellore Institute of Technology, Chennai, Tamil Nadu 600127, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_6
55
56
P. Mishra and R. Ganesan
technology includes profitable and user defined tool where user can easily implement the applications. This implementation may cause the blend of the application by creating appropriate tools in the computing technology [2]. The tool related to the particular application and direct from one end to other represented as end-user direction. The term advancing side of cloud technology refers to that portion where the security of data and information and movement of the application became alike or else the other tools referred to retreat the applications [3]. Different tools developed on the Cloud computing technology and concern the security of the data of customer is the two major functions, need to be concentrated. Due to the high demand of cloud among the users, various applications developed and this made the user to think more about new inventions and set a work piece to deform the tools with particular era. Both user and service providers benefited through this technology which generates the curiosity among the people. The mechanism of cloud is mainly attracted the user due high scalability, low cost and minimal efforts [4]. Cloud computing had built the software evolvement process rapid and flexile but on the other hand it gained raising the safety issues. Users who utilized the information in cloud might face several attacks [5]. People who usually worked in cloud companies had the precedence of retrieving the user information by interconnecting with the authentication procedure. It was difficult for any insider to change the login information of the user in the block-chain technology [6]. Under the new scheme named as distributed ledger-based authentication, none of the insider could dare to enter into an authenticated user’s information. For getting the authentication, the insiders and the outsiders were provided with own ID’s and signatures, which made easy for the service providers to trace this insiders. And also to access the database of the cloud by the users were also be authenticated [7]. Due to demand of cloud technology, it had come out as a key technology to bring the architecture and data service requirements at effective cost with minimal efforts and high level of flexibility and hence highly implemented in various aspects of software industry [8]. The rapid growth in Cloud Computing adaption has been observed but, still the safety of information concerns had not been fully countered. Information security concerns were still hindering the growth of Cloud Computing to some extents and need to be resolved. At the same time Block-chain had established as a key technology to provide security mainly in aspects of nobility, authenticity and confidentiality [9]. This paper explained the various aspects of safety in Blockchain and Cloud Computing and further investigated the implementation of blockchain technology in cloud computing security. The users or clients of the cloud were requested for the resources from the cloud service providers (CSP). This CSP were the third party, supplied cloud storage services to the clients. Attribute Authority and Third-Party Auditor were also worked as a third party providers which were also supposed to supply the security services in cloud [10]. The information of users were on high risk that could be lost, attacked or leaked but no such recourses were available to take away the data from this substandard situation. The users did not have the knowledge to whom they were sharing the information, so for this, the transparency between the users and providers should be required to keep all the information of the users safe and secure [11].
Data Prevention Protocol for Cloud Computing Security Using …
57
2 Block-Chain Technology Block-chain was an emerging and novel technology that could be implemented by users to upsurge the trust and provide safety of data while it came to utilization appropriate services from the Cloud. Block-chain could provide better security in comparison to centralized database security. Block-chain was used to track the history of records that were linked and authenticated using a cryptographic hash function to the previous block in continuous manner [12]. A block-chain was a distributed ledger that might record transactions and hold back the damage. Block-chain typically managed through peer to peer network and designed to disable arbitrary tampering. Block-chain can provide security at par with the data storage at central database [13]. From management aspects, the data storage damages and attacks could be prevented. As well, since the Block-chain had open attribute, transparency could be provided to data when it need to apply an area requiring the disclosure of data [14]. Due to such kind of strengths, it could be more utilized in diversed areas like the financial sector, banking sector, government sector, health management and many more and its applications were expected to expand in near future. Cloud computing had implemented to many industrial sectors because of its efficiency, availability and usage [15]. Block-chain helped in pull out the restriction of many tools in prevailed technologies and modified system performance [16]. The Block-chain technology had unique identity which was surrounded with a public key, and the ownership of the private key moved to the well-deserved user. The user impressions helped in verifying behalf the public key which was stored in the Block-chain identity [17]. The Blockchain application was defiant in remodelling to existed information. It takes out the concept of an assumed arbitrary party for the verifying process. Moreover, it worked as node-to-node in scattered systems, where the node-supported stage of a distributed ledger and network had no central control. The Block-chain mechanism was based on a decentralized circulated approach, which provided various benefits over customized authentication methodologies. It mostly helped in tracking the previous records and activities of the particular user [18].
3 Structure of Block-Chain Normally, in Block-chain, a block carried the important information like the hash of current, timestamp and other information. Main data: Normally depending upon the types of the services provided by the block, namely bank transaction records, contact records, clearance records or data records of [19]. Hash: After the execution of transaction it hashed to code and then broadcasted to other nodes. Block-chain used some unique applications to reduce the data transmission and computing resources because each node blocks contained thousands of transaction records as well as records the final hash to block header. Timestamp: Time duration of block generation.
58
P. Mishra and R. Ganesan
Other Information: Mainly block signature, data that user defines or nonce value were under the categories of other information [20].
4 Features of Block-Chain System A Block-chain system had obtained the restriction where the system should work effectively and efficiently. Hence, the impact of the complete functionality of the Block-chain system had a bigger consequences on the system design. Following were the non-functional attributes of the block-chain network-. • Openness: Due to the compatible behaviour of the nodes, block-chain had the capability to utilize and exchange the data at the period of transaction. • Concurrency: Block-chain performance, as nodes process concurrently, improves. • Scalability: The function of attaching and removing of the new nodes made the block-chain scalable. As far as Scalability was concerned, the focus was majorly on the three parameters:—Transaction Processing rate of Distribution:—Size:— latency or Manageability: • Fault tolerance: The fault-tolerant property of Block-chain network, fault at any node will be transparent to all other nodes in the Block-chain, created the network to work properly in case of a fault. • Transparency: In the network, the Block-chain transactions were noticeable to every nodes. • Security: The cryptographic protocols were utilized in the Block-chain network to safe the information. • Failure Management: There was a process, which made Block-chain network robust, and found the cause of failure.
5 Challenges Block-chain technology was directly related to computer-generated and virtual money and used by all users. But, still several block-chain security challenges were reports, these were as follows: • Block-chain agreement—Block-chain was the group of sequential connection of funda-mental generated blocks; a Block-chain may split into two since temporarily generated last two blocks can be used by two different users, if two distinct peers produced a result during mining at the same time. The block will become insignificant if it’s not picked by the peers in the Bitcoin network as the latest block, continuous mining will become pointless. In the Bitcoin peers with more than fifty per cent mining potential will be followed by the network.
Data Prevention Protocol for Cloud Computing Security Using …
59
• Transactions Security—Various forms could be created utilizing a pliable programming language along with well-written script for handling security matters. Bitcoin contract was used for verification, validation and financial services. A most common method was created on the basis of the script called multisig where multiple-signature method was incorporated. • Wallet Security—Once the public and personal keys got encrypted, bitcoin addressed the utilization of the hash value of a public key. The script which carried the encryption of public and personal keys and if this script was remained unlocked then it was impossible to unlock the bitcoin transaction script. For unlocking the script, the incorporated data from the personal key was utilized and kept inside the bitcoin wallet. And suppose, if this information got missing, which was very important, lead to a big damage of bitcoin. So, the Bitcoin wallet was remained the prime focus, for maintaining this wallet away from the hackers. • Software Security—The software with bug which was utilized in Bitcoin had a major concern. Despite the fact that, the certified developer of bitcoin documentations evidently described all the related bitcoin processes as very proficient and impactful [21].
6 Secure Solution for Cloud Under cloud system, the psychological and economical losses were happened at the time when the user’s personal information got revealed or leaked. The lots of research was going on, whether all the user’s private information were kept secured or not in the cloud environment during shifting and saving of the data. The technology named as block-chain technology which ensured for the safety of all the data remained in the cloud environment. While using Block-chain technology, a secure e-wallet is positioned. If the e-wallet was not omitted, which could cause the misleading of the user’s data. This remaining data about the user could be used to extract the user information. Incidences of double transactions of block-chain and forging the ledger or Bitcoin came as a major challenge. A safe and reliable e-wallet was required to handle such security issues. Normally the e-wallets installed on the PCs were used, but mobile devices were becoming more popular day by day, so the need of the hour was to verify the security of e-wallets in mobile devices more stringently. Subsequently, a transaction completed only when accuracy and integrity created in a moving gadget, ensure the safety of a transaction. A safety e-wallet should be created for reducing, verifying and validating problems which could appear at every planning, implementation, requirements analysis, testing, and maintenance. A secure and reliable restoration of the e-wallet must be incorporated if the security was compromised or hacked by the attackers. It should ensure the security, for the user transaction data saved in the e-wallet and the settings required to manage and operate electronic wallet. The mechanism should be required for removing the remaining user
60
P. Mishra and R. Ganesan
information effectively and safely when the e-wallet was not utilized and should discarded the rest of the data subsequently.
7 Requirements of Cloud Following in Table 1 describes the necessities of cloud which helped in changing the multiple design patterns of the cloud. Furthermore, only the authentic user could access the stored data from the cloud and at the same time this should also be unreachable for an attacker. Earlier, for a complex systems various policies like multi-factor policies and user authentication control authority established by the researchers. A new policy named as ledger based authentication policy developed which runs on peer-to-peer basis. By making use of block-chain mechanism, a limited work was observed on insider mechanism. From the available techniques, attacker draw the benefits by managing the user and utilizing the insider’s data, which can be solved by maintaining the authentication details of the insider unchanged. Whether for any device or human being, to access the cloud, authentication for particular is required. If violation observed by supplying the authentication data by any third party, then the motive of the third party should be nullified. With the application of block-chain mechanism, a distributed ledger-based authentication policy was introduced for the outsider, which found difficulty for the outsider to get the ledger entry, and made things complicated for an attacker. So proper authentication agreement was needed to avoid the unnecessary modification of data by any users including an authenticated user. Complexity of structure of threats diminished at high demand of security of data due to which efficiency increases. Sharma et al. explained about the strategical property of the insider and outsider threats. The maximal insider threats were happened due to transfer of data over the cloud. Insider threat was found very dangerous because it harmed various big organizations. This was all happening because insider threats were caused by the Table 1 Necessities of cloud for changing the design pattern Need
Explanations
Scalability
Millions of nodes which was possible to scale-in and out
Privacy
System helped to secure all the personal information of the users
Infinite computing resources
Advance proposal or planning was not allowed by the users during the services
Pricing
Multiple of applications, number of services demanded different prices depending upon there utilization
Utilization
Utilization of the variable load should be tuned finely
Performance
Depended on the number of applications were running in the systems
Flexibility
Showed the quantity sharing of the files and also provided the best services to the users
Data Prevention Protocol for Cloud Computing Security Using …
61
attacker who had prominent link with the accessible data. And these attackers had acknowledged the system framework which made them uncomplicated to acquire the profitable data records, any built an easy passage for an outsider to access the records. Due to increment of transfer of data over the cloud, insider attacks may increase and it may cause to lead more threats. So nearby one-third of the users presumed that an insider could create more damage than by an outsider. So new security were adopted which became inaccessible for the inside attackers. Before many applications were adopted to make the data secured and safe from the inside attackers. Shah et al. demonstrated about the developed multifactor authentication for the cloud servers with login issues, and explained about the devices. But this multifactor authentication produced easy accessible for the insiders to attack, easy password guessing, insider attack and no session key computation by temporary device. Habib et al. used a method where at a time various users could authenticate the data by data encryption method. This agreement enhanced the usage of security by a signature of an authenticated user from any of the group. A single user of the group could allow any number of people to join with their access permissions. After that, the particular authority provided them an accessible key with name, time and service etc. [22]. Mostly every agreement was executed to resist the inside and outside attackers in the cloud. All the data of the users were controlled systematically over the cloud by the employees worked as a cloud service suppliers. These employees worked as an insider to the cloud service suppliers and withdrew the maximal benefits for the data management. Activities of the attacker remained inside could be monitor, initially by using proper authentication, and after the complete authentication of these insider attackers, it was very simple to recognize and monitor their activities. The implementation of authentication was necessary which was to be executed and remained unchanged for the insiders also. The implementation of this authentication made the users data secured over the cloud and allowed to access the data only by the authenticated genuine users. Sometimes attackers took benefits of managing user’s data for some of the applications. If any insider threat is occurred then the responsibility of the employee can be fixed. The employee should not allow changing the authentication details to escape him from correct tracking order after committing insider threat. Whether for a normal human or a system, an authentication was required for every cloud users. Like same as an inside attackers, by using block-chain technology similar kind of an authentication scheme should be applied for the attackers belonging from the outside, which made them strenuous to replace that authenticated data stored. Block-chain helps in removing the boundaries of many application and increased device performance. Zheng et al., found that block-chain was very necessary for users applications. The block-chain needed an identification that was limited to private key and moved the ownership to an intended user. The user’s key note or impressions helped in substantiating the public key which was stocked in block-chain identification. On the basis of decentralized approach, this block-chain mechanism produced various advantages then the conventional authentication methods. Block-chain played a vital role in financial technology which was recently highly increasing. For digital dealings, block-chain was used as a public ledger which helped
62
P. Mishra and R. Ganesan
in blocking the process of hacking at the time of interchanging the transactions. As per the rule transactions could be encrypted and utilized for smooth running of the block-chain software. In a central database, all data of the users kept secured due to supply of high security by block-chain. The database undergone through any kind of damage could be prevented only by the aspect from data storage and management. Because of such features of the block-chain, it provided a lucidity in data whenever the disclosure of data was required. Block-chain technology, a mechanism where combinations of data blocks remained in a sequential pattern in a list. It was crucially a decentralized ledger that could store data safely. Block-chain enclosed numerous applications like finance and programmable society. Block-chain had classified into three parts such as private, public and consortium. In the case of private block-chain, it showed that the nodes were remained present with an organization but actually remained to the block-chain technology. At the same time, any type of nodes could connect to network and review the information related to nodes in the case of public block-chain. But in the consortium block-chain, to connect the network, the nodes should be remained authorized and also allowed to share the node information. The main benefits of using block-chain technology was that, none of the data which were on the series, was maintained unshuffled. Block-chain technology consist different layers of architecture such as application, network, contract, data, consensus, and incentive layer. The different application scenarios were mostly defined by an application layer. The multiple mechanism such as transmission and data verification were the primary thing of block-chain technology which was consisted by network layer. The multiple script code and smart contract that was programmable base of upper application layer were impounded by contract layer. The underlying data block and data encryption were confined by data layer. Consensus layer includes several algorithms for fetching the network. The block increment, mechanism issues and distribution of mechanism were imposed by an incentive layer. A block-chain technology was used to get a secure protection for the cloud data confederacy. It permitted the cloud service providers to manage the obscurity operation, and also permitted the users to focus on their data requirement. A structure was suggested that combined attribute based encryption and block-chain to understand better control access of storage data over the cloud. Here, at a time, only one particular user could access the data, and handover the confidential key to user which resolves the key management issues in customised attribute based encryption and enhance more adaptable.
Data Prevention Protocol for Cloud Computing Security Using …
63
8 Solutions for Cloud Data Security Based on Block-Chain Technology Cloud data privacy usually meant that only the data owner should have had the authority to access all the necessary information which was kept secured in the cloud. And if, all the necessary secured data once got breeched, it brought-out to be decentralized, exposed and generated huge loss. The multiple technologies was already used to keep all the confidential information of the users secured over the cloud, but still that techniques for securing the confidential information of the user remained inefficient and unsecured. The block-chain was introduced as a prototypical method for certifying insignificant. When this new technology got amalgamated to the cloud environment, it started reforming an appropriate service which could produce well-built safety. For securing the user’s data, an electronic wallet which was used should be removed properly while utilizing the block-chain technology or else the user’s data could remain behind. Block-chain consists of some activities like ledger, bitcoin and sometimes both, created a huge problem related to security. Although to avoid that kind of problem, an electronic wallet was introduced and this wallet was proved to be the best in the mobile systems. In the mobile systems, the safety transaction was only possible when an individual unification and validity of time sequence developed. Here authentication case usually not provided morality, as it had the matter of disclose the key by hacking the private key to attack the block-chain. Also, as the whole elimination of an electronic wallet was not classified, the data protection was stopped continuing. Furthermore, as the verification of the electronic wallet was not altered, it was impossible to produce complete information security. The unguarded of dual verification neither supplied availability nor convinced integrity for the enhanced block-chain process. Besides, generally it didn’t provide the left over data protection, whenever the removal of the electronic wallet remained substantiated. The complete data protection was possible whenever the data got encrypted with the help of a public key which substantiated the uprooting of an electronic wallet.
9 Examples of Cloud Data Security Based Block-Chain Technology 9.1 Secure Cloud Data Privacy The public key cryptography and access control list were the two methods for securing the private data in the cloud. Asymmetric cryptography was another name of public key cryptography where public and private keys were used. The public key was accessible to everyone whereas the private key tends to very secretive. The main
64
P. Mishra and R. Ganesan
intention of utilizing the asymmetric cryptography was to finish the encryption and with the help of another key, the decryption needs to be decided. Access control list was explicate by providing a certain number of rules to settle down, and also explained the accessibility of the databanks by the users. For accessing the data over the cloud by any organization, proper permission from the access control list needed. And then with the help of public key cryptography, the data could be encoded by the service providers. For controlling the risk factors and avoiding the losses, the above two discussed methods were elucidated in a single entity.
9.2 Cloud Data Unity The togetherness of the data in the cloud was certified with the accuracy and sustainability of the information remain stored over the cloud environment. Based on public key cryptography and data editions, data integrity was covenant. In this asymmetric crypto-graphy process, for assigning the data, users could utilize a keyless signature infrastructure encryption tool for any changes of information that would be found by the signature verification technique. These process where all the copies of data were stocked occasionally over the cloud and with the help of polling process, all the data were collated without any interchange. The block-chain produced a node to node integrity confirmation which continued stability, reliability, and dependability all over the circuition of the data by smart contracts. Decentralization kept away the ability of an evident centre, and its concord structure assured believes within the nodes, that was helpful for demonstrating the information unity.
9.3 Cloud Data Detectability The originality of the data stored in the cloud was tracked by cloud data detectability to ensure its own private data and unusual process. The information that was evolved from the operations like input and output of the cloud environment, was also found recorded in the cloud itself. The conventional detectability of information was due to log and audit technique which were bind with zero-knowledge evidence, and found helpful to inspect the modification in the flow of data over the cloud and secured the solidity of informational sources.
Data Prevention Protocol for Cloud Computing Security Using …
65
10 Review Outcomes and Research Gap The applications used in cloud computing environment and block-chain technology, exhibited one of the major effect on the user and cloud service provider. It also manifests the effect on the data owners those are wants to keep their data to be protected. From the above discussion it is found that security issues and challenges are consequential drawbacks of the cloud. It is still facing the obstacle of data security, managing the data or information, stability and reliability of data in cloud computing environment. • Quality of service is very censorious to measure in the cloud environments as reliability and scalability depends on the performance of the device. The cloud problems are mainly security and the privacy of data stored in the cloud. • Block-chain is an essential and secure technology in analysing the data detectability over the cloud. But still, the confirmation of data need to be more decentralized and by providing more guidance during collaboration of service providers may enhance the future growth of cloud computing. This brings a huge challenge for data allocation, and requires more improvement. • Some of the findings and limitations of Blockchain technology which were discussed by many researchers are mentioned below in Table 2.
11 Conclusions The following can be concluded from the above discussions: • Mainly this paper focus on the block-chain technology application to look over the cloud data security in the wake of following the threats in the cloud computing environment. • Currently many tools are present to fortify the seclusion, detectability, and reliability of cloud data. • Block-chain provides many stand up issues even now, such as transaction security, software management and multiple applications have performed to rectify the issues. • An obscureness of user data must be certifying while utilizing the block-chain technology. For that, user’s data shall be removed when extracting the service.
66
P. Mishra and R. Ganesan
Table 2 Findings and limitations of block-chain technology S. No
Author
Findings
Limitations
1
Yadav et al.
Block chain techniques for securing data, cloud computing, healthcare, security, AES
Data security and protection assurance issues are applicable to equipment and programming in the cloud design. Healthcare sectors rarely focus onto storing data into the cloud and maintain the security of data and mostly prefer sticking to have copies of the report
2
Darwish et al.
Block chain, cloud storage, hybrid encryption, data privacy
Encrypted data are kept away from the cloud provider or attackers hand and data integrity is preserved as authentication was carried out in a decentralized platform by the cloud auditors. Also quality of service is very critical to measure in cloud environments as scalability refers to performance of the system
3
Mubarakali et al.
Healthcare services, block chain, insurance agent, wearable devices
Implements the framework model based on the block chain. The deployment model is highly sensitive for securing private data for the determination and transformation of information in the healthcare domain
4
Uddin et al.
Block chain, cloud data center, cloud vulnerabilities, virtualization vulnerabilities
Even though cloud storage has lots of benefits but still cloud has vulnerabilities and many other security issues. When it comes to adoption of cloud storage, the first barrier is seen as security issues for online shopping or anything, so customers and vendors are security threats
5
Sharma et al.
Medical big data, cloud computing, data security, data classification, cryptography
Block chain is a disseminated dairy, where all gatherings holds duplicate The framework can be increasingly brought together or decentralized
6
Shah et al.
Block chain, data security, encryption, smart contract, cloud storage
The system enhances the security of data by encrypting and distributing the data across multiple peers in the system By encrypting the data and scattering the data across multiple nodes, a high level of data can be achieved
7
Peng Cheng Wei
Block chain cloud data storage, cipher text control technology, integrity verification technology, cloud data integrity
Cloud data integrity using network coding technology for storage and integrity protection has achieved certain results, but its computational overhead in the integrity protection process too large It seriously affecting the operational efficiency of the storage node and reducing the availability of the cloud storage service (continued)
Data Prevention Protocol for Cloud Computing Security Using …
67
Table 2 (continued) S. No
Author
Findings
8
Alkadi et al. Intrusion detection system, collaborative anomaly detection, cloud systems, block chain applications
The challenges using block chain and intrusion detection system should examined and analyzed Still there is significant need for process of cloud systems represents a potential threat, which exposes organizations making uses of cloud features
9
Sohrabi et al.
Smart contracts provide a decentralized access control for the cloud data storage Data owner uploads its data after it is encrypted i.e. cloud servers only stores the encrypted data and does not have the access to the decryption key
10
Murthy et al. Cloud computing, data management, data security, decentralization, block chain technology
Cloud services, smart contract, access control, Shamir secret sharing scheme
Limitations
Security issues and the cloud challenges are significant drawbacks of hampering the cloud It is still facing many challenges like data security, data management, compliance, reliability
References 1. Sarmah SS (2019) Application of blockchain in cloud computing. Int J Innov Technol Explor Eng 8(12):4698–4704 2. Yadav D, Shinde A, Nair A, Patil Y, Kanchan S (2020) Enhancing data security in cloud using blockchain. In: 4th international conference on intelligent computing and control systems, pp 753–757 3. Darwish MA, Yafi E, Al Ghamdi MA, Almasri A (2020) Decentralizing privacy implementation at cloud storage using blockchain-based hybrid algorithm. Arab J Sci Eng 45(4):3369–3378 4. Mubarakali A (2020) Healthcare services monitoring in cloud using secure and robust healthcare-based BLOCKCHAIN (SRHB) approach. Mobile Netw Appl 25(4):1330–1337 5. Uddin M, Khalique A, Jumani AK, Ullah SS, Hussain S (2021) Next-generation blockchainenabled virtualized cloud security solutions: review and open challenges. Electronics 10(20):2493 6. Sharma S, Mishra A, Singhai D (2020) Secure cloud storage architecture for digital medical record in cloud environment using blockchain. In: Proceedings of the international conference on innovative computing & communications 7. Shah M, Shaikh M, Mishra V, Tuscano G (2020) Decentralized cloud storage using blockchain. In: 4th International conference on trends in electronics and informatics, pp 384–389 8. Wei P, Wang D, Zhao Y, Tyagi SKS, Kumar N (2020) Blockchain data-based cloud data integrity protection mechanism. Futur Gener Comput Syst 102:902–911 9. Alkadi O, Moustafa N, Turnbull B (2020) A review of intrusion detection and blockchain applications in the cloud: approaches, challenges and solutions. IEEE Access 8:104893–104917 10. Gai K, Guo J, Zhu L, Yu S (2020) Blockchain meets cloud computing: a survey. IEEE Commun Surv Tutorials 22(3):2009–2030 11. Sohrabi N, Yi X, Tari Z, Khalil I (2020) Blockchain-based access control for cloud data. In: Proceedings of the Australasian computer science week multiconference, pp 1–10 12. Murthy CVB, Shri ML, Kadry S, Lim S (2020) Blockchain based cloud computing: architecture and research challenges. IEEE Access 8:205190–205205 13. Sharma P, Jindal R, Borah MD (2020) Blockchain technology for cloud storage: a systematic literature review. ACM Comput Surv 53(4):1–32
68
P. Mishra and R. Ganesan
14. Vivekanadam B (2020) Analysis of recent trend and applications in block chain technology. J ISMAC 2(04):200–206 15. Gai R, Du X, Ma S, Chen N, Gao S (2020) A summary of the research on the foundation and application of blockchain technology. J Phys Conf Ser 1693(1):012025 16. Patil A, Patil S, Sharma V, Rokade S, Sambare GB (2021) Securing Cloud Based Data Storage using Blockchain. Int J Eng Res Technol 10(06):517–521 17. Liang J, Li T, Song C (2021) Application of cloud computing and blockchain technology in intelligent information security. In: IEEE international conference on emergency science and information technology, pp 819–823 18. Kollu PK (2021) Blockchain techniques for secure storage of data in cloud environment. Turk J Comput Math Educ 12(11):1515–1522 19. Kaushal RK, Kumar N, Panda SN (2021) Blockchain technology, its applications and open research challenges. J Phys Conf Ser 1950(1):012030 20. Kumar R, Tripathi R (2021) Scalable and secure access control policy for healthcare system using blockchain and enhanced Bell–LaPadula model. J Ambient Intell Humaniz Comput 12(2):2321–2338 21. Li W, Wu J, Cao J, Chen N, Zhang Q, Buyya R (2021) Blockchain-based trust management in cloud computing systems: a taxonomy, review and future directions. J Cloud Comput 10(1):1– 34 22. Habib G, Sharma S, Ibrahim S, Ahmad I, Qureshi S, Ishfaq M (2022) Blockchain technology: benefits, challenges, applications, and integration of blockchain technology with cloud computing. Future Internet 14(11):341 23. Khanna A, Sah A, Bolshev V, Burgio A, Panchenko V, Jasi´nski M (2022) Blockchain-cloud integration: a survey. Sensors 22(14):5238 24. Guo H, Yu X (2022) A survey on blockchain technology and its security. Blockchain Res Appl 3(2):100067 25. Ke Z, Badarch T (2022) Research on information security and privacy technology based on blockchain. Am J Comput Sci Technol 5(2):49–55
Intelligence Monitoring of Home Security Hampika Gorla, Mary Swarna Latha Gade, Babitha Lokula, V. Bindusree, and Pranay Kumar Padegela
Abstract Major cities are under grave danger from crimes conducted in a variety of ways, including chain robberies, robberies carried out by people within establishments, and home invasions. Residential crimes are committed by forcing open closed doors. State police claim that by having officers patrol the area at night and implementing other security measures, it is occasionally feasible to stop the killing of an owner during a late-night heist. This technique, however, cannot prevent home invasion robberies because officials are unable to search every single dwelling. The suggested generic framework includes several modules, such as Auto-Configuration and Management, Communication Protocol, Auto-monitoring and Control, and Objects Security Systems. The generic framework that is being suggested includes a number of modules, including those for auto-configuration and management, communication protocol, auto-monitoring, and control. The proposed generic framework is designed to work on all vendor boards and variants of Linux and Windows operating system. Keywords Raspberry Pi · Spy cam · Door sensor · LCD · GSM · GPS · Buzzer
1 Introduction Now a days many robberies are taking place mainly the robberies are of four types in that the main robberies are gome robberies they are also call as HB which also called as house breaking. In that HB they are mainly two types one is HB(D) it means the robbery which took place by breaking the house in day time and the second one is HB(N) which means the house braking robberies which are taking place during. During the robberies many of the precious thing (important) like Gold, money and many other are getting stolen. In some areas they are hurting and killing the people H. Gorla (B) · M. S. L. Gade · B. Lokula · V. Bindusree · P. K. Padegela Department of Electronics and Communication Engineering, Institute of Aeronautical Engineering, Hyderabad, Telangana, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_7
69
70
H. Gorla et al.
during robbery. By using our device it directly connects to the police (100) and it also alerts us. Smart home security system was designed by Aman sharma, Anjana Goen in 2018. The whole project was good but the limitations in their project are it is to costly and very complex to design and the main disadvantage is every one cannot understand it. IOT based technology for smart home security A generic framework in 2017. The whole is good but it has some limitations we can operate the device from long ranges, and after this also there are many robberies are taking place. Research of intelligence home security surveillance system Based on ZigBee. There is no product based upon this it’s just a research. Like many proects are there which are very complex to design and they are very much costly. And main motto is to make the things which can be accessed by every one. But many many cannot be used by illiterate’s.
2 Proposed Device This system advances the existing smart door lock technology and enhances the security features, thereby proving to be a significant progress for the main business. Our suggested work includes remarkable models of the various components of the process, including the power supply unit, the door sensor detection unit, the camera, system locking component, startup strategy, monitoring, alarm component, and indication. The method’s block diagram is shown in the example [7] (see Fig. 1). This system’s hardware components include a power supply, an Arduino UNO, key sensors, an LCD, a buzzer, and LEDs that communicate with the program. Embedded Linux is the software application in use [7]. The technology quickly locks the car door after detecting the presence of home security in closed homes. As a result, the strategy reduces theft, the number of street robberies, and future fatalities related to device use. An effective association is provided to create the intelligent procedure for house switch, which will expose irrelevant opening door parameters in the middle of a regular time period and may deliver this benefit to the base unit. In the future, I’d like to offer a plan that will fully Fig. 1 Block diagram
Intelligence Monitoring of Home Security
71
Fig. 2 Flow chart
integrate base communication skills and station by using GSM and changing one of each of the style parameters [2]. This approach is also crucial to consider the security from a few protection indicators when utilizing outdoor men and women by using unidentified touching nearby doors [8]. A liquid crystal show display may be installed inside the entrance to serve as a signal for the reason pressure along with everyone inside the property. The microcontroller shown in Fig. 2. The Arduino organizational structure can be used to continue purchasing output from the vibration sensor and door sensor modules. Additionally, the plan appears to be working as of right now, and the sensor may immediately inform the processor of a suitable voltage [9]. This reality forces the technique to stop the engine and send a notice to the users. The code is written using the Linux operating system, which is the ecosystem for creating laptop programs. It is then collected, created into a hex file, and loaded onto the controller. The proposed procedure’s simulation setup has been validated (see Figs. 2 and 3).
Fig. 3 Arduino uno
72
H. Gorla et al.
3 Components Specifications In this section we explained the entire hardware of the SECURING HOME ROBBERY DETECTION DEVICE (see Fig. 2). A. Arduino It is a microcontroller board based on ATmega328p. This has 14 digital pins which are input and output pins. IN that 6 pins are used as PWM (pulse width modulation) output. The major parts of the Arduino are [7]: • • • • • • • • •
USB connector Power port Microcontroller Analog input pins Digital pins Reset switch Crystal oscillator USB interface chip TX RX LEDs.
B. Monitoring Unit The machinery that transforms physical quantities into electrical impulses is known as a sensor. The list of sensor types utilised by this system is described below [7]. I. Obstacle sensors are used to detect objects outside of the vehicle. II. Door sensors are used to determine the state of doors. III. Vibration sensors are used to identify unidentified human touches. C. Buzzer A buzzer is a machine that makes beeping sounds. It needs 5 V DC and is connected to a BC547 transistor to increase the low current to a level that is sufficient for amplification. The device that this project uses to inform users when the intruder enters the room. D. LCD Displays with crystallized liquid Greek letters, all alphabets, unique characters, and mathematical symbols are displayed using LCD technology, which goes by another name. Hitachi’s HD44780 controller was employed in the. E. GSM Wireless sensor technology for communication systems is one of the most well-liked and pervasive technologies available today with minimal costs is Global System’s for Mobile Communication (GSM). It can communicate fully duplex with an Arduino
Intelligence Monitoring of Home Security
73
at a 9600 baud rate. The AT instructions are the foundation of this technology. When intruder enters into the room it initializes the call. And alerts the user. F. GPS In order to determine the location, the Global Positioning System (GPS) is used. There is no need for a phone or the internet. This module can simply interface with an Arduino microcontroller and is compatible with TTL logic. The GPS system detects the location parameters when a sensor is activated and sends them to an Arduino microcontroller. G. Spy Cam A webcam is a digital camera that streams or transmits real-time video. to or through a computer to a PC network. The term “webcam” (a clipped compound) can also be used to refer to a video camera that is regularly linked to the internet for an extended period of time rather than for a single session, typically providing a view for anyone who visits its website online. Webcams typically contain a lens, a photo sensor, support electronics, and one or sometimes two sound microphones.
4 Future Scope Smart locks, which can be operated by cellphones, also use AI. Artificial intelligence (AI)-enabled smart locks have many security advantages, including a reduced reliance on physical keys for access, temporary access for visitors, and regular video broadcasts of people ringing the doorbell. The smart home ecosystems of Google, Samsung, and Amazon can accommodate biometric door locks from Kwikset, August, and Samsung. Residential households are becoming more and more “smart,” still analysts predicting residential robberies estimated count is top 300 million in 2023. A rise in new security concerns is anticipated as the market for smart homes expands. Wirelessly connected gadgets are more susceptible to cyber-attacks. Therefore, protecting connected devices from security risks and weaknesses is crucial to gaining homeowners’ trust and boosting the sale of smart home technology. For instance, the Maria IoT botnet took over more than 600,000 smart home devices globally in 2016, including security cameras, routers, and air quality monitors. This led to a significant rerouting of web traffic and the suspension of services for media platforms such as twitter and Netflix (Figs. 4 and 5).
74
H. Gorla et al.
Fig. 4 Implementation of project
Fig. 5 Call to a contact any suspect detection
References 1. Narsimha Reddy K, Bojja P (2022) A novel method to solve visual tracking problem: hybrid algorithm of grasshopper optimization algorithm and differential evolution. Evol Intell 15(1):785–822 2. Bojja P et al Experimental analysis and improvements of a visible spectro photometer fordetection of nano materials. Int J Chem Eng, April, 22 3. Bojja P et al (2021) A novel method to solve visual tracking problem: hybrid algorithm of grass hopper optimization algorithm and differential evolution. Evol Intell 4. Narsimha Reddy K, Bojja P (2021) Algorithm and differential evolution [Emid: 6e7b1b6195f4b6c0], Springer. J Evol Intell. https://doi.org/10.1007/S12065-021-00567-0 5. https://www.researchgate.net/publication/332835074_Research_Paper_On_Home_Automat ion_Using_Arduino 6. https://www.researchgate.net/publication/332578743_Web_Based_Home_Security_and_Aut omation_System 7. Hampika G, Rohit V, Abhishek P, Manisha G (2020) Design of drunk and drive detection device, vol 664. CCSN2019, Springer, pp 315–323 8. https://ieeexplore.ieee.org/document/8821885 9. https://www.ripublication.com/ijaer18/ijaerv13n2_57.pdf 10. https://acadpubl.eu/hub/2018-119-15/5/801.pdf 11. https://www.academia.edu/25030686/Design_and_Implementation_of_Security_Systems_ for_Smart_Home_based_on_GSM_technology
Intelligence Monitoring of Home Security
75
12. https://www.academia.edu/36782305/Review_On_Home_Security_System 13. https://doi.org/10.1088/1742-6596/953/1/012128/pdf 14. https://www.technicaljournalsonline.com/ijeat/VOL%20II/IJAET%20VOL%20II%20I SSUE%20IV%20%20OCTBER%20DECEMBER%202011/ARTICLE%2066%20IJAET% 20VOLII%20ISSUE%20IV%20OCT%20DEC%202011.pdf
TB Bacteria and WBC Detection from ZN-Stained Sputum Smear Images Using Object Detection Model V. Shwetha
Abstract Manual evaluation of TB bacteria from Ziehl-Neelsen images is a tedious process. Computer Aided Diagnosis for image analysis helps increase the throughput. In Ziehl-Neelsen sputum smear images, bacilli appear very small and often overlap with WBC. This results in a setback in the automation process. Recent advances in CNNs in solving medical image-related diagnosis help in faster diagnosis. The present study is focused on the application of an object detection model in TB bacteria detection process. The study showed that the RetinaNet-based object detection model is suitable for microscopic ZN images resulting in an Average Precision of 0. 91 for WBC and 0.94 for bacilli bacteria. Keywords Tuberculosis · Mycobacterium tuberculosis · Ziehl-Neelsen images · WBC · Bacilli · Object detection model · CNN · YOLO · RetinaNet
1 Introduction Tuberculosis (TB) is one of the leading causes of death in the world [1]. Early treatment caused by TB is important to avoid the clinical risk to the patient’s health [2]. Mycobacterium tuberculosis bacteria cause TB [3]. They are rod-shaped bacteria known as Bacilli bacteria. For low-income countries, Ziehl-Neelsen (ZN) staining procedure is considered a feasible method to identify the bacteria [4] because of its speed, high true positive detection rate and cost. It is a morphological identification procedure. Bacteria detection using Computer Aided diagnosis procedure in earlier work Bacteria detection [5]. Machine Learning based [6–8] and CNN-based methods [9, 10] are used to detect the bacteria. Bacteria colony counting using CNN [11] was reported by Ferrari et al.. UNet-based segmentation for Raman Spectroscopy images for bacteria detection reported by Shaebi et al. [12]. Object detection models such as YOLO and RetinaNets were used in the population, vehicle and aerial object V. Shwetha (B) Electrical and Electronics Department, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal 576104, Karnataka, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_8
77
78
V. Shwetha
detection [13–15]. Crop disease detection using object detection models namely YOLO and RetinaNet models, were reported [16, 17]. The application of the object detection models has also been employed in several medical image segmentation of the medical image [18]. Lesion detection in brain MRI images [19, 20], mamogram images segmentation using yolo [21], RetinaNet [22, 23] and Histopathology cancer detection breast [24] were. Detection of microscopic pollen grain images reported [25], Malaria parasite detection from thick blood smear [26] and cell detection [27] using RetinaNet model also reported. Bacilli and WBC are appear very small and overlap in ZN images. The novelty of the work is to detect the bacilli and WBC using an object detection model. The contributions of the proposed work are as follows: 1. This paper presents a novel RetinaNet-based TB bacteria detection model 2. The proposed methodology detects the bacilli bacteria and WBC from ZiehlNeelsen stained images 3. This study demonstrates a comparison study with other object detection models and their variants, in which RetinaNet with ResNet-101 backbone gives an mAP value of 0.91.
2 Methodology The input images used in this work are Ziehl-Neelsen (ZN) stained images. These images consist of bacilli bacteria and WBC. These are further resized before sending them into the object detection model. In the current study, RetinaNet and its variants are used as object detection models. The proposed method is also compared with various other object detection models. The overall methodology is shown in Fig. 1.
2.1 Dataset ZN Images are obtained from a publicly available dataset called ZNSM-iDB database [28]. It consists of 2000 obtained from different staining methods, out of which 800 full images are ZN images. Images are captured from 25.× magnification. Figure 1 shows the bacteria and WBC appearance in the ZN images (Fig. 2).
Fig. 1 Overall proposed methodology
TB Bacteria and WBC Detection from ZN-Stained …
79
Fig. 2 a and b are the sample images used in the study
2.2 TB Bacteria Detection Using RetinaNet Object Detection Model RetinaNet model is employed as the object detection model for the current study. RetinaNet primarily consists of three subnetworks such as two Fully Convolutional Networks (FCNs), a Feature Pyramid Network (FPN), and a Residual Network (ResNet) [29]. Figure 3 summarises the RetinaNet network design. The concept of residual learning, which enables the original input information to be directly passed to the succeeding layers, was the fundamental contribution of ResNet [29]. ResNet employs many network levels. The three most popular types of network architecture are those with 50, 101, and 152 layers. ResNet is used to extract the characteristics of the bacteria, which are subsequently forwarded to the subsequent FPN subnetwork. In the beginning, we fed a single-dimensional image into ResNet. The features of all the layers were then chosen by the FPN and concatenated to create the final feature output combination, starting with the second layer of the convolutional network.
2.3 Implementation Details The detection model was implemented using the PyTorch platform. A total of 1200 sub-images are obtained from 800 full images. A total of 850 images are used in the train set and 350 images for the test set. The images are pre-processed so that images are selected and resized to 512. × 512. By dividing by 256, the grey value is normalized to [0, 1]. ResNet extracts the ZN image’s features, and the FPN creates the final features. IOU threshold is 0.5, the batch size is 16, and the kernel size is 3. All the deep networks’ constitutional layers have the ReLU activation function,
80
V. Shwetha
Fig. 3 RetinaNet architecture contains feed-forward ResNet and FPN as the backbone for bacteria detection [29]
while the prediction layers use the softmax function. With an SGD optimizer and a learning rate of 1e.−3, the network is trained for 50 epochs.
2.4 Comparison with Other Object Detection Model In the study, You Only Look Once (YOLO) model is used for the object detection comparison. You Only Look Once (YOLO) is the real-time object detection model used to detect vehicles, parking signals, and people in groups and individuals. In this work, an attempt is made for microscopic image implementation. The methodology, such as pre-processing steps, includes dividing the images into four images. After pre-processing individual size of each image is around 512. × 384. The images were annotated using the LabelImg tool. Overall, 600 images were annotated using LabelImg software to implement the object detection model. YOLO is a convolution neural network. It consists of 24 convolutional layers and is followed by two fully connected layers. Each layer has its importance, and the layers are separated by their functionality. YOLOV3, YOLOV4 and YOLOV5 Darknet models are used for the comparison.
2.5 Performance Metrics The suitable object detection model for TB bacteria detection is evaluated using performance metrics Average Precision (AP) and mean Average Precision (mAP). The AP represents the region beneath the precision-recall curve. mAP is the mean average of the AP, whereas AP is specific to a particular class.
TB Bacteria and WBC Detection from ZN-Stained …
81
3 Result and Discussion The training and validation accuracy curve and Training and validation loss curve for 25 epochs is shown in Fig. 4. RetinaNet with various backbone network used for the TB bacteria detection. RetinaNet with ResNet-50, RetinaNet with ResNet-101 and RetinaNet with ResNet-152 shows better performance. These three models are further used in the study to measure performance metrics such as the class-wise AP and mAP. The proposed method is compared with object detection models. Table 1 shows that YOLOV5 performed compared to YOLOV3 and YOLOV5 darknet models. The retinaNet model showed better results than other object detection models used in the literature. Table 2 shows that RetinaNet with ResNet-101 gives good AP and mAP compared to its variants. This model is further deployed to detect TB bacteria and WBC. RetinaNet and performance mAPs have been listed in Table 3. IoU thresholds of 0.3 and 0.7 are tabulated. We used ResNet-101 for these backbone networks because
Fig. 4 a Training and validation accuracy curve and b training and validation loss curve of RetinaNet with ResNet-101 Table 1 Result obtained for various object detection model Object detection Class AP model YOLOV3 YOLOV4 YOLOV5 Proposed RetinaNet
Bacilli WBC Bacilli WBC Bacilli WBC Bacilli WBC
0.74 0.76 0.78 0.76 0.77 0.85 0.84 0.93
mAP 0.76 0.77 0.82 0.87
82
V. Shwetha
Table 2 Result obtained for RetinaNet with various backbone Backbone model Dataset AP-WBC AP-Bacilli (train/test) VGG-16 VGG-19 ResNet-50 ResNet-101 ResNet-152
Training set Test set Training set Test set Training set Test set Training set Test set Training set Test set
0.71 0.70 0.81 0.80 0.93 0.84 0.92 0.81 0.91 0.82
0.73 0.77 0.83 0.84 0.91 0.87 0.95 0.86 0.90 0.83
Table 3 Result obtained for RetinaNet with various backbone Backbone model [email protected] [email protected] VGG-16 VGG-19 ResNet-50 ResNet-101 ResNet-152
0.69 0.71 0.82 0.89 0.82
0.71 0.81 0.93 0.92 0.90
mAP 0.78 0.81 0.84 0.85 0.89 0.82 0.92 0.89 0.87 0.80
Parameters 12,333,523 22,423,624 33,414,221 77,812,333 57,393,213
Table 4 Performance comparison of RetinaNet with ResNet101 backbone using various filter size Filter size
mAP
3. × 3 4. × 4 5. × 5 6. × 6 7. × 7
0.82 0.88 0.92 0.87 0.88
it produced better mAP than other backbone networks. The RetinaNet with ResNet 101, was also able to obtain the more significant parameter compared with other backbone networks. It is listed in Table 3. Performance of RetinaNet model with ResNet 101 as backbone for various Kernel sizes is shown in Table 4. Model with the 5. × 5 kernel size shows better performance than the larger kernel size 6. × 6 and 7. × 7. The filter size of 5. × 5 is suitable for bacteria detection in ZN images.
TB Bacteria and WBC Detection from ZN-Stained …
83
Fig. 5 a Input image and b result obtained using RetinaNet with ResNet-101 backbone
Figure 5 shows the result obtained. In which WBC is detected with a bounding box confidence value varies from 92 to 100%. Similarly, Bacilli bacteria detected with the confidence value varies from 90 to 100%.
4 Conclusion TB bacteria detection is a very crucial step clinical diagnosis process. Bacilli bacteria often appear small and overlapped. It also appears along with the WBC, often
84
V. Shwetha
causes in difficulties in the manual evaluation process. This paper presents an object detection model for microscopic small images, which was earlier used in vehicle and population detection. The present work also compares the detection process using object detection models such as YOLOV3, YOLOV4 and YOLOV5. RetinaNet model is found to be suitable for bacilli and WBC detection. The RetinaNet model with ResNet-101 as a backbone resulted in a highest mAP of 0.92. The proposed model is compared with different filter sizes, in which kernel size of 5. × 5 is suitable for bacteria detection.
References 1. Bhandari J, Thada PK. NCBI Bookshelf. Aservice of the National Library of Medicine, National Institutes of Health 2. Waitt CJ, Banda NPK, White SA, Kampmann B, Kumwenda J, Heyderman RS, Pirmohamed M, Squire SB (2011) Early deaths during tuberculosis treatment are associated with depressed innate responses, bacterial infection, and tuberculosis progression. J Infect Dis 204(3):358–362 3. Kuria Joseph KN (2019) Diseases caused by bacteria in cattle: tuberculosis. In: Bacterial cattle diseases. IntechOpen 4. Koch ML, Cote RA (1965) Comparison of fluorescence microscopy with Ziehl-Neelsen stain for demonstration of acid-fast bacilli in smear preparations and tissue sections. Am Rev Respir Dis 91(2):283–284 5. Wang H, Ceylan Koydemir H, Qiu Y, Bai B, Zhang Y, Jin Y, Tok S, Yilmaz EC, Gumustekin E, Rivenson Y et al (2020) Early detection and classification of live bacteria using time-lapse coherent imaging and deep learning. Light: Sci Appl 9(1):1–17 6. Kotwal S, Rani P, Arif T, Manhas J, Sharma S (2021) Automated bacterial classifications using machine learning based computational techniques: architectures, challenges and open research issues. Arch Comput Methods Eng 1–22 7. Qu K, Guo F, Liu X, Lin Y, Zou Q (2019) Application of machine learning in microbiology. Front Microbiol 10:827 8. Sun H, Yang C, Chen Y, Duan Y, Fan Q, Lin Q (2022) Construction of classification models for pathogenic bacteria based on libs combined with different machine learning algorithms. Appl Opt 61(21):6177–6185 9. Yang M, Nurzynska K, Walts AE, Gertych A (2020) A CNN-based active learning framework to identify mycobacteria in digitized Ziehl-Neelsen stained human tissues. Comput Med Imaging Graphics 84:101752 10. Lo CM, Wu YH, Li YC, Lee CC (2020) Computer-aided bacillus detection in whole-slide pathological images using a deep convolutional neural network. Appl Sci 10(12):4059 11. Ferrari A, Lombardi S, Signoroni A (2017) Bacterial colony counting with convolutional neural networks in digital microbiology imaging. Pattern Recognit 61:629–640 12. Al-Shaebi Z, Uysal Ciloglu F, Nasser M, Aydin O (2022) Highly accurate identification of bacteria’s antibiotic resistance based on Raman spectroscopy and U-Net deep learning algorithms. ACS Omega 7(33):29443–29451 13. Wang Y, Wang C, Zhang H, Dong Y, Wei S (2019) Automatic ship detection based on RetinaNet using multi-resolution Gaofen-3 imagery. Remote Sens 11(5):531 14. Liu M, Wang X, Zhou A, Fu X, Ma Y, Piao C (2020) Uav-yolo: small object detection on unmanned aerial vehicle perspective. Sensors 20(8):2238 15. Lu J, Ma C, Li L, Xing X, Zhang Y, Wang Z, Xu J (2018) A vehicle detection method for aerial image based on yolo. J Comput Commun 6(11):98–107 16. Morbekar A, Parihar A, Jadhav R (2020) Crop disease detection using yolo. In: 2020 international conference for emerging technology (INCET). IEEE, pp 1–5
TB Bacteria and WBC Detection from ZN-Stained …
85
17. Peng H, Li Z, Zhou Z, Shao Y (2022) Weed detection in paddy field using an improved RetinaNet network. Comput Electron Agric 199:107179 18. Yang R, Yu Y (2021) Artificial convolutional neural network in object detection and semantic segmentation for medical imaging analysis. Front Oncol 11:638182 19. Liang M, Hu X (2015) Recurrent convolutional neural network for object recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3367– 3375 20. Yang M, Xiao X, Liu Z, Sun L, Guo W, Cui L, Sun D, Zhang P, Yang G (2020) Deep RetinaNet for dynamic left ventricle detection in multiview echocardiography classification. Sci Progr 2020 21. Su Y, Liu Q, Xie W, Hu P (2022) Yolo-logo: a transformer-based yolo segmentation model for breast mass detection and segmentation in digital mammograms. Comput Methods Progr Biomed 106903 22. Jung H, Kim B, Lee I, Yoo M, Lee J, Ham S, Woo O, Kang J (2018) Detection of masses in mammograms using a one-stage object detector based on a deep convolutional neural network. PLoS ONE 13(9):e0203355 23. Ueda D, Yamamoto A, Onoda N, Takashima T, Noda S, Kashiwagi S, Morisaki T, Fukumoto S, Shiba M, Morimura M et al (2022) Development and validation of a deep learning model for detection of breast cancers in mammography from multi-institutional datasets. PLoS ONE 17(3):e0265751 24. Bozaba E, Solmaz G, Yazıcı Ç, Özsoy G, Tokat F, Iheme LO, Çayır S, Ayaltı S, Kayhan CK, ˙Ince Ü (2021) Nuclei detection on breast cancer histopathology images using RetinaNet. In: 2021 29th signal processing and communications applications conference (SIU). IEEE, pp 1–4 25. Kubera E, Kubik-Komar A, Kurasi´nski P, Piotrowska-Weryszko K, Skrzypiec M (2022) Detection and recognition of pollen grains in multilabel microscopic images. Sensors 22(7):2690 26. Nakasi R, Mwebaze E, Zawedde A, Tusubira J, Akera B, Maiga G (2020) A new approach for microscopic diagnosis of malaria parasites in thick blood smears using pre-trained deep learning models. SN Appl Sci 2(7):1–7 27. Waithe D, Brown JM, Reglinski K, Diez-Sevilla I, Roberts D, Eggeling C (2020) Object detection networks and augmented reality for cellular detection in fluorescence microscopy. J Cell Biol 219(10) 28. Shah MI, Mishra S, Yadav VK, Chauhan A, Sarkar M, Sharma SK, Rout C (2017) ZiehlNeelsen sputum smear microscopy image database: a resource to facilitate automated bacilli detection for tuberculosis diagnosis. J Med Imaging 4(2):027503 29. Ren S, He K, Girshick R, Zhang X, Sun J (2016) Object detection networks on convolutional feature maps. IEEE Trans Pattern Anal Mach Intell 39(7):1476–1481
AuraCodes: Barcodes in the Aural Band Soorya Annadurai
Abstract Barcodes are well-known tools initially developed in the twentieth century to tackle issues of inventory management in supermarkets, by using optical scanners to turn visible black-and-white lines into machine-readable binary code. This evolved over the years into many forms, like Codabar, EAN-8, and UPC-A. A major advancement in this technology was the representation of these codes twodimensionally, giving rise to Data Matrixes, MaxiCodes, and the popular QR Codes. All these variants use visible surface area as the mode of transmission. However, these forms require line of sight, close physical proximity, and adequate lighting. This work proposes a novel way to generate and detect barcodes using audio frequencies spread across time, instead of surface area. Keywords Barcodes · QR codes · Audio frequencies · Aural band
1 Introduction In today’s day and age of digital technology and innovation, data collection, representation, and interpretation has become a centric focus for most technologies such as machine learning techniques, medical care, advertising, and so many others. One of the most prominent breakthroughs occurred in the late 1960s, when onedimensional barcodes were used to label railroads by the Association of American Railroads. A barcode is a machine-readable optical label that contains information about the item to which it is attached [5]. It is compiled according to a set of rules to embed information for machine reading [4]. A matrix code, also termed a 2D barcode, is a two-dimensional method to represent information. It is similar to a linear (1-dimensional) barcode, but can represent more data per unit area. A QR code is a type of two-dimensional code that can store data information and designed to be read by smart-phones. QR stands for Quick Response, indicating that the code contents should be decoded very quickly at high speed. The code consists of black modules S. Annadurai (B) Microsoft Corporation, Redmond, WA 98052, USA e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_9
87
88
S. Annadurai
arranged in a square pattern on a white background. The information encoded may be text, a URL or other data. The popularity of QR codes is growing rapidly all around the world. Nowadays, mobile phones with built-in camera are widely used to recognize the QR Codes. The QR code system consists of a QR code encoder and decoder. The encoder is responsible for encoding data and generation of the QR Code, while the decoder decodes the data from the QR code. The two-dimensional barcodes were a success because of the greater efficiency that it possessed over the traditional onedimensional barcodes. Here, efficiency refers to the density of information that can be packed with respect to a given amount of area. It can be observed that virtually all existing two-dimensional barcodes utilize a cartesian arrangement of square tiles that may be either of two opposing colors. Other examples of two-dimensional barcodes are the MaxiCode, Data Matrixes, and ShotCodes. One interesting aspect of some these barcodes is the utilization of an error correction code. The Reed-Solomon error correction code is one of the most powerful and standardized techniques [7] to detect the presence of an error in the transmission of a message, as well as to possibly correct the error. The Reed-Solomon algorithm is represented as an .(n, k) RS code defined in a Galois Field, .G F(2m ). There exists work by Jalaly et al. [2] to represent barcodes using radio frequencies. However, these RF barcodes require hardware to be manufactured to represent an RF barcode, and this cannot be overwritten. The detection of today’s barcodes require line of sight, close physical proximity, and adequate lighting. That is because the scanner must accurately detect [1] the position and color of each of the tiles (for example, rectangles in barcodes, squares in QR codes, etc.). This work attempts to overcome these limitations by proposing a novel way to generate and transmit and receive barcode-like information. A technical analysis of how the one-dimensional EAN13 barcode and the two-dimensional QR code is constructed [3, 6] is presented, along with justifications for each of these components (or “zones”). These zones are used as motivation for the components of the proposed audible barcode, or AuraCode.
2 Technical Analysis 2.1 Components of the EAN13 Barcode An example of a one-dimensional barcode is the EAN13 barcode. It comprises of several components, or zones, as seen in Fig. 1. • Quiet Zone: The barcode is surrounded on the left and right extremes by empty white space. This helps the detector to visually interpret the left- and rightextremities of a barcode, and distinguish it from its surroundings. • Left and Right Guards: The first and last 3 lines follow the pattern of 1 black line, 1 white line, and 1 black line. A black line represents 1 in binary, and a black line
AuraCodes: Barcodes in the Aural Band
89
Fig. 1 Components of an EAN13 barcode
represents 0. Those lines help the detector to thickness of a black and white line. The first 3 lines comprise the Left Guard, and the last 3 lines comprise the Right Guard. • Center Guard: This zone is in the center of the barcode, and comprises of black and white lines representing the binary code ‘01010’. This is a well-defined pattern in the standard, and will always be present. If the detector is unable to find this sequence in the middle of the supposed barcode, it will not consider the pattern to be a valid EAN13 barcode. • Checksum: This is the last number of the code in the barcode. It is not a part of the original message transmitted in the barcode. It helps the detector validate if the rest of the code was correctly scanned. It is a one-digit value generated by passing the original message through a well-defined mathematical function. • Left-hand and Right-hand encoding: The space between the left guard and the center guard comprises the left-hand encoding, and between the center guard and right guard comprises the right-hand encoding.
2.2 Components of the QR Code An example of a two-dimensional barcode is the Quick Response (QR) code. It comprises of the following zones, as seen in Fig. 2. • Version information: This sequence of black and white squares (representing binary 1 and 0 respectively) help determine which version of the QR code standard was used in generating the code. • Format information: This records two things: the error correction level and the mask pattern used for the symbol. The Reed-Solomon error correction code is employed to tolerate errors/inaccuracies in detecting the color of position of a few tiles when scanning the QR code. The format information contains data on how much error tolerance is used, so that the detector may accurately decode the data. Masking is used to break up patterns in the data area that might confuse a scanner, such as large blank areas or misleading features that look like the locater marks.
90
S. Annadurai
Fig. 2 Components of a QR code
The mask patterns are defined on a grid that is repeated as necessary to cover the whole symbol. • Data and error correction keys: This field contains the actual data embedded with error-correcting padding generated by the Reed-Solomon algorithm. This helps in accurately decoding the entire QR code. • Required patterns: For an image processing algorithm to accurately detect both the boundaries and the grid of black and white squares, certain patterns are defined by the standard. The Position markers are placed on the top right, top left, and bottom left corners; these are what the scanner will look for to demarcate the corners of the QR code, and further image processing will take place within these limits. The Alignment markers are present in regularly spaced intervals vertically and horizontally, to allow the detector to accurately infer gridlines even when the QR code image is deformed. And the Timing markers are alternating black and white squares meant to determine the count of squares along the length of the QR code, as well as the width of any one square which can represent binary 0 or 1. • Quiet zone: This is an empty space around the boundaries of the QR code that help the detector distinguish the QR code from its surroundings.
3 Proposition This work proposes a novel type of barcode which represents its “tiles” using audio frequencies instead of a cartesian tiling on a surface. Put simply, it is a barcode that is generated using sound frequencies, and is detected by listening to it. In contrast to conventional 1-dimensional and 2-dimensional barcodes, the audible barcode does
AuraCodes: Barcodes in the Aural Band
91
not require line of sight, close physical proximity, or adequate lighting. Possible use cases are: • device pairing: to pair a Bluetooth speaker to one’s phone, one typically needs to find the correct device ID on their phone’s Bluetooth devices menu, click on it, and manually pair using a 6 digit code. An audible barcode can streamline that entire process. Let the Bluetooth speaker play a tune (audible barcode) that identifies itself, and let the phone listen to it to interpret which device to pair with. • data broadcasts: when at a railway station or airport in a foreign country, one might not understand the language or the announcements being made over the loudspeaker. If every announcement was augmented with this audible barcode on the same loudspeaker, people’s devices can listen to the announcement and translate it locally on the personal device’s screen. • assistive technology in non-visual circumstances: if a product needs to be identified in poor visibility conditions (for example, a visually impaired person searching for a product in a supermarket, or in an elevator finding the buttons), audible barcodes serve well here because a visually impaired person cannot reasonably find these objects without sight. For example, most elevator buttons are augmented with Braille code markings, but it is difficult for the visually impaired to locate them. Instead, let the person play a barcode identifying the item they need (or the button to press in an elevator) using their personal device, and the store/elevator can recognize and assist.
4 Implementation The Aural Barcode, hereby referred to as the Auracode, is generated by mapping each byte of data to well-defined audio frequencies. In this implementation, 4 bits of data are represented by a specific frequency. Several critical ideas are inspired by their counterparts in the EAN13 and QR code algorithms, like the Left and Right Guard zones and the Error Correction information. The algorithm to generate an AuraCode is: • Collect the data to encode. • Using the Reed-Solomon algorithm with a Galois field of size 256, and generating 10 symbols, generate a padding of 10 bytes containing the error-correction code (ECC) data. • Use a well-defined 4-byte pattern “0909” as the Left and Right guard pattern. • The binary data to encode will be the sequential concatenation of Left Guard, original message bytes, ECC padding, and Right Guard. • A mapping of half-byte (4 bits) to audio frequency is required. Let .b0000 => 200 Hz, b0001 => 800 Hz, b0010 => 1400 Hz, b0011 => 2000 Hz, b0100 => 2600 Hz, b0101 => 3200 Hz, b0110=>3800 Hz, b0111=>4400 Hz, b1000 => 5000 Hz, b1001 => 5600 Hz, b1010=>6200 Hz, b1011=>6800 Hz, b1100 =>
92
S. Annadurai
7400 Hz, b1101=>8000 Hz, b1110=>8600 Hz, b1110=>9200 Hz, b1111 => 9800 Hz. • Convert the binary data to encode into its corresponding audio frequencies, playing each frequency for 0.1 s. The proposed implementation allows for a microphone to listen to the audio stream, and using audio processing techniques like Fast Fourier Transforms, reconstruct the frequencies played between the known Left Guard and Right Guard frequencies. In this case, because the guard patterns are defined as “0909”, or hexadecimal 0x30393039, it translates to an audio segment of 2000 Hz, 200 Hz, 2000 Hz, 5600 Hz, 2000 Hz, 200 Hz, 2000 Hz, 5600 Hz, each played for 0.1 s. So, when the microphone detects this particular 0.8 s sequence, it would have detected either the Left or Right Guard zone (timestamps). And the detecting algorithm can proceed to reconstruct the frequencies of the notes played between these timestamps. The frequency map specified with this work’s described implementation above does not have any specific significance. Any set of audio frequencies can be chosen. Further research can be undertaken to see if any specific set of fine-tuned frequencies might serve better for an audio detection algorithm (for example, frequencies from pentatonic or heptatonic scales).
5 Experimental Results A sample message “AuraCodes are awesome” is chosen to be encoded. The first step is to represent it in binary form. This results in the sequence: b’AuraCodes are fun’
Then, we pass it through the Reed-Solomon error correction algorithm, and retrieve 10 bytes of error correction keys, which is appended to the original message. This results in the sequence: b’AuraCodes are funI>\xb3\t2\xb0\xdc\x81\x06f’
Then, we add the left and right guards. This results in the sequence: b’0909AuraCodes are funI>\xb3\t2\xb0\xdc\x81\x06f0909’
This is the final sequence that needs to be represented using the audio frequencies specified in the map described in the implementation. Each frequency is rendered for 0.1 s. The generated AuraCode has a spectrogram as seen in Fig. 3.
AuraCodes: Barcodes in the Aural Band
93
Fig. 3 Spectrogram of an AuraCode with the message “AuraCodes are fun”
6 Future Work This implementation is a proof of concept, and can be enhanced in the following ways: • The message can be encrypted using either a symmetric or asymmetric encryption algorithm (AES or RSA). This assumes the key data (symmetric key, or private key) can be independently and securely transmitted to the intended recipient(s). • Currently, each half-byte of data is represented by an individual frequency played over 0.1 s. However, because multiple non-destructive frequencies can be detected by a Fast Fourier transformation, multiple frequencies can be played simultaneously over the same span of time (0.1 s), effectively creating a chord of frequencies. This allows for the time taken for the entire AuraCode to be played to be reduced by a factor of the number of frequencies played in a chord. • Currently, it is possible for the binary data stream to be repetitive in nature. For example, it is possible to continuously play 2000 Hz for most of the duration of the AuraCode. This problem can be eliminated by using Masking patterns, similar to the implementation of the QR code, to ensure the bytes of the data stream are non-repetitive in bursts. Information about which mask pattern was used would need to be transmitted within the AuraCode, similar to the Mask data zone in the QR code. • The current implementation limits the number of Reed-Solomon error correction symbols to 10. This can be made dynamic and customized to the size of the message. Shorter messages can demand fewer error correction symbols, and larger messages can demand more. This would imply that another zone representing the number of Reed-Solomon ECC symbols needs to be reserved in the structure of the AuraCode.
94
S. Annadurai
7 Conclusion And so, a novel implementation of a barcode in the aural band (the AuraCode) is proposed. It circumvents the limitations of all current visual barcodes that require line of sight, close physical proximity, and adequate lighting. The AuraCode can be detected across physical barriers, can be detected at a distance from the audio source, and does not require light for generation or detection. It takes advantage of hardware that is very commonly found in the modern environment—a speaker and a microphone. The AuraCode can become the basis of a new standard of device pairing algorithms, assistive technology for public announcements, or identifying items/commands without visual reference.
References 1. Creusot C, Munawar A (2015) Real-time barcode detection in the wild. In: 2015 IEEE winter conference on applications of computer vision, pp 239–245 2. Jalaly I, Robertson ID (2005) RF barcodes using multiple frequency bands. In: IEEE MTT-S international microwave symposium digest, 2005, pp 139–142 3. Li Y, Zeng L (2010) Research and application of the EAN-13 barcode recognition on iphone. In: 2010 international conference on future information technology and management engineering, vol 2, pp 92–95 4. Lin S-C, Wang P-H (2014) Design of a barcode identification system. In: 2014 IEEE international conference on consumer electronics—Taiwan, pp 237–238 5. Palmer RC (2007) The bar code book: a comprehensive guide to reading, printing, specifying, evaluating, and using bar code and other machine-readable symbols. Trafford Publishing 6. Tiwari S (2016) An introduction to QR code technology. In: 2016 international conference on information technology (ICIT), pp 39–44 7. Zhang J, Fan G, Kuang J, Wang H (2005) A high speed low complexity Reed-Solomon decoder for correcting errors and erasures. In: IEEE international symposium on communications and information technology, 2005. ISCIT 2005, vol 2, pp 1009–1012
A Scrutiny and Investigation on Student Response System to Assess the Rating on Profuse Dataset—An Aerial View Shweta Dhareshwar
and M. R. Dileep
Abstract Sentiment analysis (SA) has grown in popularity among researchers in several fields, including the field of education, in recent times. In this paper, a vast study and examination is made to analyze students’ feedback to assess the efficiency of educating and learning. In this study, the algorithms and techniques of Natural Language Processing (NLP) and Machine Learning (ML) are considered for mining the student feedback on teaching and learning process. Universities and organizations collect the student feedback on the various online courses offered by them. The student feedbacks are classified into three sentiment classes Positive, Negative or Neutral. Traditionally Questionnaires are used to collect feedback. To overcome this, an automated system is proposed using NLP. In NLP Lexicon and Dictionary approaches are implemented for Sentiment Analysis. In this study Machine Learning techniques such as Support Vector Machines, Multinomial Naive Bayes are considered for analysis. In this paper, online learning platform such as Coursera, Udemy, LinkedIn Learning are used to gather student feedback on instructors’ performance, learning experience and other course features. Many online learning portals like Coursera works with universities and other organizations to offer free online courses for students on variety of subjects. Additionally, students blog about their academic understandings. This paper focusses on three topics: Assessing MOOC contentsIt is considered from student perspective related to learning materials, assignment, and project. Course structure-related to course module, difficulty level and learning objectives. Design—different courses offered are well organized and the way it has been presented content wise. Keywords Sentiment analysis · Student feedback · Natural language processing · Assessment · Feature selection · Educational data mining
S. Dhareshwar (B) · M. R. Dileep Department of Master of Computer Applications, Nitte Meenakshi Institute of Technology, Yelahanka, Bengaluru, Karnataka, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_10
95
96
S. Dhareshwar and M. R. Dileep
1 Introduction Feedback is an integral element of the system in academic attainment. One of the most effective methods of educational practice is through student feedback, which has a clear impact on students and a significant effect in both the teaching and learning processes. In learning online, response is information provided to the learner regarding their performance compared to the learning objectives or outcomes. It should be able to yield improvement in student’s learning. Feedback comes in many shapes and forms. This assignment was carried out manually using paper and pens, which has several limitations such as evaluation of these handwritten forms is a tedious process. The faculty gathers papers and computes the final grade for each course and lecturer after receiving comments from all students. Recent advancements in online learning have created good chances to move away from traditional lecture approaches and towards more creative and effective teaching techniques. These pedagogies encourage group collaboration among students and give a wide number of student’s free access to course resources. Massive Open Online Courses (MOOCs) are one such learning approach that has drawn a lot of interest. These Online courses (MOOCs) provide an open access to number of students across the world via the Web. The Online Course Feedback and Analysis System is a webbased system which gathers the opinion from every distinct student and gives an overall result about the following course, which can be analyzed directly. In this online-era, where everything is going web based, there is a need to develop an online portal which is very useful to maintain feedback reports. Due to this, the evaluation reports can be accessed by the university more quickly and without any data loss. Students will fill out an online questionnaire using a standard format after logging into the website. By this process, the student may submit the feedback successfully and error-free. One can access the feedback given by the students and it can be analyzed accordingly. Sentiment analysis is particularly useful for reviewing course remarks in vast online courses (MOOCs), which could make it simple for trainers to assess their courses. In this paper, in-depth scrutiny, and investigation on the application of sentiment analysis to the assessment of student feedback on MOOC course is conducted.
2 Literature Survey Sentiment Analysis is also known as opinion mining is a subfield of Natural Language Processing that tries to identify and abstract opinions within a given document, sentence and transcript level across blogs, evaluations, social media forums, news etc. Emotion detection conducted on document level is difficult since large number of documents are present. Sentiment analysis is used in an inclusive range of fields, including business, education, medicine, and politics, and has emerged as one of the
A Scrutiny and Investigation on Student Response System to Assess …
97
most fiercely disputed concerns in education. Particularly education sector focus on multiple state holders—teacher, student, decision maker and institution. Kumar et al. [1] has conducted a detailed study on various educational firms. In this study, any institute must have feedback evaluation in order to sustain and track the system’s academic standard. A questionnaire-based technique is typically used to assess an institute’s teachers’ performance. It suggests a sentiment analysisbased autonomous valuation system that will be more resilient and insightful than the recent approach. Ahmad et al. [2] has suggest that compared to lexicon-based methods, supervised learning algorithms yield higher results. Dinu et al. [3] suggests a novel approach to gathering and making use of useful student feedback on the effectiveness of the electronic classroom as a learning environment. Dake et al. [4] summarizes Sentiment analysis and qualitative feedback from students are important for institution. Using sentiment analysis enables it to understand how much students value their lectures, which they often communicate in prolonged texts with little or no limitations. Dalipi et al. [5] demonstrate implementation of sentiment analysis methods to interpret student input in MOOCs is a crucial step towards enhancing the learning opportunity. It can be used for examining how students behave toward courses, platforms, and instructors, sentiment analysis can be used to enhance instruction. Bhanukiran et al. [6] has explored the Student Feedback System is used to gather student feedback. Using the feedback provided by the students, it produces reports for the instructors. It includes modules for students, faculty, and administration. It is possible to determine the faculty’s effectiveness from the perspective of the students. Ibrahim et al. [7] has focused to identify problems with assessment methods, analyse student comments on assessments, and recommend an effective data mining methodology to decision-makers so they may update and alter their practices accordingly. The primary goal is to make learning more effective. Mabunda et al. [8] evaluating instructors based on student feedback is crucial because it enables teachers to assess the effectiveness of their instruction. This research offers a sentiment analysis model to analyse student feedback to be able to evaluate the success of teaching and learning in order to address this issue. This study uses student datasets from Kaggle to train machine learning models such as Support Vector Machines, Multinomial Naive Bayes and Neural Networks on feature engineering and resampling techniques to categorize student feedback into three sentiment classes. Kastrati et al. [9] invented an algorithm using deep learning in the field of education to investigate students’ attitudes, ideas, and actions about diverse learning facets. Knöös et al. [10] focusedon MOOCs struggle with significant number of dropouts, and only an insignificant portion of students finish the classes they registered for despite their rising popularity. It is to understand how MOOC learners behave and to determine learner motivations and how they vary between students who finished a course and those who back out. Kotikalapudi et al. [11] explained the scenario that student feedback can be processed using knowledge engineering and expert system to help academia administrators and instructors solve challenging areas in education. The suggested approach uses online resources and course exams to measure student answers to identify psychological
98
S. Dhareshwar and M. R. Dileep
polarity, the ideas expressed and satisfaction versus dissatisfaction. The sustainability of the system can be seen when findings are compared to those from direct assessments. Liu et al. [12] has conducted study on lexical (unsupervised) and machine learning methods used to do sentiment analysis on messages received from a MOOC forum (supervised). These methodologies are compared, study reveals that machine learning techniques have certain limitations over lexical. Mehta et al. [13] concentrate on the fundamentals of opinion mining and its dimension and results of an analysis using various ML and Lexicon exploration techniques. Altrabsheh et al. [14] gave an approach on real-time student feedback provides many benefits for education, but it can be frustrating and time-consuming to analyze input while instructing. This Paper suggest employing sentiment analysis to automatically analyze feedback in order to solve this issue. Support Vector Machines (SVM), with the highest level of pre-processing, unigrams and absence of neutral class, produced the best results for the four attributes, with an accuracy of 95%. Nasim et al. [15] the sentiment analysis of student evaluation is presented in this work using a combination of machine learning and lexicon-based methodologies. The textual feedback, which is normally gathered at the end of a semester, offers helpful insights into the overall quality of the instruction, and makes insightful recommendations for ways to enhance the process. In this paper, three aspect of MOOC course viz. Assessing MOOC contents, Course structure, Design is considered. Different courses are considered further rating given by the student for course are compared and feedback which is divided into three categories: positive, negative, and neutral.
3 Motivation and Background Study Massive Open Online Course is referred to as a MOOC, a portal serves as the host for courses. The Courses are open enrolment, free courses that are provided by several websites. The course’s framework includes areas for discussion, practice, and learning. Users can refer to all the reading material, lectures, and videos in MOOCs to further learning. Giving students the freedom to learn at their own speed is the main goal of MOOCs. The students can access the course at any time by logging on to the site. Due to the accessibility and flexibility, it provides in terms of learning, online education has experienced a major upsurge in recent years. The main objective of this paper is to critically analyse student reviews of courses available on the Coursera platform by responding to research questions and doing systematic reviews in a step-by-step manner. Additionally, the Covid-19 epidemic raised that ratio. According to statistics, the online learning sector has risen by 70% since 2015 and is predicted to continue growing in the years to come. The quality of online
A Scrutiny and Investigation on Student Response System to Assess …
99
Fig. 1 Correlation of traditional education and the MOOC courses
education has grown over time, which has enhanced its reputation. There are several advantages of sentiment analysis in MOOC mainly 1. It is crucial for MOOC coordinators to assess student comments and make improvements by understanding the learners and properly focusing the feedback. 2. It is vital to understand the opinions of the students in order to determine whether the course is appreciated by the students. The reviews or comments might be used to infer the student perspective. 3. Monitoring negative words or phrases might assist in resolving immediate concerns. The root reason of any negative remarks can be found by looking for them in comments, and the issue can then be fixed to improve the MOOC. Due to pandemic many students feel MOOC courses offered in online platform are better than traditional education. Since there are many benefits of MOOC course over traditional education and it is depicted using Fig. 1. A concise description on the components of the sentiment analysis architecture is given in the below section.
3.1 Data Collection Text that has been written by users can be found on social networking sites like Twitter, Facebook, and Instagram as well as weblogs, forums, news, and reviews.
3.2 Pre-processing The first stage in sentiment analysis systems for text mining to get the data ready for categorization is pre-processing. At this phase, the SA system gets the collected data ready for execution. Text cleaning, selection and semantic segmentation are all included in this phase. Several methods are used to reduce text noise in textual data and reduce dimensionality, which aids in effective categorization.
100
S. Dhareshwar and M. R. Dileep
3.3 Tokenization Tokenization entails breaking up the raw text into manageable components. The original text is tokenized, or broken up into tokens like words and sentences.
3.4 Stemming It is a method of eliminating the word’s suffix and restoring it to its original form.
3.5 Future Selection This is a method to identify a subset of features to achieve various goals.
3.6 Classification The classification process involves selecting a set of categories or modules for a novel observation based on a training sample. There are various methods for classifying sentiment. However, lexicon-based implementation and using machine learning methods are the two approaches that are most widely utilized.
3.7 Evaluation A classifier’s performance is often assessed using four matrices. Accuracy, precision, recall, and F-measure among these.
3.8 Visualization The evaluation process is visualized using well-known methods like graphs, histograms, and confusion matrices (Fig. 2). The NRC Emotion Lexicon and natural language processing are both used by the system to identify emotion and thoughts after pre-processing input data from student comments received from MOOC forums. The approach determines whether a person is satisfied or dissatisfied by categorizing feelings into three categories: positive and
A Scrutiny and Investigation on Student Response System to Assess … Fig. 2 Samples sentiment analysis system architecture of the dataset
101
Data Collection
Preprocessing
Stemming Stop Word Removal Tokenization
Feature Selection
Classification
Evaluation
Visualization
negative, and neutral. The SA system contains a data-visualization component to aid in analysis and is capable of processing content in multiple languages.
4 Dataset A set of course reviews collected from Kaggle for this paper. About 1,500 student comments from the course, which started around August 2019 to August 2021, which each feedback includes the following information: 1. Course_id, 2. Course_ Certificate_Type, 3. Course_rating, 4. Review_by_Course as presented in Table 1. Course_Id contains the Course title. Course_Certificate_Type—It contains information about the numerous certificates that are offered in courses. Rating- It has the ratings associated with each course and is given by the student ranging between 4 and 5. Review_by_course contains a detail about comment given by the students based on learning experience. The feedback is the form of text which includes emotions like positive—Excellent, Great Negative-Terrible, Poor and neutral.
102
S. Dhareshwar and M. R. Dileep
Table 1 Sample student review on courses from Kaggle Dataset Course_id
Course_ Certificate_Type
Course_ Rating
Review_by_Course
machine-learning
Specialization
4.7
Pretty dry, but I was able to pass with just two complete watches so I’m happy about that. As usual there were some questions
data-visualization-tableau
COURSE
4.7
Would be a better experience if the video and screen shots would show on the side of the text
supply-chain-logistics
COURSE
4.7
A few grammatical mistakes on test made me do a double take but all in all not bad
natural-language-processing-tensor flow
COURSE
4.8
Excellent course and the training provided was very detailed and easy to follow
python-crash-course
COURSE
4.6
Great course, lectures were straight forward and easy to follow along. The course provided all the information necessary to pass the CPI examination for certification
5 Observations and Analysis Based on the study made in this paper on sentiment analysis techniques of student feedback to enhance teaching and learning methods are displayed in Table 2. Four basic categories can be used to classify the approaches used to conduct sentiment analysis in MOOCs. Many papers are used either Naïve Bayes (NB), Support Vector Machine (SVM), Lexicon and few papers are used PRISMA framework and own method, but many of the studies did not specifically describe the method.
6 Results and Discussions In this section the tentative findings of student sentiment prediction using an existing dataset that was created based on student feedback on Coursera obtained from Kaggle website in detail. The dataset consists of 1500 records of student evaluations of the effectiveness of the course in which they have enrolled is shown in Figs. 3 and 4.
A Scrutiny and Investigation on Student Response System to Assess …
103
Table 2 Broad analysis, observations and success rates on various SA techniques Problem selected
Method
Observation
Success rate (%)
Sentiment analysis of students’ feedback with NLP and deep learning: a systematic mapping study
PRISMA framework
Deep learning techniques are applied in the educational field to evaluate how student’s opinions and behavior about various elements of teaching aspects
56
A sentiment NLP with NRC analysis system emotion lexicon to improve teaching and learning
It supports 40 languages including several 80 Indian ones like Hindi, Tamil, Gujarati, Marathi, and Urdu
Sentiment analysis and feedback evaluation
Supervised and semi supervised machine learning techniques
Feedback is collected in the form of running text and sentiment analysis. automatic system is replaced by Existing questionnaire-based system
Sentiment analysis: towards a tool for analyzing real-time students feedback
SVM, naive SVM performed better Bayes, complement naive Bayes (CNB)
95
Sentiment analysis of student textual feedback to improve teaching
Count vectorizer TFIDF (term frequency inverse document frequency) SMOTE technique
79
SVM, MNB, and RF classifiers are compared to machine learning classifiers like neural networks and K-nearest neighbors in this study. In comparison to the other models, neural networks fared better
68.08
7 Conclusions The pandemic crisis speeds up online learning across the globe and increases the MOOCs marketplace. The use of sentiment analysis methods to evaluate student opinion in MOOCs is a crucial step toward enhancing the educational experience. Additionally, by examining how students behave toward courses, platforms, and instructors, sentiment analysis can be used to enhance instruction. This paper tried to examine student viewpoints from the numerous comments left by students in online forums. It is discovered that most MOOC learners’ comments were favorable through textual data mining and sentiment analysis, and that the sentiment score was favorably connected with the MOOCs’ rating a detailed survey is accomplished in this paper. The two supervised machine learning algorithms that are most frequently used are the Naive Bayes classifier and SVMs.
104
S. Dhareshwar and M. R. Dileep
Fig. 3 Sample course rating by a student of based on learning experience
Fig. 4 Positive, negative and neutral review by the student
A Scrutiny and Investigation on Student Response System to Assess …
105
Future Works Overall, students have a positive opinion of the course content when taking into account the research articles described in the studied sample when evaluating MOOC content. As was mentioned above, most of the literature on MOOC content assessment focuses on learner remarks; however, future researchers may want to think about looking into teachers’ feedback and perspectives on the development of the course material, teaching pedagogy, experience, and assessment, among other things. Sentiment analysis can be applied for different MOOC platform like Udemy to improve teaching learning process. To increase the system’s multilingual capability, add other languages.
References 1. Kumar A, Jain R (2015) Sentiment analysis and feedback evaluation. In: 2015 IEEE 3rd international conference on MOOCs, innovation and technology in education (MITE), pp 433–436. https://doi.org/10.1109/MITE.2015.7375359 2. Ahmad GI, Singla J (2019) Machine learning techniques for sentiment analysis of Indian languages. Int J Recent Technol Eng 8(2):3630–3636 3. Dinu L, Auter P, Arceneaux P (2015) Gathering, analyzing, and implementing student feedback to online courses: is the quality matters rubric the answer? Istanbul J Open Distance Educ 1:15–28 4. Dake DK, Gyimah E (2022) Using sentiment analysis to evaluate qualitative students’ responses. EducInfTechnol. https://doi.org/10.1007/s10639-022-11349-1 5. Dalipi F (2021) Sentiment analysis of students’ feedback in MOOCs: a systematic literature review 6. Bhanukiran G (2018) Student feedback system. IRE J 1(10). ISSN: 2456-8880 7. Ibrahim ZM, Bader-El-Den MB, Haig E (2018) A data mining framework for analyzing students’ feedback of assessment. EC-TEL 8. Mabunda JGK (2021) Sentiment analysis of student textual feedback to improve teaching 9. Kastrati Z, Dalipi F, Imran A, PirevaNuci K, Wani M (2021) Sentiment analysis of students’ feedback with NLP and deep learning: a systematic mapping study. Appl Sci 11.https://doi. org/10.3390/app11093986 10. Knöös J, Rääf SA (2021) Sentiment analysis of MOOC learner reviews: what motivates learners to complete a course? (Dissertation). Retrieved from: http://urn.kb.se/resolve?urn=urn:nbn:se: lnu:diva-105919 11. Devi KV, Yamini UM, Sanjusri K, Virinchi G. Traditional Sentiment and emotion identification system to improve teaching and learning 12. Liu T, Hu W, Liu F, Li Y (2021) Sentiment analysis for MOOC course reviews. In: International conference of pioneering computer scientists, engineers and educators. Springer, Singapore, pp 78–87 13. Mehta P, Pandya S (2020) A review on sentiment analysis methodologies, practices and applications. Int J Sci Technol Res 9(2):601–609 14. Altrabsheh N, Cocea M, Fallahkhair S (2014) Sentiment analysis: towards a tool for analysing real-time students feedback. In: 2014 IEEE 26th international conference on tools with artificial intelligence, pp 419–423. https://doi.org/10.1109/ICTAI.2014.70 15. Nasim Z, Rajput Q, Haider S (2017) Sentiment analysis of student feedback using machine learning and lexicon-based approaches, pp 1–6. https://doi.org/10.1109/ICRIIS.2017.8002475
Analysis of Hospital Patient Data Using Computational Models Impana Anand, M. Madhura, M. Nikita, V. S. Varshitha, Trupthi Rao, and Ashwini Kodipalli
Abstract The recommendation system could help the user with the different demands. This paper takes medical service as an objective to improve the recommendation system using K means clustering Hospital recommendation in the internet conforming to user’s obligations like the required locality of the hospital in the city with the required specialty to alleviate the user. One of the Machine learning algorithms is K means clustering which is used to deal with the unsupervised data. The target variable is not accessible in unsupervised data. This program divides the data set into various clusters based on their similarity. This Machine learning algorithm works monotonously to allocate each data point to one of the K clusters built on the attributes that are contingent. Some of the applications of K-means clustering include image segmentation, customer segmentation, species clustering and clustering languages. Taking into consideration ‘n’ number of hospitals of different specialty like general, IVF, eye care, maternity, skin and cosmetic located in different locations in the city, K-means clustering assists the end user to choose the required hospital hinges on their desired location as well as specialty. Keywords Hospital management · Label encoder · PCA · Autoencoders · Elbow method and Silhouette analysis
1 Introduction World’s largest industry is health care and managing this industry itself is another industry called Hospital management. Hospital Management is the process of efficiently and effectively administering all the things from appointment scheduling, patient registration to consultation management, lab management, document I. Anand · M. Madhura · M. Nikita · V. S. Varshitha · T. Rao (B) · A. Kodipalli Department of Artificial Intelligence and Data Science, Global Academy of Technology, Bangalore, India e-mail: [email protected] A. Kodipalli e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_11
107
108
I. Anand et al.
management, report generation, drug safety, outpatient management, staff management, and so much more [1, 2]. Decades ago, a single doctor would manage the total hospital administration. Now that is not the case. All thanks to the engineers who made doctors life easy by building a software-based system called Hospital Management System (HMS) [3]. The necessity of hospital management is: • • • • •
Track and increase revenue Data security and integrity Paperless Operation and Minimise Errors Efficient and accurate hospital administration Increase Patient Satisfaction. Few of the modern technologies in healthcare industry are
1. AI: In 2018 Forbes stated that few of the important areas of AI are image analysis, robotics surgery, administrative workflows, clinical decision support and virtual assistants. It is anticipated that AI tools will complement and facilitate human work but will not take the place of healthcare professionals or physicians [4, 5]. 2. ML: Machine Learning can more accurately detect a disease at a premature stage, helping in reducing the patients’ readmissions in hospitals. 3. DL: Deep learning a branch of AI is used in new drug discovery, recognition of potentially cancerous cells in radiology images [6]. 4. Fuzzy logics: wherever there is non-countable and uncertain data there comes intrusion and the best method to solve them is fuzzy [7]. 5. Big data: collecting, analyzing, and leveraging consumer, patient data is too vast or complex to be understood. So big data will outreach and guides the hospital by doing all those jobs and giving the best next action to be taken. Nowadays Hospitals have to meet the patient needs and the complete focus has shifted to ‘patient services’ from medical and surgical therapies. Good Hospital Management starts by ensuring anytime accessibility, availability and high-quality treatments [3]. In this study, we used k-means clustering, a machine learning method, to propose hospitals to patients based on their location and specialty [8].
2 Literature Survey Stuart Lloyd proposed a technique for pulse code modulation as the standard algorithm in 1957. In 1965. The identical technique was published by E W Forgy, therefore it became known as Lloyd-Forgy. For Big Data [9, 10], in 1967 MacQueen many clustering algorithms are proposed based on parallel and distributed computation [5]. Abdul Nazeer [11, 12] asserts that choosing centroids that result in various clusters is the main drawback of k-means because this decision affects the quality of the cluster that is produced. Choosing centroids and assigning data points to nearby clusters are
Analysis of Hospital Patient Data Using Computational Models
109
the first two phases in k means, followed by the mean calculation. The number of clusters and the requirement for the k value as an input remains a restriction [6]. In their comparison of representative object-based Fuzzy C-Means clustering with centroidbased K-Means clustering, Soumi Ghosh et al. Because fuzzy measure calculations take longer than K-means, the FCM approximates the results of K-means [13, 14]. A number of datasets, including Wine, Vowel, Ionosphere, Iris, crude oil data, and numerous other distance measures are used to calculate the algorithm’s performance, according to Thakare [15, 16]. When there are two or more algorithms belonging to the same category, R Amutha et al. proposed that clustering approaches are employed to obtain the best results. Two k-means algorithms exist: Numerous data sets Clustering with parallel k/h-Means and high dimensional data sets—New Clustering Algorithm Based on K-Means. Parallel k/h-Means performs well on large datasets, and a unique k-means-based clustering method combines the advantages of HC and K-Means. These algorithms expand the space and similarity of the two data sets that are present at each node [17]. Since data is present everywhere, Sakthi et al. argued that evaluating the data is a highly difficult procedure that necessitates data mining ways to learn about, comprehend, and arrange the data into exceptional collections [18]. Shi Na recommended looking into the K-Means algorithm’s shortcomings. Each iteration’s distance between the centroid and each object is calculated using K-means. This process’s repetition has an impact on clustering’s effectiveness [19]. Nidhi Singh recommended using comparative analysis and agglomerative K-Means (a clustering approach that uses one partition) (a clustering algorithm that employs one hierarchy). Both of the aforementioned algorithms’ performance, running time, and accuracy are calculated using WEKA tools [20]. The results demonstrate that for datasets with real characteristics, K-Means is more accurate than the agglomerative for, real attributes, whereas for datasets with integer, the agglomerative is more accurate than K-Means. The K-Means approach is hence effective for huge datasets. POSIX threads on individual nodes, Yan [21, 22] devised an MPI-based spectral clustering technique. The distribution and data partitioning amongst the nodes must be explicitly handled by the programmer because MPI is unable to solve the issue of node failure on its own [16, 23]. Using Map-Reduce, which lessens the operations’ reliance on iterations, Cui Xiaoli offered big data optimization. Three MR (Map-Reduce) jobs were fully utilized. The last stage involves using the Voronoi diagram to map the data set to centroids, while sampling techniques are only used at the beginning [18]. For improving group quality and choosing the ideal number of bunches, Shafeeq et al. provide a modified K-implies algorithm. Information from the client was submitted, including the K-implies calculation and the quantity of bunches (K). In any case, selecting the number of groups in advance under ideal conditions is tough. The strategy presented in this work can be used in both situations, where there is an unknown number of groups and one that is known in advance. The number of bunches can be chosen by the client, or they can choose the bare minimum required [15].
110
I. Anand et al.
3 Methodology The following method is used in this paper.
Dataset Creation
Data Preprocessing
Model Building
Evaluation
Clustering
3.1 Dataset Creation Creation includes compiling information to create a dataset that helps in model building. This model entails a hospital dataset created by collecting the raw data comprising attributes like Hospital Name, Area, Pin code, Contact details, Ratings, Specialization (General, Eye Care, IVF, Maternity and Skincare) and Online Appointments. The information was obtained from various sources viz., medical journals, telephone directories, and other sites.
3.2 Data Preprocessing In this step, basic EDA of the dataset is done, and checking if there is any correlation between the attributes, as it is necessary to make a relation between characteristics. The ones chosen will directly impact the results. Data pre-processed by normalizing,
Analysis of Hospital Patient Data Using Computational Models
111
eliminating duplicates, and making error corrections using keys such as Unique, Label encoder, Pair plot, heatmap, is null, and not null. • Unique: number of distinct values in a particular attribute. • Label encoder: it converts non-numeric values into numeric values, for instance, “specialty”. • is null and not null: checks whether there are any missing or null values. • Pair plot (Fig. 1): the pairwise relationship between the elements. Here is a graph showing a pairwise comparison between the 5 attributes. • Heatmap (Fig. 2): represents each value of a matrix using colour. Correlation between the attributes is shown below. The maximum correlation has a value1 or − 1 (i.e., 1 is a positive and − 1 is negative correlation). In comparison to lighter colors, deeper shades indicate higher values. Different hues of color stand in for the middle values. There is an entirely new color for each variation in value.
Fig. 1 Pair plot
112
I. Anand et al.
Fig. 2 Heatmap
3.3 Model Building Several models such as PCA, Autoencoder, k-means clustering, DBN, RBM, HTM, CNNs, and SVMs can be chosen according to the objective. This model is compiled using k-means clustering as the data set has not been tagged with labels identifying characteristics, properties, or classifications. To identify the value of k, which aids in proper dataset clustering, the elbow method is specifically used on the dataset. This value can also be used in graphical representation.
3.4 Evaluation Homogeneity: 0.1836464702880451 Completeness: 0.1837407327840609 V-measure: 0.18369358944333708
The quality of the clustering operation has been calculated with homogeneity of 0.1836464702880451, completeness of 0.1837407327840609 and v-measure as 0.18369358944333708. The standard values lies between 0.0 and 1.0.
Analysis of Hospital Patient Data Using Computational Models
113
3.5 Clustering Kernelized K-Means is implemented within Spectral Clustering in our data (Fig. 3). The graph uses the nearest neighbours to give a higher-dimensional representation of the data and assigns labels.
4 Results This paper shows the implementation of K-Means Clustering, the most popular unsupervised clustering algorithm. The K-Means algorithm is principled on finding the specific number of clusters and data labels for a value of K that has been pre-defined. Non-identical values of K are used for the purpose of comparison between the results, and post-comparison, an optimal value of K is picked such that this value provides the best performance. K inertia shows how well the given data is clustered. According to the dataset used, the value of K was chosen as 5 and the clustering was done, the result is shown below. The result shown above (Fig. 4) is not a best fit as the inertia of the data was 30188.768781342063 which is considered to be high, therefore the value of K is to be changed and compared. As this method of trial and error can be exhaustive, we use elbow method to determine optimal K values. In order to determine K value Elbow method and Silhouette Analysis is implemented.
Fig. 3 Spectral clustering
114
I. Anand et al.
Fig. 4 Scatter plot
4.1 Elbow Method The elbow method is shown in Fig. 5. From the diagram (Fig. 5), we can conclude that the optimal K value for the given data set is 3 and the distortion score (calculating the distance between points of one cluster to the nearest cluster center) is 132.445. The inertia for the 3 clusters in the given data is equal to 238.6454124349772, which is comparatively less. As the inertia decreases, accuracy of the given model increases, proving the optimal value of K.
Fig. 5 Elbow plot
Analysis of Hospital Patient Data Using Computational Models
115
Fig. 6 Scatter plot for K = 3
The result obtained by considering three clusters is represented as a scatter plot in Fig. 6.
4.2 Silhouette Analysis The silhouette analysis is implemented to cross validate the distortion score. Silhouette values near + 1 show that the sample value is distant from the neighboring clusters. When a sample value is 0 it shows that the sample is either on the decision boundary or bordering in between two neighboring clusters and samples are considered to be assigned to the wrong cluster if the values are negative (Figs. 7, 8, 9, 10 and 11). For K = 2, The average silhouette score is 0.636079328831427.
Fig. 7 Silhouette analysis for K = 2
116
Fig. 8 Silhouette analysis for K = 3
Fig. 9 Silhouette analysis for K = 4
Fig. 10 Silhouette analysis for K = 5
I. Anand et al.
Analysis of Hospital Patient Data Using Computational Models
117
Fig. 11 Silhouette analysis for K = 6
K = 3, The average silhouette score is 0.7927900477647479. K = 4, The average silhouette score is 0.5373697118400748. K = 5, The average silhouette score is 0.28926236262963273. K = 6, The average silhouette score is 0.041655367366686694. Silhouette score usually ranges from − 1 to + 1 for which the highest value represents that the given object is suitably matched to its cluster and has been matched deficiently to its neighboring clusters. When the number of clusters is 3, the average score is 0.7927 which is high compared to others. So, 3 is considered as optimal K value.
5 Conclusion This paper includes information that helps people to find the best hospital in their locality by using K-Means clustering. It comprises a dataset that has a list of hospitals that provide the best care according to the patient’s requirement. Although there are many algorithms, K-Means clustering is the most suitable algorithm on grounds of it being an unsupervised dataset. To find an optimal value of the k, the elbow method is most efficient which is cross validated with silhouette analysis. The quality of clustering operation depends on normalized conditional entropy measures. The final output is obtained through clustering, finding the accurate value of K for the dataset.
118
I. Anand et al.
References 1. Nazirun NNN et al (2017) User acceptance analysis of hospital asset management system. In: 2017 international conference on robotics, automation and sciences (ICORAS), pp 1–5. https:/ /doi.org/10.1109/ICORAS.2017.8308054 2. Bohr A, Memarzadeh K (2020) The rise of artificial intelligence in healthcare applications. Artif Intell Healthc 3. Reddy S, Fox J, Purohit MP (2019) Artificial intelligence-enabled healthcare delivery. J Roy Soc Med 112(1):22–28 4. Lv W, Tang W, Huang Hl, Chen T (2021) Research and application of intersection clustering algorithm based on PCA feature extraction and K-means. J Phys: Conf Ser 5. Malingering MB (2015) A survey on various K-means algorithms for clustering. IJCSNS Int J Comput Sci Netw Secur 15(6) 6. Abdul Nazeer KA, Sebastian MP (2009) Improving the accuracy and efficiency of the k-means clustering algorithm. In: Proceedings of the world congress on engineering 2009, vol 1, London, UK, 1–3 July 2009 7. Ghosh S, Dubey SK (2013) Comparative analysis of K-means and fuzzy C-means algorithms. Int J Adv Comput Sci Appl 4(4) 8. Thakare YS, Bagal SB (2015) Performance evaluation of K-means clustering algorithm with various distance metric. Int J Comput Appl (0975 n 8887) 110(11) 9. Amutha R, Renuka K (2015) Different data mining techniques and clustering algorithms. Int J Technol Enhanc Emerg Eng Res 3(11). ISSN: 2347-4289 10. Sakthi M, Thanamani A (2013) An enhanced K means clustering using improved Hopfield artificial neural network and genetic algorithm. Int J Recent Technol Eng (IJRTE) 2. ISSN: 2277-3878 11. Na S, Xumin L, Yong G (2010) Research on K-means clustering algorithm: an improved K-means clustering algorithm. In: 2010 IEEE third international symposium on intelligent information technology and security informatics, 2–4 Apr 2010, pp 63–67 12. Singh N, Singh D (2012) Performance evaluation of K-means and hierarchal clustering in terms of accuracy and running time. Int J Comput Sci Inf Technol 3(3):4119–4412 13. Yan W, Brahmakshatriya U, Xue Y, Gilder M, Wise B (2013) p-PIC: parallel power iteration clustering for big data. J Parallel Distrib Comput 73(3):352–359 14. Cui X, Zhu P, Yang X, Li K, Ji C (2014) Optimized big data K-means clustering using MapReduce. J Supercomput 70:1249–1259 15. Shafeeq A, Hareesha K (2012) Dynamic clustering of data with modified K-means algorithm. In: International conference on information and computer networks, vol 27 16. Kodipalli A, Guha S, Dasar S, Ismail T (2022) An inception-ResNet deep learning approach to classify tumours in the ovary as benign and malignant. Expert Syst e13215 17. Ruchitha PJ, Richitha YS, Kodipalli A, Martis RJ (2021) Segmentation of ovarian cancer using active contour and random walker algorithm. In: 2021 5th international conference on electrical, electronics, communication, computer technologies and optimization techniques (ICEECCOT). IEEE, pp 238–241 18. Kodipalli A, Devi S, Dasar S, Ismail T (2022) Segmentation and classification of ovarian cancer based on conditional adversarial image to image translation approach. Expert Syst e13193 19. Ruchitha PJ, Sai RY, Kodipalli A, Martis RJ, Dasar S, Ismail T (2022) Comparative analysis of active contour random walker and watershed algorithms in segmentation of ovarian cancer. In: 2022 international conference on distributed computing, VLSI, electrical circuits and robotics (DISCOVER). IEEE, pp 234–238
Analysis of Hospital Patient Data Using Computational Models
119
20. Gururaj V, Ramesh SV, Satheesh S, Kodipalli A, Thimmaraju K (2022) Analysis of deep learning frameworks for object detection in motion. Int J Knowl-Based Intell Eng Syst 26(1):7– 16 21. Guha S, Kodipalli A, Rao T (2022) Computational deep learning models for detection of COVID-19 using chest X-ray images. In: Emerging research in computing, information, communication and applications: proceedings of ERCICA 2022. Springer, Singapore, pp 291–306 22. Rachana PJ, Kodipalli A, Rao T (2022) Comparison between ResNet 16 and inception V4 network for COVID-19 prediction. In: Emerging research in computing, information, communication and applications: proceedings of ERCICA 2022. Springer, Singapore, pp 283–290 23. Zacharia S, Kodipalli A (2022) Covid vaccine adverse side-effects prediction with sequenceto-sequence model. In: Emerging research in computing, information, communication and applications: proceedings of ERCICA 2022. Springer, Singapore, pp 275–281
A Research on the Impact of Big Data Analytics on the Telecommunications Sector Ashok Kumar, Nancy Arya, and Pramod Kumar Sharma
Abstract The telecommunications industry is making significant strides in technological advancements, and Big Data Analytics is playing a crucial role. Unlike in the past, the industry is no longer confined to providing phone and internet services. Big Data Analytics and AI have been successful in replacing obsolete and time-consuming techniques with modern algorithms that simplify the analysis and handling of large amounts of data from diverse consumer bases. Computer vision techniques have proven particularly useful in this regard. This article explores how Big Data Analytics is disrupting the telecoms sector by breaking down boundaries using technologies like Self-optimizing Networks (SON), Robotics Process Automation (RPA), and Chatbots. Keywords Telecommunication · Optimization · Big data analytics · Artificial intelligence · Challenges · Future trends
1 Introduction The telecommunications industry is rapidly adopting big data analytics, with the market estimated to be worth USD 679.0 million in 2019 and expected to grow at a CAGR of 38.4%. As one of the fastest-growing industries, telecoms rely heavily on AI for various aspects of their operations, such as enhancing customer satisfaction and maintaining network integrity. Customer service applications are one of the most common areas where telecom companies utilize AI. Chatbots and virtual assistants are used to handle a significant portion of installation, maintenance, and troubleshooting support inquiries. This approach automates support responses, improves customer satisfaction, and lowers costs for the business. Vodafone Ltd.’s chatbot, A. Kumar (B) · N. Arya · P. K. Sharma Department of Computer Science and Engineering, Shree Guru Gobind Singh Tricentenary University, Gurgaon, India e-mail: [email protected] N. Arya e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_12
121
122
A. Kumar et al.
Fig. 1 Asia pacific telecommunication big data analytics and AI size by: Image source https:// www.grandviewresearch.com/
TOBi, for instance, resulted in almost a 68% improvement in customer experience after its introduction to address customer queries (Fig. 1). Telecom companies can improve their customer service and save costs by implementing AI and automation. Big Data Analytics has disrupted the telecommunications sector’s core, with robots taking on human-only tasks like routing traffic through network content analysis. It can also be used to develop self-optimizing networks that can adapt to current conditions and limitations set by designers. Big data analytics helps identify and fix network issues, including Service Level Agreement (SLA) breaches. The rise of Over-The-Top (OTT) services like video streaming has transformed the distribution and consumption of audio-visual content, increasing the demand for bandwidth and operational expenses. The use of big data analytics and AI in network configuration and maintenance can reduce operating costs in the telecom industry. Automation in the telecom sector expedites the delivery of new services and onboarding of new customers. The telecom industry is transitioning from 4 to 5G mobile communications, promising high-speed data transfer rates with low latency, and the development of infrastructure that can support IoT-controlled sectors. Google and AT&T Intellectual Property, for instance, announced a collaboration in March 2020 to make Google Cloud services more accessible to enterprises through 5G connectivity. Operators and vendors are combining AI/machine learning with networking capabilities to create innovative 5G solutions, with research and trials ongoing for 6G Networks.
A Research on the Impact of Big Data Analytics …
123
2 Trends in the Implementation of Big Data Analytics in the Telecom Industry The telecommunications industry utilizes Big Data Analytics and Artificial Intelligence (AI) to enhance customer satisfaction and gain valuable business insights. The AI value chain comprises Big Data Analytics and AI solution providers, system integrators, and end-users who contribute to product development. Various infrastructural facilities, technological tools, and data formats are required for training and implementing machine learning and computer vision applications. Companies such as IBM, Microsoft, Nvidia, and Intel offer machine learning training resources. System integrators and infrastructure builders play a crucial role in the AI value chain. Original equipment manufacturers (OEMs) collaborate with system integrators to provide comprehensive support and preventive maintenance. AI applications such as predictive maintenance, network optimization, anomaly detection, robotic process automation, and fraud detection are widely adopted in the telecommunications industry. For example, Nokia introduced AVA Telco AI as a Service in May 2021, which uses cloud-based big data analytics technologies to automate network management and service assurance for communication service providers. Vodafone introduced its machine learning chatbot “TOBi” in June 2019 to enable customer support agents to focus on more complex issues. Governments worldwide are expediting their efforts to deploy 5G networks to facilitate digital transformation, edge-enabled IoT, and AI in wireless communication. The trend toward AI adoption in the telecommunications industry is driven by the demand for better services and user experience. As more 5G deployments occur, new markets are expected to emerge for telecom operators to provide process automation services and outsourced IT services powered by edge computing and AI. Leading telecom companies such as Charter Communications, AT&T, and Verizon are expected to make significant AI investments. Manufacturers can offer additional services such as self-diagnostics software and remote monitoring, and a service agreement with manufacturers can provide application industries with design, integration, and delivery benefits. Field experts with factory training assist with installing and maintaining these systems in the telecommunications industry (Fig. 2). The telecommunications industry can save significant amounts of money by automating customer support, which is expected to lead to the rapid growth of the virtual assistance sector. Machine learning algorithms, such as chatbots, can automate customer inquiries and direct them to the most qualified personnel in the telecom industry. The customer analytics sub-segment is expected to dominate the market due to the increasing demand for real-time behavioral insights. Big Data Analytics can help service providers collect and analyze subscriber data, which can be used for targeted subscription offers and advertising, among other things.
124
A. Kumar et al.
Fig. 2 Global big data analytics and (AI) in telecommunication market share by application by: Image source https://www.grandviewresearch.com/
3 Regional Insight North America is leading the market as an early adopter of advanced technologies, particularly in the use of automation and AI in customer service and network optimization. AT&T launched mobile 5G in the US in 2018 with edge AI computing, while US telecom companies are using Big Data Analytics-based solutions from companies like CUJO LLC to secure their networks. Meanwhile, European telecom companies are investing heavily in AI, with a capital expenditure of $72.21 billion on 5G and FTTH networks in 2020. While Europe has strict regulatory requirements, projections suggest that the region will continue to increase its use of Big Data Analytics (AI). Asia–Pacific is expected to have the highest compound annual growth rate (CAGR) due to rapid technological development in countries like China and India. Middle Eastern and African markets are also seen as potential growth areas for the telecom industry, with recent 5G network deployments in Qatar, Saudi Arabia, Bahrain, and the United Arab Emirates. Cooperation among the top five telecom providers in the Middle East to adopt Open RAN solutions is expected to drive demand for Big Data Analytics solutions. Finally, South America is expected to experience increasing prosperity in the telecom industry, with a significant impact from the European and American markets (Table 1).
A Research on the Impact of Big Data Analytics …
125
Table 1 Telecommunication big data analytics and AI market scope Report attribute
Details
Market size value in 2020
USD 962.2 million
Revenue forecast in 2027
USD 9350.3 million
Growth rate
CAGR of 38.4% from 2020 to 2027
Base year
2019
Historical data
2016–2018
Forecast period
2020–2027
Quantitative units Revenue in USD million and CAGR from 2020 to 2027 Report coverage
Revenue forecasting, company rank, competitive landscape, growth factors, and trends
Segments covered Application, region Regional scope
North America, Europe, Asia Pacific, South America, MEA
Country scope
U.S., Canada, Mexico, Germany, U.K., France, China, Japan, India, Brazil
Key companies
Cisco systems, IBM, Microsoft, Intel Corporation, Google, AT&T Intellectual Property, Nuance Communications Inc., Evolv Technology, H2 O.ai, Infosys Limited, Salesforce.com Inc., and NVIDIA and many more
Information source https://www.grandviewresearch.com/
4 Challenges of the Telecommunication Industry 4.1 Operation-Related Costs Handling complex data sets can be a difficult task that requires a significant amount of resources. Collecting and processing data in real-time can come with high operational costs, including additional time, money, and supplies to maintain quality standards and protect a company’s reputation. Companies are not willing to compromise on such important matters, and thus, they bear all the operational costs themselves.
4.2 Needs for Consumers Telecommunications service providers receive a vast number of requests for network connections and recharge every day. Meeting the needs of consumers has become increasingly challenging for companies, and failing to communicate with them in a timely manner can lead to a loss of customer trust.
126
A. Kumar et al.
4.3 Information Security and Breaches When a telecommunications company tries to expand its business while still safeguarding client information, it faces a big issue. Monitoring call and message quality is crucial if a telecoms operator is to deliver the level of service that customers demand. Numerous surveys show that concerns over data theft, identity fraud, and financial losses continue to exist. Despite the fact that this is a common desire, many telecom providers simply do not have the resources to grant customers complete access to and control over their data. For the ecosystem to remain risk-free and secure, customer trust is essential. Data breaches would be completely avoided if this were put into practice.
4.4 Collaboration Remote collaboration While several governments-imposed lockdowns and mandated “work from home” policies during the peak of the COVID-19 epidemic, traditional telecom companies carried on with their operations from physical sites throughout the globe for their contact centers and customer service franchises. Many companies were forced to close, which resulted in many individuals losing their jobs. “Remote work” was a substantial obstacle for these organizations because a lot of the necessary equipment was kept on-site. Modern technological solutions are needed in the telecommunications industry to allow staff to continue functioning in the face of catastrophes like this.
4.5 High-Speed Data-in-Motion Petabytes of data can be produced in a matter of minutes by large-scale industrial IoT, smart cities, and autonomous vehicles. This data speed will increase, thanks to 5G’s connection and low-latency transmission. For low-latency computation and storage architectures on the cloud, lightning-fast read/writes will require advanced fog/cloud infrastructure support.
4.6 Support for Application and Network Intelligence The 5G network needs to be much more than just a massive data conduit. Big data needs to be integrated into the fabric of 5G architectures as well as supported by analytics in order to support dispersed network and application intelligence use cases.
A Research on the Impact of Big Data Analytics …
127
4.7 Real-Time Actionable Insights Low latency is an essential feature of 5G networks, especially in applications that require fast data transfers to the cloud, real-time analytics at the edge, and ultra-lowlatency data persistence for real-time decision-making in critical applications like public safety, emergency care, and security surveillance.
5 Telecommunication Industry, Big Data Analytics, and Customer Services Customers today have high expectations when it comes to customer service, and they expect personalized experiences and quick resolutions to their problems. AI can help facilitate fruitful client connections and offer support through individualized follow-up. Additionally, AI is essential for data security, as it can detect 95% of threats using AI-based techniques. AI can also be utilized for mobile-tower optimization, reducing labor costs and enabling real-time notifications in the case of a threat. The telecommunications industry is rapidly adopting AI, with big data analytics expected to increase by 42% in the coming year. Telecom companies all over the world are exploring the potential of AI algorithms. AT&T, the largest telecommunications company in the world, has observed a 5% increase in output after incorporating various types of machine learning and big data analytics. Vodafone, in collaboration with IBM, offers AI support through a chatbot, which has cutting-edge features and is used as a travel guide and an HR assistant.
6 Conclusion Big Data Analytics is a valuable tool for many companies, including those in the telecommunications sector. By providing relevant services and information, telecom providers can better understand their customers and build trust-based relationships. Additionally, carriers can monitor equipment conditions and prevent fraud. To utilize this tool effectively, companies need to find appropriate software or vendors. The use of Big Data Analytics is rapidly increasing in the global telecommunications industry, with top companies investing in cognitive technologies to improve their products and services. To stay competitive, the telecoms sector needs to incorporate advanced AI ideas. The integration of AI and cloud-based technology has made the already competitive telecom industry even more so, leading to expansion, new prospects, and innovation. 5G, the Internet of Things, and cloud computing have all contributed to better network and traffic maintenance, increased capacity at lower costs, and satisfied
128
A. Kumar et al.
customers. However, widespread implementation of Big Data Analytics and AI also presents new challenges that need to be addressed. It is important for all those interested in using this technology to complete this stage in their own call centers or offices.
References 1. Guzman AL, Lewis SC (2019) Artificial intelligence and communication: a human-machine communication research agenda. Sage Publishing, vol 65, p 5 2. Khatib MME, Nakeeb AA, Ahmed G (2019) Integration of cloud computing with artificial intelligence and its impact on telecom sector—a study case. iBusiness 11:8 3. Khrais LT (2020) Role of artificial intelligence in shaping consumer demand in e-commerce. J Future Internet 12:12 4. Kumar A (2018) Artificial intelligence for place-time convolved wireless communication networks. ICT Discov 1:83 5. Laghari KUR, Yahia GI, Crespi N (2019) Analysis of telecommunication management technologies. Institute Telecom, vol 2, p 6 6. Li C, Fan J, Li M (2019) Navigation performance comparison of ACE-BOC signal and TD-Alt BOC signal. Springer, vol 287, p 52 7. Li Z, Wang X, Li M, Han S (2019) An adaptive window time-frequency analysis method based on Short-Time Fourier Transform, vol 287. Springer, p 91 8. Liang W, Sun M, He B (2018) New Technology brings new opportunity for telecommunication carriers: artificial intelligent applications and practices in telecom operators. ICT Discov 1:121 9. Ma J, Shi B, Che F, Zhang S (2019) Research on evaluation method of cooperative jamming effect in cognitive confrontation, vol 287. Springer, p 40 10. Thakkar HK, Desai A, Ghosh S, Singh P, Sharma G (2022) Clairvoyant: AdaBoost with cost enabled cost-sensitive classifier for customer churn prediction. Comput Intell Neurosci: 9028580. https://doi.org/10.1155/2022/9028580 11. Optimization of artificial intelligence in telecommunication. In: Proceedings of the second Asia Pacific international conference on industrial engineering and operations management Surakarta, Indonesia, p 09 (2021) 12. A churn prediction model using random forest: analysis of machine learning techniques for churn prediction and factor identification in telecom sector, vol 7 (2019) 13. Artificial intelligence driven 5G and beyond networks. Telecom IT 10(2):1–13 (2022). https:/ /doi.org/10.31854/2307-1303-2022-10-2-1-13 14. Anomaly detection in telecommunication network performance data. In: Proceedings of the 2007 international conference on artificial intelligence, ICAI 2007, vol 2, pp 433–438, 01/12/ 2007 15. Customer satisfaction index. World Academy of Science, Engineering and Technology, 69 (2012) 16. Hair JF et al (1998) Multivariate data analysis, 5th edn. Prentice-Hall, Inc., New Jersey 17. Ali T, Ozkan C (2007) Development of a customer satisfaction index model. J Industr Manage Data Syst 107(5) 18. Optimization of artificial intelligence in telecommunication proceedings of the second Asia Pacific international conference on industrial engineering and operations management Surakarta, Indonesia, 14–16 Sept 2021. Int J Econ Bus Manage Stud 3(2):55–66
Face Mask Isolation Canister Design for Healthcare Sector Towards Preventive Approach Sushant Dalvi, Aditya Deore, Rishi Mutagekar, Sushant Kadam, Avinash Somatkar, and Parikshit N. Mahalle
Abstract In recent times we have seen the importance of wearing a facemask to protect ourselves from various bacteria and viruses. And just like wearing a mask to protect ourselves, it is equally important to store and isolate the mask properly. Keeping a used facemask on a surface without disinfecting and isolating it can cause a serious health hazard to another person and the person himself, because the surface on which the mask is kept also contains number of bacteria and viruses. There are currently no special products available in the market to store and isolate the facemask of multiple users at the same time. Therefore, looking at this scenario we decided to design a Face mask isolation canister. It is simple, sturdy and easy to use. The Face Mask Isolation Canister is design in such a way that the mask will not have any contact with the surface and is capable of storing and isolating the mask of multiple users at the same time. Keywords Facemask isolation canister · Healthcare sector · Viruses · Bacteria
S. Dalvi (B) · A. Deore · R. Mutagekar · S. Kadam · A. Somatkar · P. N. Mahalle Vishwakarma Institute of Information Technology, Pune, India e-mail: [email protected] A. Deore e-mail: [email protected] R. Mutagekar e-mail: [email protected] S. Kadam e-mail: [email protected] A. Somatkar e-mail: [email protected] P. N. Mahalle e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_13
129
130
S. Dalvi et al.
1 Introduction Face mask isolation canister can be efficiently used in variety of places where the mask is supposed to be kept and isolate for a certain amount of time. Isolation of mask is a process of complete separation of mask from surrounding. All the family members when they reach home put their masks on the same hanger regardless of the contacts between the masks, which is not only unhygienic but dangerous at the same time which we have seen in the phase of pandemic [1]. Masks when removed should not be touched or made contact with surfaces until and unless they are disinfected. Face masks are used to prevent viral infection or other airborne infections [2, 3]. There are different types of masks which are available in market such as Disposable surgical mask [4], N95, KN95, cotton masks etc. Disposable surgical masks [4, 5] are thin and paper like and are generally light blue and white in colour and it can used only once. N95, KN95 and surgical mask are more effective than others, but their lifespan is limited [6]. Cotton masks can be reused because they are washable and are commonly used extensively [7]. The main precaution of the washable mask [8] is that if one is removing it for purpose of eating or coming back to home for a moment, then it should be properly isolated and disinfected to use it throughout the day again. But main thing is to put it at proper place after disinfection with sanitizer spray bottle. As we have seen the habits of people to keep their mask anywhere. But when guests visit ones house the same mistake is repeated that the mask is kept on sofas, pockets, purse or bag. The same picture we can see in Restaurants and cafes where customers put their masks on table, pockets or anywhere else. For the reasons mentioned above, something is necessary in which we can keep our mask separate. To solve this problem, we decided to design a Face mask isolation canister which will keep mask isolated and safe. According to the guidelines given by WHO on how to remove mask [9, 10], Masks should be removed using the straps from behind the head or ears, without touching the front of the mask. To remove the mask, lean forward and pull the mask away from your face. The face mask isolation canister is also designed in such a way that the front side of mask will be untouched, and contact will be only with the straps of mask in complete process of isolation. The design of the Face mask canister is simple, so it is easy to handle. Face mask isolation canister requires small space and maintenance cost is also low, the overall model is economical. For families, Restaurants and cafes, keeping a mask in a face mask isolation canister can be a great option.
2 Motivation Surgeons and doctors are those who work tirelessly for inhumane hours and constantly wear facemasks or surgical masks on their faces. Surveys and studies have shown that prolonged use of N95 and surgical masks by healthcare professionals has caused adverse effects on their body such as headaches, rash, acne, skin
Face Mask Isolation Canister Design for Healthcare Sector Towards …
131
breakdown, and impaired cognition in the majority of those surveyed. Tight straps and pressure on superficial facial and cervical nerves are mechanical features causing headaches [11]. In this profession it is expected to stay calm and relax thought out the day. Therefore, it is recommended to take few moments of short break to relax in their cabins. Taking more rest breaks is also associated with lower levels of overall fatigue during dayshifts [12]. Therefore, after removing that mask for a moment there is a requirement of a reliable solution which can keep that mask safe and isolated for certain amount of time. Face masks have become an inevitable part of everyone’s life since the outbreak of coronavirus pandemic. But now things are getting back to normal. The pandemic has taught us several lessons in these years. So, there might be possibilities of an increase in cases and resulting in the next wave of pandemic. And if this happens then again it will be mandatory to wear the face mask while leaving the house for work or enjoyment or any other task. In the moment to remove the mask and feel free it has been observed that people forget to place their mask in an isolated and clean place. And end up putting their mask on tables, hangers and pockets too, eventually increasing its contact with more harmful/ unsafe surfaces. Though sometimes we can say that the surface was clean, people forget that their masks are contaminated and placing them on the tables they contaminate the table as well. The present invention relates to the storage of face mask in canister which involves isolation of face mask and also separate compartments are provided to keep each mask separately. Now a days, Each and every citizen needs to wear mask in public for their own safety. There are different types of masks available in market, these masks can be stored for reuse throughout the day. If masks are placed on sofa seats, dining tables the surfaces get contaminated by the bacteria present on the outer walls of the mask. And when the masks are used again these places are left contaminated and may come in contact with some other person. Families with babies should take a lot more care when placing their masks, as babies have no idea whether they should touch certain surfaces or not and also to clean their hands if they get in contact with masks or place where the mask was placed. The mask should be isolated in close container without having touch with any surface. Few people keep their mask in box or container to isolate it. But at a time only one mask can be stored in one particular container or box. But it is not completely isolating the mask because there is direct contact of the outer surface of mask with the surface of container or box. So, it is not convenient for all family members to use it throughout the day and also it is not completely isolating the mask and keeping it safe because there is direct contact with the surfaces of mask and container or box.
3 Related Work The ultraviolet disinfection of face mask [13] discloses in patent (CN103566501A) comprising a supporting piece, at least two ultraviolet LEDs (light-emitting diodes) and a face mask, wherein the ultraviolet LEDs are fixedly arranged inside the
132
S. Dalvi et al.
supporting piece, and the emitted light of the at least two ultraviolet LEDs has different dominant wave lengths. This is good for disinfection, but the product is not for storing mask separately of number of person and also it is not cost effective, and it also requires electric power to operate. A face mask container [14] is a design of plastic box and a leather pouch to store a mask temporary when not in use. Within the containers, the masks are secured by their straps and held flat in their natural positions without the risk of squeezing and self- environmental contamination. This way they can easily be picked up by the straps and worn safely without touching the front or inside of the mask. Vents are created on both sides of the face mask containers to improve ventilation in the containers when closed
4 Gap Analysis As of now no other product is available in the households to keep the masks of every individual at a single place separately. There are some products available in the market, the ultraviolet disinfection of face mask [13] is good for disinfection, but the product is not for storing mask separately of number of person and also it is not cost effective, and it also requires electric power to operate. A face mask container [14] is good the storing a mask temporary when not in use. But there are still chances of direct contact of mask with the surface, specifically in the leather pouch and also there is limitation of keeping number of masks isolated at the same time. To overcome this issue, we have made a face mask isolation canister which is the most reliable solution as of now. It’s cheap, sturdy and easy to handle. It’s a must have product to every house, every table in a restaurant or a café. Keeping the mask away from the surfaces which may contain bacteria or virus increases the life of the mask and also of the user. And Face mask isolation canister brings us exactly what we need at this stage. Placing our mask in the Canister will ensure that masks doesn’t touch the surfaces or gets in contact with any other mask. Also, there are provision of compartments to store and isolate the face mask of number of persons at the same time. The make of the canister makes it sturdy as well as attractive. It is not composed of any electronics part and hence is quite simple to use as well as cheaper. It can be used even by the smallest child in the family as no complex mechanisms are used in it. Hence, people must turn their heads towards the Face mask Isolation Canister to keep their masks safe from bacteria.
Face Mask Isolation Canister Design for Healthcare Sector Towards …
133
5 Proposed Design Figure 1—The frame contains a framework that supports the entire assembly of the canister. The frame is made of wooden material for strong and sturdy design. The wooden material is used to easily assemble all the components in the canister, which makes it economical. It consists of 5 compartments to keep and isolate the mask of 4 persons and 1 guest. The number of compartments may vary depending upon the strength of family members living in the house. Figure 2—Slider is made up of transparent material like fiber. At front of the slider a wooden strip is provided for easy grip to hold and moves the slider back and forth to open and close the compartment. A clearance is also provided for easy access to move the slider. Figure 3—Front side of canister consists of a transparent wall made up of fiber material. For easy visuals of the mask inside of canister of respective person Figure 4—A simple sliding mechanism is provided for sliding of the slider. The extension is given on top side and support strip are mounted inside each compartment and clearance is provided. The slider slides easily through the clearances provided between the extension and the support strip. Thus, the sliding mechanism will not require any kind of maintenance Fig. 1 Frame
Fig. 2 Slider
134
S. Dalvi et al.
Fig. 3 Transparent front side
Fig. 4 Slider mechanism
Figure 5—Hooks are placed on the support strip of the slider and they are placed in the center of the support strip which is used for hanging the mask, the position of the hook will ensure that there will be enough space to hang any type of mask. This way, the mask will not have direct contact with the walls of the compartment Fig. 5 Hook
Face Mask Isolation Canister Design for Healthcare Sector Towards …
135
Fig. 6 Vacuum cups
Fig. 7 Sectional side view
Figure 6—Vacuum cups are used as grippers to stop canister from slipping. When putting on or taking off the mask, vacuum cups are supplied to stop the entire assembly from being shaken loose. The cup rim entirely seals against atmospheric air when creating vacuum, and the inside air is promptly emptied, giving the canister a secure hold (Figs. 7 and 8).
6 Working Face Mask Isolation Canister is very efficient for isolating and keeping mask safe. The process begins with placing the mask in the container, but before that the mask and the container in which it is to be stored must be properly sanitized with the help of a sanitizer spray bottle. Once the mask and respective compartment is properly sanitized the canister is ready to store and isolate the mask. To keep the mask inside the compartment on top side of each compartment a wooden strip attached to the slider is provided, A clearance is also provided to hold it with fingers, now simply by sliding the slider outside the compartment can be opened. Inside of compartment a hook is provided to hang the mask, the hook is fitted exactly at center of each compartment so that mask will not have direct contact with any surface like walls of compartment. Now take the mask without touching its front side, hold it at its straps
136
S. Dalvi et al.
Fig. 8 Sectional front view
and gently place it on hook inside the compartment. After placing the mask inside the compartment now again push and slide the slider to its initial position which will eventually close the compartment. Once the compartment is completely closed the mask is properly isolated. For taking out the mask from compartment the same procedure should be followed that slide the slider outside, hold the straps of mask and gently take it out of canister again push the slider inside and close the compartment.
7 Applications • • • • •
Cabin of Doctors and Surgeon. Home-based small-scale industries. Hotels and lodges. Restaurants and cafes. Canteen of industries.
8 Future Scope The Face Mask Isolation Canister can be modified depending upon the requirements. • UV rays can be added in the canister to disinfect the mask at the same time of isolating it.
Face Mask Isolation Canister Design for Healthcare Sector Towards …
137
• Automatic sanitizing sprays can also be installed in the compartments to disinfect the mask instead of manual sanitization. • Automatic opening of sliders can be provided using proximity sensors and optical sensors and can be implemented where physical touching is strictly prohibited.
9 Conclusion Currently there are some devices available in the market for sterilizing masks, but they cannot store the mask. Whenever we see people in our families, restaurants, cafes and industries storing their masks in pockets, hangers, tables, and anywhere else. Implying that the masks keep the person safe for a while but contaminates the surface it comes in contact with. The reason people turn to these places to store masks is because of the lack of proper gadgets to store used masks. So, to overcome this problem we have come up with Face Mask Isolation Canister. This device stores used masks without allowing them to touch any surface or any other mask that minimizes further contamination. The Face Mask Isolation Canister provides a separate compartment for each mask and the number of compartments can be changed as per the needs of the users. Canisters are a very economical solution to our daily problem. Another reason why a face mask isolation canister is a useful solution is that the lack of electronics in the gadget makes it more user-friendly.
References 1. Ippolito M, Vitale F, Accurso G, Iozzo P, Gregoretti C, Giarratano A, Cortegiani A (2020) Medical masks and Respirators for the protection of healthcare workers from SARS-CoV2 and other viruses. Pulmonology 26(4):204–212. https://www.sciencedirect.com/science/art icle/pii/S253104372030088X 2. Matuschek C, Moll F, Fangerau H, Fischer JC, Zänker K, van Griensven M, Schneider M, Kindgen-Milles D, Knoefel WT, Lichtenberg A, Tamaskovics B, Haussmann J (2020) Face masks: benefits and risks during the COVID-19 crisis. Euro J Med Res 25(1):1–8. https://doi. org/10.1186/s40001-020-00430-5 3. MacIntyre CR, Cauchemez S, Dwyer DE, Seale H, Cheung P, Browne G, Fasher M, Wood J, Gao Z, Booy R, Ferguson N (2009) Face mask use and control of respiratory virus transmission in households. Emerg Infect Dis 15(2):233. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC 2662657/ 4. Watts JHG (1965) U.S. patent no. 3,170,461. U.S. Patent and Trademark Office, Washington, DC. https://patents.google.com/patent/US3170461A/en 5. Rockwood CA, O’donoghue DH (1960) The surgical mask: its development, usage, and efficiency: a review of the literature, and new experimental studies. AMA Archiv Surg 80(6):963–971. https://jamanetwork.com/journals/jamasurgery/article-abstract/558661 6. Wang MX, Gwee SXW, Chua PEY, Pang J (2020) Effectiveness of surgical face masks in reducing acute respiratory infections in non-healthcare settings: a systematic review and metaanalysis. Front Med 582. https://doi.org/10.3389/fmed.2020.564280/full 7. Duncan S, Bodurtha P, Naqvi S (2021) The protective performance of reusable cloth face masks, disposable procedure masks, KN95 masks and N95 respirators: filtration and total
138
8.
9. 10.
11.
12.
13.
14.
S. Dalvi et al. inward leakage. PloS One 16(10):e0258191. https://doi.org/10.1371/journal.pone.0258191& utm_source=researcher_app&utm_medium=referral&utm_campaign=RESR_MRKT_Resear cher_inbound Lee KP, Yip J, Kan CW, Chiou JC, Yung KF (2020) Reusable face masks as alternative for disposable medical masks: factors that affect their wear-comfort. Int J Environ Res Publ Health 17(18):6623. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7558362/ https://www.who.int/news-room/questions-and-answers/item/coronavirus-disease-covid-19masks World Health Organization (2022) Infection prevention and control in the context of coronavirus disease (COVID-19): a living guideline, 25 Ap 2022: updated chapter: mask use, part 1: health care settings (No. WHO/2019-nCoV/ipc/guideline/2022.2). World Health Organization. https://apps.who.int/iris/bitstream/handle/10665/353565/WHO-2019-nCoV-ipc-guidel ine-2022.2-eng.pdf?sequence=1 Rosner E (2020) Adverse effects of prolonged mask use among healthcare professionals during COVID-19. J Infect Dis Epidemiol 6(3):130. https://pdfs.semanticscholar.org/bdea/3aef30775 ad4505dc7a7c19e9b41ff89baef.pdf Blasche G, Arlinghaus A, Crevenna R (2022) The impact of rest breaks on subjective fatigue in physicians of the General Hospital of Vienna. Wiener klinische Wochenschrift 134(3):156–161. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8857152/ Pohle L (1995) US photovoltaic patents: 1991–1993 (No. NREL/SP-410-7689). National Renewable Energy Lab. (NREL), Golden, CO (United States). https://patents.google.com/pat ent/CN103566501A/en Ogoina D (2020) Improving appropriate use of medical masks for COVID-19 prevention: the role of face mask containers. Am J Trop Med Hygiene 103(3):965. https://www.ncbi.nlm.nih. gov/pmc/articles/PMC7470593/
Modeling the Impact of Access to Finance on MSMEs’ Contribution Towards Exports of the Economy Md. Motahar Hossain
and Nitin Pathak
Abstract A trade surplus contributes incredibly towards the growth and development of an economy. Higher export indicates that there is a high level of goods and services from a country’s industrial units which leads greater number of employment in the industrial operations. Micro, Small, and Medium Enterprise (MSME) sector exports a large portion of total export of India. The research study aimed at exploring the trends of export contribution by the sector. Moreover, the research study analyzed the relationship between exports and loans and advances received from financial institutions (FIs) by the MSME sector. The study exposed that export of MSME sector accounted for nearly 47% (on an average) of total Indian export during the study period of 2011–2021. The research study also found that there is a high correlations between exports and loan outstanding (received from different FIs) of MSMEs. Finally the study recommended lower lending rate, limited collateral, shorter time to disburse the loan, and easy access to finance for accelerating the MSMEs’ export contribution to the country’s economy. Keywords Access to finance · Financial institutions · Loans and advances · Exports
1 Introduction Micro, Small, and Medium Enterprises (MSMEs) are the backbone of economies in many countries of the universe. These small business enterprises require relatively low funding, mediocre technological integration, and semi-skilled labor force to conduct their business functions [1]. MSMEs have been contributing significantly Md. Motahar Hossain (B) Department of Business Management, University School of Business, Chandigarh University, Mohali, Punjab 140413, India e-mail: [email protected] N. Pathak Department of Commerce, University School of Business, Chandigarh University, Mohali, Punjab 140413, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_14
139
140
Md. Motahar Hossain and N. Pathak
towards employment generation, production, and exports worldwide. Small business ventures are more likely to contribute to the developing countries like Bangladesh, India, Sri Lanka, Tanzania, etc., due to their ease of formation and operations. About 63.30 million registered and non-registered small and medium-sized businesses make up 95% of the industrial units in India. MSMEs also employed around 106 million, which accounts for 40% of the Indian total workforce. Moreover, manufacturing MSMEs contribute approximately 6.11% of GDP, while the service sector represents approximately 24.63%. Furthermore, 40% of total Indian exports are contributed by MSMEs [2]. Firms’ capability to expand the business activities beyond the domestic boarder increases total export of every economy in the world. Exporting goods and services to foreign markets require various costs and information concerning foreign markets, marketing strategies, local preferences, and distribution networks etc. [3]. The firms need to incur the costs associated with export in advance. Besides, firms must focus on quality assurance and productivity to ensure competitive advantages in the highly competitive market. Sustainable productivity linked with sophisticated technological integration. Hence, a big amount of investment is associated with exporting goods and services by crossing the border of a country. So, access to finance is playing a crucial role in this connection. The financial inclusion of Micro, Small and Medium Enterprises (MSMEs) has been of extraordinary attention among strategy developers, financial analysts, brokers, business specialists and researchers in view of their importance in driving of monetary development in private sectors all over the world. A strong MSME area contributes definitely to the economy by setting out greater business open doors, creating higher production volumes, expanding trades and presenting development and business venture abilities. Public sector banks, private sector banks, foreign banks, scheduled commercial banks, and Non-Banking Financial Institutions (NBFIs) have been offering decisive monetary assistance to the small and medium undertakings in India [4]. MSME funding has been recognized as a significant obstruction to MSME development. Miserably, the issue is by all accounts as unsettled as it generally has been. However most of the issues are with the MSME themselves, the banks likewise have significant issues in planning financial products for the MSMEs [5]. As a developing country India has a lot of potentiality in the prominent sector. Most of the financial institutions have extended their support towards MSMEs, especially after the covid-19 pandemic. Many small businesses have been struggling to continue their business after dramatic change caused by the covid-19 pandemic [6]. Besides, Service sector MSMEs were hit hardest during the pandemic situation. So the banks and non-banking financial institutions have significant role to rescue the sector from financial deficiencies.
Modeling the Impact of Access to Finance on MSMEs’ Contribution …
141
2 Reviews of Related Literatures Increased productivity or sustainable productivity growth leads higher volume of exports. Sophisticated technological integration, financial inclusion, effective management, and business conducive environment can ensure the desired export earning of a country [7]. MSMEs encourage private ownership and entrepreneurial skills, can quickly adjust to shifting market conditions, generate employment, aid in the diversification of economic activities, and significantly contribute to Indian exports and trade. But these small businesses have been facing numerous obstacles in conducting their business activities smoothly and perfectly. Poor infrastructure, unavailability of raw materials, inadequate financial support, inefficient management capability, information gap, and lack of effective marketing strategies are the major constraints of MSME sector in India [8]. Internationalization of MSMEs’ products and services is crucial to enhance the growth of export contribution by the sector. Inadequate knowledge of foreign markets, lack of expertise in conducting international trade is the major barriers of Indian MSMEs for maintaining required sustainable growth towards export contribution [9]. Besides, quality assurance, effective marketing channel, demand forecasting ability, and proper management of supply chain can enhance the export volume of the sector [10]. Financial inclusion is one of the major weapons to reduce the barriers of micro, small, and medium enterprises in India. Most of the enterprises have been facing problems in getting financial assistance from banking and non-banking financial institutions. The availability of appropriate economic resources is important for the sustainable growth and development of MSMEs. Lack of education, proper documentation, and collateral is the major constraints in this connection [11]. Formalized financial institutions don’t rely on the owner-manager of small business ventures. Moreover, they don’t afford adequate collateral and documentation required by the financial institutions for getting financial assistance. Sometimes financial institutions take longer time to disburse loans to the enterprises, average grace period is 2–3 months or above which creates undue pressure on smooth business operations [12]. Finance is known as blood of an organization and recent covid-19 pandemic has been created financial deficiency of every business organization. Indian small business sector is not the exception. Considering the paramount importance of small business ventures, Government should encourage the financial institutions to provide financial assistance in a large scale with minimum formalities. Additionally, financial institutions can reduce their lending rate as well as make ease the loan application procedure to rescue the sector from the damages [13]. The primary sources of finance are private sector banks, public sector banks, scheduled commercial banks, foreign banks and non-bank financial institutions in India. MSMEs have been depending greatly on the financial institution to collect funds for startups, upgraded technological integration, production growth, business internationalization and expansion of the business areas [14]. Timely and adequate
142
Md. Motahar Hossain and N. Pathak
financial support can enhance the contribution growth of MSME sector noticeably over the periods.
3 Objectives of the Research The research has been conducted. (1) To examine the trends of export contribution by the MSME sector towards Indian economy during 2011–2021. (2) To analyze the impact of financial assistance on MSMEs’ export contribution during the study period.
4 Methodology of the Study The research study is empirical and quantitative. Secondary data has been collected to investigate the relationship between loan outstanding and contribution of MSMEs towards export in Indian economy. Secondary data sources are annual reports of M/ o MSME, RBI, World Bank, News paper, and Internet. Additionally SPSS software has been used to analyze the data.
5 Linear Regression Model A regression model has been developed to test the impact of MSMEs’ Access to finance on Export share of the sector where Loans and advances received from Financial Institutions (FIs) by the MSME sector (Loan Outstanding) has been considered as independent variable and Export contribution of the MSMEs has been treated as dependent variable. And SPSS-version-26 software has been used to analyze the data. Hypothesis: H01 : There is no significant correlation between access to finance and export contribution of MSMEs in the economy.
Modeling the Impact of Access to Finance on MSMEs’ Contribution … Table 1 Year wise export contribution by Indian MSMEs
143
Year
Share of MSMEs in total export (%)
2011–12
43
Growth (%) –
2012–13
43
0.00
2013–14
42
(0.02)
2014–15
45
0.07
2015–16
50
0.11
2016–17
50
0.00
2017–18
49
(0.02)
2018–19
48
(0.02)
2019–20
49.75
0.04
2020–21
49.35
(0.01)
Average
46.91
0.15
Source Financial express, Published on July 19, 2022 2:37:49 pm
6 Analysis of Data and Discussion 6.1 Export Contribution of Indian MSME Sector MSME sector has been contributing significantly towards export contribution of Indian economy. Government as well as private organization has been putting supreme importance in the growth of MSMEs to maintain the sustainable growth of the economy. More importantly, nearly 47% of total Indian exports have been contributed by the sector. The growth rates of the contribution were negligible over the years from 2011 to 2021. Covide-19 pandemic resist the enterprises in conducting their business activities smoothly (Table 1). Figure 1 portrayed the trends of MSMEs’ export contribution towards Indian economy from 2011–12 to 2020–21. The figure showed that MSMEs contributed highest share of export in the year 2015–16 and 2016–17 then it started to decline. After the financial year 2018–19 the trends increased in the year 2019–20 and then again showed declining trend. The lowest share of MSMEs’ export stands at the financial year 2013–14.
6.2 Regression Analysis and Their Interpretation See Table 2.
144
Md. Motahar Hossain and N. Pathak
Fig. 1 The trend of export contribution by Indian MSME sector
Table 2 MSMEs’ share of export and loan outstanding. Amount in Rs. Billion Year
Share of MSMEs in total export (%)
Loans and advances from private and public banks
2011–12
43
10,568.34
2012–13
43
13,744.22
2013–14
42
17,009.29
2014–15
45
19,291.13
2015–16
50
23,420.50
2016–17
50
25,927.95
2017–18
49
26,484.78
2018–19
48
31,551.31
2019–20
49.75
36,198.01
2020–21
49.35
33,579.60
Source World Bank and Reserve Bank of India
6.2.1
Model Validity
The model can be significant when p-value is less than 0.05. Our test result showed p-value is 0.003 (Table 3) which is less than 0.05. So, the model is significant at 95% confidence level.
6.2.2
How Independent Variable Explains Dependent Variable
Adjusted R square value represents the explanation of independent variable on dependent variable in case of small number of data. The value of adjusted R square in our
Modeling the Impact of Access to Finance on MSMEs’ Contribution …
145
Table 3 ANOVA test result ANOVAa Model 1
Sum of squares
Df
Mean square
F
Sig
Regression
67.401
1
67.401
18.215
0.003b
Residual
29.603
8
3.700
Total
97.004
9
a. Dependent variable: Exports b. Predictors: (Constant), Loan outstanding
Table 4 Explanation of dependent variable by the independent variable Model summaryb Model
R
R square
Adjusted R square
Std. error of the estimate
1
0.834a
0.695
0.657
1.923640
a. Predictors: (Constant), Loan outstanding b. Dependent variable: Exports
study is 0.657 (Table 4) i.e. the dependent variable is explained nearly 66% by the independent variable in the model.
6.2.3
Hypothesis Acceptance or Rejection Decision
From the above correlations result (Table 5), the study found that the correlation between dependent variable and independent variable is 0.834 which indicates there is a significant correlation between the variables. Therefore, null hypothesis is rejected and alternative hypothesis is accepted. So, it can be concluded from the test result that export contribution of MSMEs in India is significantly dependent on loans and advances provided by the private banks, public banks, foreign banks, and scheduled commercial banks in India.
7 Scope for Future Research The study has been conducted by limiting the time frame from 2011 to 2021. Further the data have been collected from secondary sources related to the Indian economy in this study. So, future research can be conducted using wide range of time duration and considering more factors affecting the export contribution by the MSME sector. Moreover, this study has been conducted focusing on Indian economy. Future investigation concerning other developing countries like Bangladesh, China, Malaysia etc. and least developed economies such as Nepal, Bhutan, may generate better result in this regard.
146 Table 5 Correlations result
Md. Motahar Hossain and N. Pathak
Correlations Exports
Loan outstanding
Pearson correlation
Exports
1.000
0.834
Loan outstanding
0.834
1.000
Sig. (1-tailed)
Exports
N
0.001
Loan outstanding
0.001
Exports
10
10
Loan outstanding
10
10
8 Conclusion and Recommendations The MSME sector has been playing an important role in total exports of India. Indian MSMEs have been contributing on an average about 47% of the country’s total exports. Access to finance is one of the most significant factors affecting export contribution of the sector. Private, public, and scheduled commercial banks, foreign banks, and non-banking financial institutions have been providing financial services to the MSME sector prominently. The study has been analyzed the correlation between export contribution and loan outstanding of MSME sector during the period of 2011– 2021. The regression result reveals that MSMEs’ export contribution significantly depended on loan outstanding i.e. loans and advances received from public sector banks, private sector banks, foreign banks, and scheduled commercial banks. So, the FIs should disburse loans with lower interest, affordable collateral, and favorable grace period [15]. Besides, time taken in between loan application and loan disbursement should be shortened to maintain the credible growth of the sector. Finally it can be concluded that institutional financing enhance the growth pace of production and marketing strategies which leads sustainable export growth of MSME sector.
References 1. Nagaraj R, Vaibhav V (2020) Revising the definition of MSMEs: who is likely to benefit from it? Indian J Labour Econ 63:119–126 2. Statistics, SME Sectors in India. Retrieved from www.msme.gov.in/KPMG/CRISIL/CII on 31 Jan 2023 3. Mukherjee S, Chanda R (2021) Financing constraints and exports: evidence from manufacturing firms in India. Empirical Econ 61(1):309–337 4. Meher BK, Hawaldar IT, Mohapatra L, Spulbar C, Birau R, Rebegea C (2021) The impact of digital banking on the growth of Micro, Small and Medium Enterprises (MSMEs) in India: a case study. Bus Theor Pract 22(1):18–28
Modeling the Impact of Access to Finance on MSMEs’ Contribution …
147
5. Chowdhury SA et al (2013) Problems and prospects of SME financing in Bangladesh. Asian Bus Rev 2(1)(4):51 6. Dubey P, Sahu KK (2020) MSMEs in COVID-19 crisis and India’s economic relief package: a critical review. AIJR Preprints, p 207 7. Singh D (2019) Implementation of technology innovation in MSMEs in India: case study in select firms from Northern Region. J Sci Technol Policy Manage 8. Khatri P (2019) A study of the challenges of the Indian MSME sector. IOSR J Bus Manag 21(2):05–13 9. Mukherjee S, Tomar R Internationalization of Indian MSMEs with reference to the Post-Covid Era. SMEs Int Bus 9 10. Jraisat LE, Sawalha IH (2013) Quality control and supply chain management: a contextual perspective and a case study. Supply Chain Manage Int J 18(2):194–207 11. Abdesamed KH, Abd Wahab K (2014) Financing of small and medium enterprises (SMEs): determinants of bank loan application. Afr J Bus Manage 8(17):1 12. Chowdhury MSA, Azam MKG, Islam S (2013) Problems and prospects of SME financing in Bangladesh. Asian Bus Rev 2(2):109–116 13. Ajayi OI, Ajuwon OS, Ikhide S (2021) Access to finance and performance of services sector MSMEs in Nigeria. Oradea J Bus Econ 6(2):8–20 14. Chakraborty DK (2012) Empowering MSMEs for financial inclusions and growth-role of banks and industry association. Change 9(4785.27):32-08 15. Uz Zaman AH, Islam MJ (2011) Small and medium enterprises development in Bangladesh: problems and prospects. ASA Univ Rev 5(1):145–160
Privacy Challenges and Solutions in Implementing Searchable Encryption for Cloud Storage Akshat Patel , Priteshkumar Prajapati , Parth Shah , Vivek Sangani , Madhav Ajwalia , and Trushit Upadhyaya
Abstract With the instantaneous growth of cloud usage, IoT devices, smart cities, 4G communication, augmented reality, and virtual reality the requirement for digital storage has also risen to accelerate. By delivering data storage and continuing to manage data, cloud storage has now become a crucial part of the new eon. The usage of cloud computing has expanded dramatically. Shifting data and information to cloud servers has reduced a lot of anxiety related to storing it on physical storage devices. The frugality of data submitted has been a vital duty to safeguard the security of sensitive data. Generally, cloud servers are prone to cyberattacks that will lead to the exploitation of specific sensitive data of enterprises or institutions. Hence, there is a need to give some efficient strategies that not only preserve the information maintained on the server but also do not hamper the facility of using services. Searchable Encryption (SE) refers to a technique that enables users to run search operations over encrypted data. Encryption of data is a need to ensure the privacy of their data. Searchable Symmetric Encryption (SSE) allows us to secretly send our data to another user while keeping the possibility to choose to search through the encrypted data. In this paper, we discussed searchable encryption in-depth, such as the efficiency of contemporary SE schemes, privacy challenges, different single A. Patel (B) · P. Prajapati · P. Shah · V. Sangani · M. Ajwalia · T. Upadhyaya Chandubhai S. Patel Institute of Technology (CSPIT), Faculty of Technology and Engineering (FTE), Charotar University of Science and Technology (CHARUSAT), Changa, Gujarat, India e-mail: [email protected] P. Prajapati e-mail: [email protected] P. Shah e-mail: [email protected] V. Sangani e-mail: [email protected] M. Ajwalia e-mail: [email protected] T. Upadhyaya e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_15
149
150
A. Patel et al.
and multiple read/write schemes, and some probable future improvements and ideas provided. Keywords Cloud security · Cloud storage · Searchable encryption · Searchable symmetric encryption · Search token · Data privacy
1 Introduction Cyberspace plays a vital part in our everyday lives. Everything is near impossible without it. There has been a clear growth in internet users over time. People have an interactive and engaged social life. Given the existing circumstances and use of the internet, it won’t come as a weird occurrence to us if it has led to a surge in cyberattacks, which have climbed in the previous several months by six times. Adversaries gain control of whole systems and use the data taken from them. Higher participation in online apps and computers implies a significant volume of data around the globe. This vast volume of data poses an essential worry for its storage. Cloud computing seems more realistic for this problem. It comprises of numerous services such as sound storage and servers covering the whole of cyberspace. As per [12], total of 5.1 billion breaches were documented, and this assault cost 22.40 million USD in 2021. Encrypting the data before delivering it to the server is the most acceptable approach for attaining privacy. This approach may ensure privacy, but it will present some challenges for the CSP (Cloud Service Provider) while executing specific activities, most notably, the search function on the encrypted data. If we want to work on a certain area of the data, it is tough to do so since we have to go through the complete dataset. SE plays a vital part in fixing this challenge. It permits the data owner to exclusively outsource its data to a new server while preserving the ability to search through it. A substantial quantity of data is compromised owing to accessible SE schemes, which finally leads to cyber-attacks. The quantity of compromised data may be decreased using methods like ORAM (Oblivious Random Access Memory) and PIR (Private Information Retrieval). Woefully, these options are highly pricey and impracticable. To avoid this complexity, we also have a straightforward approach, which implies that the user first downloads all the material locally, and then decrypts it, and may, at last, execute a search on it. Again, this method also does not seem viable since if the database is huge and the user just needs a particular portion of the data. The complete information has to be downloaded, which requires some add-ons such as local environment variables to decrypt the content and execute the search procedure. The notion of SE offers an optimistic approach to preventing data from being outsourced via unauthorized access by CSPs. Encrypted data receives identification by giving them an encrypted keyword, also known as search tokens. CSPs scan the record for a keyword that is inserted with encrypted data that meets the search term. By doing so, CSPs are able to examine which encrypted data was accessed via the query search by comparing the records. SE additionally includes insert and delete operations, which are known as the dynamic
Privacy Challenges and Solutions in Implementing Searchable …
151
SE scheme. The dynamic scheme may leak some data if it doesn’t obey forward and backward privacy. If we utilize forward privacy, then CSPs will not be able to know if the data owner has updated or added new data to the previously run query. In the backward privacy, CSPs will not be able to know whether the removed data matches to new queries. To benefit both, forward and backward privacy may restrict the CSPs from amassing the data. But the fact is that, presently, no scheme offers both forward and backward privacy simultaneously. Searchable encryption has two classifications. First is symmetric search in which the user is allowed to perform search operations within their own data, on the other hand, we have asymmetric search which is the opposite of symmetric search that is anyone can perform search operation on the user data. There are various methods that encrypt the plaintext data directly using keywords. Because each phrase must be searched linearly on the server’s encrypted data, the time complexity will grow at the same time. So, to simplify the search process, we frequently develop an index of the database that is built using unencrypted data. By establishing the index, you may greatly decrease search complexity and, as a consequence, enhance search speed. We obtain an increase in search performance at the price of preprocessing operations. Since the indexes are built using unencrypted data, it’s not always possible to build an index. It is largely dependent on the data being encrypted. Basically, we have two basic approaches to structuring the index. With the development in the use of cloud servers, individuals are outsourcing data and performing data sharing with other people. At the same time, data integrity is the greatest concern for the security of data. It is impossible to confirm the integrity of data without downloading it locally. Later, numerous schemes were devised, including Provable Data Possession (PDP) by Authors [1, 16, 25, 26, 28] and Proof of Retrievability (POR) by Naveed et al. [27]. Although later on, the integrity of outsourced data auditing systems were also advocated. To store user’s data, normally, it is divided across several servers, which may lead to the sharing of data among different users who may or may not know each other. Let’s imagine if someone wishes to erase a certain file from his/her computer locally, then the quickest method to get rid of it is to burn it or shred it, but this does not seem feasible in the case of cloud storage. Servers utilize logical deletion to erase undesired files, which unavoidably conceals associated data instead of genuine deletion, which may result in a user’s privacy being disclosed among others. Also, at certain times, CSPs make fraudulent deletions and scam their clients for their own commercial curiosity. Hence, knowing if the data has been erased genuinely is also a vital component of data security and the data life cycle. The data uploaded on the cloud servers need to stay safe, and users also anticipate that there should be a guarantee that users data and information will remain secure. A SE scheme should meet certain security needs. Some of them include the uploaded document’s privacy, search indexes, and the keywords used to get data from the server (i.e., query keyword). They should all be safeguarded, as should the search pattern and access pattern. The query keyword we offer to obtain our data from the server should also be secured, along with that search pattern and access pattern. We also have to maintain safety to keep our submitted data shielded. A search pattern is
152
A. Patel et al.
nothing but whatever information we get from the knowledge of whether two search results are for the same keyword. On the other side, the access pattern is defined as the sequence of search results. Let us comprehend it by considering one example. Here, A(a1 ), …, A(an ) is the sequence. Where A(ai ) is the search result of ai . Simply put, A(ai ) is the container of documents in A that include the keyword a. We live in a time when there is an excess of data that has to be kept on cloud servers, but the dilemma is how to deal with that vast volume of data while protecting privacy and integrity. We have to design an algorithm for an efficient dynamic SSE scheme. There are several algorithms existing for the same, which have since proved inadequate. All SE schemes utilize various security models, which makes it impossible to examine all the scheme’s security side by side. Plus, most of the techniques often compromise search and access patterns. So, it’s necessary to build an efficient scenario that does not emanate search patterns and access patterns.
2 Literature Review The aim of utilizing searchable encryption systems is to search through the encrypted data/information placed on the cloud servers. The point is that, those users who don’t have space locally to keep the data may encrypt their data and send it to the cloud servers. The user may then access his or her data from the local system by using some keyword known as a query keyword. They look through the encrypted data without revealing the content’s privacy to the servers. The established mass systems are based on single-data-owner or multiple-data-owners. To take advantage of methods that are provided for single-data-owner and multi-user, each uses a distinct key every time. Let us grasp this by considering a simple example. We can make use of a cloud server to upload the patient’s information and previous health records, which will not only allow the patient to go wherever they want to take medication, but the hospitals will also be able to give patient treatment more appropriately by going through the patient’s previous records. Here, if hospitals are using single-data owner or multipleuser searchable encryption schemes, then hospitals have to share the different keys with other hospitals, which have to get encrypted multiple times, which will result in high computation and communication overhead that will prove difficult to cope with the different keys. However, the SE schemes intended for multiple-data owners employ just a single key. So, as a consequence, the computation and communication costs will reduce, and the only drawback we find in this scheme is that there should be confidence between the two organizations to employ a multi-data owner SE scheme. So, we may infer that single-data-owner and multi-user SE methods are not at all efficient since they need a big number of keys. Generally, what should happen is that the information and data of the patient should not be abused. In other words, privacy should be respected. For that, we have to build a scheme that will not disclose patients’ information in specified trapdoors under the known ciphertexts. This will make the server not retrieve any information. Also, CSPs should not be able to understand whether the offered trapdoors are for the given information/ciphertext or
Privacy Challenges and Solutions in Implementing Searchable …
153
not. In addition, the auditor/attacker should not be able to track or compute the score of the similarity, which should be done exclusively by the servers. The situation is as such, most of the schemes don’t principally concentrate on search patterns and access patterns, which leads to the exploitation of data. Because of search patterns and access patterns, servers may quickly be able to track the information. So, this is a serious problem with some of the present SE methods. But authors created a manner in which we can block external auditors from getting into our information and data. We may do this by, let’s say, if Ui is one of the matrices, then we can randomly divide this into two or three new matrices. Further, we have to multiply it with other new matrices (let’s take new matrices as Y1 and Y2 ). By doing so, attackers would never be able to access the true encrypted data, so there would be no risk of information leaking. It’s crucial since it will not let attackers decode the data/information on the servers. We may also utilize random matrices in the keys. As in this scenario, we are employing a multiple-data-owner SE scheme, where we have to share our key. This may result in an easy description of the information of other companies or the invasion of someone’s information. But with the aid of random matrices, it’s near to impossible to decode the trapdoor. By creating a secret key, we will be able to prohibit intruders from assaulting. If we use F2 as our secret server code, attackers won’t be able to decrypt any information (since the secret key is only known by the server) and as attackers have to delete F from indices and trapdoors, there will not be any loss of data information from the trapdoors. But at the same time, processing and transmission overhead may grow.
2.1 Proxy Re-encryption with and without Keyword Search Searchable encryption is the greatest solution among others since it’s inexpensive to decrypt the data across the servers or databases and on the searchable data. Cao et al. [31] was the first one to propose a symmetric searchable encryption scheme. Later on, several more schemes were developed, but most of them were adequate for single read/write domiciles and were not appropriate for multi-writer/single reader. This is due to the high expense of building a secure route to transfer over keys. Lately, in 2004, [4] invented the first-ever public-key searchable encryption scheme called Public Key Encryption with Keyword Search, popularly known as PEKS. After then, more and more schemes based on public-key were presented with multi-keyword search. Chase and Kamara [21]’s scheme suggested a method known as “safe channel free certificate-less searchable public-key encryption with numerous keywords.” In this method, it utilize public channels to deliver the message. This proved to be unsecure and risky as there were certain possibilities of keyword guessing attack by Kadvani et al. [3] proposed the Proxy Re-Encryption (PRE). PRE involves a third party who transforms the encrypted document into the user’s encrypted. While doing this, we have to guarantee that the server does not receive any sensitive information. This scheme was also biface (i.e., data user to data owner). Subsequently, this was the prospect of a plaintext attack. Author [1] devised the first one-sided scheme that
154
A. Patel et al.
exploits the idea of bilinear pairing. This scheme was more efficient and guarded against chosen-plaintext assaults. Also, [1] was quicker. Chang and Mitzenmacher [29] designed a new notion called Proxy Re-Encryption using a keyword search in 2010. It is referred to as “building a trapdoor using the private key.” Later, with the aid of proxy servers, it scans through the ciphertexts. Chang and Mitzenmacher [29]’s was based on a single keyword search, which only became a restriction. Afterward, [38] presents a scheme based on multiple keyword searches. Many alternative schemes were also built by solving the constraints of [38]’s scheme. This included adding conductive keyword search with chosen tester. The function Re-dtPECT essentially lets the user do a search operation and decode the encrypted data at a given time range. This scheme enables only selected authorized servers to do the search of query terms, which decreases the danger of keyword guessing attacks. As this scheme is time-sensitive, it may not meet all the privacy criteria. Baek et al. [39] devised a scheme called “time-controlling public key encryption” with delegated conjunctive keyword search (tc-PEDCKS). This scheme permits and is capable of searching for conjunctive terms. Baek et al. [39]’s grants authorization to exchange information/data of other data owners to one data user for a defined duration or for any specific time. Major Operations of SSE Keyword-based Index: This is another sort of safe index known as a keyword-based secure index. In this scheme, one keyword refers to multiple documents and IDs. Search time for the query term in the keyword-based index is linear. Also, this scheme is more efficient as compared to the document-based index. Here, the change and deletion of the keyword-based index when a new document is introduced is extremely affected. Dynamic SSE Scheme: [35] developed a dynamic SSE scheme that can be given out with document updation. Search time in this scheme is logarithmic. This scheme is further divided into two schemes, which both enable document updation. The very first scheme is interactive, whereas the other one is non-interactive. Now the modified scheme can aid with updates that are based on XORs and PRF s . The problem with this scheme is that it discloses some information regarding trapdoors. Later on, [17] presents another dynamic tree-based scheme in which they tackle the leakage of information issues following updation operations. SSE schemes with secure index: To optimize search efficiency, [14] proposes a secure index building which employs pseudorandom and bloom filters. Using a bloom filter, we may rapidly evaluate the existence of components in a given collection. In this scheme, we have to apply the pseudorandom function twice, and then the result gets matched with the bloom filter. With the aid of bloom filters, it becomes quite simple to verify whether the page includes the needed term. This scheme is fully dependent on a secure index. The key benefit of this scheme over the one described by Cao et al. [31] is that we simply have to search over the search index instead of searching over the complete cipher text.
Privacy Challenges and Solutions in Implementing Searchable …
155
Underlying Concept of SSE Structure The core principle of searchable symmetric encryption is to generate secret keys for the communications that have been encrypted and also for hiding indices; Encryption of data or collections of data using safe symmetric schemes. Further, this encrypted message/information needs to be connected with coordinating keywords with the assistance of either an index as the data structure or tags to the block; Hiding the indices tags using Pseudo-Random Functions (PRFs) and Pseudo Random Permutation (PRPs) in such a manner that the connection formed between the appropriate keyword and the encrypted content is retained and allow for the protection of query privacy via the use of keywords and masked indices/tags. Overall, there are four key recommendations. This includes No index creation, Direct index building, Building using an inverted index, and Tree-based structure. The very first approach is to generate an SSE without an index. Cao et al. [31] was the one who suggested this in his important article. In this, the data or information is encrypted in such a manner that search operations may be carried out without the necessity of distinct metadata. In a direct index, an index is generated with the aid of information or a list of keywords that is a tuple. In this method, an index is in the form of a key and value. Here, key refers to a keyword and value refers to a collection of data/messages which are associated with keywords. A considerable number of schemes have gained this technique. However, [11] brings this approach in our sight. Later on, the subsequent schemes were specified by authors [7, 9, 18–20, 22–24, 23, 24]. The execution of direct and inverted index may be based on a tree structure, although other schemes are presented, particularly employing trees, for example, binary tree search. Multi-user SE scheme in the Cloud (MUSE) Following the traditional method of retrieving our data, it is not feasible for many applications. This involves downloading the whole set of documents, decrypting them locally, and then performing a search operation. SE allows us to do search operations on the encrypted data and we’ll be able to retrieve only the required amount of information, which saves us a lot of time. A significant challenge faced by the SE scheme is in the multi-user setting. The term “multiuser SE scheme” refers to users who have access to some set of encrypted information stored by other users. Multiuser SE systems allow users to use and search through certain encrypted material by following some permissions determined by the owner of the relevant segment. There are a lot of security criteria in these systems. We don’t just have to worry about the data segment, but also the security of the queries should be protected against attackers and poisonous cloud service providers. Lately, some researchers have proposed some new schemes which fulfill these security measures. It includes either key-sharing between users or building on Trusted Third Parties (TTP). As we noted, the shielding of the access pattern is discretionary in the single-user SE method, but it is a large element and plays a very essential function in the privacy of submitted data. This is necessary for the event of multi-user configurations. Regrettably, most of the existing MUSE systems have a weakness in the needed security.
156
A. Patel et al.
The Multi-user SE technique extends the present keyword search method into multiple read–write mechanisms that comprehend a large number of users. The first is as a writer, in which the user may outsource their data to the server and entrust other users by providing them the power to execute keyword searches; the second is as a reader. As a reader, the user may do keyword search operations on the papers for which he/she has permission. Searchable schemes have privacy issues, so does the multi-user SE schemes. MUSE poses two key privacy problems and needs. (1) Index Privacy: Uncertified parties should not be able to view the data that has been uploaded to the cloud. (2) Query Privacy: No one other than the reader who has provided the query keyword should be able to view the queries that have been submitted and the results that have been created after the query has been submitted. A cloud service provider considers someone who is not permitted to access the rights on an index to be conspiring if they seek to access sensitive information. Such assumptions will help us in developing a powerful opponent for non-delegated users. This presumption may be incorporated into the MUSE schemes, which don’t have this adversary model if any non-authorized user attempts to collude with the cloud service provider. Now let us explore the shock of the lack of the access pattern. Let us suppose K1 is the one who has the permission to access and query all two indexes, i1 and i2 . CSP can identify the resemblance in ia and ib by monitoring the access pattern of the query presented by K1 . In another scenario, K2 is another approved user but only has access to I B . Since the CSP has content of both ia and ib found through i1 and i2 , the CSP can straightforwardly locate the information available in ia . The CSP already knows the content of ib via user K2 Consequently, we might assume that here, index privacy has been breached. To salvage this, we may make use of a third party. by adding a third party called a proxy, which will carry out an algorithm called TransformQuery. TansformQuery will turn a query of a single user into just a single query per chosen document. For every query produced, the proxy will send the cloud service provider a specific Private Information Retrieval (PIR). The PIR protocol refers to the secret retrieval of data necessary from the data server without letting anybody know about what has been obtained. Because of the PIR protocol, CSPs won’t be able to know about the information included in the query or the response provided by it. This will result in a new scheme achieving both queries as well as access pattern privacy. PIR, without a doubt, protects us against CSPs, but it also creates a new privacy risk at the same time. Implementation of TransformQuery will make a proxy able to know the correlation between a query and multiple ciphertexts that correspond to intended indices. One of the benefits of employing a proxy in the advanced solution is scalability, and by using the TransformQuery technique, customers don’t have to make several PIR queries for a given or similar keyword. Type of Searches Single keyword search: Solution is provided in [31] for searching the encrypted data stored in the cloud without any loss of data confidentiality. The main contribution provided by them is even if an untrusted server gets the encrypted, they can’t get the original message. Authorization techniques used by their schemes prevent the untrusted server or attackers from getting access to the stored encrypted data
Privacy Challenges and Solutions in Implementing Searchable …
157
and query isolation functionality provides privacy against the untrusted server or attackers. Their schemes divide whole data into fixed-length of words (W) with an index and without an index. An index can be used when the data is large in size, but with a disadvantage of overhead. The overhead is caused by storing and updating the index, so authors use their schemes without an index. Their basic scheme used a pseudorandom generator G (stream cipher), a pseudorandom function F and a pseudorandom permutation E (block cipher) pseudorandom value S i , fixed Key K i and bit-wise exclusive or (XOR). This has a major disadvantage that user should be aware about the precise location of the Word (W) or provide the all keys k i which reveals the whole data. So, they proposed a controlled searching scheme with secure pseudorandom function F to avoid the disadvantage of their basic scheme. In support, for hidden search, authors pre-encrypt all words (W) using the deterministic encryption which works similarly to Electronic Codebook (ECB) encryption of the words using some block cipher. Finally, they provided a scheme, in which the encryption of the word is further divided into two parts (L i , Ri ) and key generation uses a pseudorandom function of L i to provide query isolation functionality. The major disadvantage of this scheme is that all the keywords of the file have to be decrypted at the time of search functionality. Multi-Keyword search: Single keyword search results may not be exact consequently we may not be able to retrieve the needed location of the file. Since the efficiency is quite poor in the single keyword search, a multi-keyword scheme is devised to enhance the result’s efficiency and precision. A basic technique to extend the single keyword scheme to a multiple keyword scheme would be to dispatch the list of query keywords to the servers. The subsequent server will deliver all the results related to the set of query terms given. Multi keyword search is the possibility of searching the encrypted data in a conjunctive fashion. This scheme may leak the information/data furthermore, this scheme is inefficient and is computationally expensive for the user. To put it simply, we can ask the server to discover an intersection on the result and then provide it back to the user. This would disclose the intersection keyword pattern, which may not be reasonable. Moreover, [8, 14] proposed secure searchable encryption for the single keyword search. Blaze et al. [11] reviewed the searchable symmetric encryption scheme and created a more powerful scheme employing an inverted index. Shah and Prajapati [4] were the first ones to introduce the first searchable encryption PEKS which supports the multi-user mode. Van Liesdonk et al. [15] has created two searchable encryption schemes, each of which works with the conjunctive keyword search. There is an enhanced search capability named Ranked keyword search. Ranked keyword search is a form of the mature search system. If a user wishes to do a keyword search in the plaintext search system, the search engine will deliver the results and documents relevant by rank. Prajapati et al. [6] developed a ranked keyword search scheme utilizing “Coordinate matching”. This inquiry is based on the number of matched terms based on the search queries and data files. Although the outcome we receive in this scheme is not flawless. Fuzzy keyword search: Basically, in SSE schemes, we’ve to input the trapdoor of the needed keyword and the server returns the content relating to the supplied
158
A. Patel et al.
keyword. The server does not return anything in case the query keyword didn’t match with any of the pages. Let’s assume, our preset keyword is “House” and the query keyword is “Hose”. In this situation, the server will not return anything. Providentially, the fuzzy keyword-based search can cope with it. The fuzzy keyword search might permit such modest errors. Further, various fuzzy schemes were built employing “edit distance”, gram along LSH, and bloom to construct a ranked keyword search scheme since it’s likely that a semi-honest server may provide just a few or a fraction of results. Sun et al. [36] presented a scheme that not only provides fuzzy keyword search but also helps to verify that the server has returned all the search results. Conjunctive Keyword search: In conjunctive keywords search, users can get documents containing several keywords during a single query submission. We have to follow the trivial method in order to do a single keyword search which sounds pretty inefficient. It also leaks some data to the server. Sun et al. [36] proposes two conductive keyword schemes. In the first scheme, major work is done in offline mode whereas the second scheme does not require anything to be done offline that is it only requires continuous communication. Wang et al. [13] proposed a scheme that assists phrase queries, a sub-string. In this technique, the least repeated keyword is queried first then this result further gets matched with the other keywords. Ranked and Verifiable keyword search: The ranked keyword search may increase the search results by returning the most important and relevant content. Also, it will boost the system’s serviceability since it minimizes network traffic. Authors [34, 36, 40] successfully achieved ranked keyword search in single keyword search by utilizing the order-preserving function. Later, [6] proposed a multi-keyword search with the help of “coordinate matching” measurement. But both the methods did not yield correct results. As the results were dependent on the amount of matched keywords and not on the various keywords. Faber et al. [33] present the multi-keyword ranked keyword scheme, which was based on the “cosine measure” approach. This technique generated some decent and accurate results. Ma et al. [10] defined a multikeyword ranked search scheme that is constructed on a hierarchical congregate index to boost search efficiency. Subsequently, this scheme has linear search time growth, but the data collection has an aggressive growth. The verified keyword search may examine whether the results after the search is completed are valid or not. This may also verify the inaccuracy of the result produced due to hardware or software failure, inadequate storage, or malicious acts induced by the semi-honest server. Single read/single write and multiple write/single read In the single reader/writer scheme, users have the authorization to produce searchable data or information and also have permission to build trapdoors to execute the search operation. In this situation, the secret key is normally known by a single person who is both the writer and the reader of the uploaded and created document/data utilizing the SSE scheme. This scheme can also be performed by various alternative methods, including the use of PKE and by keeping a public key hidden, but the outcome will be less efficient. In multi-writer single-read schemes, there are numerous users in
Privacy Challenges and Solutions in Implementing Searchable …
159
terms of having the public key, who have access to produce searchable cipher-text for the one who has the private key and has access to the searchable material. General Framework of PEKS In 2004, [4] was the first to build the first PEKS framework. For a considerable part, PEKS is based on a public-key encryption technique. PEKS are generally consisting of three components: data transmitter, data receiver, and cloud service provider (CSPs). In the first stage, customers encrypt their data with the public key and transfer it to their cloud service providers. The data receiver should have a private key in order to execute a search operation on the data. Once the recipient has gotten access to the appropriate private key, he/she will be able to undertake search operations by constructing a trapdoor of query terms. Once the server gets the trapdoor, it begins to determine if the accessible ciphertext includes the query keyword or not, without acquiring any knowledge about the plaintext of the encrypted data and query keywords. Once the server is done performing this process, the result requested for the provided keyword is returned. It is more acceptable to utilize PEKS on certain unconfident servers since it does not demand any encryption as well as a decryption party to prepare the key in advance. Now let us discuss the algorithms’ description. It provides four randomly generated polynomial-time algorithms. (1) KeyGen (lambda): The data receiver executes the key generation algorithm. It accepts the security parameter lambda as input and produces the public key pK and private key sK. (2) PEKS (pK, U)—This encryption technique is utilized by the data sender. This method receives public key pK and keyword U as input and produces the output keyword ciphertext S(U). (3) Trapdoor (sK, U): The data receiver conducts the trapdoor method to create a trapdoor of keywords. It accepts private key sK and keyword U as input and delivers a trapdoor of query keyword T suffix U (Tu) as output. (4) Test (pK, S, Tu): The cloud server provider executes a test operation. This method accepts public key pK, ciphertext S, and trapdoor of query keyword Tu as input and if it discovers the corresponding result, it outputs “YES”. Otherwise, it produces “NO”. Attack Model Almost all known PEKS systems are susceptible to the keyword guessing attack. Also, there is a file injection attack. Furthermore, users may access their data from the server while still having the option to look through encrypted data selectively. Keyword guessing assaults (KG) are a very critical problem for PEKS. Prajapati and Shah [5] was the first one to suggest a keyword guessing attack on a handful of the PEKS methods. By assaulting, the attacker will be able to guess the enciphered keyword in a particular keyword trapdoor. In KG, attackers may be grouped into two groups, labeled as (1) Outside attacker, (2) Inside Assault. An outside attacker is someone who has no contacts to the cloud service provider. This implies that attackers spy on the public channel between a service provider and the user, customer, or recipient. Lately, [2] created the first public channel free PEKS scheme in which any outside attacker receives the keyword ciphertext while the trapdoor keyword has been communicated over the public channel, the attacker won’t be able to run text
160
A. Patel et al.
algorithms over it. An internal attacker is often mentioned as a poisonous server provider. Insider attackers may get encrypted keywords from any sender and can also gain data through trapdoors from the recipient. This will grow worse when the PEKS scheme is built on secure channel free, where malevolent server providers would undertake tests to identify the association between encrypted keyword and trapdoor using a public key. It is quite tough to rebuff inside attackers. File Injection Attack The File injection attack, also termed the FI, is another highly critical exploit for public-key searchable encryption. Shen et al. [41] was the first one to uncover/expose the FI attack in searchable encryption. A rogue server provider may gain information about the query keyword in this attack by inserting or injecting certain files into the user. It is relatively simple for attackers to obtain into certain sensitive data of the consumers. Because of these reasons, it is now extremely necessary to create some PEKS scheme that can resist this sort of attacker. Search Functionalities of PEKS The PEKS scheme has been particularly appealing for cryptography researchers. It is difficult to execute a search over encrypted data, but PEKS is developing and improving on a regular basis to enable as many ciphertext functions as feasible, such as plaintext information outsourcing. Till now, PEKS systems enable searches such as single keyword search, conjunctive keyword search, fuzzy keyword search, multi-keyword search, ranking keyword search, verified keyword search, similarity keyword search, semantic keyword search, range query, subset query, etc. Discussion on Possible Developments Machine learning is boomingly being utilized today for data mining, picture recognition, DNA sequencing, and many more applications. Lately, the Ministry of Transportation has shifted a huge quantity of data to the cloud. Let’s imagine all the data submitted gets completely mined, this will minimize traffic standstill and traffic accidents. In addition to this, we will be able to anticipate the daily traffic in a given region and at a specific moment. Also, if we utilize data together from the ministry of transportation and the ministry of public security, then it would aid in curbing/ limiting the criminal activities. Combined usage of both Cloud Server and Machine Learning is helpful for us as it may also aid to boost national security. But at the same time, there are two key challenges we may confront. First, certain organizations/departments may decline to provide their information in order to safeguard data security. Secondly, in the era of high and vast cloud data, organizations/departments with a lesser quantity of data may not be able to execute data mining efficiently since it requires a very high amount of computation and communication. Various research has been undertaken on cloud-based machine learning. Machine learning with public auditing, machine learning training, and classification systems based on homomorphic encryption, homomorphic deep learning, etc. are some of the subjects under development.
Privacy Challenges and Solutions in Implementing Searchable …
161
3 Conclusion The simplest and most fundamental benefit of cloud computing is that user can access their data from anywhere and at any time. It also helps users to get rid of physical storage. Hence, users commonly decide to store their data on cloud servers merely to get rid of physical storage. Uploading sensitive data on the cloud server is still an ongoing concern as servers are prone to attacks. Searchable encryption enables servers to conduct search operations on the encrypted data in the place of the user or data owner without violating the privacy of plaintext. In order to preserve sensitive data, one should encrypt it before uploading. Encrypting data will decrease the possibilities for attacks. Encrypting the data will conceal the plaintext. By encrypting the data, it will make it near to impossible for most insiders and outsiders to get the data without knowing the keys. At the same time, it may also restrict the utilization of accessible or present functions, which doesn’t seem rational. We can reanimate the search function by applying straightforward procedures, which basically entail downloading the full database locally, decryption of the whole database, and eventually, conducting a search operation over the whole plaintext. This approach is not at all healthy and is unworkable for most use cases. As we know, the IoT is built of millions of devices that save their data directly on cloud servers. Running searches on the encrypted data is still a hurdle. Many methods are available for Secure Searchable Encryption (SSE), although all of them lag something somewhere. The current Searchable Encryption protocol comprises Searchable Symmetric Encryption and public key encryption with keyword search (PEKS). We have discovered several restrictions while employing both the SSE and PEKS protocols. It is very tough as well as expensive for key management and delivery. On the other hand, PEKS is prone to several assaults, especially keyword guessing attacks (KGA). Many existing systems leak some amount of information about the plaintext, which is widely split into three groups. (1) Leakage of Information in the Index (2) Obtain pattern leak access, and (3) Look for patterns that are spilling out. Almost every scheme available spill a trifle of data, except for a handful schemes. Lately, [30] presents a scheme that doesn’t leak index information or search patterns, which gives us a flawless definition of total security. Research on the issue of searchable encryption has risen greatly with the advent of schemes based on SSE and PEKS. There are three important paths where we’ve made considerable progress: identified as Query Explanation, Effectiveness and Security. The main advantage of utilizing SE schemes is that the cloud server supplies only the desired result, so we don’t have to download the whole document. That saves a lot of time and, in many schemes, offers us significant efficiency. In this paper, we have discussed about the efficiency of existing SE schemes, challenges we may face while using existing schemes, privacy issues in SE schemes, core operations of SSE schemes, multi user searchable encryption in the cloud, types of searches, single write/read, multiple write/read schemes, and a general framework of PEKS. We have also given some possible development and ideas.
162
A. Patel et al.
References 1. 2022. [online]. https://www.itgovernance.co.uk/blog/list-of-data-breaches-and-cyber-attacksin-may-2022-49-8-million-records-breached 2. Ateniese G, Burns R, Curtmola R, Herring J, Kissner L, Peterson Z, Song D (2007) Provable data possession at untrusted stores. In: Proceedings of the 14th ACM conference on computer and communications security, pp 598–609 3. Kadvani S, Patel A, Tilala M, Prajapati P, Shah P Provable data possession using identity-based encryption. In: Information and communication technology for intelligent systems 4. Shah P, Prajapati P (2020) Provable data possession using additive homomorphic encryption. J King Saud Univ-Comput Inform Sci 5. Prajapati P, Shah P (2014) Efficient cross user data deduplication in remote data storage. In: International conference for convergence for technology. IEEE, pp 1–5 6. Prajapati P, Shah P, Ganatra A, Patel S (2017) Efficient cross user client side data deduplication in Hadoop. JCP 12(4):362–370 7. Shacham H, Waters B (2008) Compact proofs of retrievability. In: International conference on the theory and application of cryptology and information security. Springer, pp 90–107 8. Song DX, Wagner D, Perrig A (2000) Practical techniques for searches on encrypted data. In: Proceeding 2000 IEEE symposium on security and privacy. S&P 2000. IEEE, pp 44–55 9. Boneh D, Di Crescenzo G, Ostrovsky R, Persiano G Public key encryption with keyword search. In: International conference on the theory and applications of cryptographic techniques. Springer, pp 506–522 10. Ma M, He D, Kumar N, Choo K-KR, Chen J (2017) Certificateless searchable public key encryption scheme for industrial internet of things. IEEE Trans Industr Inf 14(2):759–767 11. Blaze M, Bleumer G, Strauss M (1998) Divertible protocols and atomic proxy cryptography. In: International conference on the theory and applications of cryptographic techniques. Springer, pp 127–144 12. Shao J, Cao Z, Liang X, Lin H (2010) Proxy re-encryption with keyword search. Inf Sci 180(13):2576–2587 13. Wang XA, Huang X, Yang X, Liu L, Wu X (2012) Further observation on proxy re-encryption with keyword search. J Syst Softw 85(3):643–654 14. Xu L, Li J, Chen X, Li W, Tang S, Wu H-T (2019) Te-pedcks: towards time controlled public key encryption with delegatable conjunctive keyword search for internet of things. J Netw Comput Appl 128:11–20 15. Van Liesdonk P, Sedghi S, Doumen J, Hartel P, Jonker W (2010) Computationally efficient searchable symmetric encryption. In: Workshop on secure data management. Springer, pp 87–100 16. Kamara S, Papamanthou C, Roeder T (2012) Dynamic searchable symmetric encryption. In: Proceedings of the 2012 ACM conference on computer and communications security, pp 965– 976 17. Curtmola R, Garay J, Kamara S, Ostrovsky R (2011) Searchable symmetric encryption: improved definitions and efficient constructions. J Comput Secur 19(5):895–934 18. Kamara S, Papamanthou C (2013) Parallel and dynamic searchable symmetric encryption. In: International conference on financial cryptography and data security. Springer, pp 258–274 19. Goh E-J et al (2003) Secure indexes. IACR Cryptol ePrint Arch 2003:216 20. Wang C, Cao N, Li J, Ren K, Lou W (2010) Secure ranked keyword search over encrypted cloud data. In: 2010 IEEE 30th international conference on distributed computing systems. IBEE, pp 253–262 21. Chase M, Kamara S (2010) Structured encryption and controlled disclosure. In: International conference on the theory and application of cryptology and information security. Springer, pp 577–594 22. Kamara S, Papamanthou C, Roeder T (2011) Cs2: a searchable cryptographic cloud storage system. Microsoft Research, TechReport MS-TR-2011-58
Privacy Challenges and Solutions in Implementing Searchable …
163
23. Kurosawa K, Ohtaki Y (2012) Uc-secure searchable symmetric encryption. In: International conference on financial cryptography and data security. Springer, pp 285–298 24. Poh GS, Mohamad MS, Z’aba MR (2012) Structured encryption for conceptual graphs. In: International workshop on security. Springer, pp 105–122 25. Moataz T, Shikfa A, Cuppens-Boulahia N, Cuppens F (2013) Semantic search over encrypted data. In: ICT 2013. IEEE, pp 1–5 26. Cash D, Jaeger J, Jarecki S, Jutla CS, Krawczyk H, Rosu M-C, Steiner M (2014) Dynamic searchable encryption in very-large databases: data structures and implementation. In: NDSS, vol 14. Citeseer, pp 23–26 27. Naveed M, Prabhakaran M, Gunter CA (2014) Dynamic searchable encryption via blind storage. In: 2014 IEEE symposium on security and privacy. IEEE, pp 639–654 28. Stefanov E, Papamanthou C, Shi E (2013) Practical dynamic searchable encryption with small leakage. Cryptology Print Archive 29. Chang Y-C, Mitzenmacher M (2005) Privacy preserving keyword searches on remote encrypted data. In: International conference on applied cryptography and network security. Springer, pp 442–455 30. Golle P, Staddon J, Waters B (2004) Secure conjunctive keyword search over encrypted data. In: International conference on applied cryptography and network security. Springer, pp 31–45 31. Cao N, Wang C, Li M, Ren K, Lou W (2013) Privacy-preserving multi-keyword ranked search over encrypted cloud data. IEEE Trans Parallel Distrib Syst 25(1):222–233 32. Wang B (2016) Search over encrypted data in cloud computing. Ph.D. dissertation, Virginia Tech 33. Faber S, Jarecki S, Krawezyk H, Nguyen Q, Rosu M, Steiner M (2015) Rich queries on encrypted data: beyond exact matches. In: European symposium on research in computer security. Springer, pp 123–145 34. Swaminathan A, Mao Y, Su G-M, Gou H, Varna AL, He S, Wu M, Oard DW (2007) Confidentiality-preserving rank-ordered search. In: Proceedings of the 2007 ACM workshop on storage security and survivability, pp 7–12 35. Zerr S, Olmedilla D, Nejdi W, Siberski W (2009) Zerber+ r: Top-k retrieval from a confidential index. In: Proceedings of the 12th international conference on extending database technology: advances in database technology, pp 439–449 36. Sun W, Wang B, Cao N, Li M, Lou W, Hou YT, Li H (2013) Privacy-preserving multi-keyword text search in the cloud supporting similarity-based ranking. In: Proceedings of the 8th ACM SIGSAC symposium on information, computer and communications security, pp 71–82 37. Chen C, Zhu X, Shen P, Hu J, Guo S, Tari Z, Zomaya AY (2015) An efficient privacy-preserving ranked keyword search method. IEEE Trans Parallel Distrib Syst 27(4):951–963 38. Byun JW, Rhee HS, Park H-A, Lee DH (2006) Off-line keyword guessing attacks on recent keyword search schemes over encrypted data. In: Workshop on secure data management. Springer, pp 75–83 39. Baek J, Safavi-Naini R, Susilo W (2008) Public key encryption with keyword search revisited. In: International conference on computational science and its applications. Springer, pp 1249– 1259 40. Zhang Y, Katz J, Papamanthou C (2016) All your queries are belong to us: the power of file-injection attacks on searchable encryption. In: 25th {USENIX} security symposium ({USENIX} security 16), pp 707–720 41. Shen E, Shi E, Waters B (2009) Predicate privacy in encryption systems. In: Theory of cryptography conference. Springer, pp 457–473
Multi-objective Optimization with Practical Constraints Using AALOA Balasubbareddy Mallala , P. Venkata Prasad , and Kowstubha Palle
Abstract This paper used Ameliorated Ant Lion Optimization (AALO) technique for solving OPF problems with considered objectives. To test the efficiency of the proposed method, standard test functions such as Sphere new function and Rastrigin functions and IEEE 30 bus systems. The basic ant lion algorithm is hybridized with Levy flight operation and the proposed algorithm as ameliorated ant lion optimization (AALO) algorithm in this considered objective functions are tested by fulfilling equality, inequality, and practical constraints. Finally, the results are satisfactory. Keywords AALO algorithm · Multi-object · Practical constraint
1 Introduction The optimal power flow (OPF) is the key role in power system, to achieve the reduction of power generation cost, loss, voltage deviation, etc. by controlling some control parameters in the power system while satisfying operating limits and practical constraints. In the present literature lot of optimization techniques are described like slap swam [1], hybrid moth-flam [2], grasshopper optimization [3], The teaching learning-based algorithm [4], Grey Wolf Optimizer [5], Ameliorated ant lion [6], Manta Ray Foraging Optimization [7], PSO algorithm [8], Hybrid Sine–Cosine Algorithm [9], Peafowl Optimization [10], Criss Cross Optimization [11] algorithms. Varies multi-objective optimization techniques with and without FACTS devices in the literature such as improved GA [12], artificial immune algorithm [13], the hybrid cuckoo search algorithm with and without FACTS devices [14–16]. Hybrid fruit fly algorithm [17], NSGA-III [18], hybrid gravitational and fruit fly-based ABC algorithms [19, 20], Differential Evolution [21], the effect of FACTS devices based on converter location [22–24].
B. Mallala (B) · P. Venkata Prasad · K. Palle Chaitanya Bharathi Institute of Technology, Hyderabad, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_16
165
166
B. Mallala et al.
This paper proposed a novel AALOA technique for solving the optimization problem. The effect of practical constraints such as ramp rate limits and prohibited operating zones are analyzed. The results are compared with literature.
2 Problem Formulation The problem formulation are given as follows: Min J p (a, b)∀ p = 1, 2, . . . , O f
(1)
Subjected to u(x, y) = 0
(2)
vmin ≤ v(x, y) ≤ vmax
(3)
x T = [Pg1 , Vl1 . . . Vl N P Q , Q g1 . . . Q g N P V , Sl1 . . . Sl N T L
(4)
y T = [Pg2 . . . Pg N P V , Vg1 . . . Vg N P V , Q sh 1 . . . Q sh N C , Tt1 . . . Tt N T
(5)
2.1 Objective Functions a. Fuel cost minimization PV N J1 = min Fc Pgm = xm Pg2m + ym Pgm + z m $/ h
(6)
m=1
b. Emission minimization PV N J2 = min Emission Pgm = αm + βm Pgm + γm Pg2m + ξm exp λm Pgm ton/h m=1
(7) c. Total power loss minimization J3 = min(Ploss ) =
N TL m=1
Plossm M W
(8)
Multi-objective Optimization with Practical Constraints Using AALOA
167
2.2 Constraints a. Equality constraints N PV
Pgm − Pd − Pl = 0;
m=1
N PV
Q gm − Q d − Q l = 0
(9)
m=1
b. Inequality constraints min max min max max Vgm ≤ Vgm ≤ Vgm , Pgm ≤ Pgm ≤ Pgm and Q min gm ≤ Q gm ≤ Q gm ∀m ∈ N P V min max min max Ttm ≤ Ttm ≤ Ttm ∀m ∈ N T ; Sshm ≤ Sshm ≤ Sshm ∀m ∈ N C; min max max Vlm ≤ Vlm ≤ Vlm ∀m ∈ N P Q; Slm ≤ Slm ∀m ∈ N T L
(10)
Ramp-rate limits max Pgmin , Pg0 − R2 ≤ Pg ≤ min Pgmax , Pg0 + R1
(11)
Prohibited Operating Zone (POZ) limits L U L U max Pgmin ≤ Pg ≤ Pg(1) ; Pg( p−1) ≤ Pg ≤ Pg( p) ; Pg(n) ≤ Pg ≤ Pg
(12)
The practical limits are described in [14].
3 Proposed Algorithm The proposed AALO algorithm described in the step-by-step process is given below. Step-1 Stochastic Walk of Ants a(r ) = [0, cs(2w(r1 ) − 1), cs(2w(r2 ) − 1), . . . , cs(2w(r I T ) − 1)
(13)
w(r ) = {i f rand > 0.5 is 1; i f rand ≤ 0.5 is 0}
(14)
The ants positions are as: ⎡
X Ant
Ant11 Ant12 ⎢ Ant Ant ⎢ 21 22 ⎢ =⎢ : : ⎢ ⎣ : : Ant N 1 Ant N 2
⎤ . . . Ant1D . . . Ant2D ⎥ ⎥ ⎥ : : ⎥ ⎥ : : ⎦ . . . Ant N D
(15)
168
B. Mallala et al.
Ait =
Ait − m i × kit − k tj ni − m i
+ jit
(16)
Step-2 Ants Trapped in Ant Lions Traps jit = Ant Lion tj + j t ; kit = Ant Lion tj + k t
(17)
Step-3 Building the Traps For the selection of ant lions, the Roulette Wheel is used. This approach is used to traps the ant lions. Step-4 Slipping Ants jt =
jt t kt ;k = Z Z
(18)
Step-5 Hunting Ant Ant Lion tj = Antit if f Antit > f Ant Lion tj
(19)
Step-6 Elitism
Antit
Q tA + Q tE = 2
(20)
Step-7 Lévy Flight Operator
1
(1 + α) × sin(0.5π α) α
Levy(α) = 1+α
2 × α × 20.5(α−1)
(21)
Ait = m i + Levy × (n i − m i )
(22)
4 Multi-objective Optimization Strategies Multi-objective optimization function as given below Min[J1 (a, b), J2 (a, b), . . . , Js (a, b)] where, s is number of objectives which are amalgamated for multi-objective problems. To implement this, initially random populations is generated and with the help
Multi-objective Optimization with Practical Constraints Using AALOA
169
of which non-dominated sorting Pareto Front Solutions (PFS) are obtained which are being stored in archive and leader selection is implemented to identify the best PFS and then for compromised solution the Fuzzy decision making tool is used.
4.1 Non-dominated Sorting It helps to identify the PFS for the multi-objective optimization problems. Consider two solutions from the PFS i.e. S1 and S2 then check their dominancy over each other. A vector a1 dominates a2 when below mentioned conditions is satisfied: ∀x = 1, 2, . . . , s Sx (a1 ) ≤ Sx (a2 ) ∃y = 1, 2, . . . , s Sy (a1 ) ≤ S y (a2 )
(23)
Solutions which non-dominate search vicinity are know as Pareto Front and in together they form Pareto Front solutions.
4.2 Archive Selection To improve the distribution of solutions, there must be a limit for the archive and the distribution can be measured using niching approach which investigates the region of each solution for which there are two mechanisms are used. Firstly, select the ant lions from the least populated neighborhood which is represented as probability of choosing a solution in the archive: Hi =
c Mi
(24)
where, c is constant having value greater than one and Mi is the number of solution in the region of ith solution. Secondly, when archive is full, then remove the solutions which have highly populated neighborhood so to accommodate the better solutions in archive. The probability for removing solution from the archive is expressed below: Hi =
Mi c
(25)
4.3 Fuzzy Approach The best compromised solutions from the PFS can be extracted on the basis of decision making tool. This Fuzzy approach gives the optimal solution.
170
B. Mallala et al.
j μi
⎧ ⎪ ⎨
S ij ≤ min(Si ) 1 max(Si ) − S ij = min(Si ) ≤ S ij ≤ max(Si ) ⎪ ⎩ 0 max(Si ) − min(Si ) S ij ≥ max(Si )
(26)
The μ is normalized using Eq. (30) for jth pareto solution[9]: s
s j Wi μi μnor m = sup N i=1 ; where W ≥ 0; Wi = 1 i j s s j=1 i=1 Wi μi i=1
(27)
5 Results and Analysis In this section, to validate the proposed algorithm three bench mark test functions are considered Sphere and Rastrigin functions for single objective optimization and Schaffer 2 test function for multi-objective optimization and IEEE 30 bus electrical system.
5.1 Example-1: Benchmark Test Functions Sphere Function f (x) =
d
xi2
(28)
i=1
Rastrigin Function f (x) = 10d +
d
[xi2 − 10 cos(2π xi )
(29)
i=1
The benchmark test functions are tested using the proposed method and the convergence characteristics of sphere function and Rastrigin function are represented in Figs. 1 and 2 respectively. This figures shows the better results for the proposed method and gives better optimal value. Furthermore, for extending the potential of proposed MOAALO technique, multi objective optimization problem for Schaffer 2 bench mark test function is Minimize g1 (x) = x 2 ; g2 (x) = (x − 2)2
(30)
The selected pareto front solutions is achieved against weighted sum, as listed in Table 1.
Multi-objective Optimization with Practical Constraints Using AALOA
171
Fig. 1 Comparison of different methods for sphere function
Fig. 2 Comparison of different methods for Rastrigin function
5.2 Example-2: Electrical Test System The one of the electrical system, IEEE 30 bus is tested using the proposed method. Effect of practical constraints are considered and with their effect optimal power flow problem are solved for considered three objective functions for following 4 cases: Case A: Without ramp rate (RR) and prohibited operating zone (POZ) Case B: With RR limits
172
B. Mallala et al.
Table 1 Results for Schaffer 2 test function Set No
W1
W2
Existing MALO
Proposed MAALO
Func.1
Func.2
Func.1
Func.2
1
0.9
0.1
0.005322
3.713522
0.032498
3.311412
2
0.8
0.2
0.005322
3.713522
0.032511
3.311278
3
0.7
0.3
0.005322
3.713522
0.104931
2.809208
4
0.6
0.4
0.250619
2.248144
0.167596
2.530056
5
0.5
0.5
0.250619
2.248144
0.167596
2.530056
6
0.4
0.6
0.43681
1.793146
0.691433
1.365335
7
0.3
0.7
0.80089
1.221192
0.982613
1.017539
8
0.2
0.8
0.919412
1.083973
0.982613
1.017539
9
0.1
0.9
0.919412
1.083973
0.982613
1.017539
Case C: With POZ limits Case D: With both practical constraints (RR and POZ). Optimal power flow results for objectives is tabulated in Table. 2, with the consideration of all the above four cases, it can be seen that for emission and total power loss optimized effectively by the proposed algorithm. From, Table 2, it can also be observed that, as practical constraints are applied then the considered objective functions value also increased. It can also be concluded from Table 3, that under practical constraints rescheduling of generators being done, for the minimization of total power loss and emission all generators operates at their maximum limit except the generator at the slack bus while for the minimization of the generation fuel cost all generators operates at minimum limit except the generator at the slack bus. For further analysis both the constraints i.e. ramp-rate limit and POZ limits will be considered. The combination of objective optimization simultaneously were solved for three amalgamations mentioned below: 1. Fuel cost—Emission. 2. Fuel cost—Loss. 3. Emission—Total power loss. Total Pareto’s are generated from which best pareto is identified and with the implementation of fuzzy decision making pareto fronts are selected for considered amalgamation and selected pareto front solutions (Table 4).
Generation fuel 798.462 cost $/h
8.78778
0.25822
11.4839
20.5155
21.4211
16.7154
14.4284
21.3722
38.9421
71.5094
69.6486
0.20497 3.6641
8.99689 8.75197
806.431 950.766
23.1682
28.4988
20.5048
36.6578
50
173.267
0.26129
0.2507
803.379
16.6261
17.5224
21.9049
18.2956
44.3212
178.592
Bold means, considered objective function values
10.4131
14.3041
PG13, MW
Loss, MW
18.6224
PG11, MW
0.2503
799.612
15.9777
PG8, MW
Emission, ton/h
12.1024
25.1881
PG5, MW
48.9207
48.1204
PG2, MW
177.79
175.969
4.12863
0.20726
928.347
15.1166
20.2646
15.2577
49.9613
63.8279
62.5501
B
A
D
Emission (ton/h)
C
A
B
Fuel cost ($/h)
PG1, MW
Variables
Table 2 Results of practical constraints for electrical systems
4.2874
0.20798
918.248
38.4411
24.8537
15.921
27.1371
68.3657
71.6943
C
B
C
D
30
24.773
50
25
25
49
69.8299 63
25
35
47.9556
67.9175
4.86923 3.9328 4.6368
4.8885 4.9129
0.2175 0.21852
937.55 921.607 920.73 902.648
38.693 34.6456 30.575 35.4886
29.8
35
50
60.27
70.628 79.7882 75.714 66.9994
A
Loss (MW)
0.22299 0.2093 0.2148
903.752
38.441
18.2864
17.098
27.6277
60.5741
74.54
D
Multi-objective Optimization with Practical Constraints Using AALOA 173
RR2, P1
RR1, P1
RR1, P1
RR1, P1
RR1, P1
PG5, MW
PG8, MW
PG11, MW
PG13, MW
RR1, P1
RR1, P1
RR1, P1
RR1, P1
RR2, P1
RR2, P2
RR1, P1
RR1, P1
RR1, P1
RR1, P1
RR2, P1
RR2, P2
R2, P1
RR2, P2
RR1, P1
RR2, P2
RR2, P1
RR2, P2
RR1, P1
RR1, P1
RR1, P1
RR2, P2
RR2, P2
RR1, P1
RR1, P1
RR1, P1
RR1, P1
RR2, P2
RR2, P2
RR1, P1
RR2, P2
RR2, P1
RR1, P1
RR1, P1
RR2, P2
RR1, P1
C
RR2, P2
RR1, P1
RR1, P1
RR1, P1
RR2, P2
RR1, P1
D
RR2, P2
RR2, P1
RR2, P2
RR2, P2
RR2, P2
RR1, P1
A
Loss (MW)
RR2, P2
RR2, P2
RR2, P1
RR2, P2
RR2, P2
RR1, P1
B
RR2, P2
RR2, P3
RR2, P3
RR2, P2
RR2, P2
RR1, P1
C
RR2, P2
RR2, P3
RR2, P2
RR2, P2
RR2, P2
RR1, P1
D
P4—equal to POZ upper limit
* RR1—Near-by up ramp rate, RR2—Near-by down ramp rate, P1—less than POZ lower limit, P2—more than POZ upper limit, P3—equal to POZ lower limit,
RR2, P2
PG2, MW
B
A
D
Emission (ton/h)
C
A
B
Fuel cost ($/h)
PG1, MW
Variables
Table 3 Operation of generators w.r.t the ramp-rate limit and POZ limit
174 B. Mallala et al.
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
2
3
4
5
6
7
8
9
Without practical constraints
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
With practical constraints
Without practical constraints
Amalgamation-2
765.085
765.085
692.456
631.251
631.251
631.251
609.532
573.166
573.166
0.209
0.209
0.215
0.226
0.226
0.226
0.238
0.266
0.266
777.947
777.947
777.906
758.273
668.259
522.346
522.346
522.346
522.346
0.200
0.200
0.200
0.200
0.205
0.217
0.217
0.217
0.217
830.067
830.067
830.067
569.917
569.917
286.395
286.395
286.395
286.395
5.214 845.999
5.214 845.999
5.214 845.251
8.470 838.788
8.470 469.921
19.797 469.921
19.797 341.804
19.797 340.235
With practical constraints
5.250 0.346
5.250 0.346
5.254 0.239
5.302 0.239
10.683 0.239
10.683 0.239
14.478 0.213
14.547 0.213
3.545
3.545
4.137
4.137
4.137
4.137
5.022
5.022
5.945
0.282
0.282
0.230
0.230
0.230
0.229
0.229
0.213
0.213
3.429
3.429
5.393
5.393
5.393
5.521
5.521
11.696
11.696
Emission Total Emission Total (ton/h) power (ton/h) power loss loss (MW) (MW)
Without practical constraints
Amalgamation-3
16.194 0.202
Generation Total fuel cost power ($/h) loss (MW)
With practical constraints
19.797 320.166
Generation Emission Generation Emission Generation Total fuel cost (ton/h) fuel cost (ton/h) fuel cost power ($/h) ($/h) ($/h) loss (MW)
Amalgamation-1
W1
W2
SNO Weightage
Table 4 Three amalgamations for without and with practical constraints
Multi-objective Optimization with Practical Constraints Using AALOA 175
176
B. Mallala et al.
6 Conclusion To solve multiple objectives such as minimization of cost, emission, and transmission power loss at a time this paper proposed a novel AALO algorithm. The proposed algorithm tested on Rastrigin and Sphere test functions. The results proved that the proposed algorithm performance gives better than the existing literature. Furthermore, Multi-objective Ameliorated Ant Lion Optimization Algorithm had been tested on Schaffer-2 test function and it is being observed that the best Pareto front solutions are obtained using the proposed method as compared to the existing Multi-objective Ant Lion Optimization Algorithm. The practical constraints are considered for analysis of the electrical IEEE 30 bus system, and the complete analysis of the system is described in this paper.
References 1. Mallala B, Dwivedi D (2022) Salp swarm algorithm for solving optimal power flow problem with thyristor-controlled series capacitor. J Electron Sci Technol 20(2):111–119. https://doi. org/10.1016/j.jnlest.2022.100156 2. Shaikh MS, Raj S, Babu R, Kumar S, Sagrolikar K (2023) A hybrid moth–flame algorithm with particle swarm optimization with application in power transmission and distribution. Decis Anal J 6:100182. https://doi.org/10.1016/j.dajour.2023.100182 3. Ahmadipour M, Murtadha Othman M, Salam Z, Alrifaey M, Mohammed Ridha H, Veerasamy V (2023) Optimal load shedding scheme using grasshopper optimization algorithm for islanded power system with distributed energy resources. Ain Shams Eng J 14(1). https://doi.org/10. 1016/j.asej.2022.101835 4. Fatehi M, Toloei A, Niaki STA, Zio E (2023) An advanced teaching-learning-based algorithm to solve unconstrained optimization problems. Intell Syst Appl 17. https://doi.org/10.1016/j. iswa.2022.200163 5. Tukkee AS, bin A. Wahab NI, binti Mailah NF (2023) Optimal sizing of autonomous hybrid microgrids with economic analysis using Grey Wolf optimizer technique. In: e-Prime— advances in electrical engineering, electronics and energy, p 100123, Feb. 2023. https://doi. org/10.1016/j.prime.2023.100123 6. Balasubbareddy M, Dwivedi D, Murthy GVK, Kumar KS (2023) Optimal power flow solution with current injection model of generalized interline power flow controller using ameliorated ant lion optimization. Int J Electr Comput Eng 13(1):1060–1077. https://doi.org/10.11591/ ijece.v13i1 7. Paul K, Sinha P, Bouteraa Y, Skruch P, Mobayen S (2023) A novel improved manta ray foraging optimization approach for mitigating power system congestion in transmission network. IEEE Access 11:10288–10307. https://doi.org/10.1109/ACCESS.2023.3240678 8. Reddy MB, Obulesh YP, Raju SS (2012) Particle swarm optimization based optimal power flow for volt-var control. 7(1). [Online]. Available www.arpnjournals.com 9. Singh DK, Srivastava S, Khanna RK (2020) Optimal placement of IPFC for solving optimal power flow problems using Hybrid Sine-Cosine algorithm. 19(4):3064–3080. https://doi.org/ 10.17051/ilkonline.2020.04.764681 10. Ali MH, El-Rifaie AM, Youssef AAF, Tulsky VN, Tolba MA (2023) Techno-economic strategy for the load dispatch and power flow in power grids using peafowl optimization algorithm. Energies (Basel) 16(2):846. https://doi.org/10.3390/en16020846
Multi-objective Optimization with Practical Constraints Using AALOA
177
11. SVS College of Engineering, Institute of Electrical and Electronics Engineers. Madras Section, and Institute of Electrical and Electronics Engineers. Proceedings of 2019 third IEEE international conference on electrical, computer and communication technologies ´ 12. Liu Y, Cetenovi´ c D, Li H, Gryazina E, Terzija V (2022) An optimized multi-objective reactive power dispatch strategy based on improved genetic algorithm for wind power integrated systems. Int J Electr Power Energ Syst 136. https://doi.org/10.1016/j.ijepes.2021.107764 13. Lian L (2022) Reactive power optimization based on adaptive multi-objective optimization artificial immune algorithm. Ain Shams Eng J 13(5). https://doi.org/10.1016/j.asej.2021. 101677 14. Balasubbareddy M, Sivanagaraju S, Suresh CV (2015) Multi-objective optimization in the presence of practical constraints using non-dominated sorting hybrid cuckoo search algorithm. Eng Sci Technol Int J 18(4):603–615. https://doi.org/10.1016/j.jestch.2015.04.005 15. Balasubbareddya M, Sivanagarajub S, Venkata Sureshc C, Naresh Babud AV, Srilathaa D (2017) A non-dominated Sorting Hybrid Cuckoo Search Algorithm for multi-objective optimization in the presence of FACTS devices. Russ Electr Eng 88(1):44–53. https://doi.org/10.3103/S10 68371217010059 16. Balasubbareddy M (2017) A solution to the multi-objective optimization problem with FACTS devices using NSHCSA including practical constraints 17. Balasubbareddy M (2016) Multi-objective optimization in the presence of ramp-rate limits using non-dominated sorting hybrid fruit fly algorithm. Ain Shams Eng J 7(2):895–905. https:/ /doi.org/10.1016/j.asej.2016.01.005 18. Guo Y, Zhu X, Deng J, Li S, Li H (2022) Multi-objective planning for voltage sag compensation of sparse distribution networks with unified power quality conditioner using improved NSGAIII optimization. Energ Rep 8:8–17. https://doi.org/10.1016/j.egyr.2022.08.120 19. Nazir MS, Almasoudi FM, Abdalla AN, Zhu C, Alatawi KSS (2023) Multi-objective optimal dispatching of combined cooling, heating and power using hybrid gravitational search algorithm and random forest regression: towards the microgrid orientation. Energ Rep 9:1926–1936. https://doi.org/10.1016/j.egyr.2023.01.012 20. Mallala B, Papana VP, Sangu R, Palle K, Chinthalacheruvu VKR (2022) Multi-objective optimal power flow solution using a non-dominated sorting hybrid fruit fly-based artificial bee colony. Energies (Basel) 15(11). https://doi.org/10.3390/en15114063 21. Lv D, Xiong G, Fu X, Wu Y, Xu S, Chen H (2022) Optimal power flow with stochastic solar power using clustering-based multi-objective differential evolution. Energies (Basel) 15(24). https://doi.org/10.3390/en15249489 22. Reddy MB, Obulesh YP, Sivanagaraju S, Suresh CV (2016) Mathematical modelling and analysis of generalised interline power flow controller: an effect of converter location. J Exp Theor Artif Intell 28(4):655–671. https://doi.org/10.1080/0952813X.2015.1042529 23. Reddy MB, Obulesh YP, Raju SS, Suresh V (2014) Optimal power flow in the presence of generalized interline power flow controller. [Online]. Available www.ijrte.org 24. Reddy MB, Obulesh YP, Raju SS (1999) Analysis and simulation of series facts devices to minimize transmission loss and generation cost
An Implementation of Lightweight Cryptographic Algorithm for IOT Healthcare System Ahina George , Divya James , and K. S. Lakshmi
Abstract IoT connects billions of heterogeneous devices, known as Things, using various communication technologies and protocols to enable end users to access a range of smart applications worldwide. It provides technology and delivers a range of smart applications to end users around the world. Majority of the cryptographic algorithms used today were developed for desktop and server contexts, as a result they are not suitable for small devices. Therefore, lightweight encryption algorithms are considered. In the proposed system, the readings from the real time IoT traffic is taken using sensors and are sent to the Raspberry Pi. Encryption is done by four lightweight cryptographic algorithms. The lightweight cryptographic algorithms considered are SPECK, SIMON, CLEFIA and PRESENT. The four lightweight cryptographic algorithms are used to decrypt the encrypted values after they are retrieved from the hardware module via an MQTT broker thus enabling data security in IoT Healthcare application. A comparative analysis is made on the four lightweight cryptographic algorithms and their performance characteristics is analyzed based on various factors such as time and throughput. The results show that the average time taken for encryption/decryption by SPECK algorithm was the least followed by SIMON, CLEFIA and PRESENT. Based on Throughput analysis, PRESENT algorithm has the least throughput followed by CLEFIA, SIMON, SPECK. Keywords Block Cipher · Encryption · Decryption · Lightweight cryptography · Internet of Things · CLEFIA · Raspberry Pi · Security
A. George (B) · D. James · K. S. Lakshmi Department of IT, Rajagiri School of Engineering and Technology, Kakkanad, Kerala 683104, India e-mail: [email protected] D. James e-mail: [email protected] K. S. Lakshmi e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_17
179
180
A. George et al.
1 Introduction Cryptography is the process of encrypting and decrypting data in order to restrict unauthorized access. Majority of the cryptographic algorithms used today are developed for desktop and server contexts, as a result they are not suitable for small devices. Therefore, lightweight encryption algorithms (LWC) are considered [17]. The creation of new cryptographic algorithms that are optimized for a limited few devices has been the subject of extensive research over the past decade. These cryptographic methods are frequently referred to as “lightweight” algorithms (Khandaker 2021). The parameters for hardware implementations where encryption is built into the device include the amount of physical space needed for the circuit implementing the algorithm, the power required, and the time needed to receive the circuit’s output [2]. Majority of the cryptographic algorithms used today were developed for desktop and server contexts, as a result they are not suitable for small devices [10]. Therefore, lightweight cryptographic approach is proposed that solves many of the problems of traditional cryptography when used on devices with constrained physical size, computational requirements, limited memory, and power consumption. Lightweight cryptography helps secure networks of smart things due to its efficiency and footprint reduction [4]. In the proposed system, the readings from the IoT traffic is taken using sensors and are sent to the Raspberry Pi. Encryption is done by four lightweight cryptographic algorithms. The Lightweight cryptographic algorithms considered are SPECK, SIMON, CLEFIA and PRESENT [6]. The encrypted values are retrieved from the hardware module via an MQTT broker and are decrypted using the four lightweight cryptographic algorithms and the results are then displayed on a web application [14]. In the upcoming sections, Sect. 2 consist of Methodology which includes Sect. 2.1 that highlights the materials that are used in the proposed system and Sect. 2.2 which shows the Proposed Architecture which consists of Sect. 2.2.1, Light Crypto-Care Layered Architecture, Sect. 2.2.2, which contains Work Flow of Light Crypto-Care Architecture and Sect. 2.2.3, Development technologies for Light Crypto-Care Architecture which is then followed by Experimental Setup and Results (Sect. 3) which consists of Performance Evaluation (Sect. 3.1) of the LWC algorithms. Section 4 outlines the Discussions and then finally concludes in Sect. 5.
An Implementation of Lightweight Cryptographic Algorithm for IOT …
181
2 Methodology 2.1 Materials Figure 1 shows the smallest Raspberry Pi with integrated Wi-Fi and Bluetooth is the super-compact and super-thin Raspberry Pi Zero W. Its dimensions are 65 mm long, 30 mm wide, and 5 mm thick, and it is 40% quicker than the original Raspberry Pi. The Raspberry Pi Zero W is perfect for embedded Internet of Things (IoT) projects because it has wireless LAN and Bluetooth [11]. The Pi Zero W uses a tiny jack and bare 40-pin GPIO to be as versatile and small as possible. Figure 2 shows Pulse oximeter sensor. A sensor with an integrated pulse oximeter and heart rate monitor is called the MAX30100. It incorporates two LEDs, a photodetector, improved optics, and low-noise analogue signal processing to detect pulse oximetry and heart rate signals [9]. A key feature of the MAX30100 is that it reads the absorbance levels of both lights and stores them in a readable buffer via I2C. Figure 3 shows the single wire digital temperature sensor, DS18B20. For use with an Arduino, the One Wire and Dallas temperature libraries are necessary. You can utilize one sensor or numerous sensors on the same data line since each sensor can be uniquely identified by an address [8]. Additionally, there is no need for an external power source because the sensor is powered directly from the data line [12]. The Sunrom BP Sensor is depicted in Fig. 4; it measures the user’s blood pressure and heart rate, stores the data, and emails it to the administrator. A typical reading appears to be three values separated by a comma and a space, and the measured values are systolic, diastolic, and pulse [5]. Fig. 1 Raspberry PI ZERO W
Fig. 2 Pulse oximeter sensor: MAX30100
182
A. George et al.
Fig. 3 Temperature sensor: DS18B20
Fig. 4 Sunrom BP sensor
CLEFIA provides 128-bit blocks with a choice of 128, 192 and 256-bit keys in rounds of 18, 22 and 26 respectively (P. Saravanan). The most compact version requires 2488GE (encryption only) for a 128-bit key, so it offers high performance and high resistance to various attacks at a relatively high cost. CLEFIA’s high resistance to security attacks is achieved through its double entanglement and diffusion properties [19]. Conversely, it requires more memory and limits its use in very small applications. CLEFIA is based on the Feistel structure [15, 16]. PRESENT is based on the Substitution-Permutation network and uses 64-bit blocks in two major variants: 80-bit and 128-bit keys with GE 1570 and 1886 requirements [18]. The minimum GE requirements specified for the PRESENT version are approx. 1000GE (encryption only) requiring 2520–3010 GE for proper security. This is a hardware efficient algorithm that uses a 4-bit S-box (replacement layer—replace 8 S-boxes with 1 S-box), but the software (permutation layer) requires large cycles that require an improved version of this algorithm [7]. SIMON is a lightweight block cipher suite developed by the NSA to provide ciphers with optimal hardware performance. The SIMON circuit is a classic Feistel circuit that operates in two n-bit halves in each round, so the total round block size is 2n bits. The rest of this article will use n to represent half the size of the cipher block, i.e. the size of the left and right branches, respectively.
An Implementation of Lightweight Cryptographic Algorithm for IOT …
183
Speck is an ARX (“add, rotate, XOR”) construct. The non-linearity comes from modular addition and uses XOR and rotation for linear mixing. Modular additions are a more natural choice than bit AND Simon for software performance [3]. Majority of the cryptographic methods currently in use were created for desktop and server environments, hence they are not appropriate for small devices. Lightweight encryption techniques are therefore taken into consideration, and we are presenting a model called Light Crypto-Care that resolves these problems. This model is discussed in the next section.
2.2 Proposed Architecture Ubiquitous, sometimes known as pervasive, refers to something that is “existing everywhere” [19]. Unlike desktop computing, ubiquitous computing may occur in any location, device, and format. This paradigm is the product of exponentially increasing computer technology to integrate computing into the environment. Lightweight cryptographic techniques are used to create the proposed system. The proposed system is named as Light Crypto-Care which is used for IoT healthcare applications. The Layered Light Crypto-Care architecture is given below.
2.2.1
Light Crypto-Care Layered Architecture
Figure 5 shows the Light Crypto-Care Layered Architecture. It consists of four layers namely, Light Crypto-Care Application Layer, Light Crypto-Care Data Processing Layer, Light Crypto-Care Networking Layer and Light Crypto-Care Sensing Layer.
Light Crypto-Care Sensing Layer In Light Crypto-Care Sensing Layer, we have trusted IoT devices that are connected to the Raspberry Pi. The sensors that are used are the Temperature sensor which measures the temperature values, the Pulse oximeter sensor that reads the oxygen saturation levels and the blood pressure sensor that reads the blood pressure values. The readings from the IoT traffic is taken using sensors and are sent to the Raspberry Pi.
Light Crypto-Care Networking Layer The authentication of IoT devices is handled by the Light Crypto-Care Networking Layer. To retrieve the values from the Raspberry Pi and verify the data’s authenticity, an MQTT broker is employed.
184
A. George et al.
Fig. 5 Light crypto-care layered architecture
Light Crypto-Care Data Processing Layer The data is examined and pre-processed in the data processing layer. The Light Crypto-Care Data Processing Layer handles the real-time analysis and the storage of data in the database.
Light Crypto-Care Application Layer The Light Crypto-Care Application Layer contains the smart real time dashboard. The Application layer defines all the applications in which IoT has deployed and serves as an interface by displaying values to the user and facilitating user interaction.
2.2.2
Implementation Environment of Light Crypto-Care
Figure 6 shows the Implementation Environment of Light Crypto-Care. In Light Crypto-Care, we have trusted IoT devices that are connected to the Raspberry Pi. The sensors that are used are the Temperature sensor which measures the temperature values, the Pulse oximeter sensor that reads the Oxygen saturation levels and the Blood pressure sensor that reads the blood pressure values. The readings from the IoT traffic is taken using sensors and are sent to the Raspberry Pi. The values are then encrypted using four lightweight cryptographic algorithms SPECK, SIMON, CLEFIA and PRESENT. The encrypted values are then retrieved from the hardware
An Implementation of Lightweight Cryptographic Algorithm for IOT …
185
Fig. 6 Implementation environment of light crypto-care
module via a MQTT broker and are decrypted using the four lightweight cryptographic algorithms. The decrypted values are then published to the web application when it subscribes to the MQTT Broker, and the data values are stored in the database.
2.2.3
Development Technologies for Light Crypto-Care
Table 1 shows the Development technologies for the proposed system Light CryptoCare Architecture. The Raspberry PI Zero W serves as the basis for the Light CryptoCare hardware module, and sensors such as the Sunrom blood pressure sensor, the DS18B20 temperature sensor, and the MAX30100 oxygen sensor are attached to it. Table 1 Development technologies for the light crypto-care architecture Components
Descriptions
Hardware
Raspberry PI Zero W
RAM
2 GB
Hard disk
250 GB or higher
Processor
i3 or higher
Operating system
Windows 7 or later
Library and framework
Django (Python web framework)
Sensors
Temperature sensor (DS18B20), Oxymeter sensor (MAX30100), Sunrom blood pressure sensor
IDE
Python IDE
DBMS
SQLite3
Programming language
Python 3, HTML, CSS, JS, C
186
A. George et al.
3 Experimental Setup and Results Figure 7 shows the Hardware module. The hardware module consists of Raspberry PI, BP sensor, temperature sensor and pulse oximeter sensor. Raspberry PI reads the pressure value from the blood pressure sensor, the temperature value from the temperature sensor, and the oxygen saturation level from the pulse oximeter sensor. Readings from the sensor are sent to the Raspberry PI. The value is encrypted using four lightweight cryptographic algorithms. The lightweight cryptographic algorithms considered are SPECK, SIMON, CLEFIA and PRESENT. Encrypted values are fetched from hardware through an MQTT broker. Figure 8 shows the Real time sensor values displayed in Raspberry PI. The sensor connects to the Raspberry PI and reads all sensor values in real time. Encryption is done by four lightweight cryptographic algorithms. The Lightweight cryptographic algorithms considered are SPECK, SIMON, CLEFIA and PRESENT and then these values are published to an MQTT broker, where the encrypted values are received when the web application subscribes to them. Figure 9 shows the Encryption of Real time sensor values on Raspberry PI. The values encrypted by the lightweight cryptographic algorithms will be published to an MQTT broker from where the encrypted values will be received in the web application when that subscribes to it. The MQTT publish and subscribe and insert into the Database. The encrypted values are received from the hardware part via an MQTT broker. Figure 10 decryption of the encrypted values will be decrypted using lightweight cryptographic algorithm and the decrypted values will be stored on the database. When login to the webpage by the user, the values are displayed and can be plotted. The results show that the average time taken for encryption/decryption by SPECK algorithm was the least followed by SIMON, CLEFIA and PRESENT. Based on Throughput analysis, PRESENT algorithm has the least throughput followed by CLEFIA, SIMON, SPECK. Fig. 7 Implementation environment of light crypto-care
An Implementation of Lightweight Cryptographic Algorithm for IOT …
187
Fig. 8 Real time sensor values displayed in Raspberry PI
Fig. 9 Encryption of real time sensor values on Raspberry PI
3.1 Performance Evaluation Table 2 shows the performance evaluation of lightweight encryption algorithms CLEFIA, PRESENT, SIMON and SPECK. Performance characteristics such as average time for encryption/decryption, throughput was analyzed. Figure 11 shows the Average time analysis of lightweight algorithms. Average time for encryption/decryption process is calculated for all the algorithms. Average time is calculated as follows:
188
A. George et al.
Fig. 10 Decryption of real time sensor values on Anaconda Prompt Table 2 Performance evaluation
LWC
Average time (ms)
SPECK
0.04323106288909912 23,131.515469913506
SIMON
0.09637074947357177 10,376.592539360036
CLEFIA
0.29179857969284857
PRESENT 0.8663344812393189
Fig. 11 Average time analysis of lightweight algorithms
Throughput (records/ms)
3427.0214784891755 1154.2885821299278
An Implementation of Lightweight Cryptographic Algorithm for IOT …
189
Fig. 12 Throughput analysis of lightweight algorithms
Average time for encryption/decryption =
total time taken to encrypt/decrypt all the records Total number of records
Figure 12 shows the Throughput analysis of lightweight algorithms. Throughput values is calculated for all the algorithms. Throughput is calculated as follows: Throughput =
Total number of records Total time taken for encryption of all the records
4 Discussions Lightweight cryptography helps secure networks of smart things due to its efficiency and footprint reduction. In the proposed system, Light Crypto-Care, the readings from the real time IoT traffic is taken using sensors and are sent to the Raspberry Pi. Encryption is done by four lightweight cryptographic algorithms. The Lightweight cryptographic algorithms considered are SPECK, SIMON, CLEFIA and PRESENT. The four lightweight cryptographic algorithms are used to decrypt the encrypted values after they are retrieved from the hardware module via an MQTT broker thus enabling data security in IoT Healthcare application. The results are then displayed on a web application. Results shows the comparison made on the four different lightweight encryption techniques and their performance characteristics is analyzed based on various factors such as time and throughput.
190
A. George et al.
5 Conclusions and Future Work IT security of IoT devices is one of the most prominent cyber security challenges facing the modern technology ecosystem. Majority of the cryptographic algorithms used today were developed for desktop and server contexts, as a result they are not suitable for small devices. Therefore, lightweight encryption algorithms are considered. The creation of new cryptographic algorithms that are optimized for a limited few devices has been the subject of extensive research over the past decade. These cryptographic methods are frequently referred to as “lightweight” algorithms. Simple lightweight cryptographic algorithms SPECK, SIMON, CLEFIA and PRESENT are used in the Light Crypto-Care systems to secure IoT Health care systems. Availability of hardware resources and problems with network connections were some of the challenges encountered. Light Crypto-Care architecture also compares the lightweight encryption techniques and analyses the performance characteristics based on average time and throughput. The results show that the average time taken for encryption/decryption by SPECK algorithm was the least followed by SIMON, CLEFIA and PRESENT. Based on Throughput analysis, PRESENT algorithm has the least throughput followed by CLEFIA, SIMON, SPECK. Future scope of this work aims to implement encryption and decryption of values with hardware and software efficient lightweight cryptographic algorithms. Acknowledgements I would like to thanks Divya James, who is an Asst. Professor of Dept. of IT at Rajagiri School of Engineering and Technology, who guided for conducting the experiment.
References 1. Aakash D, Shanthi P (2016) Lightweight security algorithm for wireless node connected with IoT. Indian J Sci Technol 9:1–8 2. Ahmadi S, Delavar M, Mohajeri J, Aref MR (2014, September) Security analysis of CLEFIA128. In: 2014 11th international ISC conference on information security and cryptology. IEEE, pp 84–88 3. Ali SS, Mukhopadhyay D (2013, August) Improved differential fault analysis of CLEFIA. In: 2013 workshop on fault diagnosis and tolerance in cryptography. IEEE, pp 60–70 4. Bansod G, Raval N, Pisharoty N (2014) Implementation of a new lightweight encryption design for embedded security. IEEE Trans Inform Forensics Sec 10(1):142–151 5. Choi MG (2022) Use of serious games for the assessment of mild cognitive impairment in the elderly. Appl Comput Sci 18(2) 6. Katagi M, Moriai S (2008) Lightweight cryptography for the internet of things. Sony Corporation, pp 7–10 7. Luo H, Chen W, Ming X, Wu Y (2021) General differential fault attack on PRESENT and GIFT cipher with nibble. IEEE Access 9:37697–37706 8. Morris JC, Heyman A, Mohs RC, Hughes JP, van Belle G, Fillenbaum GDME, Clark C et al (1989) The consortium to establish a registry for Alzheimer’s disease (CERAD): I. Clinical and neuropsychological assessment of Alzheimer’s disease. Neurology
An Implementation of Lightweight Cryptographic Algorithm for IOT …
191
9. Mozaffari-Kermani M, Azarderakhsh R (2012) Efficient fault diagnosis schemes for reliable lightweight cryptographic ISO/IEC standard CLEFIA benchmarked on ASIC and FPGA. IEEE Trans Ind Electron 60(12):5925–5932 10. O’caoimh R, Sweeney C, Hynes H, McGlade C, Cornally N, Daly E, Molloy W et al (2015) COLLaboration on AGEing-COLLAGE: Ireland’s three star reference site for the European Innovation Partnership on Active and Healthy Ageing (EIP on AHA). Eur Geriatr Med 5(6):505–511 11. Pyrgas L, Kitsos P (2019, August) A very compact architecture of CLEFIA block cipher for secure IoT systems. In: 2019 22nd Euromicro conference on digital system design (DSD). IEEE, pp 624–627 12. Sadkhan SB, Salman AO (2018, March) A survey on lightweight-cryptography status and future challenges. In: 2018 international conference on advance of sustainable engineering and its application (ICASEA). IEEE, pp 105–108 13. Saravanan P, Rani SS, Rekha SS, Jatana HS (2019, July) An efficient ASIC implementation of CLEFIA encryption/decryption algorithm with novel S-box architectures. In: 2019 IEEE 1st international conference on energy, systems and information processing (ICESIP). IEEE, pp 1–6 14. Shah P, Arora M, Adhvaryu K (2020, October) Lightweight cryptography algorithms in IoT—a study. In: 2020 fourth international conference on I-SMAC (IoT in social, mobile, analytics and cloud) (I-SMAC). IEEE, pp 332–336 15. Shirai T, Shibutani K, Akishita T, Moriai S, Iwata T (2007, March) The 128-bit blockcipher CLEFIA. In: International workshop on fast software encryption. Springer, Berlin, Heidelberg, pp 181–195 16. Suh GH, Ju YS, Yeon BK, Shah A (2004) A longitudinal study of Alzheimer’s disease: rates of cognitive and functional decline. Int J Geriatr Psychiatry 19(9):817–824 17. Thakor VA, Razzaque MA, Khandaker MR (2021) Lightweight cryptography algorithms for resource-constrained IoT devices: a review, comparison and research opportunities. IEEE Access 9:28177–28193 18. Thorat CG, Inamdar VS (2018) Implementation of new hybrid lightweight cryptosystem. Appl Comput Inform 19. Tong T, Chignell M, Tierney MC, Lee J (2016) A serious game for clinical assessment of cognitive status: validation study. JMIR Serious Games 4(1):e5006 20. World Health Organization (2017) Global action plan on the public health response to dementia 2017–2025
Home Appliances Automation Using IPv6 Transmission Over BLE Lalit Kumar and Pradeep Kumar
Abstract There are various solution for data transfer in increased Internet of Things (IoT) devices. Bluetooth, Wi-Fi, RF, IR and optical are some of them. For home automation or small range devices data communication has Bluetooth as a good option as Bluetooth has developed or release as BLE (Bluetooth Low Energy) for efficient or low power solution for data transfer. It uses IPv6 over BLE for data transfer. It was proposed by Internet Engineering Task Force (IETF) in RFC 7668. BLE is widely used in IoT applications using Raspberry Pi and mobile connectivity. In this paper, we have created a connection between mobile phone and Raspberry Pi for controlling the home automation system. Blue Z is used for data transfer and controlling application through mobile. Keywords BLE · Automation · L2CAP · IPv6 · GATT
1 Introduction This paper presents an idea to create a BLE connection and data transfer from mobile to Raspberry and Raspberry Pi to Mobile. Along with this, we are hopping more research in developing more efficient connection or developments. BLE can be utilized for more application such as patient monitoring and environment condition. We have different application used in industries that utilize IoT application for machine monitoring, store environment maintain and process control using PTS (Product Traceability system), MCP (Machine critical parameters) continuous monitoring and CTQ (Critical to Quality) controls. The motivation for this work is to search a low power consumption method for Industrial and home automation. This paper use BLE capability to use in data transfer in comparison to Bluetooth and controlling appliances using Raspberry Pi pins. RFC 7668, Section 3, “Specification of IPv6 over BLE” declares that “BLE does not currently support the formation of multi-hop networks at the link layer” But we utilize it as an advantage that we require limited L. Kumar (B) · P. Kumar J.C.Bose University of Science and Technology, YMCA, Faridabad, Haryana, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_18
193
194
L. Kumar and P. Kumar
range such as for home appliances control where no any other person or Bluetooth device can interrupt or take the data. Bluetooth based appliances control system using Raspberry Pi and mobile App. BLE is a WPAN (Wireless Personal Area Network) technology having reduced power consumption [1]. It is used for low power application. It has different 2MHz channels. It is used with GFSK (Gaussian Frequency Shift modulation). GATT (Generic Attribute Profile) is used in majority of low energy application profile. In GATT, attributes that is short piece of data are exchanged over low energy link. IPSP (Internet Protocol Support Profile) is used for internet connectivity. BLE devices are detected through a procedure based on broadcasting advertising packets. GATT uses UUIDs (Universal Unique Identifier) has services, characteristics and descriptor information [2]. It has security manager (SM) services with AES (Advanced Encryption Standards) encryption. Bluetooth 4.0 is working at 2.4 GHz ISM band i.e. same as classic Bluetooth. It is not compatible with classical Bluetooth. It is has event triggered application so used for low power devices until it remain in sleep mode. It can operate from one to five year with only single coin cell battery. It can be used in health monitoring applications also. It uses 1–5% power as compared to classic Bluetooth. BLE has its limitation of not supporting formation of multi-hop network at link layer. Internet Engineering Task Force (IETF) is an open standard organization which takes decision and develops related to voluntary internal standards. It helps to remove the network complexity. In this paper, we have used following documents of IETF: (a) RFC7668: For IPv6 over BLE [4] (b) RFC4919: For link layer connection over 6 LOWPAN (Low power wireless personal area network) (c) RFC6775: For internet protocol to smallest devices (d) RFC6282: For Compression format for IPv6 datagrams over IEEE 802.15.4 based network [3] (e) RFC4861: For Router solicitation message format. 6LOWPAN devices have two types of devices i.e. Reduced Function Devices (RFDs) and Full Function devices (FFDs). RFDs are used for small sensors with limited capabilities and FFDs are used for more resources and for high data transfer applications. In 6LOWPAN, link layer security is based on AES as mandated by IEEE 802.15.4. Encoding format LOWPAN_IHPC is used for IPv6 header and LOWPAN_ NHC is used for the UDP header. 6LOWPAN has low power consumption, low bandwidth, smallest packet size, low cost and ability to connect with large number of devices. Neighbor solicitation is a massage send by one node to determine the link layer address of a neighbor. It also detect if any duplicity available. Router Solicitation is massage generated by node for requesting of Router Advertisements immediately. Router advertisement is the response from router against router solicitation massage. Internet protocol support profile (IPSP) has proposed for exchanging of IPv6 packets between node (6LN) and router (6LBR) [3]. Node (6LN) has capability to that allow router (6LBR) to discover node using generic attribute protocol (GATT). Maximum transfer unit (MTU) shall be 1280 octets for both node and router. Both of
Home Appliances Automation Using IPv6 Transmission Over BLE
195
these use service discovery to initiate the connection between the peripheral and the router. We can connect with using link layer connection, L2CAP and router discovery behavior. For transferring packet between two routers, one router consider as master and other used as a slave. One 6LN cannot connect with 6LN or other 6LBR without the help of its central 6LBR. Then 6LBR is connected with the internet or IoT devices. Each Node has its service universally unique identifier (UUID) with advertising data (AD) type field and IP support service UUID also defined in its advertising data. Massage fragmentation is required due to its limited transfer capabilities. 6LNs are able to communicate without 6LBR by giving IPv6 address to each node. 6LN is working as GATT server role and 6LBR as a GATT client role both use IP support service (IPSS) to discover a node. L2CAP is used for fragmentation and reassembly of massage because data transfer packet limitation.
2 Work Flow See Fig. 1.
Fig. 1 Work flow chart of proposed model
196
L. Kumar and P. Kumar
3 Proposed Algorithm (A) Address Setup: Generate IPv6 address assigned to BLE network based on 48 bit Bluetooth device address. We use private Bluetooth device for security purpose. (B) Multi-Cast router solicitation massage to 6LBR with source Link-Layer Address Option (SLLAO): It sends IP source address which includes source link-layer address option and sends three solicitation massages after 10 s (or specify as per your requirement). It multicast this massage. (C) 6LBR receive massage and uni-cast router advertisement directly to link local address of 6LN. A check can also be implemented to ensure that 6LBR has received the massage for router advertisement. (D) 6LN registered with 6LBR by sending neighbor solicitation massage with ARO (Address registration option) and update in context table. (E) Send router advertisement with 6CO: It uses 64-bit extended unique identifier (EUI-64) with LOWPAN_IPHC for IPv6 and LOWPAN_NHC for UDP. Elided some portion while sending data: While sending data from 6LN to 6LBR then we have to elide some of its fields. All the fields are retrieved in sending data from 6LBR to 6LN.
4 Initial Setup and Plateform We have used Ubuntu operating system using bootable PEN Drive. We have used it in following steps: Step 1: Install Ubuntu OS in bootable PEN Drive Step 2: Insert in USB Port and press F12 for booting Step 3: Select USB storage for starting Ubuntu Step 4: Select Boot Ununtu-1604-32-bit as my system is 32 bit Step 5: Connect Raspberry Pi with USB port Step 6: Select Wired Ethernet connection Step 7: Select Setting as per picture (Fig. 2) Step 8: Select IPv6 and then select “Link local only” option (Fig. 3) Step 9: Open terminal and open Raspberry Pi login using ssh command Step 10: It will ask for password then use raspberry as a password and login your system.
Home Appliances Automation Using IPv6 Transmission Over BLE
Fig. 2 Connection setting with all options
Fig. 3 Link local option for login Raspberry Pi with USB cable
5 Experimental Setup for Connect Raspberry BLE with Mobile nRF Connect App Run “Advertisement Program” (Figs. 4, 5, 6, 7, 8 and 9).
197
198
Fig. 4 Advertisement of BLE device to connect with other device
Fig. 5 BLE device connected with the mobile app
Fig. 6 Program for application is in running condition
L. Kumar and P. Kumar
Home Appliances Automation Using IPv6 Transmission Over BLE
Fig. 7 Showing application is running by using mobile app
Fig. 8 Showing advertising program screen photo
Fig. 9 Application program
199
200
L. Kumar and P. Kumar
6 Experimental Results We have completed my proto with following results. While sending 00 using mobile App., All LEDs are in OFF status (Figs. 10, 11 and 12). While sending 01 using mobile App., only one LED is in ON condition (Figs. 13, 14 and 15). While sending 02, 04 using mobile App., only one LED is in ON condition same as sending 01. While sending 08 using mobile App., only one LED is in ON condition as shown in results (Figs. 16, 17 and 18). While sending 0F using mobile App., ALL LEDs are in ON conditions (Figs. 19, 20 and 21).
Fig. 10 Sending 00 using mobile nRF connect app
Fig. 11 All LED status showing in OFF condition in our Ubuntu screen
Home Appliances Automation Using IPv6 Transmission Over BLE
Fig. 12 No LED is in glow condition showing that all relays are in off condition
Fig. 13 Sending 01 using mobile nRF connect app
Fig. 14 Showing screening on Ubuntu only one LED1 is in ON condition
201
202
L. Kumar and P. Kumar
Fig. 15 Only first LED is in glow condition showing that only one relay is in ON condition
Fig. 16 Sending 08 using mobile nRF connect app
Fig. 17 Screening in Ubuntu showing only LED4 is in ON condition
Home Appliances Automation Using IPv6 Transmission Over BLE
203
Fig. 18 Showing only LED4 is in ON state which indicate that only relay 4 is in ON condition
Fig. 19 Sending 0f using mobile nRF connect app
Fig. 20 Screening in Ubuntu indicating all LEDs are in On condition
204
L. Kumar and P. Kumar
Fig. 21 Showing all LEDs are in ON condition and it indicating all relays are in ON condition
7 Conclusion Relay operation takes very less time for performing operation so it can be used in other home appliances control and industrial control applications. Its area of coverage is also sufficient for home automation coverage. We have implemented low power Bluetooth operation in home automation proto. My aim for this proto is to implement it in my future research in field of IoT. It will be used in many application such as medical and research also. This approaches improved the power consumption as less packet required to be transmitted after fragmentation. It is the future of connecting devices in field of IoT.
References 1. https://en.wikipedia.org/wiki/Bluetooth_Low_Energy. Accessed: 24/03/2019 2. https://www.bluetooth.com/specifications/gatt/generic-attributes-overview. Accessed: 24/03/ 2019 3. Hui J, Thubert P (2011) RFC6282 Compression Format for IPv6 Datagrams over IEEE 802.15.4Based Networks, September 2011. https://doi.org/10.17487/RFC6282 4. Nieminen J, Savolainen T, Isomaki M, Patil B, Shelby Z, Gomez C (2015) RFC 7668: IPv6 over BLUETOOTH(R) Low Energy. RFC Editor, USA. https://doi.org/10.17487/RFC7668
IoT Based Collision Avoidance System with the Case Study Using IR Sensor for Vehicles Manasvi , Neha Garg , and Siddhant Thapliyal
Abstract The paper is designed to study the vehicle collision avoidance mechanism that comes in its path while moving forward. Obstacle avoidance car is an autonomous and intelligent device that senses the obstacle in its path changes its direction of motion and then resumes its motion and navigate in an unpredictable environment. This study is achieved using Arduino Uno which is a ATMega328p microcontroller board which acts as the brain of obstacle avoidance car. Ultrasonic Sensors are used here to sense the obstacles coming in its path. Motor driver is used to provide the drive current to all the four motors used in this practical implementation. Keywords Collision avoidance · IR sensor · Vehicles · IoT
1 Introduction A collision avoidance system, also known as a pre-crash system, forward collision warning system, or collision mitigation system, is an advanced driver assistance system designed to prevent or reduce the severity of an accident. A forward collision warning system, in its most basic form, measures a car’s speed, the speed of the automobile in front of it, and the distance between the two to inform the driver if the cars are becoming too close and maybe even avert an accident. A range of technology and sensors, including radar, laser, and cameras, are used to identify an imminent catastrophe. GPS sensors can detect stationary hazards such as approaching stop signs using a position database. These gadgets may also be capable of detecting pedestrians. Autonomous emergency braking (AEB) is a common system that is required in some countries, such as the EU, while agreements between automakers and safety officials to eventually standardise crash avoidance systems, such as in the US, as well as research projects involving some manufacturer-specific devices, are other Manasvi (B) · N. Garg · S. Thapliyal Department of Computer Science and Engineering, Graphic Era Deemed to be University, Dehradun 248002, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_19
205
206
Manasvi et al.
examples of collision avoidance systems. This paper is proposed to build an intelligence and autonomous obstacle avoidance robotic vehicle using ultrasonic sensors for its smooth movement. Obstacle Avoiding car is one of the great result of mobile robotics. It is an autonomous robot which senses the obstacles in its path while moving forward and changes its direction of moving by finding the path with no obstacle by observing in its left and right direction. The obstacle detection is the primary requirement of this autonomous robotic car. It gets the information of its surrounding through the ultrasonic sensors mounted on its top. For turning towards left or right direction, servo motor is used in it. To control the turning and walking algorithm of obstacle avoidance car, Arduino Uno controller has been implemented which acts as the brain of this car.
1.1 Types of Collision Avoidance Assist Systems Automatic braking system When an oncoming impediment is detected, AEB technology quickly engages the vehicle’s braking system. While most AEB technologies will utilise the brakes until the vehicle comes to a complete stop, certain systems may just use a modest amount of braking power to give the driver more time to respond (Fig. 1). Adaptive cruise control ACC is a mechanism that assists automobiles in maintaining a safe following distance while driving at the speed limit. Instead of requiring the driver to adjust the speed of the vehicle, this technology does so automatically. Electronic Stability Control When it detects a rapid loss of control, such as when you turn too quickly, ESC instantly assists in stabilising your vehicle. It is turned on when you start your automobile and will beep if it detects you losing control.
Fig. 1 Types of collision avoidance assist systems
IoT Based Collision Avoidance System with the Case Study Using .. . .
207
Park Assist It is a self-parking assistance system that assists drivers in parking more correctly by utilising guidance system technology that outperforms ultrasonic and other camera-based replacements with superior, cutting-edge technology. In numerous ways, the parking guidance system (PGS) is customer-focused. It enhances each stage of the driving process with automated procedures that create trust in the drivers, in addition to directing vehicles to available parking spaces. Using a radar sensor situated in front of the car, the Intelligent Forward Collision Warning system determines the distance between the two vehicles ahead as well as their relative speed. The gadget may then analyse the situation in front of the vehicle. Collision Avoidance Technology’s Evolution and Benefits 2 Collision avoidance technology protects drivers and cars by monitoring the road for potential hazards. Mobileye, a prominent producer of factory-installed and aftermarket collision avoidance systems, scans the road ahead with a single camera-based sensor. When the system detects an impending accident, dangerous following distance, lane departure, or speed limit violation, it delivers visual and audio alarms, providing the driver essential time to react and rectify.
2 Literature Review For ensuring our research authenticity we had gone through various research articles published by researchers in reputed journals and conferences among them Philip et al. [6] proposed in the paper that in India, the use of vehicles for transportation has dramatically increased during the past several decades. Increased use of vehicles for transportation leads to traffic jams and more accidents on the road. Accidents are usually caused by two things: not obeying traffic laws and irregularities on the road. When drivers make mistakes, vehicles collide, and accidents happen because of weather-related issues. Despite several recent advancements in technology inventions for automobile safety, accidents continue to rise daily. Therefore, a proper strategy for detecting and avoiding vehicle collisions is crucial. Therefore, this literature review has an influence on the research of the various methods for and requirements for vehicle collision detection and collision avoidance systems. In another study of Lee et al. [5] presented that The path planning and tracking system of an autonomous vehicle is designed to avoid collisions on slick surfaces. The fifth-order spline is used in route planning to produce a path that may provide the greatest amount of lateral acceleration while taking tire-road friction into account. Model predictive control (MPC) is used to track the produced route, and the combined brushed tyre model and extended bicycle model reflect the nonlinearity caused by the tyre and low friction surface. To prevent the vehicle from becoming unstable on slick roads and optimise lateral movement at the same time, a novel form of yaw rate limitation that takes side-slip angles into account is used inside the controller. Hamedi et al. [3] presented that One of the services on the Internet of Ships that is envisioned is cooperative collision avoidance amongst inland waterway ships. Such a service seeks to enhance ship trajectories while promoting safe navigation. However, in order to
208
Manasvi et al.
deploy it, precise and rapid ship placement prediction with in-the-moment responses is required to avoid collisions. Advanced Machine Learning (ML) techniques are typically used in this situation to forecast the locations of the ships. The processing of the data must often take place in a centralised setting, such as a cloud data centre run by a third party. Due to the fact that linked ships can access sensitive information and that the location data of ships is not accessible to this third party, these schemes are not appropriate for the collision avoidance service. Zhang et al. [17] discussed an active collision avoidance system’s segmented trajectory planning technique. The collision avoidance trajectory is broken up into three sections: lane change, overtaking, and returning to the original lane. This is done by taking into account the longitudinal and lateral movement of the obstacle vehicle, as well as the ego vehicle and obstacle outer contour restrictions. According to the relative speed, distance, and outside contour of the two vehicles, a longitudinal and lateral safety distance model is used to determine the lane-beginning change’s and ending points. The ideal trajectory can be chosen based on the system objective function, the lane-change trajectory cluster, the vehicle states, the dynamic constraints, and the vehicle body kinematics constraints. This allows the vehicle to continuously monitor the distance to the obstacle vehicle and ensure that it can avoid colliding with it safely and smoothly. The efficacy and viability of the suggested trajectory planning technique for the active collision avoidance are shown by simulation and experiment findings. Kim et al. [4] discussed that “Great efforts have been made in recent years to build unmanned navigation systems for Autonomous Surface Vehicles, but significant advancements are still needed for ASVs working around obstacles. In this case, this research proposes an Obstacle Collision Avoidance Guidance system that uses obstacle detection with a two-dimensional (2D) LiDAR sensor. The OCAS was implemented in a physical model of a Catamaran-type ASV with a 2D LiDAR sensor after the OCAG algorithm was validated in a numerical manoeuvring simulation. When the ASV sails along a specified global course, the LiDAR sensor identifies obstacles, and collision-avoidance actions are performed based on several criteria such as the vehicle’s velocity and orientation, as well as the detected distance and direction of the objects”. Tornese et al. [12] proposed “Real-time simulators are valuable tools for the development and validation of complex systems such as naval ships, aircraft, and land vehicles, which require increasingly stringent testing and integration over time as these systems get more sophisticated. Hardware-in-the-loop simulation (HILS) is a well-established approach for quickly and cost-effectively validating both the hardware and software components of a defined architecture. also discussed the design and execution of a hardware-in-the-loop simulator created to evaluate a marine autonomous collision avoidance system (CAS), as well as some significant test scenarios used to examine the model’s behaviour”. The HILS system is made up of a programmable logic controller (PLC) that is coupled to two personal computers, one of which hosts the Linux system that runs the ROS Path Planner and the other which simulates the present situation and the ship model.
IoT Based Collision Avoidance System with the Case Study Using .. . .
209
3 Motivation Technologies that avoid collisions have the potential to cut fatal and injury crashes by 30 and 40%, respectively. According to a research based on actual collision data, combining this technology with automated braking systems might prevent up to a very high proportion of rear-end collisions. According to the Institute for Highway Safety, vehicles equipped with collision avoidance systems are less likely to be involved in crashes that result in injuries. This safety device has been shown to reduce fatalities, major injuries, and car accidents. Collision warning and avoidance systems are modern automotive safety features that can assist drivers in avoiding rearend collisions. Their goal is to provide the driver enough time to avoid an accident while not annoying the driver with signals that are deemed too early or unnecessary.
4 Methodology This obstacle avoidance car proposes the vehicles with intelligence and self guidance. Its working include some steps out of which first step involves the process of detection of obstacle which is done with the help of Ultrasonic sensor which is mounted on the servo motor and gives the idea of the surrounding to the car. The second step involves finding an alternate path to continue its motion. The third step involves changing its direction of motion by finding the path with no obstacles by observing its left and right direction using servo motor. An infrared sensor is a type of electronic module that detects the physical appearance of its surroundings by emitting and/or detecting infrared light. IR sensors can also detect motion and determine the amount of heat radiated by an item.The IR transmitter sends out an IR signal, and when that signal encounters an obstacle in its course, the reflected IR signal is received by the receiver. The basic block diagram of the obstacle avoidance car using arduino is shown by the above figure. This block diagram mainly consists of three main components. Arduino Uno Ultrasonic Sensor Motor driver (L293D)
4.1 Arduino Uno The Arduino Uno is a prototype board based on the ATmega 328p Microcontroller. It’s an open source electrical prototyping platform that can work with a variety of sensors and actuators. It is used to manage all operations and assign tasks to each device (Figs. 2 and 3; Tables 1 and 2).
210
Manasvi et al.
Fig. 2 Circuit diagram of obstacle avoidance car using arduino
Fig. 3 Block diagram of obstacle avoidance car using arduino
4.2 Ultrasonic Sensors It is a sensor for ultrasonic range finding. It is a non-contact distance measurement technology that can measure distances ranging from 2 cm to 4 m. It is mostly used to identify obstacles.
IoT Based Collision Avoidance System with the Case Study Using .. . .
211
Table 1 Hardware requirements of testbed Parameter Description OS name Version System type Processor SMBIOS version Installed physical memory (RAM) Table 2 Software specification Parameter Platform Processor Random access memory (RAM) size Development environment Programming environment Libraries used Sensors used Kit used
Microsoft Windows 7 Ultimate 6.1.7600 service pack 1 build 7600 x64-based PC Intel(R) Pentium(R) CPU B940 @ 2.00 GHz, 2000 MHz, 2 Core(s), 2 Logical processor(s) 2.7 6.00 GB
Description Windows 7 Intel(R) Pentium(R) CPU B940 .@ 2.00 GHz 2.00 GHz 6.00 GB (5.85 GB usable) Integrated Development Environment (IDE) with Arduino Embeded C AFMotor, Newping, Servomotor Ultrasonic sensors Arduino Kit
4.3 Motor Driver It can supply bidirectional current to four motors. In this actual application, motors were employed to control the left, right, forward, and reverse directions.
5 Result and Discussion The outcome for obstacle avoidance car using arduino is that it detects obstructions in its way using ultrasonic sensors while going ahead and adjusts its direction of motion by watching its left or right direction using servo motor. Although this paper only talks about the prototype of the collision avoidance system but it can be used for various industries, for example in heavy-duty industries Most collision avoidance solutions are built on an effective tracking system that can monitor the movements of employees and pieces of equipment to guarantee they don’t collide. Tags, worn by employees and affixed to machinery and vehicles, can stream information into a control centre, allowing a human operator to view the relative location of everything. These type of collision avoidance systems also contain alarms and notifications that activate when two tags are moved too near to one another, preventing collisions from occurring (Fig. 4).
212
Manasvi et al.
Fig. 4 Obstacle avoidance car using arduino
6 Conclusion and Future Work The paper is focused upon developing obstacle avoidance system in vehicles in which an autonomous robot detects the obstacles in its path and navigate according to the actions we set for it. This paper is based on prototype model of the object detection and collision avoidance mechanism. With the advancement in ROBOTICS, it can be modified.It can be used in cars to avoid car accidents. Passengers will have a smooth and safe drive.It can be used in household work like automatic vacuum cleaning, automatic lawn mower, etc. Can be used in mines and factories as a mining vehicle that uses obstacle detection.
References 1. Chakraborty A, Jindal M, Khosravi MR, Singh P, Shankar A, Diwakar M (2021) A secure iot-based cloud platform selection using entropy distance approach and fuzzy set theory. Wirel Commun Mobile Comput 2021:1–11 2. Chauhan H, Kumar V, Pundir S, Pilli ES (2013) A comparative study of classification techniques for intrusion detection. In: 2013 international symposium on computational and business intelligence. IEEE, pp 40–43 3. Hammedi W, Brik B, Senouci SM (2022) Toward optimal MEC-based collision avoidance system for cooperative inland vessels: a federated deep learning approach. IEEE Trans Intell Transp Syst 1–13
IoT Based Collision Avoidance System with the Case Study Using .. . .
213
4. Kim JS, Lee DH, Kim DW, Park H, Paik KJ, Kim S (2022) A numerical and experimental study on the obstacle collision avoidance system using a 2d lidar sensor for an autonomous surface vehicle. Ocean Eng 257:111508 5. Lee H, Choi S (2022) Development of collision avoidance system in slippery road conditions. IEEE Trans Intell Transp Syst 23(10):19544–19556 6. Philip C, Vanitha DV, Keerthi K (2022) Vehicle detection and collision avoidance system. In: 2022 8th international conference on advanced computing and communication systems (ICACCS), vol 1, pp 971–973 7. Philip C, Vanitha DV, Keerthi K (2022) Vehicle detection and collision avoidance system. In: 2022 8th international conference on advanced computing and communication systems (ICACCS), vol 1. IEEE, pp 971–973 8. Sharma S, Ghanshala KK, Mohan S (2019) Blockchain-based internet of vehicles (IoV): an efficient secure ad hoc vehicular networking architecture. In: 2019 IEEE 2nd 5G world forum (5GWF), pp 452–457. https://doi.org/10.1109/5GWF.2019.8911664 9. Thapliyal S, Nautiyal P (2021) A mechanism of sentimental analysis on YouTube comments. Elementary Educ Online 20(2):2391–2397. https://doi.org/10.17051/ilkonline.2021.02.254, https://ilkogretim-online.org/?mno=111555 10. Thapliyal S, Wazid M, Singh DP (2023) Blockchain-driven smart healthcare system: challenges, technologies and future research. In: Choudrie J, Mahalle P, Perumal T, Joshi A (eds) ICT with intelligent applications. Springer Nature, Singapore, pp 97–110 11. Thapliyal S, Wazid M, Singh DP, Das AK, Alhomoud A, Alharbi AR, Kumar H (2012) ACMSH: an efficient access control and key establishment mechanism for sustainable smart healthcare. Sustainability 14(8). https://doi.org/10.3390/su14084661 12. Tornese R, Polimeno E, Pascarelli C, Buccoliero S, Carlino L, Sansebastiano E, Sebastiani L (2022) Hardware-in-the-loop testing of a maritime autonomous collision avoidance system. In: 2022 30th mediterranean conference on control and automation (MED), pp 514–519 13. Wazid M, Das AK, Odelu V, Kumar N, Conti M, Jo M (2018) Design of secure user authenticated key management protocol for generic iot networks. IEEE Internet Things J 5(1):269–282 14. Wazid M, Das AK, Shetty S, Rodrigues JJ, Guizani M (2022) AISCM-FH: Ai-enabled secure communication mechanism in fog computing-based healthcare. IEEE Trans Inform Forensics Sec 18:319–334 15. Wazid M, Singh J, Das AK, Shetty S, Khan MK, Rodrigues JJ (2022) ASCP-IOMT: Aienabled lightweight secure communication protocol for internet of medical things. IEEE Access 10:57990–58004 16. Wazid M, Thapliyal S, Singh DP, Das AK, Shetty S (2022) Design and testbed experiments of user authentication and key establishment mechanism for smart healthcare cyber physical systems. IEEE Trans Netw Sci Eng. https://doi.org/10.1109/TNSE.2022.3163201 17. Zhang H, Liu C, Zhao W (2022) Segmented trajectory planning strategy for active collision avoidance system. Green Energ Intell Transp 1(1):100002
Airline Ticket Price Forecasting Using Time Series Model A. Selvi, B. Sinegalatha, S. Trinaya, and K. K. Varshaa
Abstract Sales Forecasting is the process of predicting the future price based on time. In this project, We have collected dataset from kaggle on Airline ticket sold on each days. Initially We have done all the descriptive and exploratory analysis on dataset. As it is time series data, We have check for stationarity for better prediction of model. At first, the dataset is not stationary, So We have done different transformation techniques to dataset like logarithms, differentiation, shift, etc. Then We check for stationarity by using adfuller test and kpss test. We found that the dataset becomes stationary. I have worked with two models ARIMA and SARIMAX. We have found that SARIMAX. Forecasts results more better than ARIMA for that data. So, We have finally used SARIMAX to make predictions. For Front end, We have created login form for user authorization and a page for user to upload dataset, to choose period (i.e. Days, Monthly, Yearly) and number of duration. Based on the user requirements, the final forecasting graph which predicted by using my model will visualize as a output. The front end and Back end is connected by using a Python Flask Web Framework. Power BI is used to visualize the data using various graphs and charts. Keywords Sarimax · Time series · Airline ticket price · Kaggle · Stationarity · Python flask · Power bi · Html · Css
1 Introduction The airways is most sophisticated in terms of pricing strategies. The cost of a ticket may fluctuate multiple times a day because of imbalances between the number of seats available and the demand from passengers. Airlines utilize sophisticated tools to manage the pricing, however, passengers are becoming more savvy and using online resources to compare prices. The main goal for airlines is to increase revenue and maximize profit, which is done through dynamic pricing customer behavior, A. Selvi (B) · B. Sinegalatha · S. Trinaya · K. K. Varshaa Department of Computer Science and Engineering, M. Kumarasamy College of Engineering, Thalavapalayam, Karur, Tamilnadu 639113, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_20
215
216
A. Selvi et al.
Fig. 1 Value proposition
competitor prices and internal/external factors. However, dynamic pricing is challenging due to the many factors that influence it. On the side of customers, most studies focus on forecasting best ticket purchase time, while fewer reviews have attempted to forecast original ticket fare. The paper presents a short literature survey of previous reviews related to dynamic pricing in airways (Fig. 1).
2 Related Work Etzioni et al. [1] contain 12,000 rows of ticket fare data which is collected over 41 days in an airline. In this paper, the model is built using t combination of machine learning algorithms such as Reinforcement learning, Rule learning (Ripper) and Qlearning. This algorithm produces a saving of dollar 198,074 which represents an average savings of percentage 23.8 for 341 passengers. The output performance of this project gives an average of 61.8% savings. The limitations of this research paper are that there is a limited dataset. The performance of this system is less as it uses machine learning algorithms which does not produce appropriate results for time series data. Willian Groves, Maria Gini et al. [2] presents data collected from daily price quotes from a major travel search website between February 22, 2011, and June 10, 2011. The data was gathered for a three-month period, 60 days before the departure date. The dataset was divided into a training set (48 observations), a calibration set (20 observations), and a test set (41 observations). The study used Ridge Regression, Decision Tree, and PLS Regression to analyze the data. The results showed an average cost savings of 7.25%, which represents 69% of the optimal savings. Santos Domínguez-Menchero et al. [3] provide customers to decide the best time for purchasing a airline ticket to save their money. The study analysed air fares for routes from Madrid to four different states They are London, New York, Frankfurt and Paris over the course of two months. Research found that the consumer has an 18day window before departure to buy a ticket without incurring a significant financial
Airline Ticket Price Forecasting Using Time Series Model
217
penalty. This proposes non-parametric isotonic regression techniques to forecast the results. The main disadvantage in this paper is that there is no evaluation of results described. Chen et al. [4] contains data of further than three months that contains hundred and ten days data for five transnational routes. The dataset that collected contains the features off are, prices of current planners before the target day, prices of planners with the same day of week, price of planners. This paper proposes ensemble ways to make model for prognosticating the price. The ensemble ways includes Learn and NSE in modified format. The performance of this paper is Mean absolute chance error (MAPE) of10.7 as compared to PA with 15.41 error rate and KNN with error rate 12.58. The disadvantages of this paper are not able to prognosticate fare for a flight and doesn’t consider the multistop breakouts. In Tziridis et al. [5] the authors tackled the challenge of predicting airfare prices by applying various features to machine learning applications. They compared the performance of each model and evaluated the impact of the feature set on the accuracy of the prediction. The study used a novel dataset of 1814 airfare prices for flights from Thessaloniki to Stuttgart and found that the machine learning models achieved an accuracy of 88% for a specific set of flight features. The best performing models were Bagging Regression with Random Forest Regression Tree with 85.91% accuracy and 87.42% accuracy. Vu et al. [6] describe the challenge faced by air passengers trying to find the best time to purchase airfares while airlines aim to maximize revenue by adjusting prices. The authors propose a new model that can help passengers predict price trends using publicly available data, even without access to key information from airlines. The suggested model, called the Stacked Prediction Model, uses a combination of Random Forest and Multilayer Perceptron, and demonstrated improved performance over Multilayer Perceptron and Random Forest alone, with an R2 measurement of 4.4% and 7.7% better respectively. The study used 51,000 flight records of seven from three domestic airways. Janssen et al. [7] propose four statistical regression models for airline ticket pricing and evaluate their accuracy. The dataset used includes 126,412 records for 2271 flights, which were collected 60 days prior to departure for a single route. The study proposes the use of the Linear quantile mixed regression model to forecast ticket prices, which is effective for shorter time frames but less efficient for longer ones. This approach could be useful for future air travelers in making purchase decisions. Mumbower et al. [8] calculates flight-level price using data from displays of seat map and online prices. The study adopts a variable approach to correct for errors in prices and finds a set of valid instruments. The model used in the study is a linear regression algorithm, resulting in a price elasticity of 1.97. The limitations of the research include the lack of performance evaluation and the missing price and demand data for over 25% of the observations. Pouyanfar et al. [9] focuses on the prediction of airfare prices. The authors selected a set of features that are believed to affect the air ticket prices and applied these features. The study also investigates the impact of the feature set on the prediction
218
A. Selvi et al.
accuracy. The results showed that the machine learning models were able to achieve an accuracy of nearly 88% for a particular type of flight features. Janssen et al. [10] discusses the various factors that affect the fare of an airline ticket, such as flight distance, time of purchase, fuel price, etc. The study suggests a new approach using two public data sources: the Airline Origin and Destination Survey (DB1B) and the Air Carrier Statistics database (T-100).The framework employs machine learning algorithms to forecast the average quarterly ticket price for origin-destination pairs, referred to as market segments. The analysis demonstrated a high level of prediction accuracy, with an adjusted R squared score of 0.869 in the test data.
3 Background of the Work TIn this project, we have analyzed airline ticket sales data collected from Kaggle. The dataset consists of two columns, date of sale and number of tickets sold. We performed various data analysis techniques and transformed the data to make it stationary, as it is a time series data. After testing for stationarity using Adfuller and KPSS tests, we compared two models ARIMA and SARIMAX and found that SARIMAX performed better for the given data. We used SARIMAX to make predictions and connected the front-end, which is a login form and a page for uploading data and choosing forecasting period and duration, to the back-end using Python Flask. Finally, the predicted results were visualized using Power BI with different graphs and charts.
4 Proposed Work In this paper, two different models were used to predict air ticket prices, the ARMA and the SARIMAX model. The ARMA model is used to predict the variation of air tickets by treating the time series data as a random sequence and considering the influence of external factors. On the other hand, the SARIMAX model was used to consider the seasonality features in the time series data and produce better results. The experiment results showed that the SARIMAX algorithm produced better results compared to the ARIMA algorithm.
4.1 Important Packages The libraries Pandas, Numpy, Matplotlib, and Statsmodels are used in this project. Pandas is used to import and perform operations on the dataset using dataframes.
Airline Ticket Price Forecasting Using Time Series Model
219
Numpy is utilized for operations on multi-dimensional arrays and applied for transformation techniques. Matplotlib is used for visualizing graphs. Finally, Statsmodels is used for developing time series models.
4.2 Dataset Collection The data collected was for the number of airline tickets sold on each day and was obtained from Kaggle. The dataset consists of two columns: date of sale and the number of tickets sold. The data was loaded into the system using the pandas read csv method and it was found that there were no missing values in the dataset after using the is a method. The dataset has 144 records with two columns. The date of sale column, which was initially in the “object” data type, was transformed into the datetime data type and was then set as the index for the data. The Augmented Dickey Fuller Test and the KPSS test were used to determine if the time series was stationary, but both tests showed that the time series was not stationary.
4.3 Building Model 4.3.1
Arima Model
The ARIMA model, which stands for “autoregressive integrated moving average”, is a statistical and econometric tool used for analyzing and modeling events that occur over a period of time. The purpose of ARIMA is to understand past trends or predict future trends in a series of data. In this case, the data was first transformed by taking the square root, which was then confirmed to be stationary through the use of the Adfuller and KPSS tests. Finally, the ARIMA model was fitted to the stationary data (Fig. 2). The ARIMA model’s predictions didn’t match the original data accurately, so we altered the method of transforming the series to make it stationary and then used ARIMA. I performed second-order differentiation on the data and found it to be stationary via the Adfuller and KPSS tests. ARIMA was then used to fit the data, resulting in improved predictions compared to the original data. I applied the same transformation technique to SARIMAX for further analysis (Fig. 3).
4.3.2
Sarimax Model
ARIMA and SARIMA are time series forecasting algorithms that use past values to predict future values. ARIMA considers autoregressive and moving average components, while SARIMA also includes seasonal patterns. The dataset obtained through differentiation was also tested with the SARIMA model. The SARIMA model with
220
A. Selvi et al.
Fig. 2 Performance analysis of ARIMA model (SQRT)
Fig. 3 Performance analysis of ARIMA model (second order difference)
differentiation yields improved predictions compared to the original data and outperforms the ARIMA model. As a result, SARIMA was used for final forecasting in the SALES_FORECASTING web app. We have drawn line plot of original data and predicted data for both SARIMAX and ARIMA model. By that graph, we analysis the performance of both models. In addition to that, we have also found MSE (Mean Squared Error) to find the error rate of each models (Figs. 4 and 5).
Airline Ticket Price Forecasting Using Time Series Model
221
Fig. 4 Performance analysis of SARIMAX model (second order difference)
Fig. 5 Visualization using power BI
5 Flask Python Web Framework Flask is a Python-based web application framework created by Armin Ronacher and the Poocco team. It uses the Workwear WSGI toolkit and Jinja2 template engine, both of which are Pocco projects. The SARIMA model was integrated with the frontend web pages using Flask, allowing users to upload a dataset and receive forecasting results based on their specifications.
6 Experimental Results E RR =
FP + FN T P + T N + FN + FP
222
A. Selvi et al.
Fig. 6 Error rate
0.8
0.75
0.7 0.6
0.5
0.5
0.4
0.4 0.3 0.2 0.1 0
Table 1 Error rate analysis with different algorithms
Arima with SQRT
Arima with second order difference
Sarimax
Algorıthm
Error rate
Arıma wıth sqrt
0.75
Arıma wıth second order dıfference
0.5
Sarımax
0.4
The proposed SARIMAX algorithm exhibits a lower error rate in predicting airline ticket prices compared to the existing algorithm, as indicated by the above graph (Fig. 6; Table 1).
6.1 Accuracy Accuracy (ACC) is calculated as the ratio of the number of accurate predictions to the total number of test data. It is equivalent to 1 minus the error rate (ERR). A perfect accuracy score is 1.0, and the lowest possible score is 0.0 (Table 2) ACC =
TP +TN × 100 T P + T N + FN + FP
The proposed CNN algorithm achieves a higher accuracy rate compared to the existing algorithm, as shown by the above graph (Fig. 7).
Table 2 Accuracy score analysis with different algorithms
Algorıthm
Accuracy (%)
Arıma wıth sqrt
50
Arıma wıth second order dıfference
65
Sarımax
80
Airline Ticket Price Forecasting Using Time Series Model
223
Fig. 7 Accuracy chart
ACCURACY 90% 80% 70% 60% 50% 40% 30% 20% 10% 0%
Arima with SQRT
Arima with second order difference
Sarimax
7 Conclusion To summarize, this paper presents a comprehensive review of ticket prediction and demand forecasting models in the airline industry. It discusses the concept of dynamic pricing, which involves adjusting ticket prices based on various factors. The paper also proposes a ticket price forecasting model using the SARIMAX algorithm, which has been experimentally shown to have high prediction accuracy. The developed system has been implemented to aid in air ticket purchasing decisions. Further research is needed to improve the forecast accuracy through the acquisition of more data and optimization of algorithms.
References 1. Dingli A, Mercieca L, Spina R, Galea M (2015) Event detection using social sensors. In: 2nd international conference on information and communication technologies for disaster management (ICT-DM). Rennes, France, pp 35–41. https://doi.org/10.1109/ICT-DM.2015.740 2054 2. Porshnev A, Redkin I, Shevchenko A (2013) Machine learning in prediction of stock market indicators based on historical data and data from twitter sentiment analysis. In: IEEE 13th international conference on data mining workshops. Dallas, TX, USA, pp 440–444. https:// doi.org/10.1109/ICDMW.2013.111 3. Krizhevsky A, Sutskever I, Hinton G (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90 4. Lantseva A, Mukhina K, Nikishova A, Ivanov S, Knyazkov K (2015) Data-driven modeling of airlines pricing. Procedia Comput Sci 66:267–276. ISSN 1877-0509 5. Nurwidyantoro A (2013) Event detection in social media: a survey, pp 1–5. https://doi.org/10. 1109/ICTSS.2013.6588106 6. An B, Chen H, Park N, Subrahmanian (2017) Data-driven frequency-based airline profit maximization. ACM Trans Intell Syst Technol 8(4 July) Article No: 61:1–28 7. An B, Chen H, Park N, Subrahmanian VS (2017) MAP: Frequency-based maximization of airline profits based on an ensemble forecasting approach. In: Proceedings of the 22nd ACM
224
8.
9. 10. 11.
12.
13.
14.
15.
16.
17.
18. 19.
20.
21.
22.
23.
24.
A. Selvi et al. SIGKDD international conference on knowledge discovery and data mining (KDD ‘16). ACM, New York, NY, USA, pp 421–430 Zhang C, Zhou G, Yuan Q, Zhuang H, Yu Z, Kaplan L, Wang S, Han J (2016) GeoBurst: realtime local event detection in geo-tagged tweet streams. In: Proceedings of the 39th international ACMSIGIR conference on research and development in information retrieval (SIGIR’16). ACM, New York, NY, USA, pp 513–522 Chawla B (2017) Airfare analysis and prediction using data mining and machine learning. Int J Eng Sci Invent (IJESI) 6(11):10–17 Wen C-H, Chen P-H (2017) Passenger booking timing for low-costairlines: a continuous logit approach. J Air Transp Manage 64:91–99 Pandiaraja P, Muthumanickam K, Palani Kumar R (2023) A graph-based model for discovering host-based hook attacks: smart technologies in data science and communication. In: Lecture notes in networks and systems, vol 558. Springer, Singapore, pp 1–13 Escobari D, Rupp N, Meskey J (2018) An analysis of dynamic price discrimination in airlines. Southern Econ J. https://doi.org/10.1002/soej.12309; Domínguez-Menchero J, Santo R, Javier T-M, Emilio (2014) Optimal purchase timing in the airline market. J Air Transp Manage 40:137–143 Pandey SK, Vanithamani S, Shahare P, Ahmad SS, Thilagamani S, Hassan MM, Amoatey ET (2022) Machine learning-based data analytics for IoT-enabled industry automation. Wireless Commun Mobile Comput, Article ID 8794749, 12 pp Constantinides E, Dierckx RHJ (2014) Airline price discrimination: a practice of yield management or customer profiling. In: 43rd EMAC conference anonymous paradigm shifts and interactions. Valencia, Spain Karthik K, Nachammai M, Nivetha Gandhi G, Priyadharshini V, Shobika R (2023) Study of land cover classification from hyperspectral images using deep learning algorithm. In: Smys S, Lafata P, Palanisamy R, Kamel KA (eds) Computer networks and inventive communication technologies. Lecture notes on data engineering and communications technologies, vol 141. Springer, Singapore Yuan H, Xu W, Yang C (2014) A user behavior-based ticket sales prediction using data mining tools: an empirical study in an OTA company. In: 2014 11th international conference on service systems and service management (ICSSSM). Beijing, pp 1 Pradeep D, Bhuvaneswari A, Nandhini M, Roshini Begum A, Swetha N (2023) Survey on attendance system using face recognition, pervasive computing and social networking. In: Lecture notes in networks and systems, vol 475. Springer, Singapore Dedhia M, Jadhav A, Jagdale R, Palkar B (2018) Optimizing airline ticket purchase timing. Int J Recent Innov Trends in Comput Commun (IJRITCC) 6(4):296–298 Chamundeeswari G, Srinivasan S, Prasanna Bharathi S, Priya P, Rajendra Kannammal G, Rajendran S (2022) Optimal deep convolutional neural network based crop classification model on multispectral remote sensing images. Microproc Microsyst 94:104626 Akilandeswari V, Kumar A, Thilagamani S, Subedha V, Kalpana V, Kaur K, Asenso E (2022) Minimum latency-secure key transmission for cloud-based internet of vehicles using reinforcement learning. Comput Intell Neurosci Pan B, Yuan D, Sun W, Liang C, Li D (2018) A novel LSTM-based daily airline demand forecasting method using vertical and horizontal time series. In: Lecture notes in computer science, vol 11154. Springer, Cham, pp 168–173 Padmini Devi B, Aruna SK, Sindhanaiselvan K (2021) Performance analysis of deterministic finite automata and turing machine using JFLAP tool. J Circ Syst Comput 30(6):2150105– 2150116 Yang H-T, Liu X (2018) Predictive simulation of airline passenger volume based on three models: data science. In: ICPCSEE, communications in computer and information science, vol 902. Springer, Singapore, pp 350–358 Murugesan M, Thilagamani S (2021) Bayesian feed forward neural network-based efficient anomaly detection from surveillance videos. Intell Automation Soft Comput 34(1):389–405
Airline Ticket Price Forecasting Using Time Series Model
225
25. Becker H, Iter D, Naaman M, Gravano L (2013) Identifying content for planned events across social media sites, WSDM ‘12. In: Proceedings of the fifth ACM international conference on web search and data mining, pp 533–54 26. Domínguez-Menchero JS, Rivera J, Torres-Manzanera E (2014) Optimal purchase timing in the airline market. J Air Transp Manag 40:137–143 27. Elman JL (1990) Finding structure in time. Cognitive Sci 14(2):179–211. ISSN 0364-0213 28. Selvarathi C, Kumar KH, Pradeep (2023) Journal on delivery management platform. In: Choudrie J, Mahalle P, Perumal T, Joshi A (eds) IOT with smart systems. Smart innovation, systems and technologies, vol 312. Springer, Singapore 29. Tziridis K, Kalampokas T, Papakostas GA, Diamantaras KI (2017) Airfare prices prediction using machine learning techniques. In: 2017 25th European signal processing conference (EUSIPCO), Kos, Greece, pp 1036–1039. https://doi.org/10.23919/EUSIPCO.2017.8081365 30. Shankar A, Sumathi K, Pandiaraja P, Stephan T, Cheng X (2022) Wireless multimedia sensor network QoS bottleneck alert mechanism based on fuzzy logic. J Circ Syst Comput 31(11) 31. Li L, Chu KH (2017) Prediction of real estate price variation based on economic parameters. In: International conference on applied system innovation (ICASI). Sapporo, Japan, pp 87–90. https://doi.org/10.1109/ICASI.2017.7988353 32. Li J, Granados N, Netessine S (2014) Are consumers strategic? Structural estimation from the air-travel industry. Manag Sci 60(9):2114–2137 33. Saravanan S, Abirami T, Pandiaraja P (2018) Improve efficient keywords searching data retrieval process in cloud server. In: 2018 international conference on intelligent computing and communication for smart world (I2C2SW). Erode, India, pp 219–223 34. Dong X, Mavroeidis D, Calabrese F et al (2015) Multiscale event detection in social media. Data Min Knowl Disc 29:1374–1405 35. Alamelu V, Thilagamani S (2022) Lion based butterfly optimization with improved YOLO-v4 for heart disease prediction using IoMT. Inf Technol Control 51(4):692–703 36. Chen Y, Cao J, Feng S, Tan Y (2015) An ensemble learning based approach for building airfare forecast service. In: IEEE international conference on big data (big data). Santa Clara, CA, USA, pp 964–969. https://doi.org/10.1109/BigData.2015.7363846 37. Pandiaraja P, Boopesh KB, Deepthi T, Laksmi Priya M, Noodhana R (2022) An analysis of document summarization for educational data classification using NLP with machine learning techniques. In: Applied computational technologies. ICCET 2022. Smart innovation, systems and technologies, vol 303. Springer, Singapore, pp 127–143 38. Panagiotou N, Katakis IM, Gunopulos D (2016) Detecting events in online social networks. In: Definitions, trends and challenges, solving large scale learning tasks 39. Li Q, Nourbakhsh A, Shah S, Liu X (2017) Real-time novel event detection from social media. In: IEEE 33rd international conference on data engineering (ICDE), pp 1129–1139 40. Herrera Quispe J, Juarez R (2015) Prediction of tourist traffic to Peru by using sentiment analysis in Twitter social network. https://doi.org/10.1109/CLEI.2015.7360051 41. Santana E, Mastelini S, Jr S (2017) Deep regressor stacking for air ticket prices prediction. In: Anais do XIII Simpósio Brasileiro de Sistemas de Informação, Lavras, pp 25–31 42. Pandiaraja P, Aishwarya S, Indubala SV, Neethiga S, Sanjana K (2022) An analysis of Ecommerce identification using sentimental analysis: a survey. In: Applied computational technologies. ICCET 2022. Smart innovation, systems and technologies, vol 303. Springer, Singapore, pp 742–754 43. Janssen T, Dijkstra T, Abbas S, van Riel AC (2014) A linear quantile mixed regression model for prediction of airline ticket prices. Radboud University 44. Liu T, Cao J, Tan Y, Xiao Q-W (2017) ACER: an adaptive context-aware ensemble regression model for airfare price prediction. In: International conference on progress in informatics and computing (PIC), pp 312–317 45. Wohlfarth T, Clemencon S, Roueff F, Casellato X (2011) A data-mining approach to travel price forecasting. In: 10th international conference on machine learning and applications and workshops. Honolulu, HI, pp 84–89
226
A. Selvi et al.
46. Sakaki T, Okazaki M, Matsuo Y (2010) Earthquake shakes Twitter users: real-time event detection by social sensors. In: Proceedings of the 19th international conference on World wide web (WWW ‘10). ACM, NY, USA, pp 851–860 47. Szopinski T, Nowacki R (2015) The influence of purchase date and flight duration over the dispersion of airline ticket prices. Contemporary Econ 9:353–366. https://doi.org/10.5709/ce. 1897-9254.174 48. Vu VH, Minh QT, Phung PH (2018) An airfare prediction model for developing markets. In: International conference on information networking (ICOIN), pp 765–770 49. Walther M, Kaisser M (2013) Geo-spatial event detection in the twitter stream. In: Advances in information retrieval: 35th European conference on IR research, ECIR 2013. Moscow, Russia, 24–27 Mar 2013. Proceedings, vol 35. Springer Berlin Heidelberg, pp 356–367 50. Manoj Krishna S, Sharitha G, Madhu Ganesh P, Ajith Kumar GV, Karthika G (2022) Flight ticket price prediction using regression model. IJRASET 2321–9653
AI Powered Authentication for Smart Home Security—A Survey P. Priya, B. Gopinath, M. Mohamed Ashif, and H. S. Yadeshwaran
Abstract The visitor management issue is one that faces the modern world, and it may help identify and prevent countless fraud, privacy, and other problems. One of the safest solutions, much more so than CCTV and wake-through-gate techniques, is the visitor management system that uses facial recognition. The major decision that must be made in the project is if the system cost is reasonable given the scope of the project. Domestic and industrial use are two examples of how the size of activities and the security needs vary by location. Currently, visitor management systems are largely utilized by businesses, institutions, and schools, but thanks to remarkable improvements, they may soon be employed at toll booths, airports, and train stations as well. The use of Visitor Management Systems is almost universal among companies with sizable facilities and is always expanding at a steady rate. By automatically recognizing outsiders, a facial recognition visitors’ management system (FRVMS) is recommended to increase home security. Centralized systems allow for better process management and monitoring. Due to the fact that this system does not require any additional devices, development costs are also taken into account. Face recognition makes use of the computer’s built-in web camera. When compared to the family of faces recorded in the monitoring system’s database, the detailed features that have been discovered are compared, and if a member is found, security is terminated; if an outsider is found, an alarm notice is shown to the user. The user then puts on a face mask, which means they set a reminder to take it off so they can see their face. Convolutional neural networks, a deep learning methodology, may be used to develop the framework, test it in live scenarios, and research different facial recognition methods. Keywords CCTV cameras · Visitor management · Convolutional neural network · Mask detection · Face detection
P. Priya (B) · B. Gopinath · M. Mohamed Ashif · H. S. Yadeshwaran Department of Computer Science and Engineering, M. Kumarasamy College of Engineering, Thalavapalayam, Karur, Tamilnadu 6391131, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_21
227
228
P. Priya et al.
1 Introduction Locks have existed for countless years. Locks have likely existed in some form for as long as there have been treasures that humans have wished to keep secure. Numerous different types of locks are undoubtedly seen every day [1–5]. Some locks require little more than a key or a string of digits to unlock. Others are incredibly sophisticated locks that can only be opened with unique electronic key cards or fingerprints. In order to strengthen security, modern locks use a wide variety of mechanical and technological technologies [2, 3]. Many doors in traditional technology have mechanical locks with a limited number of key options [4–6]. A standard security system requires one of the following to gain entry: a key, a security password, an RFID card, or an ID card. The shortcomings of these security measures include the possibility that they would be disregarded or seized by unauthorized persons [7, 8]. In order to increase security, a better mechanism must be developed. Consequently, there is a potential that someone may misplace their pins, keys, cards, etc., but if facial recognition is employed for the door operating system, there is hope for improved security [9, 10]. The characteristics of a person’s face, which include their eyes, nose, and other distinctive features, may reveal a variety of emotions. There are two types of biometrics: behavioural traits and physiological traits (facial features, fingerprints, hand and finger shapes, palms, iris, ears, and voice) (gait, signature and keystroke dynamics). Face detection, feature extraction, face recognition, and door operation are the system’s four steps [11–13]. Currently, we are facing security issues in every aspect. So, we have to resolve these issues by using streamlined technology. The most important of characteristic of any home security system is to discover the people who enter or leave the house. Instead of monitoring that through passwords or pins, unique faces can be made use of as they are one’s biometric characteristic.
2 Related Work Tsai et al. [14] implemented the framework for the smart home, which includes the VSN and server functions. To execute a vision-based analysis, a whole systemlevel algorithm has been created. The performance of the suggested system has been assessed using a variety of test sequences that have been tried and trued. The suggested system may be broken down into image processing tasks including object tracking, foreground segmentation, and object labeling. One of the key steps in doing intelligent vision processing is foreground segmentation. It divides the binary foreground mask before performing object labeling using its mask data, which may be utilized in a variety of applications. Chambino et al. [15] implement the system a face analysis system may use photos from both the visible and infrared spectral bands to identify changes in facial expression, position, and presentation attacks. In this study, we employ images from the visible spectrum as well as the near infrared (NIR), short wave infrared (SWIR),
AI Powered Authentication for Smart Home Security—A Survey
229
and long wave infrared (LWIR). The old Fusion and Subspace techniques as well as the more contemporary Deep Neural Networks are categorized according to their fundamental operating principles. Waseem et al. [16] proposed a two-tier recognition system-based hierarchical classification network. ResNet101, a deep learning architecture, is used in the first tier for recognition, and FaceNet in the second layer for authorization. The use of FaceNet alone resulted in a 72.43% accuracy rate, while incorporating the proposed hierarchical classification framework using FaceNet embeddings led to an accuracy improvement of 14.93%. We also contrasted our suggested method with previously published ones that use fake photographs from the Internet. The findings are consistent with previous research employing hierarchical network topology to enhance recognition performance. Li et al. [17] propose a real-time face identification system for 2D images that enhances 2D face recognition technology, lowers hardware requirements and costs, and increases the adoption of the face recognition system. A network structure based on deep separable convolutions and high feature multiplexing in multitasking networks can reduce the need for multitasking model computations and lower the total cost of network computations. Experimental results indicate that the UFaceNet model outperforms other models in computational complexity, parameter count, efficiency, and ease of use. Lee et al. [18] several deep learning methods have been examined. R-CNN, Fast R-CNN, and Faster R-CNN are three types of R-CNN that expand CNN in specific. CNN has a good object identification rate, but it is difficult to detect in real time, so it is used to recognize objects in a number of classes, Although the object identification rate of YOLO is lower than that of R-CNN, but real-time detection is possible, so it can be used for fast detection in a small number of classes. This research introduced the object detection algorithm in image processing, and especially compared and analyzed YOLO, a real-time object detection algorithm. Jiang et al. [19] to address the issue of detecting properly worn masks, we recommended using the Properly Wearing Masked Face Detection Dataset (PWMFD), which contains 9205 images of masked individuals categorized into three groups. We also proposed using the Squeeze and Excitation (SE)-YOLOv3 mask detector for its balanced efficacy and efficiency. To enhance the stability of bounding box regression, we added GIoUloss, which more accurately reflects the geographic difference between predicted and ground truth boxes. The current practice of posting security guards to enforce mask-wearing in public places is ineffective and puts them at risk of exposure to the virus. Renuka et al. [20] implemented a new technology that claims to bridge the gap between the real and digital worlds. To protect M2M machines from various potential assaults, an essential service called machine authentication must be included in M2M networks. The suggested method enables any pair of entities in an M2M network to establish a secure session key and authenticate each other for the purpose of transferring private data. Using BAN logic, we were able to demonstrate the proposed scheme’s security. We have demonstrated the ability of the suggested approach to fend off various hostile attempts.
230
P. Priya et al.
Shah et al. [21] the latest developments in user authentication for various usage situations, such as personal gadgets, internet services, and smart environments, are reviewed. However, the quality of the behavioral biometrics collected during registration may differ from that of the authentication data, which could lead to poor performance when enrollment and authentication occur under different circumstances. This issue is commonly referred to as domain adaptation. Deep learning is one potential strategy that could be useful in addressing this challenge. Alsawwaf et al. [22] utilizing industry standards, we expanded our trials and obtained benchmark findings against other state-of-the-art methodologies. We want to do this, among other things, by increasing the number of datasets utilized and adding more benchmarking methods. The suggested method has numerous benefits, including maintaining privacy by employing a unique identification for each individual instead of keeping images. By comparing numbers, computations may be completed more quickly.
3 Face Recognition Technologies In a literal sense, only in situations where a subject (or scene) is simultaneously captured by several cameras and an algorithm jointly utilizes the captured images or videos is the phrase “multi-view face recognition” appropriate [12, 13]. For the sake of clarity, we will refer to the various video sequences recorded by synchronized cameras as multi-view videos and the single-view videos, which are recorded when the subject changes stance, as monocular video sequences. Multi-view surveillance movies are becoming more and more prevalent due to the widespread use of camera networks [23, 24]. The vast majority of multi-view video face recognition algorithms, however, that are now in use, only utilize single-view recordings. For both of the face photographs in this procedure, it will also be necessary to estimate the stances and lighting conditions. The holistic matching algorithm, which bases its matching measure on the ranking of look-up results, was also developed using the “generic reference set” concept [25, 26]. There are also works that implicitly account for stance fluctuations without explicitly calculating the pose.
3.1 Appearance-Based Methods Appearance-based techniques for re-identifying people fall into one of two categories: single-shot techniques or multiple-shot techniques, depending on how many photos are utilized to build the human representation. While multiple-shot approaches employ a collection of photos to calculate the feature vectors to represent the person, single-shot methods just require a single image to do so. Multiple-shot approaches are more computationally intensive than single-shot methods even if they offer more information [27, 28]. Simple matching is then performed to assess how similar the
AI Powered Authentication for Smart Home Security—A Survey
231
photos are after feature extraction [29, 30]. Deep learning is increasingly being used in human re-identification tasks in addition to these two categories since it combines feature extraction and classification tasks into a single integrated framework [31, 32].
3.2 Single-Shot Methods With a direct approach, the goal is to create a consistent and reliable feature representation of the human under varied circumstances. A learning-based method looks for matching algorithms using training data that minimize distances between comparable classes while maximizing distances between classes that are different [33, 34]. However, their suggested approach was only effective provided the human photos from various cameras were taken from perspectives that were comparable [35, 36]. It suggested an appearance model that had been built using kernel estimation. They employed path-length and the Color Rank Feature in their research.
3.3 Multi-shot Methods To address the issue of posture variation, it may split the human picture into 10 horizontal stripes. From these 10 stripes, they derived the median hue, saturation, and lightness (HSL) of the color. In order to lessen the dimensionality of the characteristics, linear discriminate analysis (LDA) was employed. For low resolution videos, like those in the majority of surveillance systems, their proposed strategy is unreliable and suggested a brand-new spatiotemporal segmentation method that produced a spatiotemporal graph utilizing 10 consecutive human frames [37, 38]. From these 10 stripes, they derived the median hue, saturation, and lightness (HSL) of the color. In order to lessen the dimensionality of the characteristics, linear discriminate analysis (LDA) was employed [13, 39]. For low resolution videos, like those in the majority of surveillance systems, their proposed strategy is unreliable and suggested a brand-new spatiotemporal segmentation method that produced a spatiotemporal graph utilizing 10 consecutive human frames [40].
3.4 Rgbd-Based Approaches A novel idea that utilizes depth is presented. Depth data is color-independent and maintains its invariance for a longer duration compared to RGB information, even under challenging conditions such as lighting changes and alterations in clothing. Additionally, people may be more easily distinguished from the background with the depth value of each pixel, which significantly lessens the impact of the background [41, 42]. The distance between samples is calculated using Point Cloud Matching
232
P. Priya et al.
(PCM). Two features, the depth voxel covariance descriptor and the Eigen-depth feature, are proposed to describe body shape [43]. Some Re-ID approaches merge RGB and depth information from Kinect to derive a more discriminative feature representation [44, 45]. The second kind of approach is based on geometrical characteristics: Re-ID is accomplished using the outlined technique, which involves matching body forms as full point clouds distorted to a typical position [46, 47]. For Re-ID, they use the anthropometric measurement approach. In order to calculate geometric properties like limb lengths and ratios, they employ the 3D position of body joints that the skeleton tracker provides [48].
4 Proposed Methodologies The days of utilizing keys, pin codes, cards, and fingerprints for authorized access are long gone. We might suggest that the system enter the facial recognition age. In this project, a model for smart door security was put out that would use face recognition technology in conjunction with CNN and the Grassmann algorithm to verify visitors. As a result, this study developed the face recognition system’s approach for detecting or capturing an unknown person’s face in the system. In Fig. 1, the suggested framework is displayed.
Fig. 1 Proposed system
AI Powered Authentication for Smart Home Security—A Survey
233
4.1 Grassman Algorithm The Glassmann algorithm is a highly efficient method for solving linear programming problems. It can quickly find the optimal solution for problems with hundreds or even thousands of variables. The Glassmann algorithm can be used to solve a wide range of optimization problems, including problems with multiple objectives and constraints. The Glassmann algorithm is guaranteed to find the optimal solution for linear programming problems, provided that a feasible solution exists. We divide up the cropped faces using a Grassman algorithm that was modeled after a video face matching technique. In our suggested method, sampling and describing a registration manifold is the crucial stage. The first step in our approach is to preprocess the input picture (which is a detected face) to a standardized size and convert it to grayscale in order to compute its features. We then define a set of affine parameters for geometric normalization, based on a pair of face coordinates.
4.2 Convolutional Neural Network Algorithm A feed-forward network that can extract topological properties from input images is known as a convolutional neural network (CNN). The network first extracts features from the original image, which are then classified by a classifier. CNNs can handle simple geometric transformations such as translation, scaling, rotation, and compression. They achieve this by combining local receptive fields, common weights, and spatial or temporal sub-sampling to provide some level of shift, scale, and distortion invariance. Each link is given a trainable weight, but the weights are shared by all units of a feature map. Weight sharing approach is a feature used in all CNN layers that enables lowering the number of trainable parameters.
4.3 Limitations Limited applicability: The landmark algorithm is most effective when applied to datasets with a low intrinsic dimensionality. It may not be suitable for highdimensional datasets, as the landmarks may not capture the important features of the data. Sensitivity to landmark selection: The effectiveness of the landmark algorithm depends on the quality of the landmarks selected. If the landmarks are not representative of the dataset, then the algorithm may not be effective.
234
P. Priya et al.
4.4 Future Works Despite recent advancements, face recognition technology still faces challenges in recognizing faces accurately under various conditions. Future work could focus on developing algorithms that are more robust to changes in lighting, pose, and other environmental factors, as well as addressing issues related to bias and discrimination.
5 Conclusion In conclusion, visitor authentication using face recognition technology is a reliable and convenient way to enhance security in various settings. The technology involves capturing and processing facial images to extract unique features that are used for comparison with those stored in a database. Face recognition is contactless and fast, making it an efficient way to authenticate visitors. However, it is important to note that face recognition technology is not fool proof, and errors can occur due to variations in lighting, pose, and other environmental factors. Therefore, it is important to implement additional security measures such as password verification or secondary authentication methods to ensure maximum security. Overall, the use of face recognition technology can significantly enhance visitor authentication and contribute to the overall security of a facility.
References 1. .Ye M, Lan X, Wang Z, Yuen PC (2019) Bi-directional centerconstrained top-ranking for visible thermal person re-identification. IEEE Trans Inf Forensics Secur 15:407–419 2. Tsai T-H, Huang C-C, Chang C-H, Hussain MA (2020) Design of wireless vision sensor network for smart home. IEEE Access 8:60455–60467. https://doi.org/10.1109/ACCESS.2020. 2982438 3. Muhammad K, Ahmad J, Mehmood I, Rho S, Baik SW (2018) Convolutional neural networks based fire detection in surveillance videos. IEEE Access 6:18174–18183. https://doi.org/10. 1109/ACCESS.2018.2812835 4. Bhatti MT, Khan MG, Aslam M, Fiaz MJ (2021) Weapon detection in real-time CCTV videos using deep learning. IEEE Access 9:34366–34382. https://doi.org/10.1109/ACCESS.2021.305 9170 5. Alamelu V, Thilagamani S (2022) Lion based butterfly optimization with improved YOLO-v4 for heart disease prediction using IoMT. Inf Technol Control 51(4):692–703 6. Renuka KM, Kumari S, Zhao D, Li L (2022) Design of a secure password-based authentication scheme for M2M networks in IoT enabled cyber-physical systems. IEEE Access 7:51014– 51027. https://doi.org/10.1109/ACCESS.2019.2908499 7. Pandey SK, Vanithamani S, Shahare P, Ahmad SS, Thilagamani S, Hassan MM, Amoatey ET (2022) Machine learning-based data analytics for IoT-enabled ındustry automation. Wireless Commun Mobile Comput, Article ID 8794749, 12 pp
AI Powered Authentication for Smart Home Security—A Survey
235
8. Mohammed Ali A, Kadhim Farhan A (2020) A novel improvement with an effective expansion to enhance the MD5 hash function for verification of a secure E-document. IEEE Access 8:80290–80304. https://doi.org/10.1109/ACCESS.2989050 9. Mun H-J, Hong S, Shin J (2018) A novel secure and efficient hash function with extra padding against rainbow table attacks. Cluster Comput 21(1):1161–1173. https://doi.org/10.1007/s10 586-017-0886-4 10. .Akilandeswari V, Kumar A, Thilagamani S, Subedha V, Kalpana V, Kaur K, Asenso E (2022) Minimum latency-secure key transmission for cloud-based ınternet of vehicles using reinforcement learning. Comput Intell Neurosci 11. Li H, Hu J, Yu J, Yu N, Wu Q (2021) UFaceNet: research on multitask face recognition algorithm based on CNN. Algorithms 14(9):268. https://doi.org/10.3390/a14090268 12. . Murugesan M, Thilagamani S (2021) Bayesian feed forward neural network-based efficient anomaly detection from surveillance videos. Intell Automation Soft Comput 34(1):389–405 13. Kim K, Lee B, Kim JW (2017) Feasibility of deep learning algorithms for binary classification problems. J Intell Inf Syst 23(1):95–108. https://doi.org/10.13088/jiis.2017.23.1.095 14. Tsai T-H, Tsai TH, Huang CC, Chang CH, Hussain MA (2020) Design of wireless vision sensor network for smart home. IEEE Access 8:60455–60467 15. Chambino LL, Silva JS, Bernardino A (2020) Multispectral facial recognition: a review. IEEE Access 8:207871–207883 16. Waseem M, Khowaja SA, Ayyasamy RK, Bashir F (2020) Face recognition for smart door lock system using hierarchical network. In: International conference on computational ıntelligence (ICCI). IEEE 17. Li H, Hu J, Yu J, Yu N, Wu Q (2021) UFaceNet: research on multi-task face recognition algorithm based on CNN. Algorithms 14(9):268 18. Lee M, Mun H-J (2020) Comparison analysis and case study for deep learning-based object detection algorithm. Int J Adv Sci Converg 2(4):7–16 19. .Jiang X, Gao T, Zhu Z, Zhao Y (2021) Real-time face mask detection method based on YOLOv3. Electronics 10(7):837 20. Renuka KM, Kumari S, Zhao D, Li L (2019) Design of a secure password-based authentication scheme for M2M networks in IoT enabled cyber-physical systems. IEEE Access 7:51014– 51027 21. .Shah, Syed W, Kanhere SS (2019) Recent trends in user authentication–a survey. IEEE Access 7:112505–112519 22. Alsawwaf M, Chaczko Z, Kulbacki M, Sarathy N (2022) In your face: person identification through ratios and distances between facial features. Vietnam J Comput Sci 9(2):187–202 23. Pradeep D, Bhuvaneswari A, Nandhini M, Roshini Begum A, Swetha N (2023) Survey on attendance system using face recognition, pervasive computing and social networking, lecture notes in networks and systems, vol 475. Springer, Singapore 24. Pandiaraja P, Aishwarya S, Indubala SV, Neethiga S, Sanjana K (2022) An analysis of Ecommerce ıdentification using sentimental analysis: a survey. Applied computational technologies. In: ICCET 2022. Smart ınnovation, systems and technologies, vol 303. Springer, Singapore, pp 742–754 25. .Priya P, Girubalini S, Lakshmi Prabha BG, Pranitha B, Srigayathri M (2023) A survey on privacy preserving voting scheme based on blockchain technology. In: Choudrie J, Mahalle P, Perumal T, Joshi A (eds) IOT with smart systems. Smart ınnovation, systems and technologies, vol 312. Springer, Singapore 26. Lee M-H, Kang J-Y, Lim S-J (2020) Design of YOLO-based removable system for pet monitoring. J Korea Inst Inf Commun Eng 24(1):22–27. https://doi.org/10.6109/jkiice.2020.24. 1.22 27. Sathana V, Mathumathi M, Makanyadevi K (2022) Prediction of material property using optimized augmented graph-attention layer in GNN. Mater Today: Proc 69(3) 28. Lee KM, Song H, Kim JW, Lin CH (2018) Balanced performance for efficient small object detection YOLOv3-tiny. ˙In: Proceeding Korean socity broadcast engineering conference Anseong, South Korea. The Korean Institute of Broadcast and Media Engineers
236
P. Priya et al.
29. . Shenvi DR, Shet K (2021) CNN based COVID-19 prevention system. ˙In: Proceeding ınternational conference artificial ıntelligent smartsystem (ICAIS), pp 873–878. 117-118.https://doi. org/10.1109/ICAIS50930.2021.9396004 30. .Han S (2020) Age estimation from face images based on deep learning. ˙In: Proceedding ınternational conference computer data science (CDS). Stanford, CA, USA, pp 288–292. https:/ /doi.org/10.1109/CDS49703.2020.00063 31. .Shankar A, Sumathi K, Pandiaraja P, Stephan T, Cheng X (2022) Wireless multimedia sensor network qos bottleneck alert mechanism based on fuzzy logic. J Circ Syst Comput 31(11) 32. . Waseem M, Khowaja SA, Ayyasamy RK, Bashir F (2020) Face recognition for smart door lock system using hierarchical network. ˙In: Proceeding ınternational conference computer ıntelligent (ICCI). Seri Iskandar,Malaysia, pp 51–56. https://doi.org/10.1109/ICCI51257.2020.9247836 33. Mun H-J, Han K-H (2020) Design for access control system based on voice recognition for infectious disease prevention. J Korea Converg Soc 11(7):19–24. https://doi.org/10.15207/ JKCS.2020.11.7.019 34. . Saravanan S, Abirami T, Pandiaraja P (2018) Improve efficient keywords searching data retrieval process in cloud server. In: International conference on ıntelligent computing and communication for smart world (I2C2SW). Erode, India, pp 219–223 35. Zhou J, Su B, Wu Y (2018) Easy identification from better constraints: multi-shot person reidentification from reference constraints. ˙In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5373–5381 36. Li S, Bak S, Carr P, Wang X (2018) Diversity regularized spatiotemporal attention for videobased person re-identification. ˙In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 369–378 37. Zhu X, Jing X, You X, Zhang X, Zhang T (2018) Video-based person re-identification by simultaneously learning intra-video and inter-video distance metrics. IEEE Trans Image Proc 27(11):5683–5695 38. . Hou R, Chang H, Ma B, Shan S, Chen X (2020) Temporal complementary learning for video person re-identification. ˙In: European conference on computer vision. Springer, pp 388–405 39. .Pandiaraja P, Muthumanickam K, Palani Kumar R (2023) A graph-based model for discovering host-based hook attacks. Smart technologies in data science and communication. In: Lecture notes in networks and systems, vol 558. Springer, Singapore, pp 1–13 40. Suh Y, Wang J, Tang S, Mei T, Mu Lee K (2018) Part-aligned bilinear representations for person re-identification. Proc Euro Conf Comput Vis 201:402–419 41. . Liu C-T, Wu C-W, Wang Y-C F, Chien S-Y (2019) Spatially and temporally efficient non-local attention network for video-based person re-identification, arXiv preprint arXiv:1908.01683 42. .Pandiaraja P, Boopesh KB, Deepthi T, Laksmi Priya M, Noodhana R (2022) An analysis of document summarization for educational data classification using NLP with machine learning techniques. Applied computational technologies. In: ICCET 2022. Smart ınnovation, systems and technologies, vol 303. Springer, Singapore, pp 127–143 43. Chen D, Li H, Xiao T, Yi S, Wang X (2018) Video person reidentification with competitive snippet-similarity aggregation and coattentive snippet embedding. ˙In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1169–1178 44. .Ye M, Lan X, Wang Z, Yuen PC (2019), Bi-directional centerconstrained top-ranking for visible thermal person re-identification. IEEE Trans Inf Forensics Secur. 15:407–419 45. Ye M, Shen J, Lin G, Xiang T, Shao L, Hoi SC (2021) Deep learning for person re-identification: a survey and outlook. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI. 2021.3054775,inpress 46. .Wu Y, Lin Y, Dong X, Yan Y, Ouyang W, Yang Y (2018) Exploit the unknown gradually: oneshot video-based person re-identification by stepwise learning. ˙In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5177–5186 47. . Fu Y, Wang X, Wei Y, Huang T (2019) Spatial-temporal attention for large-scale video-based person re-identification. ˙In: Proceedings of the association for the advancement of artificial ıntelligence
AI Powered Authentication for Smart Home Security—A Survey
237
48. .Li S, Bak S, Carr P, Wang X (2018) Diversity regularized spatiotemporal attention for videobased person re-identification. ˙In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 369–378 49. Fang W, Wang L, Ren P (2019) Tinier-YOLO: a real-time object detection method for constrained environments. IEEE Access 8:1935–1944
WSRR: Weighted Rank-Relevance Sampling for Dense Text Retrieval Kailash Hambarde and Hugo Proença
Abstract As in many other domains based in the contrastive learning paradigm, negative sampling is seen as a particular sensitive problem for appropriately training dense text retrieval models. For most cases, it is accepted that the existing techniques often suffer from the problem of uninformative or false negatives, which reduces the computational effectiveness of the learning phase and even reduces the probability of convergence of the whole process. Upon these limitations, in this paper we present a new approach for dense text retrieval (termed WRRS: Weighted Rank-Relevance Sampling) that addresses the limitations of current negative sampling strategies. WRRS assigns probabilities to negative samples based on their relevance scores and ranks, which consistently leads to improvements in retrieval performance. Under this perspective, WRRS offers a solution to uninformative or false negatives in traditional negative sampling techniques, which is seen as a valuable contribution to the field. Our empirical evaluation was carried out against the AR2 baseline on two well known datasets (NQ and MS Doc), pointing for consistent improvements over the SOTA performance. Keywords Dense text retrieval · Relevance score · Negative sampling
1 Introduction The field of information retrieval has undergone significant advances with the growing amount of textual data, making the task of retrieving relevant information from large text sources a crucial problem. In this context, queries and documents are typically represented by low-dimensional vectors, from where similarity metrics are used to perceive the relevance of a document with respect to a query [1–5]. Despite the widespread use of this kind of methods, the main challenge in training dense text retrieval models lies in selecting appropriate negatives from a large pool of documents during negative sampling [1]. K. Hambarde (B) · H. Proença IT: Instituto de Telecomunicações, University of Beira Interior, Covilha, Portugal e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_22
239
240
K. Hambarde and H. Proença
The most commonly used negative sampling methods, such as random negative sampling [1, 6] and top-k hard negatives sampling [7, 8], have obvious limitations: random negative sampling tends to select uninformative negatives, while top-k sampling may include false negatives [4, 7]. To address these limitations, this paper proposes a novel approach, termed WRRS: Weighted Rank-Relevance Sampling, for the dense text retrieval task. When compared to the SOTA, the key point is that WRRS assigns a probability to each negative sample based on its relevance score and rank, and subsequently selects negative samples based on the joint probability from both factors, which enables to select better sets of negative instances, allowing for consistent improvements in the training phase of dense text retrieval methods. Preliminary The dense text retrieval task (DTR) refers to retrieve the .k most relevant documents from a large candidate pool, given a query, .q. Upon its efficiency, the dual-encoder architecture is the most widely used in DTR, consisting of a query encoder, . E q , and a document encoder, . E d . The two encoders map the query and the document into .k-dimensional dense vectors, .h q and .h d , respectively. The semantic relevance score between .q and .d can be computed as follows: s(q, d) = h q · h d .
.
(1)
Recently, pre-trained language models (PLMs) have also adopted the dualencoders in DTR and the representation of the [CLS] tokens done by mean of dense vectors [9]. Overall, according to a contrastive learning paradigm, the objective of the dense text retrieval task is to maximize the semantic relevance between the query .q and the most relevant documents . D + , while minimizing the semantic relevance between − .q and any other irrelevant documents, . D = D D + . As in many other contrastive learning-based applications, training this kind of models is seen as a highly sensitive task, being negative sampling commonly used to speed up the training process and increase the probabilities of convergence. This step involves either randomly sampling negatives or selecting the top-.k hard negatives ranked by BM25 or the dense retrieval model itself [1, 4]. The optimization objective can be formulated as follows: ∑ ∑ ∑ + − ∗ .θ = arg min e−L(s(q,d ),s(q,d )) , (2) θ
q
d + ∈D + d − ∈D −
where . L(.) is the loss function. The remainder of this paper is organized as follows: Sect. 2, overviews the commonly used negative sampling methods in dense text retrieval, including random negative sampling and top-k hard negatives sampling and other. Section 3 presents the proposed WRRS approach in detail. Section 4 reports our experiments, comparing WRRS performance to baseline negative sampling methods. Finally, Sect. 5 summarizes the contributions of this paper and highlights the significance of the proposed WRRS approach for dense text retrieval.
WSRR: Weighted Rank-Relevance Sampling for Dense Text Retrieval
241
2 Related Work Recent years have seen significant advancements in dense retrieval methods for text retrieval tasks [10–13]. Unlike traditional sparse retrieval methods such as TFIDF and BM25, dense retrieval methods convert queries and documents into lowdimensional dense vectors, which are then compared using vector distance metrics (e.g., cosine similarity) for retrieval. To learn an effective dense retrieval model, it is critical to sample high-quality negatives that are paired with the given query and positive samples. Early works in the field [1, 14] mostly utilized in-batch random negatives and hard negatives sampled by BM25. Later, a series of studies [4, 7] found that using top-k ranked examples as hard negatives, selected by the dense retriever, is more effective in improving the performance of the retriever. Some methods [7, 15] adopt a dynamic sampling strategy that actively selects top-k hard negatives once after a set interval during training. However, these top-k negative sampling strategies can easily lead to selection of higher-ranking false negatives for training. To address this issue, previous works have incorporated techniques such as knowledge distillation [4, 16, 17], pre-training [18, 19], and other denoising techniques [20, 21] to alleviate this problem. Despite their effectiveness, these methods often rely on complicated training strategies or additional complementary models. Negative Sampling In the field of dense text retrieval, the selection of negative instances has been commonly seen as a crucial role for appropriately training this kind of models. According to a classical constrastive learning paradigm, negative samples are used to help the model to discriminate between relevant and irrelevant passages. Three types of negatives are typically considered in [1]: (1) The first one regards random negatives, which are any random passages from the corpus. The authors highlight the drawback of using random negatives, as they may not be semantically or contextually relevant to the question, leading to false negatives; (2) The second type consists of BM25 negatives, which are top passages returned by the BM25 retrieval algorithm that match the majority of question tokens but do not contain the answer. The authors find that BM25 negatives may also result in irrelevant or nonsensical passages being selected as negatives; Finally, (3) the third type is gold negatives, which are positive passages paired with other questions that appear in the training set. The authors mention that gold negatives may not generalize well to new questions or represent real-world scenarios, and may also not be diverse enough to cover all possible incorrect answers. To address the issue of constructing uninformative negatives, Ref. [7] proposes a novel method called “Approximate Nearest Neighbor Noise Contrastive Estimation” (ANCE). This method samples negatives globally from the corpus and uses an asynchronously updated Approximate Nearest Neighbor (ANN) index to retrieve top documents via the Dense Retrieval (DR) model. The results show the importance of constructing negatives globally to improve learning convergence. Ref. [22] proposes a multi-stage training approach that improves negative contrast in neural passage retrieval. The approach consists of six stages, starting with random sampling of negatives from the corpus and ending with the selection
242
K. Hambarde and H. Proença
of negatives based on the outputs of a fine-tuned neural retrieval model. The authors suggest that this multi-stage approach allows for the selection of negatives that are more representative of the true negatives, leading to improved learning convergence and performance. Ref. [23] proposes a new approach to negative sampling called SimANS for training dense retrieval models. The authors observe that ambiguous negatives, which are negatives ranked near the positives, are more informative and less likely to be false negatives. Therefore, SimANS incorporates a new sampling probability distribution that samples more ambiguous negatives. The experiments show that SimANS outperforms other negative sampling methods and provides a promising solution to the problem of negative sampling in dense retrieval models. Two training algorithms for Dense Retrieval (DR) models named Stable Training Algorithm for dense Retrieval (STAR) and query-side training Algorithm for Directly Optimizing Ranking performance (ADORE) are proposed in [15]. STAR aims to improve the stability of DR training by introducing random negatives, while ADORE replaces the commonly used static hard negative sampling method with a dynamic one to directly optimize the ranking performance. Ref. [24] focuses on training sparse representation learning-based neural retrievers using hard-negative mining and Pre-trained Language Model initialization. This work is based on SPLADE, a sparse expansion-based retriever, and aims to improve its performance and efficiency in both in-domain and zero-shot settings. The results showed that the use of hardnegative mining and Pre-trained Language Model initialization led to state-of-the-art results and demonstrated the effectiveness of these techniques for sparse representation learning-based neural retrievers. Ref. [25] a new negative sampling strategy, Cross-Batch Negative Sampling (CBNS), proposed for the training of two-tower recommender system models. This strategy takes advantage of the encoded item embeddings from recent mini-batches to improve the model training, effectively reducing the memory and computational costs associated with large batch size training. The results of both theoretical analysis and empirical evaluations demonstrate the effectiveness and efficiency of CBNS in comparison to existing strategies. In this context, the Weighted Rank Relevance Sampling (WRRS) algorithm, as described in Algorithm 1, presents a new approach to negative sampling in dense retrieval that addresses the limitations of previous methods. The WRRS algorithm first builds an approximate nearest neighbor (ANN) index using the dense retrieval model, retrieves the top-k ranked negatives for each query with their relevance scores, computes the relevance scores of each query and its positive documents, sorts the negatives for each query based on relevance scores, and generates probabilities for each negative sample based on its rank and relevance score. The algorithm then uses these probabilities to sample negatives for each instance in the batch during optimization of the dense retrieval model. This approach aims to generate a more diverse set of negatives and reduce the likelihood of sampling higher-ranking false negatives during training.
WSRR: Weighted Rank-Relevance Sampling for Dense Text Retrieval
243
3 Proposed Approach It should be stressed that “WRRS—Weighted Rank-Relevance Sampling” can be seen as an extension of [23], by assigning a probability to each negative sample not only based on its relevance score, but also on its position (rank) in the retrieved list. The probability (.π ) is obtained using two methods: Using the rank of the negative sample and its relevance score. The formula is: π = (k − rank(D− ))/(k ∗ (k + 1))/2,
.
(3)
where .k is the number of retrieved negatives, . D− is the negative sample, and rank(D− ) is its rank in the retrieved list. Using a weighted method that combines the relevance score and rank:
.
π = α ∗ f (|s(q, D− )s(q, d+)b|) + (1−) ∗ f (r (q, D− )),
.
(4)
where .α is a weighting factor between 0 and 1. If .α is 0, the relevance score will not be considered in the calculation, and the probability will be based solely on the rank. If .α is 1, the rank will not be considered, and the probability will be based solely on the relevance score. For values of .α between 0 and 1, the rank and relevance score will be considered according to the value of.α, with higher values giving more weight to the relevance score and lower values giving more weight to the rank. .s(q, d+) is the average relevance score of the positive documents, .b is a bias term, .r (q, D− ) is the rank of the negative document . D− . The negative samples are then selected based on the calculated probability .π for each sample. Algorithm 1 Weighted Rank Relevance Sampling Input: Queries and their positive documents (q, D+ ), document pool D, pre-learned dense retrieval model M Output: New training data (q, D+ , D− ) 1: Build the ANN index on D using M. 2: Retrieve the top-k ranked negatives D_ for each query with their relevance scores s(q, D− ) from D. 3: Compute the relevance scores of each query and its positive documents s(q, D+). 4: Sort negatives D− for each query based on their relevance scores s(q, D− ) in descending order 5: Generate the probability to each negative sample based on its rank and relevance score using equation Eq. 4. 6: Construct new training data (q, D+ , D− ) 7: while M has not converged do 8: Sample a batch from (q, D+, D_) 9: Sample negatives for each instance from the batch according to pi 10: Optimize parameters of M using the batch and sampled negatives. 11: end while
244
K. Hambarde and H. Proença
WRRS is a method for obtaining negatives from a given mini-batch. The method consists of three key steps, as described below. Step 1: Selection of Top-k Ranked Negatives [4, 7] This step is similar to previous methods for selecting the top-k ranked negatives from the candidate pool (. D\D+) using an approximate nearest neighbor (ANN) search tool such as FAISS [26]. Step 2: Computation of Sampling Probabilities, in this step, the sampling probabilities for all the top-k ranked negatives are computed using Eq. 4. Step 3: Sampling of Negatives, the negatives are sampled according to their computed sampling probabilities. The overall algorithm is presented in Algorithm 1.
4 Experiments and Results To evaluate the effectiveness of WRRS, experiments were conducted on two public text retrieval datasets: MSMARCO Document Ranking (MS Doc) [27] and NQ [28]. The statistics of these datasets are presented in Table 1. Detailed information about the baselines, and implementations can be found in [23]. The MS Doc dataset consists of 3,213,835 documents with 367,013 instances in the training set and 5193 instances in the development set. The NQ dataset consists of 21,015,324 documents with 58,880 instances in the training set and 8757 instances in the development set and 3610 instances in the test set. As a baseline, we used the AR2 [8] retrieval method. The performance of, BM25, AR2, and WRRS, were evaluated on a Microsoft Document (MS Doc) development set, as shown in Table 2. The performance measured using two metrics, MRR@10 (Mean Reciprocal Rank at 10) and R@100 (Recall at 100). As seen in Table 2, both AR2 and WRRS outperform BM25 on both MRR@10 and R@100, with scores of 0.418 and 0.425 for MRR@10, and 0.914 and 0.917 for R@100, respectively. BM25, on the other hand, has scores of 0.279 for MRR@10
Table 1 Statistics of the retrieval datasets Dataset Training Dev MS DOC NQ
367,013 58,880
5,193 8,757
Test
Documents
– 3,610
3,213,835 21,015,324
Table 2 Comparison between the mean reciprocal Rank (MRR) values for the WRRS and the baselines considered, on the MS Doc development set Method MRR@10 R@100 BM25 AR2 WRRS
0.279 0.418 0.425
0.807 0.914 0.917
WSRR: Weighted Rank-Relevance Sampling for Dense Text Retrieval
245
Table 3 Comparing retrieval performance on the NQ test set using BM25, AR2, and WRR Method R@5 R@20 R@100 BM25 AR2 WRRS
– 77.9 78.0
59.1 86.2 85.7
73.7 90.1 90.3
Fig. 1 Retrieval performance and training latency with regard to different k-sampled negative values on the NQ dataset
and 0.807 for R@100, indicating that it performs relatively poorly compared to AR2 and WRRS. The Table 3 provides the performance of, BM25, AR2, and WRRS, on a NQ test set. As shown in the Table 3, AR2 and WRRS perform similarly and outperform BM25 on all three metrics. AR2 has R@5 score of 77.9, R@20 score of 86.2, and R@100 score of 90.1. WRRS has R@5 score of 78.0, R@20 score of 85.7, and R@100 score of 90.3. BM25, on the other hand, R@20 score of 59.1, and R@100 score of 73.7. This suggests that WRRS perform better than AR2 and BM25 in retrieving relevant items.
246
K. Hambarde and H. Proença
Ablation Studies In our ablation experiments, two algorithms (AR2 baseline and WRRS) were tested and evaluated on NQ based on their recall at 5 (R@5) and latency. The results of this evaluation are presented in Fig. 1. R@5 is a commonly used metric in the evaluation of information retrieval systems, and it represents the fraction of relevant items among the first five retrieved items. In this study, the WRRS algorithm consistently outperforms the AR2 algorithm in terms of R@5 recall, with higher scores for each rank in the test cases. This suggests that the WRRS algorithm is more effective at retrieving relevant items in an information retrieval system. However, the latency of the WRRS algorithm is higher than that of the AR2 algorithm. Latency refers to the amount of time it takes for a system to respond to a request, and lower latency is generally preferable in information retrieval systems.
5 Conclusions In this paper, we proposed a new strategy (WRRS) for selecting negative samples to be used in the training phase of contrastive learning-based dense text retrieval methods. The proposed strategy was empirically compared to two baseline retrieval methods—BM25 and AR2—on two public text retrieval datasets: MSMARCO Document Ranking and NQ. The obtained results show that WRRS consistently outperformed BM25 on both MRR@10 and R@100 metrics on the MS Doc development set and R@5, R@20, and R@100 metrics on the NQ test set. The performance of AR2 and WRRS was found to be similar, with WRRS providing slightly better results. These findings suggest that WRRS is a promising approach for text retrieval, that can be useful for a broad range of applications. However WRRS accuracy is less then [23]. Further research involves to perceive the actual changes that should be applied to the described WRRS, in order to extend its capabilities to signficantly different domains. Acknowledgements The author would like to thank to AddPath—Adaptative Designed Clinical Pathways Project (CENTRO-01-0247-FEDER-072640 LISBOA-01-0247-FEDER-072640). This work is funded by FCT/MCTES through national funds and co-funded by EU funds under the project UIDB/50008/2020.
References 1. Karpukhin V, O˘guz B, Min S, Lewis P, Wu L, Edunov S, Chen D, Yih W-T (2020) Dense passage retrieval for open-domain question answering. arXiv preprint arXiv:2004.04906 2. Reimers N, Gurevych I (2019) Sentence-bert: sentence embeddings using siamese bertnetworks. arXiv preprint arXiv:1908.10084
WSRR: Weighted Rank-Relevance Sampling for Dense Text Retrieval
247
3. Brickley D, Burgess M, Noy N (2019) Google dataset search: building a search engine for datasets in an open web ecosystem. In: The World Wide web conference, pp 1365–1375 4. Qu Y, Ding Y, Liu J, Liu K, Ren R, Zhao WX, Dong D, Wu H, Wang H (2020) RocketQA: an optimized training approach to dense passage retrieval for open-domain question answering. arXiv preprint arXiv:2010.08191 5. Izacard G, Grave E (2020) Leveraging passage retrieval with generative models for open domain question answering. arXiv preprint arXiv:2007.01282 6. Luan Yi, Eisenstein Jacob, Toutanova Kristina, Collins Michael (2021) Sparse, dense, and attentional representations for text retrieval. Trans Assoc Comput Linguistics 9:329–345 7. Xiong L, Xiong C, Li Y, Tang K-F, Liu J, Bennett P, Ahmed J, Overwijk A (2020) Approximate nearest neighbor negative contrastive learning for dense text retrieval. arXiv preprint arXiv:2007.00808 8. Zhang H, Gong Y, Shen Y, Lv J, Duan N, Chen W (2021) Adversarial retriever-ranker for dense text retrieval. arXiv preprint arXiv:2110.03611 9. Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 10. Zhan J, Mao J, Liu Y, Zhang M, Ma S (2020) Repbert: contextualized text embeddings for first-stage retrieval. arXiv preprint arXiv:2006.15498 11. Hong W, Zhang Z, Wang J, Zhao H (2022) Sentence-aware contrastive learning for opendomain passage retrieval. In: Proceedings of the 60th annual meeting of the association for computational linguistics, vol 1: Long Papers, pp 1062–1074 12. Ram O, Shachaf G, Levy O, Berant J, Globerson A (2021) Learning to retrieve passages without supervision. arXiv preprint arXiv:2112.07708 13. Zhou K, Zhang B, Zhao WX, Wen J-R (2022) Debiased contrastive learning of unsupervised sentence representations. arXiv preprint arXiv:2205.00656 14. Min S, Michael J, Hajishirzi H, Zettlemoyer L (2020) AmbigQA: answering ambiguous opendomain questions. arXiv preprint arXiv:2004.10645 15. Zhan J, Mao J, Liu Y, Guo J, Zhang M, Ma S (2021) Optimizing dense retrieval model training with hard negatives. In: Proceedings of the 44th international ACM SIGIR conference on research and development in information retrieval, pp 1503–1512 16. Ren R, Qu Y, Liu J, Zhao WX, She Q, Wu H, Wang H, Wen J-R (2021) Rocketqav2: a joint training method for dense passage retrieval and passage re-ranking. arXiv preprint arXiv:2110.07367 17. Lu Y, Liu Y, Liu J, Shi Y, Huang Z, Sun SFY, Tian H, et al (2022) Ernie-search: bridging crossencoder with dual-encoder via self on-the-fly distillation for dense passage retrieval. arXiv preprint arXiv:2205.09153 18. Zhou J, Li X, Shang L, Luo L, Zhan K, Hu E, Zhang X, et al (2022) Hyperlinkinduced pre-training for passage retrieval in open-domain question answering. arXiv preprint arXiv:2203.06942 19. Xu C, Guo D, Duan N, McAuley J (2022) Laprador: unsupervised pretrained dense retriever for zero-shot text retrieval. arXiv preprint arXiv:2203.06169 20. Mao K, Dou Z, Qian H (2022) Curriculum contrastive context denoising for few-shot conversational dense retrieval. In: Proceedings of the 45th international ACM SIGIR conference on research and development in iformation retrieval, pp 176–186 21. Hofstätter S, Lin S-C, Yang J-H, Lin J, Hanbury A (2021) Efficiently teaching an effective dense retriever with balanced topic aware sampling. In: Proceedings of the 44th international ACM SIGIR conference on research and development in information retrieval, pp 113–122 22. Lu J, Abrego GH, Ma J, Ni J, Yang Y (2021) Multi-stage training with improved negative contrast for neural passage retrieval. In: Proceedings of the 2021 conference on empirical methods in natural language processing, pp 6091–6103 23. Zhou K, Gong Y, Liu X, Zhao WX, Shen Y, Dong A, Lu J, et al. Simans: simple ambiguous negatives sampling for dense text retrieval. arXiv preprint arXiv:2210.11773 24. Formal T, Lassance C, Piwowarski B, Clinchant S (2022) From distillation to hard negative sampling: making sparse neural IR models more effective. In: Proceedings of the 45th international ACM SIGIR conference on research and development in information retrieval, pp 2353–2359
248
K. Hambarde and H. Proença
25. Wang J, Zhu J, He X (2021) Cross-batch negative sampling for training two-tower recommenders. In: Proceedings of the 44th international ACM SIGIR conference on research and development in information retrieval, pp 1632–1636 26. Johnson Jeff, Douze Matthijs, Jégou Hervé (2019) Billion-scale similarity search with GPUS. IEEE Trans Big Data 7(3):535–547 27. Nguyen T, Rosenberg M, Song X, Gao J, Tiwary S, Majumder R, Deng L (2016) MS MARCO: a human generated machine reading comprehension dataset. Choice 2640:660 28. Kwiatkowski T, Palomaki J, Redfield O, Collins M, Parikh A, Alberti C, Epstein D, et al. (2019) Natural questions: a benchmark for question answering research. Trans Assoc Comput Linguistics 7:453–466
Tweet Based Sentiment Analysis for Stock Price Prediction K. Abinanda Vrishnaa and N. Sabiyath Fatima
Abstract Economic, social and political variables have a large influence on the stock market. Stock markets can be influenced and driven by a change of variables, both internal and external. Stock prices alter from moment to moment due to variations in supply and demand. Various data mining methods are routinely used to overcome this challenge. The subject of sentiment analysis for rising and falling stock prices is addressed by analyzing a number of Deep Learning (DL) techniques. Older price prediction methods could estimate prices in real-time, but be short of the information and corrections to predict price deviations. In this work, cryptocurrency sentiment analysis is commonly used to predict cryptocurrency market prices. Long shortterm memory (LSTM) and Bidirectional Encoder Representation from Transformer (BERT) are two models fused together to frame the proposed Chalk Enhanced Algorithm (CEA) model to increase the effectiveness of analyzing stock market ups and downs. CEA is the ensemble model of LSTM and Bert. This model overcomes overfit problems and uses less time. Comparative analysis of LSTM, BERT and proposed CEA models are carried out using four performance metrics (accuracy, precision, recall, and F1-score), the proposed CEA model is the best model for gauging stock sentiment analysis. The higher value of accuracy, precision in the proposed model proves that this novel method is more effective than the existing one. This allows both new and existing investors to invest with confidence and take advantage of new opportunities. Keywords LSTM · BERT · Sentiment analysis · Stock price
K. Abinanda Vrishnaa · N. Sabiyath Fatima (B) Department of Computer Science and Engineering, B.S.A. Crescent Institute of Science and Technology, Chennai, Tamil Nadu, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_23
249
250
K. Abinanda Vrishnaa and N. Sabiyath Fatima
1 Introduction Over the past decade, social media, blogs, and analytical sites have made huge amounts of user-generated content available. People can freely express their ideas in the aforesaid setting, hence data is trusted. Thus, corporations can examine conventional publicity initiatives by analysing user-generated material. Assess client satisfaction and brand recognition after new advertising efforts. Examining client feedback about a product can assist business leaders identify issues and recommend new, inspiring frameworks, encouraging open creativity. In recent years, numerous scientists have used automated sentiment analysis, also known as mood analysis, to analyse writers’ views and feelings. Sentiment review usually eliminates authors’ opinions and illustrates them with opposites (positive, negative, neutral). Sentiment assessment identifies the text’s spirits. Given the classes and complexities, this process is usually tougher than sentiment inquiry. Lexical [1] and knowledge-based techniques have discussed such tasks, with the latter presenting a better proposal. Large-scale deep learning models have been studied recently [2]. Cryptocurrency price prediction also considers the fear and greed index, Bitcoin (BTC historical)’s value, and social media mood. DL model variables predict Bitcoin price. The proposed technique predicts prices better using these features. This gives existing investors more options and lets new investors learn from the data. The main disadvantage of existing systems are the current system employs the Q Learning model, which includes a look-up table for each state-action pair and the appropriate value associated with it called a q-map. In real-world situations where there are many potential outcomes, this method of maintaining a state-action-value table is impractical. For the purpose of predicting quality prices, the current techniques ignore other qualities and social media sites. The advantage of proposed system is beneficial to examine the trajectory of the general public’s opinion of that company through time and use economics methods to identify any correlations between the company’s stock market valuation and public opinion. Since Twitter allows us to download streams of geo-tagged tweets for specific regions, businesses may also estimate how well their product is responding in the market and which areas of the market are having a positive response and bad response.
2 Related Works Mardjo et al. [3] proposed HyVADRF with GWO. VADER’s supervised classifier, a random forest, created polarity scores and categorised feelings. Given Twitter’s immensity, over 3.6 million tweets were collected and analysed various dataset sizes for model learning. Finally, GWO parameters improved classifier performance. (1) HyVADRF had higher accuracy, precision, recall, F1-score. (2) HyVADRF was consistent.
Tweet Based Sentiment Analysis for Stock Price Prediction
251
Otabek et al. [4] collected all chirps stating Bitcoin from a certain time period, and based on the following characteristics of the tweets classified them into four groups: Twitter groups, remarks, likes, and retweets. Predictions were made using a Q-Learning model for training and testing using the four secret tweet groups described above. Therefore, an ideal policy is required to gain reasonable prediction accuracy. By receiving good or bad costs based on the result of the prediction, the model can recover the initial non-optimal strategy. Mittal et al. [2] collected approximately 7.5 million tweets and used Long Short-Term Memory (LSTM), Recurrent Neural Network (RNN), and Polynomial Regression to provide results on tweet sentiment. RNN used by Pant et al. [5] classified Bitcoin tweets as positive or negative. Combining such rates with Bitcoin’s verifiable cost. The RNN model predicts future prices using the ratio of capable and judgemental tweets and previous price data. Positive and negative tweet sentiment classification accuracy is 81.39 and 77.62. Serafini et al. [6] examined ARIMAX and RNN for Bitcoin time-series forecasting. Examine statistics, deep-learning, and network attitudes to predict bitcoin prices. The author concluded that economic and crowdsourcing data showed that mood is the most important factor in determining Bitcoin market equity. Finally, ARIMAX and Recurrent Neural Network models for Bitcoin time-series forecasts are compared (RNN). A set of considerations for employing LSTMs to create cost expectations was shared by Ye et al. [7]. Where Gated Recurrent Unit (GRU) was used as a collection strategy combined with LSTM. It combines the benefits of LSTM and GRU. Not just sentiment analysis, but sentiment analysis. Emotions are broken down into feelings such as happiness, sadness, surprise, anger, and fear, but positive, negative, and neutral emotions are also used. Thanekar et al. [1] found that sentiment analysis of tweets including "bitcoin" or "btc" helped artificial intelligence (AI) models predict Bitcoin value volatility better than models that did not employ sentiment analysis. These models learned using an autoregressive integrated moving average model and LSTM network. Tweets are analysed for sentiment. Sentiment research and machine learning algorithms will evaluate the relationship between Bitcoin movement and sentiment in copied tweets. Positive Bitcoin social media messages are expected to attract more investors, raising Bitcoin’s price. Gurrib et al. [8], used linear discriminant analysis (LDA) and sentiment analysis of tweets about bitcoin to forecast market behaviour for the next day, resulting in a prediction accuracy of 0.828. S. M. Raju et al. [9], used tweets to anticipate Bitcoin prices. LSTM was more accurate. Long-short-term memory RNNs were used (LSTM). LSTM was used to test the time series model’s bitcoin price prediction. Traditional bitcoin tweet sentiment and pricing analysis vs. (ARIMA). The ARIMA model’s RMSE is 209.263, whereas the multi-feature LSTM’s is 198.448 for single features and 197.515 for multi-features. Focusing on Bitcoin, Colianni et al. [10] examined how sentiment in tweets can be used to influence investment decisions. The authors used overseen machine learning
252
K. Abinanda Vrishnaa and N. Sabiyath Fatima
Fig. 1 Architecture diagram
methods to achieve over 90% accuracy on both hourly and daily basis. Bitcoin (BTC) is another currency that this author investigates.
3 Proposed Methodology 3.1 System Architecture Sentiment analysis is the act of categorizing opinions based on given language, with the goal of determining whether a user’s attitude toward a specific topic or news is positive, negative, or neutral. It is divided into lexicon-based methods and machine learning-based methods. Pre-processing is used to clean data, and the proposed CEA model is applied on sentiments, with the end outcome labelled as positive or negative or neutral (Fig. 1).
3.2 Module Descrption 3.2.1
Data Collection
Data collection involves understanding initial observations of the data to identify useful subsets from hypotheses of the hidden information. 5000 tweets from Microsoft were extracted using Twitter API and filtered using keywords to represent the emotions of public about Microsoft over a period of time (Fig. 2).
Tweet Based Sentiment Analysis for Stock Price Prediction
253
Fig. 2 Sample tweet for sentiment analysis
3.2.2
Pre-processing
The data pre-processing stage involves selecting a record, table, attribute, and cleaning data for modeling tools. Tweets are preprocessed to represent correct emotions using three stages of filtering: Tokenization, Stopwords removal and Regex matching. The data preprocessing techniques are applied to remove noise and enhance text featurization.
3.3 Data Processing: Data Training Investors use auto regressive and moving average models to forecast stock trends, but DL techniques require a large ground truth dataset to train the classification models. To validate the accuracy of optional sentiment labels provided by StocksTwitts users, a random sampling is conducted. LSTM, Bert, and proposed CEA are used to train the models and assign sentiment for tweets where no sentiment labels are provided.
3.4 Model Training The classifier is used to predict emotions of non-human annotated tweets, while forecasting is used to display time series predictions. Negation detection measures are used to differentiate between “good” and “not good”. Figure 3 shows the proposed work’s dataflow.
4 Implementation The Efficient Market Hypothesis (EMH) states that the market is unbeatable, making predicting the uptrend or downtrend a challenging task. This research combines existing techniques to create a more accurate and flexible prediction model. The system is implemented as follows, Step 1: Split records in the pull set and test set in Twitter API and Finance API.
254
K. Abinanda Vrishnaa and N. Sabiyath Fatima
Fig. 3 Dataflow diagram
Step 2: Build a successive model using CEA. Step 3: Compile the CEA model using batch 6, activation function, and accuracy using 100 epochs. Step 4: Apply the final dense layers using Softmax as the opening function for the LeakyRelu in Python. Step 5: Apply to the exercise set using the area curve of 3 sentiment analysis validation set. Step 6: Calculate the model against test datasets collected from Twitter and the Finance API. Step 7: Create graphs linking exercise and validation/test correctness with performance metrics for accuracy, precision, recall, and f1 score. Step 8: Create an accuracy plot by contrasting the actual and projected outcomes. Step 9: Assess the accuracy of stock forecasting by comparing it to current algorithms.
Tweet Based Sentiment Analysis for Stock Price Prediction
255
4.1 CEA Algorithm The Chalk Enhanced Algorithm (CEA) is an algorithm that has been proposed for use in the classification of sentiment. The proposed model’s work function is broken down into steps in Algorithm 1. The activation function, initiation code, and batch size are the components that make up the input parameters. When analysing area curves, this approach takes into consideration the accuracy index, f-score, precision, and recall. In Algorithm 1, you can see a new projected version of the Chalk Enhanced Algorithm (CEA). This results in the creation of three input parameters: amount, start code, and batch size. A more refined version of the algorithm that was proposed can be found in Algorithm 2. Algorithm 1: Chalk Enhanced Algorithm (CEA) Input parameters: One size, Activation code, Batch size 1 start 2 tune the model using tuned_lstm(units, act_func,batch).model = Sequential()model.Add(CEA(unit,input_shape = in_dim,activation = act_func) 3 add bert model using model.add(Bert(out_dim)) 4 compare error, loss and validate model. compile(loss = ”mse”) (optimizer = ”adam”) 5 predict the best model.Fit(train_X,y_train,epochs = 100,batch_size = batch, validation data = (test_X,y_test)) y_pred = model.predict(test_X) 6 return accuracy (model, acc_scores(y_test,y_pred) 7 end In this proposed algorithm, we train the model for 100 epochs iteratively using LeakyRelu activation function to compare and constrast the error, loss accuracy and validation. Accuracy is attained through this model. Algorithm 2: Fine Tuning of CEA Pseudocode of Training Phase with Fine-Tuning CEA in Sentiment Detection Input: Document D training, Label L Output: CEA Model 1 Load dataset(); 2 SataAugmentation(); 3 Splitdata(); 4 Load Model(); 5 for I = 1 to length(D traning) 6 X = D[I] 7 Y = L[I] 8 for every epoch in epochNumber do 9 for every batch in batchSize do y = model(attributes); loss = crossEntropy(Y,y) 10 optimizations(loss); accuracy(loss); 11 bestaccuracy = max(bestaccuracy, accuracy);
256
K. Abinanda Vrishnaa and N. Sabiyath Fatima
Fig. 4 Working of CEA algorithm
12 return 13 Result = tokenization(x) 14 Result = CEA model(result) 15 Result = RELU(H*Result[0] + C) 16 Result = softmax(result) 17 next 18 save model (CEA Model) The fine tuning of CEAis carried out through Algorithm 2. The batch size, activation code is defined to find the appropriate best accuracy. Tokenization and LeakyRelu function helps to determine the finest model accurately (Fig. 4). The working flow of the CEA algorithm determines the stock market in a company. The company details are loaded from the internet. Next, the name of the company is fetched and check for company validation. Then, the twits are collected from twitter account of that company. Then, sentiment is determined as positive, negative and neutral.
5 Result and Analysis This section covered twitter sentiment classifier training approaches. Word2vec trains and tests a 97%-precision LSTM classifier. A 96% n-gram classifier surpassed Bert. Word2vec picture models recognise non-human commented tweets well for big datasets and word semantic sustainability, despite equivalent findings. 96–97% agreed on text mood. 98% sentiment accuracy. It accurately classifies tweet emotion. Table 1 shows sentiment categorization accuracy, precision, F-value, and recall.
Tweet Based Sentiment Analysis for Stock Price Prediction
257
Figure 5 shows the present and suggested algorithm’s Accuracy, Precision, Recall, and F1-score graphs. Figure 6 shows the results of text sentiment organization for the sample text “Stock prices are bullish”. The final output is categorized as positive 0.4967, neutral 0.2897, and negative 0.041. Figure 7 shows the ROC for positive sentiment cataloging range = 0.4967. Figure 8 shows the ROC of range = 0.2897 for the neutral sentiment classification. Figure 9 shows the ROC for the negative sentiment organization range = 0.041. The sentiment analysis for the above statement has been rated as positive. Table 1 Results for sentiment analysis for Word2vec Deep learning models
Accuracy
Precision
Recall
F-measure
LSTM
0.97
0.95
0.945
0.95
Bert
0.96
0.97
0.96
0.96
Proposed CEA
0.98
0.97
0.975
0.965
Fig. 5 Plot for the accuracy, precision, recall and F1-Measure for existing and proposed algorithms (CEA)
Fig. 6 Sample text sentiment classification result
258 Fig. 7 ROC for positive sentiment
Fig. 8 ROC for neutral sentiment
Fig. 9 ROC for negative sentiment
K. Abinanda Vrishnaa and N. Sabiyath Fatima
Tweet Based Sentiment Analysis for Stock Price Prediction
259
6 Conclusion and Future Work This report found a substantial correlation between a company’s stock price and Twitter sentiment. This work developed a twitter sentiment analyzer. Positive, negative, and neutral tweets exist. The stock price will rise if Twitter users like a company. The results support the hypotheses and suggest future investigation. This paper uses word2vec to assess the stock market from tweets and analyses three DL models, LSTM, BERT, and a strategic suggested model CEA, using precision, accuracy, recall, and f1-score. The suggested technique surpasses current algorithms with 98% accuracy. In this paper, the sentiment was assessed purely using Twitter data, which may be skewed because not all stock traders share their thoughts on Twitter. Additionally, news data might be included for a comprehensive gathering of public opinion. Future plans include manually annotating over 10,000 tweets and training classifiers. With larger training datasets, model performance tends to improve.
References 1. Thanekar A, Shelar S, Thakare A, Yadav V (2019) Bitcoin movement prediction using sentimental analysis of Twitter feeds. Int J Comput Sci Eng 7(2):148–152 2. Mittal A, Dhiman V, Singh A, Prakash C (2019) Short-term bitcoin price fluctuation prediction using social media and web search data. In: Proceeding 12th ınternational conference contempory computer (IC), pp 1–6 3. Mardjo A, Choksuchat C, et al. (2022) HyVADRF: hybrid VADER–random forest and GWO for bitcoin tweet sentiment analysis. IEEE Access 10:101889–101897. https://doi.org/10.1109/ ACCESS.2022.3209662 4. Otabek S, Choi J (2022) Twitter attribute classification with Q-learning on Bitcoin price prediction. IEEE Access 10:9613696148. https://doi.org/10.1109/ACCESS.2022.3205129 5. Pant DR, Neupane P, Poudel A, Pokhrel AK, Lama BK (2018) Recurrent neural network based bitcoin price prediction by Twitter sentiment analysis. In: Proceeding IEEE 3rd ınternational conference computer communication security (ICCCS), pp 128–132 6. Serafini G, Yi P, Zhang Q, Brambilla M, Wang J, Hu Y, et al. (2020) Sentiment-driven price prediction of the bitcoin based on statistical and deep learning approaches. In: Proceeding ınternational joint conference neural network (IJCNN), pp 1–8 7. Ye Z, Wu Y, Chen H, Pan Y, Jiang Q (2022) A stacking ensemble deep learning model for bitcoin price prediction using Twitter comments on bitcoin. Mathematics 10(8):1307 8. Gurrib I, Kamalov F, Smail L (2021) Bitcoin price forecasting: linear discriminant analysis with sentiment evaluation. In: Proceeding ArabWIC 7th annual ınternational conferencw Arab women computer conjunct 2nd forum women resource Sharjah (UAE), pp 148–152 9. Raju SM, Tarif AM (2020) Real-time prediction of BITCOIN price using machine learning techniques and public sentiment analysis. arXiv:2006.14473 10. Colianni S, Rosales S, Signorotti M (2015) Algorithmic trading of cryptocurrency based on Twitter sentiment analysis, pp 1–5
A Study on the Stock Market Trend Predictions Rosemol Thomas, Hiren Joshi , and Hardik Joshi
Abstract This paper reflect on the recent study taken place in the discipline of stock market forecast. It compares the different methods implemented to find the movement of stock market direction. It highlights on the analysis of the news as factor for the correctness of the prediction. Sentiment analysis and Prediction algorithms can be blend together to form a hybrid technique to give the best of the results. Not only the news factor but also the technical and fundamental factors involved are studied. Various literature are studied and inspected to know about the recent development that have taken place in the area of stock market prediction. Literature have considered the data from various stock markets of the world. Various metrics are used to evaluate the algorithms are taken in to consideration. The study intended to find on what needs to be added into future research so as to achieve better accuracy and correctness in the prediction of the future direction of stock exchange. Keywords Stock market · Predictions · Sentiment analysis
1 Introduction Future is concealed. If future can be anticipated, then many wise decisions can be taken. Investors are taking risks while funding their well-earned saving in the stocks. Stock Trend prediction with accuracy will help shareholders to take right decision in choosing the shares to invest. Stock movement can be predicted in 2 ways: (a) fundamental analysis (b) technical analysis.
R. Thomas · H. Joshi (B) · H. Joshi Gujarat University, Ahmedabad, Gujarat, India e-mail: [email protected] H. Joshi e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_24
261
262
R. Thomas et al.
Fundamental analysis, the way of predicting the company future based on the financial reports of the business. Annual report and several accounting ratios will help to predict the growth or the decline of the business. Technical analysis predicts the future solely depends on the past trend of the market. As always said past reiterate, so does the patterns in the stock exchange. The data is mapped on the graph to analyze several patterns. Patterns in turn help to recognize what will happen next. Sentiment analysis of stock exchange can be combined with technical study to give better results. Various new articles and events occurring throughout the world has its effect on stock market as it is revolving around the phycology of the investors. Investor’s phycology are driven by various moods. NLP can be used for analyzing the news articles and various social sites to well understand the phycology.
2 Background 2.1 Research Mainly Based on Technical Analysis As we go into analysis of the research taken place in this field of prediction, it focuses on the technical analysis based on various parameters of the past. Srivinay et al. [1] has used deep neural network (DNN) along with the prediction rule ensembles (PRE) technique to develop a hybrid stock model. The hybrid model works better than single DNN or ANN. The paper has considered the stocks of Kotak, ICICI, Axis and Yes Bank thereby concluded that there 5–7% improvement in RMSE score. The stock prediction will be more accurate if it deals with the nonlinear data. Observations speaks that non-linear data can be better handled by hybrid model. Candlestick pattern identification along with different technical indicators are viewed to improve the prediction [1]. Adil and Mhamed [2] uses Long-Short Term Memory model (LSTM) Recurrent Neural Networks (RNN) on the assets price to forecast the future prices. The test have confirmed that the model is strong enough to trace the trend of prices of assets. The more number of epoch taken to train the model, the better the prediction will be. Not only the number of epochs but also the length of the data also affect the accuracy of price prediction [2]. Nabipour et al. [3] In this study data from Tehran stock market of four different areas, for ten years is gathered. The study focuses on nine machine learning models. It also includes two deep learning algorithms. The approach implements the models on two patterns of data. It consider the continuous data and in second technique, it preprocesses the data to convert data into binary form LSTM and RNN perform the best in both types of input [3]. Vijh et al. [4] uses the stock data in the historical dataset of 5 companies to predict the next day closing price of the stock. As the stock prices are not linear, new variables are generated by the using the available values. The output of test reveals
A Study on the Stock Market Trend Predictions
263
that Artificial Neural Network (ANN) technique proves efficient than Random Forest (RF). RF is an ensemble machine learning technique. The paper concludes that if news factor is considered a parameter along with other financial parameters then it will enhance results [4]. Balaji et al. [5] designed fourteen distinct deep learning models. GRU based models provides better DA for shorter forecast horizon and ELM based models or longer forecast horizon. The Directional Accuracy (DA) is better achieved in GRU based models for shorter period and ELM based models are better for longer time period. The metrics used to measure the accuracy test results claim that deep learning model generate accurate stock price forecasts [5]. Sezer and Ozbayoglu [6] has presented an algorithmic model CNN-TA based on image processing properties. Apache spark, keras and tensor flow are used for analysis of big data and images creation. The time series data is converted into 2-D images using technical indicators. Images can be identified and classified to represent whether to buy, sell or hold. The paper concludes that if CNN structural parameters are improved then the algorithm may perform better [6]. Hiransha et al. [7] have used 4 kinds of deep learning architectures i.e. MLP, RNN, LSTM and CNN and ARIMA model. Dataset of Maruti (automobile), Axis bank (banking), Hcltech (IT sector) from NSE are considered, the predictions are made on the basis of closing price of the market. Observations shows that CNN performs better than other models. As ARIMA doesn’t identify the non-linear pattern in the data, neural network performs better compared to ARIMA [7]. Patel et al. [8] has proposed two stage fusion approach. The output of SVR from first stage is combined in different combinations with ANN, SVR, RF to predict the value of stock. It has been observed that two stage hybrid model works better than single stage prediction model. They suggest that semi-supervised system which includes news that effect the company’s market value will make way to further accurate predictions [8]. Patel and Yalamalle [9] proposes Feedforward (MLP) multilayer perceptron neural network technique which gives the satisfactory output when used to train and built the network [9] (Table 1).
2.2 Research Mainly Based on Sentiment Analysis Some of the research focusses on sentiment analysis for predicting the future stock value along with the technical analysis which lead to much better results. Khalil and Pipa [15] has presented a model which does analysis of text to decide Sentiment index, and thereby developing LSTM AI-based model. It uses Naive Bayes Classifier for News Analytics preprocessing. At each stage the performance of the model is checked using metrics. Limitation of the model is that the time of the news is coordinated with opening and closing time of the stock market. So the remaining news of the day is not examined. Moreover the time across the globe changes. Many organizations have their branches all over the world [15].
264
R. Thomas et al.
Table 1 Table comparing various research paper References
Algorithm
Zhang et al. [10]
MTL-LS-SVR EMTL-LS-SVR Chinese and model stock market EMTL-LS-SVR index
Results
Data sources
Metrics
Future scope
Root mean Application of squared DL error (RMSE) and mean absolute error (MAE)
Boonmatham SVM, MLP, and Meesad LSTM, GRU [11]
GRU and LSTM
Corporate news articles and stock closing prices
Accuracy, precision, recall and F1-score
Reduce processing time of LSTM and GRU
Zhou et al. [12]
GAN-FD perform better
Chinese stock market
root mean squared relative error (RMSRE); direction prediction accuracy (DPA)
Integrate predictive models under multiscale conditions
Lachiheb and Hierarchical Gouider [13] Deep Neural Network
DNN
TUNINDEX, the Tunisian stock exchange main index
Mean squared error (MSE)
Financial news, reports and moods
Patel et al. [14]
Random forest
India-based stock exchange i.e two stock price indices nifty 50 and BSE sensex
Precision and recall
Consider processing macro-economic variables
LSTM, CNN and GAN
ANN, SVM, random forest and Naive-Bayes
Chandorkar and Newrekar [16] has designed a hybrid model which uses news as factor along with previous stock close price. Various text preprocessing methods of normalization, lemmatization, stemming, tokenization, bag of words are used. LSTM is efficient in training the model according to the event that happened in long past in time series. LSTM is a recurrent neural network which operates on long sequence of data. Its results in good accuracy and lower error rate [16]. Wu et al. [17] presented a DL prediction model to forecast the effect of stock dimensional valence-arousal (SDVA). They have name the model as hierarchical title keyword based attentional hybrid network (HAHTKN). The proposed model outperformed other models. The results were evaluated by metrics (MSE) and (RMSE). The future work can include the word embedding models and sentiment corpora to
A Study on the Stock Market Trend Predictions
265
better understand the stock market news and thereby improving the accuracy of the predictions [17]. Seong and Nam [18] predicts stock price movement by applying the K-means clustering and multiple kernel learning technique based on the information of the selected company and its related group. Support Vector Machine (SVM) proves to be better than Artificial Neural Network (ANN), K-Nearest neighbors (kNN) and Naïve Bayes (NB) for predicting stock movements with textual data. Multiple Kernel Learning (MKL) is an ensemble technique that uses various data simultaneously combining multiple kernels [18]. Joshi et al. [19] developed three different classification models. The classification model classify the news based on its relation to the stock market if it has good effect or bad effects on the stock value. The data of daily stock prices and news articles of the company are considered. The results found that RF performed better than SVM and SVM better than Naïve Bayes. For companies where financial news is not available twitter data can be considered in future [19]. Bhardwaj et al. [20] fetches live data of Sensex and nifty and extracted news using python and it library for carrying out sentiment analysis. They conclude Better results is possible with more advanced functions [20].
3 Conclusion Researches and analysis is going on in this field since a long a period of time. But stock market is so versatile, that the algorithms and techniques fail to give the accuracy needed by the investors. The parameter influencing the stock values needs to be analyzed more accurately. More articles and other factors presented as news needs to be taken in to consideration for better prediction results. Hybrid model needs to be developed for more precision in the stock trend prediction.
References 1. Srivinay Manujakshi BC, Kabadi MG, Naik N (2022) A hybrid stock price prediction model based on PRE and deep neural network. Data 7. https://doi.org/10.3390/data7050051 2. Adil M, Mhamed H (2020) Stock market prediction using LSTM recurrent neural network. Procedia Comput Sci 170:1168–1173. https://doi.org/10.1016/j.procs.2020.03.049 3. Nabipour M, Nayyeri P, Jabani H, Shahab S, Mosavi A (2020) Predicting stock market trends using machine learning and deep learning algorithms via continuous and binary data; A Comparative Analysis. IEEE Access 8:150199–150212. https://doi.org/10.1109/ACCESS. 2020.3015966 4. Vijh M, Chandola D, Tikkiwal VA, Kumar A (2020) Stock closing price prediction using machine learning techniques. Procedia Comput Sci 167:599–606. https://doi.org/10.1016/j. procs.2020.03.326
266
R. Thomas et al.
5. Balaji AJ, Ram DSH, Nair BB (2018) Applicability of deep learning models for stock price forecasting an empirical study on BANKEX data. Procedia Comput Sci 143:947–953. https:// doi.org/10.1016/j.procs.2018.10.340 6. Sezer OB, Ozbayoglu AM (2018) Algorithmic financial trading with deep convolutional neural networks: time series to image conversion approach. Appl Soft Comput J 70:525–538. https:// doi.org/10.1016/j.asoc.2018.04.024 7. Hiransha MEAG, Menon VK, KP S (2018) NSE stock market prediction using deep-learning models. Procedia Comput Sci 132:1351–1362. https://doi.org/10.1016/j.procs.2018.05.050 8. Patel J, Shah S, Thakkar P, Kotecha K (2014) Predicting stock market index using fusion of machine learning techniques. Exp Syst Appl 9. Patel MB, Yalamalle SR (2014) Stock price prediction using artificial neural network. Int J Innov Res Sci Eng Technol 3:13755–13762 10. Zhang H, Wu Q, Li F-Y, Li H (2022) Multitask learning based on least squares support vector regression for stock forecast. Axioms 11:292. https://doi.org/10.3390/axioms11060292 11. Boonmatham S, Meesad P (2020) Stock price analysis with natural language processing and machine learning. In: ACM international conference proceeding series, pp 2–7 12. Zhou X, Pan Z, Hu G, Tang S, Zhao C (2018) Stock market prediction on high-frequency data using generative adversarial nets. Math Probl Eng. https://doi.org/10.1155/2018/4907423 13. Lachiheb O, Gouider MS (2018) A hierarchical deep neural network design for stock returns prediction. Procedia Comput Sci 126:264–272. https://doi.org/10.1016/j.procs.2018.07.260 14. Patel J, Shah S, Thakkar P, Kotecha K (2015) Predicting stock and stock price index movement using trend deterministic data preparation and machine learning techniques. Expert Syst Appl 42:259–268. https://doi.org/10.1016/j.eswa.2014.07.040 15. Khalil F, Pipa G (2022) Is deep-learning and natural language processing transcending the financial forecasting? Investigation through lens of news analytic process. Comput Econ 60:147–171. https://doi.org/10.1007/s10614-021-10145-2 16. Chandorkar M, Newrekar S (2021) Stock market forecasting using natural language processing and long short term memory. Int J Eng Res Technol 10:96–99 17. Wu J-L, Huang M-T, Yang C-S, Kai-Hsuan L (2021) Sentiment analysis of stock markets using a novel dimensional valence – arousal approach. Soft Comput 25:4433–4450. https://doi.org/ 10.1007/s00500-020-05454-x 18. Seong N, Nam K (2020) Predicting stock movements based on financial news with segmentation. Expert Syst Appl 113988. https://doi.org/10.1016/j.eswa.2020.113988 19. Joshi K, HN B, Rao J (2016) Stock trend prediction using news sentiment analysis. Int J Comput Sci Inf Technol 8:67–76.https://doi.org/10.5121/ijcsit.2016.8306 20. Bhardwaj A, Narayan Y, Dutta M (2015) Sentiment analysis for Indian stock market prediction using sensex and nifty. Procedia - Procedia Comput Sci 70:85–91. https://doi.org/10.1016/j. procs.2015.10.043
Optimization of Partial Products in Modified Booth Multiplier Sharwari Bhosale, Ketan J. Raut , Minal Deshmukh , Abhijit V. Chitre , and Vaibhav Pavnaskar
Abstract The Modified Booth Multiplier often finds prominent use in applications that require high speed multiplications. In comparison to conventional multiplication methods, the Modified Booth Encoding (MBE) algorithm halves the number of partial products. This work is concerned with further reducing the number of partial products. The typical MBE multiplier encodes every 3-bit sequence regardless of its repetition count. This results in recurrent encoding and the creation of incomplete products for the same sequence. The proposed technique, Optimized Booth Multiplier (OBM), prevents numerous encodings of the same sequence, hence reducing the total number of partial products. The modified booth method reduces partial products by 50%, but the proposed technique reduces partial products by an average of 81.25%. This paper describes a novel technique for a 16 × 16 multiplication that decreases the number of partial products to five or less, hence improving the design’s overall speed and power consumption. Using Xilinx ISE Design Suite 14.7 and ISIM, the design for 16 × 16 multiplication is synthesized and simulated. Keywords Modified booth encoding multiplier · Partial product · Synthesis · Optimized booth multiplier
S. Bhosale (B) · K. J. Raut · M. Deshmukh · A. V. Chitre Department of Electronics and Telecommunication Engineering, Vishwakarma Institute of Information Technology, Pune, Maharashtra, India e-mail: [email protected] K. J. Raut e-mail: [email protected] M. Deshmukh e-mail: [email protected] A. V. Chitre e-mail: [email protected] V. Pavnaskar Microchip, Cork, Ireland © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_25
267
268
S. Bhosale et al.
1 Introduction Binary information or data must undergo considerable processing in digital systems. Arithmetic and logical procedures, Multiply and Accumulate (MAC), and a number of Digital Signal Processing (DSP) activities using convolution require rapid multiplication. Reducing the total time needed for multiplication dramatically improves the efficiency of such operations. Booth’s algorithm is capable of multiplying both signed and unsigned binary values. The Modified Booth Algorithm yields fewer partial products, hence lowering the total time required for multiplication [1]. Booth’s algorithm produces a partial product for each encoded bit pair and combines these partial products to produce the final product. Booth’s algorithm (radix-2) encodes a bit pair of two bits, while modified Booth’s algorithm (radix-4) encodes a bit pair of three bits. Although other high-speed architectures for radix-8 [2] and radix-16 [3] MBE have been developed, the multiplier performance of the radix 4 booth technique is more efficient for the creation of partial products [4]. Wallace Tree Adders and Carry Save Adders (CSA) can be utilized to shorten the time required for partial product addition [5]. Reducing the amount of partial products has a significant impact on the overall time required to perform multiplication. Numerous methods for high-speed booth multipliers have been presented. These multipliers are implemented via parallel design [6], pipelined architecture [7–9], or by creating partial products of fixed length [10]. The array multiplier is less complex but has a slowed booth speed. The Wallace tree multiplier offers the greatest speed at the expense of high energy consumption, complexity, and area [11]. Consequently, the redesigned booth multiplier offers a reasonable compromise between the aforementioned characteristics, as it possesses fast speed, low power consumption, and moderate complexity and space. The structure of the paper is as follows: Sect. 2 explains the standard Booth’s algorithm, while Sect. 3 explains the Modified Booth’s algorithm, Sect. 4 describes the proposed Optimized Booth multiplier (OBM) technique, which includes the algorithm to reduce partial products along with a suitable example of 16 × 16 multiplication, and Sect. 5 compares MBE and OBM using synthesis report and simulation results, illustrating a reduction in the number of clock cycles required to achieve the result using the proposed technique and Sect. 6 concludes the paper.
2 Booth’s Algorithm In the conventional multiplication technique [11], to achieve multiplication of n-bit numbers, an ‘n’ number of partial products are generated and added after shifting. Booth’s algorithm (radix-2) analyses 2 bits of multiplier (Q) at a time and decides which step should be performed. Multiplier (Q) is appended with a 0 bit (Q-1) to its left, Q0 and Q-1 are the 2 bits of multiplier which are compared with the 4 possible combinations as shown in Table 1.
Optimization of Partial Products in Modified Booth Multiplier
269
Table 1 Encoding of radix-2 booth multiplier [12] Q0 Q−1
Recoded bit
Operation
00, 11
0
Arithmetic right shifted by 1
01
+1
Shifting after addition of multiplicand and product
10
−1
Subtract multiplicand from product (or add 2’s complement of M) and shift
Fig. 1 Booth’s algorithm flow-chart
As shown in Fig. 1, an arithmetic shift operation takes place on a register consisting of P, Q, and Q-1. Hence, during shifting, the MSB of P is retained, i.e., the sign of the product is retained, and the LSB of Q is shifted to Q-1. For negative numbers, the product generated is in the 2’s complement form. Recoding 2 bits of multiplier at a time reduces the overall execution time as any calculations for combinations 00 and 11 are replaced by a simple shift operation.
3 Modified Booth’s Algorithm Two’s complement representation of multiplier Q is expressed as shown in Eq. 1 [3].
270
S. Bhosale et al.
Q = −Q n−1 2n−1 +
n−2
Q i 2i
(1)
i=0
Equation 1 can also be written as Q=
n−2 (−2Q 2i+1 + Q 2i + Q 2i−1 )22i
(2)
i=0
The partial products can be calculated as given by Eq. 3 [13]. P Pn = −2Q n+1 + Q n + Q n−1
(3)
Modified Booth’s algorithm (radix-4) analyses a 3 bit pair of multipliers (Q) at a time. These pairs are overlapped, i.e., the MSB of the previous pair is the LSB of the next pair. For odd values of multiplier bits, the extension of the sign bit takes place to the left side of the MSB. Figure 2 displays the recoding of the multiplier and the formation of partial products where the multiplier (Q) = 0111 and the multiplicand (M) = 1101. Analyzing the pairs from right to left, each 3-bit pair is encoded as displayed in Table 2. The generation of partial products after recoding is demonstrated in Table 3. These results are derived using a combination of 2 recoded bits of radix-2 booth recoding as shown in Fig. 2. Fig. 2 Radix-4 modified booth multiplier recoding and partial product generation
Optimization of Partial Products in Modified Booth Multiplier
271
Table 2 Encoding of radix-4 modified booth’s algorithm Multiplier bits
Booth recoding
Modified booth encoding
Q1
Q0
Q−1
Q1
Q0
Q0
0
0
0
0
0
0
0
0
1
0
+1
+1
0
1
0
+1
−1
+1
0
1
1
+1
0
+2
1
0
0
−1
0
−2
1
0
1
−1
+1
−1
1
1
0
0
−1
−1
1
1
1
0
0
0
Table 3 Recoded bit representation of radix-4 modified booth’s algorithm Operation Recoded bits
Partial product
Neg Two One PP (sign extended) 0
0
0
0
PP = 0
+1 M
0
0
1
PP = M (sign extended)
+2 M
0
1
0
PP = {M, 0} (sign extended M with 0 appended to left)
−2 M
1
1
0
PP = {−M, 0} (2’s complement of M with sign extension along with 0 appended to left)
−1 M
1
0
1
PP = −M (2’s complement of M with extension of sign)
−1 and + 2 are the recoded bits of pairs 0 and 1, respectively, for Q = 0111. Figure 2 shows that radix-2 and radix-4 provide the same result using their respective recoding. Each partial product is then added after left shifting it by 2 * i, where i is the position of the bit pair or bit pair number. In the example stated above, the computation of products using conventional methods would generate four partial products, whereas the significant reduction in their number by the factor of 2 can be witnessed with the implementation of MBE. Therefore, the above example demonstrates a 50% reduction in partial products on application of the modified booth’s algorithm.
4 Optimized Booth’s Multiplier (OBM) Booth’s algorithm radix 2 * m produces an N/m number of partial products for N × N bit multiplication. This paper focuses on minimizing the overall number of partial products by eliminating repetitive recoding of identical 3-bit pairs and pairs having the same recoding. (000, 111), (001, 010) and (101, 110) are recoded as 0, + 1, and −1, respectively, as shown in Table 2. Hence, these pairs will have the same
272
S. Bhosale et al.
partial products. Overall, a partial product of these pairs can be acquired by adding, after shifting as required. For 16 × 16 multiplication, the modified booth algorithm generates 8 partial products (PP), whereas implementation of the proposed technique will generate a maximum of 5 (worst case) partial products and a minimum of 1 (best case) partial product. Here, the worst case indicates no repetition in pairs and the best case indicates all pairs will have the same recoding, i.e., maximum repetition. The worst case scenario (5 PP) provides a 68.75% reduction, and the best-case scenario (1 PP) provides a 93.75% reduction in the number of PP. The modified booth algorithm provides a 50% reduction in partial products, whereas the proposed technique on an average provides an 81.25% (3 PP) reduction in partial products.
4.1 Proposed Algorithm to Reduce Partial Products For 16 × 16 multiplication according to the modified Booth’s algorithm, the multiplier bits will be grouped into 8 pairs of 3 bits each, with an overlapping of one bit. Thus, a register bank of 8 registers, each 11-bit wide, is designed as shown in Fig. 3. The last 3 bits of each register are designated to store the recoded value of the pair, and the remaining bits are set high to indicate the position of the pair, i.e., where the pair is present in the multiplier. Hence, eight registers are used to note down the value and position of each pair. 8 pairs of multiplier bits are encoded using MBE as shown in Table 2. Each encoded operation is denoted using 3 bits, as shown in Table 3 [10]. This recoded output is stored in the last 3 bits of each register. Following algorithm is then applied to eliminate pairs having repetitive recoding. Algorithm 1. Partial product reduction algorithm 1: for i ← 0 to 7 do 2: Regi [i] ← high //current pair position 3: for j ← i+1 to 7 do 4: if Regi [10:8] == Regj [10:8] then 5: Regii [j] ← high //similar pair position 6: eliminate Regj 7: end 8: end 9: end
After the elimination of redundant pairs, for the remaining valid pairs according to the recoding (Table 3), a partial product is generated for each pair as explained in Sect. 3. The partial product of each pair is shifted and added every time a 1 is discovered in the first 8 bits of the register. A partial product is left-shifted by 2 * k places, where k is the position at which 1 is present in the register.
Optimization of Partial Products in Modified Booth Multiplier
273
Fig. 3 OBM example a Multiplier pairs. b Insertion of recorded bits in register banks
4.2 16 × 16 Multiplication Using OBM Consider 16 × 16 multiplication where multiplicand (M) = (1111 0111 0101 1010)2 = (−2214)10 and multiplier (Q) = (1010 0000 0010 0001)2 = (−24,543)10 . Therefore, −M = 0000 1000 1010 0110 by taking 2’s complement. Assign Fig. 3b depicts an 11-bit-wide register bank for each of the 8 registers. As illustrated in Fig. 3a, we form 8 pairs of 3 bits by inserting a zero (Q − 1) to the right side of the multiplier. Pair 0 i.e., 010 is encoded as +1 using Table 2. According to Table 3, this +1 is recoded as 001 and stored at bit positions 10, 9, and 8 respectively. Similarly, store the recoded output of each pair in the respective register as shown in Fig. 3b. On application of the previously mentioned algorithm for elimination of repetitive pairs, a modified register bank is obtained as shown in Fig. 4. Here, the 1’s indicate the presence of a pair at respective positions. Highlighted registers are eliminated, i.e. these registers would be excluded from further calculations. As recoding of 010 (pair 0) is +1, referring to Table 3, we obtain a partial product (PP) = 1111 1111 1111 1111 1111 0111 0101 1010. Product (P) = P + PP 2 * k, where k is the bit position where 1 is present. In register 1, one is present at bit position 0, and 3, therefore repeat the same process for each pair in order to achieve the product. For the above explained example, conventional multiplication would
274
S. Bhosale et al.
Fig. 4 OBM example—final register bank
generate 16 partial products, a modified booth multiplier would generate 8 partial products, whereas the proposed method generates only 4 partial products. P = P + (P P 2∗0) + (P P 2∗3)
(4)
P = P + P P + (P P 6)
(5)
5 Results and Discussion 5.1 FPGA Device Utilization Figure 5 depicts the device utilization summary for the optimized Booth’s multiplier, designed using FSM modelling in Verilog. The multiplier was designed and synthesized using Xilinx ISE Design Suite 14.7 for the Spartan3e xc3s500e device. Figure 6a and b depicts the comparative power analysis between the optimized and conventional Booth’s multiplier respectively. The Xilinx XPower Analyzer tool
Optimization of Partial Products in Modified Booth Multiplier
275
Fig. 5 Optimized booth’s multiplier device utilization summary
Fig. 6 Power analysis results. a Optimized booth’s multiplier. b Conventional booth’s multiplier
is used to calculate the power. The dynamic power is calculated with respect to the above mentioned example. A significant amount of power reduction is thus seen on implementation of the proposed Optimized MBE multiplier.
5.2 Simulation Results Figures 7 and 8 depicts the output waveforms for the optimized Booth’s multiplier and conventional Booth’s multiplier respectively. Multiplier and multiplicand are the same as used in the above explained example, i.e., M = −2214, Q = −24,543. It can be observed that the output of an optimized Booth’s multiplier is achieved in 4 clock cycles, whereas a conventional Booth’s multiplier generates output after 8 clock cycles. Here, each clock cycle computes a product after recoding and addition of each partial product.
276
S. Bhosale et al.
Fig. 7 Optimized booth’s multiplier simulation result
Fig. 8 Conventional booth’s multiplier simulation result
6 Conclusion This paper presents an Optimized Booth Multiplier (OBM) approach to perform speedy multiplications. The proposed approach is implemented in Verilog utilizing FSM modelling. OBM minimizes the partial products, which is a crucial factor in determining the algorithm’s efficiency. Conventional Booth multiplication requires eight clock cycles to do a 16 × 16 bit multiplication. The OBM approach can perform the same operation in four clock cycles, suggesting more efficiency in conversion time. Consequently, there is a significant saving in dynamic power consumption. Compared to the conventional Booth multiplier, the proposed technique is more efficient and has superior speed performance and significant improvement in the climacteric factor of power consumption.
References 1. Swee KLS, Hiung LH (2012) Performance comparison review of Radix-based multiplier designs. In: 4th international conference on intelligent and advanced systems (ICIAS2012), pp 854–859
Optimization of Partial Products in Modified Booth Multiplier
277
2. Jiang H, Han J, Qiao F, Lombardi F (2016) Approximate Radix-8 booth multipliers for lowpower and high-performance operation. IEEE Trans Comput 65(8):2638–2644 3. Aparna PR, Thomas N (2012) Design and implementation of a high performance multiplier using HDL. In: International conference on computing, communication and applications, pp 1–5 4. Dharani BV, Joseph SM, Kumar S, Nandan D (2020) Booth multiplier: the systematic study. In: 3rd international conference on communications and cyber physical engineering, pp 943–956 5. Manjunath V, Harikiran K, Manikanta S, Sivanantham, Sivasankaran K (2015) Design and implementation of 16×16 modified booth multiplier. In: International conference on green engineering and technologies (IC-GET), pp 1–5 6. Yeh W-C, Jen C-W (2000) High-speed Booth encoded parallel multiplier design. IEEE Trans Comput 49(7):692–701 7. Chow, H-C, Wey I-C (2002) A 3.3V 1GHz high speed pipelined Booth multiplier. In: IEEE international symposium on circuits and systems (ISCAS), pp I–I 8. Kim S, Cho K (2010) Design of high-speed modified booth multipliers operating at GHz ranges. Int J Electric Comput Eng 4(1):1–4 9. Aguirre-Hernandez M, Linares-Aranda M (2008) Energy-efficient high-speed CMOS pipelined multiplier. In: 5th international conference on electrical engineering, computing science and automatic control, pp 460–464 10. Kuang S, Wang J, Guo C (2009) Modified Booth multipliers with a regular partial product array. IEEE Trans Circ Syst II Exp Briefs 56(5):404–408 11. Nair S, Saraf A (2015) A review paper on comparison of multipliers based on performance parameters. In: Proceedings on international conference on advances in science and technology, ICAST 2014, vol 3, pp 6–9 12. Chandel D, Kumawat G, Lahoty P, Chandrodaya VV, Sharma S (2013) Booth multiplier: ease of multiplication. Int J Emerg Technol Adv Eng 3(3):118–122 13. Bano N (2012) VLSI design of low power booth multiplier. Int J Sci Eng Res 3(2):2–4
Comparative Study on Text-to-Image Synthesis Using GANs Pemmadi Leela Venkat , Veerni Venkata Sasidhar , Athota Naga Sandeep , Anudeep Peddi , M. V. P. Chandra Sekhara Rao , and Lakshmikanth Paleti
Abstract Text-to-Image conversion is an active research area where synthetic images are generated based on the text description, semantics, and captions. This application involves both Natural Language Processing (NLP) and Image processing. The text description is processed, and text features are extracted based on the text features an image generates with the corresponding text description. Generative Adversarial Nets have shown significant results in this area. GANs use two models a Generator model uses random Noise to generate an image, and a discriminator model, whose objective is to classify the data points. Both models are combined and compete with each other using an adversarial loss function. Many GAN Models have been proposed for a Text-to-Image conversion application in the past years and address several problems in this domain. In this paper, we reviewed the three latest GAN models, Cycle Consistent Inverse GAN(CI-GAN), Dynamic Aspectaware GAN(DAE-GAN), and Cycle GAN with BERT, and compared their Inception Scores(IS) of them in the CUB (200) Dataset. Keywords GANs · Natural language processing · Image processing · Image generation · Generator · Discriminator · BERT
1 Introduction Generating images from the text impacts understanding the natural language and image processing. GANs [1] have shown significant results in this task. Although many GANs have been proposed for the past few years, some problems still need to be solved in the domain. In this review paper, we reviewed three state-of-art GAN models, the first model is Cycle-Consistent Inverse GAN (CI-GAN) [2] a model not only generates the images but also manipulates the image based on the text description and the second model is Dynamic Aspect-aware GAN (DAE-GAN) P. L. Venkat (B) · V. V. Sasidhar · A. N. Sandeep · A. Peddi · M. V. P. Chandra Sekhara Rao · L. Paleti Department of CSBS, R.V.R & J.C College of Engineering, Guntur, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_26
279
280
P. L. Venkat et al.
[3] it is a novel architecture where images are generated based on the aspects of the text description it introduced a new module called Aspect-Aware Dynamic Redrawer inspired by human aspects learning which refines the image using word-level embedding sentences for global enhancement and aspect-level embedding sentences for local enhancement of the image and the third model is Cycle GAN With BERT [4] this model aims to generate realistic images using the text description Cycle GAN with BERT mainly uses the AttnGAN [5] for text feature to image mapping and BERT [6] for text feature extraction, in the following section we gave the gentle introduction and working method of these models and also discussed about their architectures and compared their Inception Scores on CUB(200) dataset.
2 Literature Review 2.1 Cycle—Consistent Inverse GAN for Text-to-Image Synthesis Introduction: Cycle Consistent Inverse GAN (CI-GAN) [2] is a novel model which not only performs the task of Text-to-Image generation and also the manipulation of images based on text. CI-GAN uses a new algorithm to perform these tasks. The authors obtain the latent code representation of images using an enhanced GAN inversion algorithm with a cycle consistency training method which inverts the original images to the GAN latent space representation. The authors discover the latent code semantics that allows them to yield higher-quality images that correlate to the textual descriptions. Working Method: The training of CI-GAN takes place in 3 stages and uses a Decoupled method for training the model. Firstly, SyleGAN [7] was trained using the noise vector z to generate the latent space w representation of an image during the training of Style GAN Text is not included in the input. Next, In the second stage, an Encoder and Inversion of the GAN model are trained in a cycle-consistent way to get the corresponding latent code w representation of a target image by inverting it. In the final, Third stage text Encoder t is trained as a latent space alignment model to learn the mapping of text features t to latent space w. the generation of images based on the text by using the random samples w’ as the initial latent code representation of the target image and the text features t from the text encoder passed to the latent space alignment model where the initial w’ is optimized to match the t text features and for the manipulation of image based on text is done by using the inverted latent code w’ representations of the original image as initial w’. By using the optimized w’ representation Style GAN generates the corresponding image. Architecture: CI-GAN architecture is shown in Fig. 1. CI-GAN can do two different tasks image generation and image manipulation using text. CI-GAN trains the GAN to learn the latent space representation of an image from random noise without text input. This representation has been used for the initial latent code for
Comparative Study on Text-to-Image Synthesis Using GANs
281
the image generation task. The image encoder uses the inverse GAN algorithm to get the inverted latent code representation of the original image, and this inverse code is passed as initial latent code for the image manipulation task. Next, the latent space alignment module gets the text encoder’s initial code and text features. Module maps the text features with the latent code of the image and optimizes them to match the text features. The optimized latent code passed to the StyleGAN generates the image with matching text features. Datasets and Results: Recipe1M and CUB are the datasets, and Kullback-Leibler (KL) divergence is used as the Distance metric. For CUB (200) dataset CI-GAN has outperformed (refer to Table 2 [2]) Text-Conditioned StyleGAN in inception score and FID Scores and also outperformed the R2GAN, StackGAN++, and CookGAN in the Recipe1M dataset. Table 1 [2] will show the Inception Score and FID values of the CI-GAN on these datasets.
Fig. 1 Architecture of CI-GAN
Table 1 CI-GAN IS and FID scores on recipe1M and CUB datasets
Table 2 Comparison of different methods for the CUB dataset
Dataset name
Inception score (IS)
FID
Recipe1M
5.97 ± 0.11
9.12
CUB (200)
5.72 ± 0.12
9.78
Methods
Inception score (IS)
FID
R2GAN [8]
4.54 ± 0.07
–
StackGAN++ [9]
5.03 ± 0.09
–
CookGAN [10]
5.41 ± 0.11
–
CI-GAN [2]
5.97 ± 0.11
9.12
282
P. L. Venkat et al.
2.2 Dynamic Aspect-Aware GAN for Text-to-Image Synthesis Introduction: DAE-GAN [3] focuses on generating an image based on concentrating on the aspects of information of the images given in the text feature. DAE-GAN has also introduced two novel models for image refinement Aspect-aware Local Refinement (ALR) and Attended Global Refinement (AGR) modules combined called Aspect aware Dynamic Re-drawer (ADR). Working Method: The text is encoded into multiple levels like aspect, word, and sentence level than a low-res image is generated using sentence embedding. Next, using ADR word-level and aspect-level sentences are embedded for image refinement. For global enhancement of the generated image Attended Global Refinement (AGR) uses the embedding of text at the word level. The image is refined for a local perspective using aspect-level embedding of text it is done dynamically using Aspectaware Local Refinement (ALR). A match loss function is used for consistency in the semantics of text and image. Aspect-Aware Dynamic Re-drawer is the main component of DAE-GAN where the aspects information features are embedded into the generated image. By considering the aspects in the embedding text the images are refined, for the global refinement of the image Attended Global Refinement model uses the word-level text embedding, and for the aspect feature refinement Aspectaware Local Refinement model uses the aspect-level feature embedding which allows the local refinement of the image. By applying the AGR model and ALR model simultaneously images can be refined at global and local levels as well. Architecture: DAE-GAN architecture is shown in Fig. 2. semantic representation of text is the first part of the model architecture. From the given text description, aspects are extracted, elements of the picture like “the white” and “yellow legs”. And the given text is also passed to a text encoder where sentence-level and wordlevel text semantics are extracted. Sentence-level text embeddings are provided for the initial image generation task. The generated initial image is passed to the AspectAware Dynamic Re-Drawer, where two other modules come to play. Attended Global Refinement module uses word-level text embedding for Global enhancement of the image and is passed to the next module Aspect-aware Local Refinement uses aspectlevel text embedding for local refinement of the image and generates a realistic image, including the aspect-level detail to the image. Datasets and Results: CUB (200) and COCO are the datasets and KullbackLeibler (KL) divergence is used as the Distance metric. Table 3 [3] will show the Inception Score and FID values of the DAE-GAN on these datasets. DAE-GAN has outperformed several GANs in its region including AttnGAN [5], MirrorGAN [11], DM-GAN [12], and RiFe-GAN [13].
Comparative Study on Text-to-Image Synthesis Using GANs
283
Fig. 2 Architecture of DAE-GAN
Table 3 DAE-GAN IS and FID scores on COCO and CUB datasets
Dataset name
Inception score (IS)
FID
COCO
35.08 ± 1.16
28.12
4.42 ± 0.04
15.19
CUB (200)
2.3 Cycle Text-to-Image GAN with BERT Introduction: CycleGAN with BERT [4] uses a new cyclic architecture using AttGAN and BERT as the main components. AttGAN [5] is used for text features to map image features the pre-trained language model BERT is used to obtain embedding text features and a cyclic architecture based on an inverse function that can map the image to its caption for a better understanding of the description features. Working Method: At first, AttGAN was trained to generate the sentence vectors and words by passing the captions through the LSTM. Using a fully connected layer and StackGAN conditioned augmentation, the mean and variance are derived from an embedding sentence. These mean and variance were used to construct a normal distribution of embedding sentence samples, which was then fed to GAN parameters. This process increases the smoothness and regularization process. The embedding sentence samples passed to the GAN are concatenated with the Gaussian Noise. Using the StackGAN [14] architecture 64x,128x and 256x generators are stacked together, and an attention module feeds the word embedding and image to generators two and three respectively. Every Generator has its discriminator taking the original image and embedding sentences as input. To generate the feature map of the image, The final 256 × image is passed to the encoder. The text features from the text encoder and image features from the image encoder are used to form the
284
P. L. Venkat et al.
DAMSM model known as Deep Attentional Multimodal Similarity Model, which is trained for stability using attention loss. CycleGAN [15] architecture which uses the AttGAN as the center component cycle GAN is used to return the text with STREAM by adding the conditions on embedded captions and image features. This transition allows a better image to caption the transition and the BERT model is added for word embedding. Architecture: CycleGAN with BERT architecture is shown in Fig. 3. The AttGAN is trained text description is passed through the LSTM model word-level, and sentence-level text embeddings are extracted. AttGAN consists of a stack of GANs, and the first GAN receives the sentence-level embedding. Second and third receives word-level embedding. The image generated from the first GAN is passed to the following GANs. In the end, the final image is given to the image encoder, which produces the image embedding, and these are given to the text encoder, where the BERT model is introduced for word-level text embedding. And this cycle process continues until the generated image is optimized as per the text description. Datasets and Results: The model uses the CUB 200 data set consists 200 different kinds of birds with 11,788 images with their text descriptions. Table 4 [4] shows the Inception Score and Mean Opinion Score of the model. Cycle with BERT has outperformed the AttGAN in Inception Score (refer to Table 5 [4]) (Table 6).
Fig. 3 Architecture of CycleGAN with BERT
Table 4 IS and MOS scores on CUB datasets
Table 5 AttnGAN and CycleGAN with BERT IS scores on CUB datasets
Dataset name
Inception score (IS)
Mean opinion score
CUB (200)
5.92
3.9
Methods
Inception score (IS)
AttnGAN
3.92
CycleGAN with BERT
5.92
Comparative Study on Text-to-Image Synthesis Using GANs
285
Table 6 Characteristics comparison of CI-GAN, DAE-GAN and CycleGAN with BERT Characteristics comparison
CI-GAN
DAE-GAN
Cycle GAN with BERT
Main components Inversion GAN, latent space alignment model, and StyleGAN
Aspect-aware local refinement, AttnGAN, attended global refinement and CycleGAN, and aspect-aware dynamic BERT re-drawer
Application
Image generation and image manipulation using text
Text aspect based image generation
Realistic image generation using text description
Loss function
InfoNCE
Conditional loss and unconditional loss
AttnLoss, binary cross entropy and cycle consistency loss
Data sets
Recipe1M and CUB(200)
CUB(200) and COCO
CUB(200)
Distance metrics
Kullback-Leibler divergence
Kullback-Leibler divergence
Kullback-Leibler divergence
Text encoder
LSTM based text encoder
LSTM based text encoder
BERT based text encoder
3 Conclusion Although, Models are evaluated on different datasets with different metrics. All three models have used the CUB (200) data set and have a common evaluation metric. ISinception score for generative models is a perfect ad hoc metric. Inception models are pre-trained and fine-tuned for the datasets. The inception score computes the Kullback-Leibler divergence distance between the probability of conditional distribution and the probability of marginal distribution. The high IS scores indicate that the model is performing well in generating realistic images and have more diversity among other classes by comparing the IS scores of the CI-GAN, DAE-GAN and CycleGAN with BERT on the CUB (200) dataset Table 7. CycleGAN with BERT has shown a better inception score-(5.92) than the other two models. CycleGAN with BERT is constructed mainly on the AttGAN, and the iterative cyclic architecture ensures that the image is smoothened and matches the text features. This architecture and the BERT model, which gave better text semantic embedding features to the model, have made CycleGAN with BERT get high Inception scores than other models.
286 Table 7 (IS) comparison of CI-GAN, DAE-GAN and CycleGAN with BERT
P. L. Venkat et al.
Model
Inception score (IS)
Cycle consistent inverse GAN
5.72
Dynamic aspect-aware GAN
4.42
Cycle GAN with BERT
5.92
References 1. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial networks. Commun ACM 63:139–144. https://doi.org/10. 48550/arxiv.1406.2661 2. Wang H, Lin G, Hoi SCH, Miao C (2021) Cycle-consistent inverse GAN for text-to-image synthesis. In: MM 2021 - proceedings of the 29th ACM international conference on multimedia, pp 630–638. https://doi.org/10.48550/arxiv.2108.01361 3. Ruan S, Zhang Y, Zhang K, Fan Y, Tang F, Liu Q, Chen E (2021) DAE-GAN: dynamic aspectaware GAN for text-to-image synthesis. In: Proceedings of the IEEE international conference on computer vision, pp 13940–13949. https://doi.org/10.48550/arxiv.2108.12141 4. Tsue T, Sen S, Li J (2020) Cycle text-to-image GAN with BERT. CoRR abs/2003.12137. https:/ /doi.org/10.48550/arxiv.2003.12137 5. Xu T, Zhang P, Huang Q, Zhang H, Gan Z, Huang X, He X (2017) AttnGAN: fine-grained text to image generation with attentional generative adversarial networks. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 1316–1324. https://doi.org/10.48550/arxiv.1711.10485 6. Devlin J, Chang M-W, Lee K, Google KT, Language AI (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North, pp 4171–4186. https://doi.org/10.18653/V1/N19-1423 7. Karras T, Laine S, Aila T (2018) A style-based generator architecture for generative adversarial networks. IEEE Trans Pattern Anal Mach Intell 43:4217–4228. https://doi.org/10.48550/arxiv. 1812.04948 8. Zhu B, Ngo C-W, Chen J, Hao Y (2019) R2GAN: cross-modal recipe retrieval with generative adversarial network. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 11477–11486 9. Zhang H, Xu T, Li H, Zhang S, Wang X, Huang X, Metaxas DN (2017) StackGAN++: realistic image synthesis with stacked generative adversarial networks. IEEE Trans Pattern Anal Mach Intell 41:1947–1962. https://doi.org/10.48550/arxiv.1710.10916 10. Zhu B, Ngo C-W (2020) CookGAN: causality based text-to-image synthesis. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 5519–5527 11. Qiao T, Zhang J, Xu D, Tao D (2019) MirrorGAN: learning text-to-image generation by redescription. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 1505–1514. https://doi.org/10.48550/arxiv.1903.05854 12. Zhu M, Pan P, Chen W, Yang Y (2019) DM-GAN: dynamic memory generative adversarial networks for text-to-image synthesis. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 5795–5803. https://doi.org/10.48550/arxiv. 1904.01310 13. Cheng J, Wu F, Tian Y, Wang L, Tao D (2020) RiFeGAN: rich feature generation for textto-image synthesis from prior knowledge. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 10911–10920 14. Zhang H, Xu T, Li H, Zhang S, Wang X, Huang X, Metaxas D (2016) StackGAN: text to photorealistic image synthesis with stacked generative adversarial networks. CoRR abs/1612.03242, pp 5908–5916. https://doi.org/10.48550/arxiv.1612.03242
Comparative Study on Text-to-Image Synthesis Using GANs
287
15. Zhu JY, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycleconsistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision 2017-October, pp 2242–2251. https://doi.org/10.48550/arxiv.1703.10593
Online Hate Speech Identification Using Fine-tuned ALBERT Sneha Chinivar, M. S. Roopa, J. S. Arunalatha, and K. R. Venugopal
Abstract The increased use of social media platforms has escalated the outspread of online hate speech. Hate speech can take numerous forms, such as political, racial, religious, LGBTQ+, gender-based, nationality-based and disability-based; It can overlap and intersect with various forms of oppression and discrimination and can lead to severe and harmful impacts on society. It is crucial to combat online hate speech and create a more inclusive and secure online environment. Several approaches have already been investigated, such as n-grams, Convolutional Neural Networks (CNN), Recurrent units, Gated Recurrent Units (GRU), and even their combinations to recognize text-based online hate speech. Even though they were able to give satisfactory results, their contextual understanding of the text needed to be stronger as it is a bit of a complex concept. The main reasons these approaches fail are the non-availability of larger datasets to take full benefit of the model’s architecture and their lack of context understanding capability. In this work, we have explored the usage of one of the transformer-based architectures named A Lite Bidirectional Encoder Representations from Transformers (ALBERT) to address the problem of contextual understanding and recognize online textual hate speech efficiently with limited resources and a smaller dataset. Even in the untrained form, ALBERT provides a better understanding of context than any recurrent unit. Thus the ALBERT, pre-trained on a large corpus, gave the flexibility of using a smaller dataset to finetune them to a particular downstream task of hate speech identification as it already has a notion of language. ALBERT has already proved its contextual understanding capability with various benchmarked datasets, viz., SQuAD and GLUE. Thus, finetuning them to our downstream task allows utilizing this to our benefit. With this approach, we achieved better results than the state-of-art in terms of all four metrics: Precision (6.16%), Recall (6.59%), F1-Score (6.41%), and Accuracy (10.9%). Keywords Fine-tuned ALBERT · Online hate speech · Social media S. Chinivar (B) · J. S. Arunalatha · K. R. Venugopal Department of CSE UVCE, Bangalore University, Bengaluru, India e-mail: [email protected] M. S. Roopa Department of CSE, Dayananda Sagar College of Engineering, Bengaluru, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_27
289
290
S. Chinivar et al.
1 Introduction Social media is one of the innovative outcomes of the technological progress of the 21st century. It is a unique platform that allows people to connect, share ideas, perform sales or marketing, put forward their opinions, and many more. Unfortunately, it is also a platform often filled with negativity, and one such negativity is spreading hate speech [1]. Hate speech refers to a kind of speech, gesture, conduct, writing, or display that can stir up violence and intolerance against an individual or a group [2]. When the Hate Speech is distributed through the internet, including social media, websites, forums, and other online platforms, it is referred to as Online Hate Speech. Figure 1 depicts the perceived reasons for the online hate speech received by gender according to the survey conducted in New Zealand [3]. Online hate speech keeps increasing with online platforms like social media [4]. Several factors of online platforms contribute to the escalation of this harmful activity, such as the anonymity they offer, lack of
Fig. 1 Gender-based perceived reasons for experiencing online hate speech according to the survey conducted in New Zealand [3]
Hate Speech Identification
291
Fig. 2 Online hate and harassment impact in the U.S. 2020 [5]
accountability on online platforms, and the algorithms and recommendation systems these online platforms usually use to provide their user with similar content with which they were engaged previously can lead to spreading of hate and formation of echo chambers, the capability of these online platforms to serve as a tool to spread fake information and propaganda to name a few. According to the January 2020 survey of the United States’s internet users, hate speech usually threatens the peace of society and its other consequences is depicted in Fig. 2 [5]. In this work, we have explored the efficiency of Fine-tuned A Lite Bidirectional Encoder Representations from Transformers (ALBERT) [6], a variation of BERT, to recognise online hate speech and create a hate-free online environment. Although ALBERT is a lighter BERT, it has a good performance record over many Natural Language Processing (NLP) benchmarks. Its architecture is similar to BERT with fewer parameters, is heavily trained on large datasets, and uses a self-supervised learning technique to learn language patterns. ALBERT base version 2 is an extension of ALBERT base version 1, where its pretraining task is incorporated with one more task named new sentence order along with the Masked Language Modelling (MLM) and Next Sentence Prediction (NSP) task. Its representation learning has been improved by incorporating the mechanism of multi-head attention and layer-wise cross attention. In addition, version 2 of ALBERT is integrated with the expanded vocabulary and more layers and hidden units [6].
292
S. Chinivar et al.
Therefore, in this work, a Fine-tuned ALBERT base version 2 has been used to efficiently identify textual content containing hate speech. The main contributions of this work are as follows. 1. To introduce a new Fine-tuned ALBERT model for hate speech detection on the Multi-domain Hate speech Corpus (MHC) benchmarked dataset [7]. 2. To improve the performance of Fine-tuned ALBERT for hate speech detection in terms of Precision, Recall, F1-Score and Accuracy with a smaller dataset. The following section outlines the literature survey; Sect. 3 describes the methodology, Sect. 4 and 5 discusses the experiment’s detail and results, respectively, and finally Sect. 6 presents the conclusions and future directions.
2 Literature Survey This section reviews the recent articles and works of literature that discuss online Hate Speech. Antonio Guterres, United Nations Secretary-General of 2021, states “Social media provides a global megaphone for hate” and “Hatred is a danger to everyone— and so fighting it must be a job for everyone”. Hate speech is discriminatory or derogatory towards an individual or group. It can be conveyed through various forms, viz., images, gestures, memes, cartoons, etc. [8]. Studies show that hate speech is more prevalent among younger adults in comparison to older age group people. Similarly, compared to females, males are more frequently indulged in or exposed to this harmful activity [3, 9, 10]. Driscoll [11] reveals the outcome of a 2019 study by HateLab of Cardiff University, where they found that an increase in hate speech on social media platforms can increase the physical world crime against minorities. In other words, there is an absolute correlation between online hate speech and physical world crime, especially against minorities [12]. The result of this study was backed by a similar study conducted in Germany [13] and a New York University study on discriminating Tweets [14]. Varied Machine Learning, Deep Learning, and Transfer Learning techniques have been used till now to identify online Hate speech [15–17]. In contrast, Bahador [18] introduced an innovative intensity scale to identify and classify recognized hate speech into its appropriate intensity category. Its primary purpose is to understand the hate speech evolution before it causes any real-world harm. Asiri et al. [19] proposed a model named “Enhanced Seagull Optimization with Natural Language Processing based Hate Speech Detection and Classification (ESGONLP-HSC). This model identified and classified the social media website’s hate speech by performing feature extraction using the GloVE technique and classification using an Attention-based Bi-directional Long Short-Term Memory (ABLSTM) model. A bi-directional LSTM with Deep CNN and Hierarchical Attention has been used by Khan et al., [20] to detect Twitter hate speech. Even though these techniques gave
Hate Speech Identification
293
good results, they did not completely understand the context of the text. So, to get the accurate contextual meaning of the text, it is necessary to upgrade to transformer architecture, which is the core foundation of ALBERT architecture used in this work. Even though ALBERT’s model size and computational cost are lighter than BERT, it retains the learning capabilities of representations. As stated by Lan et al., [6] more parameters do not always certainly mean better performance. Thus, by fine-tuning ALBERT, showed how efficiently Hate speech can be identified as being lucrative in terms of both hardware and time requirements.
3 Methodology This section discusses the approach adopted to tackle the problem of efficient recognition of Online Hate Speech (Fig. 3). The architecture diagram of the proposed approach is depicted in Fig. 1. Dataset: The dataset used in this work consists of Tweets retrieved from Twitter social media using standard Twitter API by giving various domain-specific keywords and their combinations by Ghosh et al. [7]. This dataset consists of 10,242 tweets, out of which 5125 tweets belong to each category of hate and non-hate. For our experiment, we have used a randomly sampled 50% of the data so that dataset’s training, testing, and validation set, which is divided in the proportion of 70%, 20%, and 10%, respectively, contains 3592, 1014, and 514 tweets. Fine-tuned ALBERT: The textual input data is given to Fine-tuned model of ALBERT, which internally consists of several important components viz., Input Layer, Embedding Layer, Encoder Layer, Pooling Layer, and Classifier Layer. The raw textual input data encoded into a sequence of Token IDs by the specific tokenizer is given as input to the model. The tokenizer used is the same as that used to pre-train the ALBERT model, i.e., the SentencePiece tokenizer, to get the embedding vector of Token IDs that correspond to the pre-training phase vectors of ALBERT. The representations obtained are given to Embedding Layer to further
Fig. 3 Architecture of the proposed approach
294
S. Chinivar et al.
map these numerical representations to high-dimensional vectors. The outputted embeddings from the previous layer are given as input to the ALBERT Transformer, which consists of 12 encoders. Each encoder of the Transformer internally consists of 2 sub-layers, i.e., the multi-headed self-attention layer and the feed-forward layer. The multi-headed self-attention layer generates embedding for each token by looking at other tokens of the sequence. This will be repeated eight times and finally concatenate all eight times generated representations and are forwarded to the feedforward network, which is a simple neural network. Finally, this Transformer layer outputs the token sequence representation by apprehending the in-between relationships of tokens. The output embedding of the Transformer Layer is given to Pooling Layer to aggregate the embedding to fixed-length vectors. Finally, the obtained vectors from Pooling layer are passed to Classifier Layer, which is a simple neural network, to identify which category the inputted text belongs to, i.e., whether the given text is Hate Speech or not a Hate Speech. The Encoder Layer and the Classifier Layer of the Fine-tuned ALBERT are jointly trained for a particular downstream task i.e., hate speech identification, during the Fine-tuning process on the training data. Thus the ALBERT learns task-specific representations, resulting in better performance of the model on hate speech recognition task even with smaller dataset. The model’s optimal hyperparameter viz.,learning rate, epoch, batch size etc., is selected utilizing the validation dataset and the model is finally evaluated on the test dataset to asses its performance.
4 Experiment This section discusses the experimental details of the proposed approach. This work was implemented on a 32 GB RAM system with a GPU-type Quadro M4000. A Python version of 3.9.13 is used for writing up the code. Mainly PyTorch and Scikitlearn library has been used in the implementation of this work.
4.1 Training Details of ALBERT Base Version 2 The hyperparameters of the ALBERT transformer for the downstream task of identifying Online Hate speech on the benchmarked dataset given by Ghosh et al. [7] are represented in Table 1. With these setups, training of the model got completed in less than an hour.
Hate Speech Identification
295
Table 1 ALBERT base version 2 hyperparameters for hate speech detection Hyperparameter Selected value Number of layers Number of neurons in each layer Batch size Epoch learning rate Dropout .adam_epsilon
01 128 16 10 2e.−5 (0.00002) 0.5 1e.−8 (0.00000001)
4.2 Fine-Tuned ALBERT ALBERT is a transformer-based neural language model for Natural Language Processing (NLP) tasks. It is a lighter version of BERT with similar architecture but a relatively small number of parameters, making the model efficient, effective, and scalable for most NLP tasks. It has been pre-trained on BookCorpus and English Wikipedia. The important modifications done to BERT by Lan et al. [6] to create ALBERT are as follows. 1. Parameter Sharing between Cross Layers: ALBERT uses the same parameter sets for each layer of the model. This reduces the number of trainable parameters and makes the model more computationally efficient. 2. Factorization Embedding: The utilization of Factorized Embedding Parameterization by ALBERT allows it to split the embedding matrix into two tinier matrices. This further reduces the number of parameters to be learned, enabling small hardware to train large models and attain efficient performance on the downstream task. 3. Introduction of Inter Sentence Coherence Loss: To encourage the model to learn better sentence coherence, ALBERT uses an additional loss term during pretraining called Inter Sentence Coherence loss. This loss enables the model to understand better the relationship between varied sentences of the given document. These modifications make the ALBERT lighter and computationally efficient, maintaining its efficiency on various NLP tasks. Fine-tuning of Transformer based language model is mainly done further to train the pre-trained language model on a task-specific dataset. This process helps the model adopts the target domain’s specific characteristics, such as vocabulary, syntax etc., In addition, it makes the model learn patterns specific to the task and enhances the model’s performance for a particular downstream task. While fine-tuning a pre-trained ALBERT transformer model, the ALBERT transformer layer is initialized with weights from a pre-trained model. The rest of the
296
S. Chinivar et al.
Table 2 Evaluation results Precision Train Validation Test
1.0 0.9714 0.9508
Recall
F1-score
Accuracy
0.9955 0.9260 0.9546
0.9977 0.9482 0.9527
0.9526 0.9494 0.9977
layer, the classification Layer, is initialized randomly and both of them can be further trained for a definite task based on task-specific data. By taking advantage of the pretrained representations learned by the ALBERT transformer, fine-tuning empowers the model to grasp the features specific to the task and gives a much-improved performance compared to a model that needs to be trained from scratch cost-effectively and with a relatively smaller dataset [21].
5 Results and Discussions This section discusses in detail the results of our experiment. Fine-tuned ALBERT drastically reduces the time, resources and raw data required as ALBERT ’s pretrained weights are used in this work. The model’s efficiency is measured using four various matrices; Accuracy metric specifies the model’s comprehensive, accurate scores when exposed to unseen test data. Precision metric refers to the ratio of true positives to overall positive results predicted. Recall metric is the ratio of accurately recognised positive results to the overall presented true positives, and finally, F1Score is the harmonic mean of Precision and Recall. The results presented in Table 2 clearly show the efficiency of pre-trained ALBERT base version 2. Figure 4 depicts the rise of Precision, Recall, F1-Score, and Accuracy metrics of training data against the increase of Trainer or Global step. The Trainer or Global step is a measuring unit utilized during the training of models. It specifies the number of batches processed during model training. The training data’s Precision value clearly shows the efficiency of the model with nil false predictions. The rest of the metrics of the training data has also given good scores. During validation, the model’s performance is evaluated on separate data that the model has not seen during training. Figure 5 represents the validation data’s Precision, Recall, F1-Score, and Accuracy metrics score against Trainer/Global Step, which is reasonably good. This shows that the model cannot overfit to the training data and generalize well to the new data. Figure 6 depicts the Training and Validation Loss graph of the model against Trainer/Global Step. Figure 6a understandably represents how the model gradually improves by reducing the loss function as it gets exposed to more training data. The loss function measures the variation between the predicted value of the model and the
Hate Speech Identification
Fig. 4 Evaluation metric graphs of training data
Fig. 5 Evaluation metric graphs of validation data
297
298
S. Chinivar et al.
Fig. 6 Training and validation dataset’s loss graph Table 3 Comparison of results of fine-tuned ALBERT with the state-of-art Precision Recall F1-score SEHC [7] Fine-tuned ALBERT
0.8892 0.9508
0.8887 0.9546
0.8886 0.9527
Accuracy 0.8887 0.9977
actual value. The cross-Entropy Loss function has been used in this work to measure those variations. Figure 6b portrays places where the curve becomes steep and sharp in the middle as the model is still learning and adjusting itself to the training data. As it gets exposed to more data, it keeps on making progress. Table 3 shows the comparison results of the Fine-tuned ALBERT with Ghosh et al. [7], state-of-art concerning all four-performance metrics. Thus, the pre-trained ALBERT model gave better results with a better contextual understanding of the text and illustrated that it could perform well even with a smaller dataset.
6 Conclusions With the increase in the use of social media for various good reasons, it is simultaneously becoming an easily accessible platform to spread hate. So, recognizing hate speech correctly and creating an odiousness-free online environment has become highly imperative. With the work presented, we have tried to put forward a step in reaching the goal of creating a hate-free online platform. Even though the transformer-based ALBERT fine-tuned model performed exceptionally well with a smaller dataset, a few aspects, like misclassifying aggressive comments as hate speech, are yet to be addressed with much more contextual understanding. In this work, we compared the results with the state-of-art and found that the fine-tuned ALBERT performed efficiently in all aspects, like performance matrices, resource utilization and timescale. Further, this experimentation can be extended with other transformer-based architectures viz., BERT, RoBERTa, XLNET etc., and fine-
Hate Speech Identification
299
tune, concerning varied hyperparameters to see how they will impact the results of the model addressing online hate speech. Using the ensemble technique of different finetuned transformer-based models is another interesting direction that can be pursued to handle the problem of offensive online content. Further, it should be taken into account that even though most hate speeches are text-based, there has recently been a rise in spreading hate via images, memes, and videos. Thus, designing a mechanism to combat visual-based forms of hate speech becomes vital.
References 1. Chinivar S, Roopa MS, Arunalatha JS, Venugopal KR (2022) Online offensive behaviour in social media: detection approaches, comprehensive review and future directions. Entertain Comput: 100544 2. United Nations. Hate speech is rising around the world. https://www.un.org/en/hate-speech 3. Pacheco E, Melhuish N (2018) Online hate speech: a survey on personal experiences and exposure among adult New Zealanders. SSRN Electron J 4. Laub Z (2019) Hate speech on social media: global comparisons. https://www.un.org/en/hatespeech 5. Petrosyan A (2022) Consequences of online hate and harassment according to internet users in the United States as of January 2020, July 2022. https://www.statista.com/statistics/971876/ societal-impact-of-online-hate-harassment-usa/ 6. Lan Z, Chen M, Goodman S, Gimpel K, Sharma P, Soricut R (2019) Albert: A lite Bert for self-supervised learning of language representations. arXiv:1909.11942 7. Ghosh S, Ekbal A, Bhattacharyya P, Saha T, Kumar A, Srivastava S (2022) SEHC: a benchmark setup to identify online hate speech in English. IEEE Trans Comput Soc Syst 8. United Nations. Understanding hate speech. https://www.un.org/en/hate-speech/ understanding-hate-speech/what-is-hate-speech 9. S. R. Department (2021) Exposure to hate speech on social media in Finland 2020, by age group, July 2021. https://www.statista.com/statistics/913185/people-seeing-hate-speech-online-byage-group/ 10. Harriman N, Shortland N, Su M, Cote T, Testa MA, Savoia E (2020) Youth exposure to hate in the online space: an exploratory analysis. Int J Environ Res Publ Health 17(22):8531 11. O’DriScoll A (2022) 25+ online hate crime statistics and facts, Dec 2022. https://www. comparitech.com/blog/information-security/online-hate-crime-statistics// 12. C. University (2019) Increase in online hate speech leads to more crimes against minorities, Oct 2019. https://phys.org/news/2019-10-online-speech-crimes-minorities.html// 13. Müller K, Schwarz C (2021) Fanning the flames of hate: social media and hate crime. J Euro Econ Assoc 19(4):2131–2167 14. N. University (2019) Hate speech on twitter predicts frequency of real-life hate crimes, June 2019. https://www.nyu.edu/about/news-publications/news/2019/june/hate-speech-ontwitter-predicts-frequency-of-real-life-hate-crim.html// 15. William P, Gade R, Chaudhari R, Pawar A, Jawale M (2022) Machine learning based automatic hate speech recognition system. In: 2022 international conference on sustainable computing and data communication systems (ICSCDS). IEEE, pp 315–318 16. Omar A, Mahmoud TM, Abd-El-Hafeez T (2020) Comparative performance of machine learning and deep learning algorithms for Arabic hate speech detection in OSNS. In: Proceedings of the international conference on artificial intelligence and computer vision (AICV2020). Springer, pp 247–257 17. Yuan L, Wang T, Ferraro G, Suominen H, Rizoiu M-A (2019) Transfer learning for hate speech detection in social media. arXiv:1906.03829
300
S. Chinivar et al.
18. Bahador B (2020) Classifying and identifying the intensity of hate speech, Nov 2020. https://items.ssrc.org/disinformation-democracy-and-conflict-prevention/classifyingand-identifying-the-intensity-of-hate-speech/// 19. Asiri Y, Halawani HT, Alghamdi HM, Abdalaha Hamza SH, Abdel-Khalek S, Mansour RF (2022) Enhanced seagull optimization with natural language processing based hate speech detection and classification. Appl Sci 12(16):8000 20. Khan S, Fazil M, Sejwal VK, Alshara MA, Alotaibi RM, Kamal A, Baig AR (2022) Bichat: Bilstm with deep CNN and hierarchical attention for hate speech detection. J King Saud UnivComput Inf Sci 34(7):4335–4344 21. Hugging Face. Hugging face’s ALBERT Base v2.https://huggingface.co/albert-base-v2
A Real Time Design Pattern Using MapReduce Strategy Visit for Distributed Service Oriented Strategy K. Selvamani , S. Kanimozhi , and H. Riasudheen
Abstract In Recent developments towards programming models such as MapReduce have introduced many techniques to evolve and utilize the use of design patterns. MapReduce is a computing methodology for processing various data which resides on thousands of computers that are popularized by google, Hadoop and others. Design patterns have become ease for developers, since they are the tools for developers to solve complicated problems in a reusable and general way with less time. Using these techniques in the domain of distributed Service Oriented Architecture (SOA) is simple, faster, accurate, flexible and adaptive than the old systems. Due to this flexibility and power of design patterns in composition with, it is easy to realize and understand the effects of the loose coupling between the components interactions in the designed architecture. Hence, this paper aims in proposing a real time design pattern using MapReduce based strategy visit and factory based design patterns for Distributed Service Oriented Architecture (DSOA). In addition, along with the merge of component based programming model to introduce dynamic service injection of SOA and to develop adaptive scalable distributed systems. By this proposed model, the performance and interactions between the end users are most highly appreciable and sophisticated. Moreover, this system achieves an expected real time distributed environment for Service Oriented Architecture. Keywords Design patterns · Distributed computing · MapReduce · Service oriented architecture
K. Selvamani (B) · H. Riasudheen CEG Campus, Anna University, Chennai 600025, India e-mail: [email protected]; [email protected] S. Kanimozhi Panimalar Engineering College Chennai City Campus, Chennai 600029, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_28
301
302
K. Selvamani et al.
1 Introduction The developments in design patterns have proven as a best method to build highquality software. The patterns are named as abstraction from a concrete form that represents a recurring solution to a particular problem. But, in real world solutions, the challenges in finding a suitable solution and designing a pattern to form the architecture is becoming a tedious and tremendous process. Hence, this research paper aims in proposing the two fold approach for combining two or more design patters to define a simple methodology to find the solution for Distributed Service Oriented Architecture. This type of combining design patterns are called as composite design patterns. These representations of design patterns are to give dynamic architectures or platforms for the robust problems. The recent developments in Map-Reduce in data processing provide better solutions to this problem. Hence, this research proposal of combining design patterns can be adopted for choosing an accurate service in the minimum time. Map-Reduce are well proven technique to analyze the terabyte data within few seconds. In this research work, Map-Reduce concept is applied for improving the Distributed Service Oriented Architecture; by scaling up to further in providing the outputs in minimum time. In addition to this, a new technique in composite of design patterns for dynamic adaptive distributed computing is proposed. Existing works have not applied Map-Reduce in composition of web services in distributed systems. Nowadays, most developers are interested only in developing a composite web services. This composition of web services comes into scope while solving complex request that needs to be handled by multiple servers. This kind of composition is more common in terms of REST (Representational State Transfer) based service oriented architecture. The three main technologies that are involved in developing web services are namely, SOAP, WSDL and UDDI. From this WSDL plays major role in developing SOA which has various features like Service Contract (SC), Service Lease (SL) etc., Most of the services are associated in this server as WSDL itself. This WSDL act as a contract between the client and server. Since REST based architecture works only on HTTP it needs a new paradigm to exchange the information which can be done through GET, POST, PUT and DELETE methods. In this paper, a new design pattern is proposed in which it merge the Map-Reduce strategy, visitor and factory based design pattern. Since, the visitor pattern makes the server in tightly coupled environment in terms of parallel processing, this method overcomes by using the combination of visitor along with use of map-reduce strategy. Also, the component based programming model emphasis the dynamic injection of components in runtime without disturbing the existing systems in terms of code and performance of the system. When we have huge combination of web services involved in solving a complex request, it is necessary to have a flow or sequential processing which is a time consuming process. Also, the client needs to undergone a challenge in finding out an appropriate services in a server. But this paper aims in proposing a dynamically changeable and scalable SOA in distributed systems. The
A Real Time Design Pattern Using MapReduce Strategy Visit …
303
proposed technique that uses Map-Reduce Strategic visit design pattern addresses all the above mentioned issues in efficient way.
2 Related Works There are many works pertaining to the design patterns and most of the researches were proposed by various researchers are summarized and some of their proposals are discussed here. This survey presents some works that deals with different Design patterns, and some publications reporting the reusability features of the design patterns when the same problem occurs in the future. Demian et al. [1] proposed their view of Web Services composition Architectures and techniques used to generate new services that provides the feasibility of design patters in SOA. Also, Selvamani et al. [2, 3] discussed the proposal of A Flexible Web Services Environment with High Range of Security and QoS Mechanisms for providing the security concerns in web services. They also discussed the concerns of quality of service request and service response in web services within the short time period. A spectual Feature Modules was proposed by Apel et al. [4] was very keen in developing web services in robust manner. Prasad Vasireddy et al. [5, 6] discussed applying an Autonomic Design Pattern which is an amalgamation of chain of responsibility and visitor patterns that can be used to analyze or design self-adaptive systems. They harvested this pattern and applied it on unstructured peer to peer networks and Web services environments. Moreover, Beugnard et al. [7, 8] proposed an Adaptive Strategy Design Pattern that can be used to analyze or design self-adaptive systems. This system makes the significant components usually involved in a self-adaptive system explicit, and studies their interactions. It also shows how the components participate in the adaptation process, for characterizing their properties. A Review of Dynamic Web Service Composition Techniques was presented by Antony et al. [1] which discusses the composition techniques for dynamic web services. Finally, Batory et al. [9] explained the Step-wise refinement which is a powerful paradigm for developing a complex program from a simple program by adding features incrementally. They also presented the AHEAD (Algebraic Hierarchical Equations for Application Design) model that shows how step-wise refinement scales to synthesize multiple programs and multiple non-code representations. Hence it is necessary to propose a model that suits best solution for the Distributed Service Oriented Architecture using design patterns. So, this paper proposed a new technique for Distributed Service Oriented Architecture using design patterns.
304
K. Selvamani et al.
3 Proposed Composite Patterns This proposed technique deliberates the client request on the server side since it doesn’t need to overload the client to find the server in order to handle their request in distributed environment. For e.g., if the case based reasoning is allowed, the complexity of client will be O(n) in terms of finding a server. Hence, the client request is allowed to have request to the main server for further processing. In this the use of combination of patterns like Map-Reduce strategy, Visitor and Factory pattern finds suitable solutions for Distributed SOA. When the client makes the request, it is directly allowed with certain credentials to reach the server. From the Server, it will use Map Reduce programming model to split up the composite web services request into individual chunks of independent operations (Map and Reduce Function) and give the call to the appropriate servers using strategic and factory pattern. After splitting up the request into composite of individual chunks, the MapReduce strategy will choose the server to process the client request. Hence, from the targeted servers, the request travels using the visitor pattern to the server composite requests [if needed]. After getting the response from each server, it will be reduced to the single response and it will be sent back to the client. By this way, this technique handles the complex request in a more efficient way. Suppose, if the client request is allowed to choose the server, then it needs to check and retrieve all the WSDL servers which will load all the servers even though it’s not the right server to connect. Hence, it increases the complexity in finding a server rather than accessing the service. This way of narrowing down the complex request also makes flexible in adding a new service without any disturbance to the existing systems. Hence, this approach has the capabilities of Web Services Composition, Dynamic Service Invocation, and Inclusion of the new Service Operation as a Component based model. Figure 1 shows the class diagram for Map Reduce strategy visit design pattern. The intent of design pattern will reduce the time complexity and handles complex request in distributed SOA in a efficient way. So that the design pattern for DSOA achieves good performance and server efficiency. This class diagram has various components namely client, MapReduce server, Strategic, Visitor, Server 1, Server 2, Server 3, Server 4, Server 5, Refined Server 1 and they are explained below. The client gets the request and calls the map-reducer server to process it. This client Api will be browser or it can also be a third party Api call for the necessary services that it needs. This client is responsible for triggering the request for the requested service by them to the service. The MapReduce Server role is to receive any request from the client or third party and it will process the request based on the user credentials. It will also check for necessary further processing methods. Hence, if needed it splits the request into independent chunks of individual request according to the possibility. Moreover it gets the responses from each server and applies MapReduce function to merge the
A Real Time Design Pattern Using MapReduce Strategy Visit …
305
Fig. 1 Class diagram for map reduce strategic visit design pattern
results. It plays a vital role in parallel processing the logic that varies according to requirement. This strategic server mapper using strategy pattern to choose appropriate server for the each individual request which uses factory based design pattern to create the WSDL Server instance for access it. This gives the dynamic flexibility of adding the servers to improve the performance of DSOA. Also it empowers us with loose coupling of codes. The server provides various services which uses visitor and visitable approach to serve the inter-composition of service request from other servers or from the client. Suppose if Map Reduce is failed to apply, then this approach will be hard for the server to serve complex requests.
Fig. 2 Sequence diagram for map reduce strategic visit design
306
K. Selvamani et al.
Fig. 3 Before applying pattern
The sequence diagram for the proposed DSOA using design patterns is shown in Fig. 3. The sequence diagram has various processes as given below, Step 1: Client sends the requested service parameters to the MapReduce server. Step 2: MapReduce Server split the request and send an individual request to Servers. Step 3: The servers will take the help from other servers to serve the complex request. Step 4: Each server will return the response back to MapReduce server which provides as a single response to the client. This technique is simulated in a real time environment and some sample code has been shown in Fig. 3.
4 Implementation Sample Code MapReducer
RequestMapper() { //Analyses the request //create the key value pairs of individual chunks of request } ServerMapper() { //Get the request from the request mapper and map the request to appropriate server // create the server object.
A Real Time Design Pattern Using MapReduce Strategy Visit …
307
} StrategicHandler() { //Acts as interface implemented by all servers // provides loose coupling and enables flexibility in adding servers } RequestReducer() { // receive the response from all the individual chunks of request created by the request mapper // combine them to form the response } Visitor() { // Each server implement visitor to serve the complex request // handle the inter-dependency between the servers
4.1 Results This proposed technique is simulated by having three servers along with the MapReduce Server in Java Technology with the help of Spring Framework. It also uses Jprofiler for profiling the response time. Simulated Complex request is the task combination of request served by all the servers that is configured. When this system is implemented without design pattern for handling the request it took average of 8600 ms as show below. After successfully implementing the proposed design pattern for request handling, it have analysed the same complex request with Jprofiler and it prove that our pattern is more efficient as shown Fig. 4.
Fig. 4 After applying pattern
308
K. Selvamani et al.
5 Conclusions This proposed design pattern effectively solves the complex request containing composition of web services with SOA in distributed systems. In addition, this design pattern has the capability to scale of in terms of distributed as well as component feature based model by dynamically injecting a service. Hence, this proposed technique proves to be more efficient in handling the complex request as well as solving the dynamic request when compared with the existing system. This proposed two fold approximates shown considerable improvement in performance of handling the various complex request. This system has given a direction to apply MapReduce model in SOA and open the future enhancements to improve further.
References 1. Demian Antony D, Ananthanarayana VS, Salian S (2011) A review of dynamic web service composition techniques. Commun Comput Inf Sci 133:85–97 2. Chakaravarthi S, Selvamani K (2014) A flexible web services environment with high range of security and QoS mechanisms. J Theor Appl Inf Technol 61(1):60–66 3. Indumathi CP, Selvamani K (2015) Test cases prioritization using open dependency structure algorithm. Proc Comput Sci 48:250–255 4. Apel S, Leich T, Saake G (2008) Aspectual feature modules. IEEE Trans Softw Eng 34(2):162– 180 5. Vishnuvardhan M, Ramesh T (2012) An aspectual feature module based adaptive design pattern for autonomic computing systems. In: Intelligent information and database systems—4th Asian conference, ACIIDS 2012, proceedings, part III, Springer, lecture notes in computer science, vol 7198/2012, pp 130–140 6. Kuhlemann M, Rosenmuller M, Apel S, Leich T (2007) On the duality of aspect oriented and feature-oriented design patterns. In: Proceedings of the 6th workshop on Aspects, components, and patterns for infrastructure software, ACM, ACP4IS’07, New York, NY, USA 7. Ramirez AJ, Cheng BHC (2010) Design patterns for developing dynamically adaptive systems. In: Proceedings of the 2010 ICSE workshop on software engineering for adaptive and selfmanaging systems, ACM, SEAMS’10, New York, NY, USA, pp 49–58 8. Aubert O, Beugnard A (2001) Adaptive strategy design pattern. d’Informatique des Telecommunications, ENST Bretagne, France 9. Batory D, Sarvela JN, Rauschmayer A (2004) Scaling step-wise refinement. IEEE Trans Softw Eng 30(6):187–197
Deep Learning Based Gesture Recognition System Using EMG Spectrogram Data Praahas Amin
and Airani Mohammad Khan
Abstract Myoelectric control systems are used for prosthetic control and for human machine interfaces. Gestures can be used to provide input commands to these systems. The commands given to the systems can be uniquely identified based on the patterns in the myoelectric signal. Spectrogram is a representation in a visual manner of the spectrum of frequencies of a signal with its variation in time. Deep learning methods such as Convolution Neural Networks can be employed to recognize patterns in the signal. In this study a 2-Dimensional Convolution Neural Network is proposed to recognize 5 distinct gestures by recognizing distinct patterns in the spectrogram of the myoelectric signal. Data was acquired from 10 different participants, and the study was conducted on the data of 2 randomly chosen participants. The dataset for each participant consists of 360 samples with 72 samples for each gesture. The model showed a training accuracy within the range of 84–86% and a validation accuracy within the range of 71–72%. The results are cross validated by using fivefold cross validation. Keywords Neural networks · Gesture recognition · Myoelectric control system
1 Introduction and Background Work 1.1 Myoelectric Control Systems Myoelectric control systems (MCS) use muscle signal patterns to operate prosthetic limbs, orthotics, and assistive technology. MCS for gesture recognition using deep learning is introduced in the paper. MCS was created to help people with limb loss or paralysis use assistive technology and prosthetics. MCS have evolved from singlechannel to wearable multi-channel systems that provide consumers more control
P. Amin (B) · A. M. Khan Department of Electronics, Mangalore University, Mangalore, KA 574199, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_29
309
310
P. Amin and A. M. Khan
over their devices. MCS controls prosthetics and assistive devices using musclegenerated electromyographic signals. Muscle contractions generate these signals. MCS transform this information into control signals for an artificial limb or assistive device. MCS has sensors, signal processing algorithms, and actuators. Sensors detect muscle-generated EMG (Electromyography) signals. EMG signals are captured and converted into actuator control directives via signal processing using deep learning methods. Motors, hydraulic, or pneumatic actuators move the device.
1.2 Machine Learning and Deep Learning in Myoelectric Control Systems Research work on MCS has increasingly used machine learning methods. Machine learning has been utilized to improve EMG sensor control signals and construct more complex control algorithms. Pattern recognition can be performed using machine learning in MCS. Machine learning algorithms discover patterns in muscle-generated EMG data and translate them into control commands by recognizing distinct gestures. These algorithms can provide users more natural and intuitive control over their devices by recognizing and understanding their actions rather than just detecting motions. Machine learning creates adaptive control algorithms in MCS. These algorithms utilize machine learning to understand the user’s motions and control patterns, enabling more accurate and reliable control. These algorithms may also be trained to recognize particular actions, allowing gadgets to be controlled with simple gestures. Machine learning in MCS can improve control signal precision and reliability and make device control more intuitive and natural. However, machine learning in this application faces challenges including the need for large amounts of data to train algorithms and the risk of overfitting, which makes algorithms too specialized and loses their generalizability. In [1] they used machine learning to evaluate the classification performance of Multi-Layer Perceptron, K-Nearest Neighbors, Decision Tree, Linear Discriminant Analysis, Quadratic Discriminant Analysis, and Naïve Bayes classifiers to classify default myo-armband patterns. They achieved over 90% singlemovement accuracy using Multi-Layer Perceptron over 80% accuracy with the rest of the models. They recommend expanding the feature vector for accuracy. Signal Mean Absolute Value, RMS Value, Integrated EMG, and Variance were examined. In [2] they describe a transient state surface EMG signal-based gesture detection system employing Discrete Wavelet Transform and Gaussian Kernel SVM classifier. Using 750 ms, this study achieved 87.5% accuracy and classify 5 gestures. In [3] they proposed an SVM classification model which performed with a gesture classification accuracy of 89%. They suggest employing an absolute value function and a Low-Pass Butterworth filter to decrease noise in raw data. In the context of MCS, Artificial Neural Networks (ANNs) have been utilized to categorize gestures from the EMG signals generated by the muscles. The fundamental concept is to teach the ANN the link between EMG data and related gestures. Once the
Deep Learning Based Gesture Recognition System Using EMG …
311
ANN has been trained, it may be utilized to categorize new motions based on EMG inputs. Multiple forms of ANNs, including Multi-layer Perceptron networks, Convolutional Neural Networks (CNN), and Recurrent Neural Networks (RNN), have been utilized for this job. The type of ANN used is determined by the unique needs of the application, such as the type and quantity of accessible data, the complexity of the gestures to be identified, and the processing resources available. One of the key benefits of employing ANNs for gesture categorization is that they can handle vast volumes of data and can learn intricate correlations between the EMG signals and the movements. This enables the categorization of gestures to attain high levels of accuracy, even when the motions are very varied or difficult to distinguish from one another. The application of ANNs for the classification of gestures for MCS is a very promising field of study and development. ANNs have the ability to deliver precise and reliable classification of gestures, enabling users to operate their gadgets using gestures and motions that are intuitive and natural. Deep learning algorithms are a form of machine learning algorithms that are modelled after the structure and function of the human brain. These algorithms have been utilized to increase the accuracy and reliability of the control signals supplied by the EMG sensors and to construct more complex control algorithms. Accurate categorization of gestures based on the EMG signals generated by the muscles is one of the primary obstacles in the development of MCS. Muscle-generated EMG signals have been analyzed by the Deep Neural Networks and subsequently translated into control commands. It has been demonstrated that these algorithms give users with more precise and dependable control over their gadgets, since they can identify and understand the purpose behind the motions of the user, as opposed to only recognizing individual movements. Using deep learning in MCS has the potential to significantly enhance the precision and dependability of the control signals and give users with more intuitive and natural device control. However, there are obstacles connected with the application of deep learning in this sector, such as the requirement for huge volumes of training data, the risk of overfitting, and the computational cost of these techniques. Critical to the development of MCS is the precise classification of musclegenerated EMG signals as gestures. To solve this difficulty, both traditional machine learning techniques and ANNs have been employed. For gesture classification in MCS, traditional machine learning techniques including as decision trees, random forests, support vector machines, and k-nearest neighbors have been utilized. In certain instances, it has been demonstrated that these algorithms produce excellent performance, particularly when the gestures are well-defined and the relationships between the EMG signals and the movements are clear. In the context of MCS, however, standard machine learning techniques have some drawbacks. For instance, they may fail to manage vast volumes of data and may be incapable of learning the complicated correlations between EMG signals and motions. This can reduce the categorization accuracy of gestures, especially when the motions are very varied or difficult to identify from one another. On the other hand, ANNs have demonstrated higher performance in the categorization of gestures in MCS. ANNs are machine learning algorithms fashioned after the structure and function of the human brain; they have been demonstrated to
312
P. Amin and A. M. Khan
be very successful for pattern identification and classification tasks. In the context of MCS, ANNs have been utilized to recognize muscle-generated EMG data as gestures. The fundamental concept is to teach the ANN the link between EMG data and related gestures. Once the ANN has been trained, it may be utilized to categorize new motions based on EMG inputs. One of the primary advantages of utilizing ANNs for gesture categorization is that they can process vast volumes of data and understand intricate correlations between EMG signals and the movements. This makes it feasible to classify movements with a high degree of precision, even when the motions are very varied or difficult to identify from one another. It has been demonstrated that ANNs perform better than traditional machine learning methods in the categorization of gestures in MCS. ANNs are capable of handling enormous quantities of data, learning complicated correlations between EMG signals and gestures, and achieving high levels of accuracy in gesture categorization, which makes them a highly promising method for this job. Convolutional Neural Networks are frequently employed in image and video identification applications and have shown successful in capturing the spatial correlations between various visual components. In the framework of MCS, 1-Dimensional Convolutional Neural Networks (1D-CNN) have been employed to categorize muscle-generated EMG data as gestures. A 1D-CNN works by applying a series of filters to the input EMG signal data in order to extract local characteristics from the signals. Then, a sequence of layers is applied to these characteristics in order to learn progressively complicated representations of the input data. The final output layer of the 1D-CNN is then used for the categorization of gestures. The primary benefit of employing 1D-CNNs for gesture categorization in MCS is their ability to capture the temporal correlations between the EMG data and the related motions. This is significant because muscle-generated EMG signals are time-dependent and contain information about the user’s motions and gestures. By analyzing the signals in a manner that considers their temporal correlations, 1D-CNNs are able to classify gestures with greater precision than typical machine learning algorithms that do not take temporal information into consideration. Furthermore, they are computationally economical compared to other forms of deep learning networks and are capable of handling vast volumes of data. This enables the training of 1D-CNNs on huge datasets, which is crucial for boosting the accuracy of gesture categorization. In conclusion, it has been demonstrated that the usage of 1D-CNN is an efficient method for the categorization of gestures in MCS. 1D-CNNs are capable of capturing the temporal correlations between EMG data and gestures and are computationally efficient, making them a potential technique for this job. They introduced a 1D-CNN model for the identification of degradation of structures making use of vibration data in [4 5. The recorded acceleration values, which fell between 5 and 50 Hz for random excitations, were utilized as the data. Using an accelerometer, acceleration measurements were collected at a rate of 200 Hz. This raw acceleration data was analyzed by the algorithm, which then calculated a one-point rating that represented the state of the damage. The research provided in [6, 7 investigated the effectiveness of conventional machine learning methods used in the recognition of surface EMG signals as well as the consequences of reduction of dimension on the accuracy of
Deep Learning Based Gesture Recognition System Using EMG …
313
classification. It has been established that reducing the data’s dimensionality does not result in a substantial loss of precision. The research published in [8] tested the classification performance of the proposed 1-Dimensional Convolution Neural Network model using surface EMG. 1-Dimensional Convolution Neural Network was determined to be accurate to an extent of 82%. The datasets utilized in this work were collected as described in [9–11] and they consist of Surface EMG data obtained using a myo-armband. They have utilized a 1D-CNN model in [12] for detection of failure of motor in real-time and perform status monitoring of the motor. Electric current served as the data source for this investigation. In terms of spotting faults, the model was about 97% accurate. They propose a deep learning model for the identification of faults in bearing in [13]. The raw vibrational data of the bearings was gathered. During the fault identification procedure, a 97% accuracy was measured and reported. In [14], a 1-D CNN model was utilized in order to discover switch problems in a modular multilevel converter. Additionally, they have utilized overlapping time periods. In order to confirm the validity of the study’s findings, a tenfold cross validation was conducted. They have proved that their model was either 99% accurate. They developed in [15] a feature extraction technique using Convolution Neural Network, that was capable of performing effective feature extraction. Using a combination of Convolution Neural Network derived features and traditional features improved the accuracy of the SVM, LDA, and KNN by 4.35%, 3.62%, and 4.75%, respectively. In addition, they recommend dimensionality reduction through CNN. The research provided in [6, 7 investigated the effectiveness of conventional machine learning methods used for the classification of SEMG signals as well as the consequences of dimensionality reduction on classification accuracy. It has been established that reducing the data’s dimensionality does not result in a substantial loss of precision. The research published in [8] tested the classification performance of the proposed 1-D CNN model using surface EMG and a 1-D CNN. The 1-D CNN Model was determined to be accurate to an extent of 82%.
1.3 Spectrogram of EMG for Gesture Classification MCS categorizes motions using muscle-generated EMG data. EMG signals are timeseries signals that reveal muscle activation patterns and may differentiate movements. The EMG signal spectrum is utilized to classify gestures. Spectrograms show signal frequency over time. The Short-Time Fourier Transform (STFT) on a signal window yields it. STFT converts time-domain data to frequency domain, where various frequencies correspond to different muscle activation patterns. Myoelectric control systems (MCS) use muscle impulses to operate prosthetic limbs, orthotics, and assistive technology. Spectrogram indicates frequency content fluctuations over time and therefore it is useful. Since different muscle activation patterns are connected to different movements, this information can differentiate specific gestures. In MCS, a Support Vector Machine or Neural Network classifies gestures using the EMG signal’s Spectrogram. The application uses the Spectrogram to identify muscle
314
P. Amin and A. M. Khan
activity patterns and predict movements. The EMG Spectrogram is not the only representation for gesture classification. Other representations include time-domain, frequency-domain, and mixed representations. However, the Spectrogram is popular because it visually depicts the signal’s frequency content with time. In conclusion, the spectrogram of the EMG signal provides an effective representation for classifying gestures in MCS. The spectrogram provides essential information for gesture detection and control by converting the time-domain signal to the frequency domain and showing the variations in frequency content over time. Combining the spectrogram of the EMG signal with machine learning methods has shown to be an efficient technique for classifying motions in MCS.
2 Methodology The experimental steps are detailed in this section. The setup consists of a Myo Armband by Thalmic Labs and a computer equipped with the software designed for automating the process of Data Acquisition. The armband is connected to the computer via Bluetooth dongle. They participant will be wearing the armband at the same location if the data was acquired over multiple sessions. The myo armband will be worn by the participant as shown in Fig. 1
2.1 Data Acquisition, Pre-processing and Segmentation The data acquisition steps are described in [6–8]. Bluetooth connects the participant’s forearm-worn myo armband to the computer. Data collecting software asks participants to relax or flex. The participant gestures for 5 s and relaxes 3 s. As demonstrated in Fig. 3, each gesture is processed in 8 s. This software simplifies data collection. After the session, the data is available as. CSVs Participant performs 5 hand movements, including rest. Figure 2 depicts the 5 gestures considered in the study. Each participant got 72 gesture examples. Hold 5 s, relax 3 s. Six cycles Fig. 1 Myo Armband worn by a participant during data acquisition
Deep Learning Based Gesture Recognition System Using EMG …
315
each session. Each session had 6 gesture repeats to reduce muscle fatigue. 6 gesture examples each session. 72 gestures from 12 sessions. Thus, each participant has 72 gesture examples, totaling 360 samples. During the preprocessing, the gesture windows are identified. The gesture window data is organized row-wise and labelled. Armband samples the signal every 5 ms. Therefore, a gesture window will be having 1600 samples in the window of 8 s. The data is then labelled and separated into training and validation data. The spectrogram for each gesture window computed and the spectrogram image is stored into appropriately labelled folders as Training Set and Validation set. A sample spectrogram image and corresponding gesture window are shown in Fig. 4. Data was acquired from 10 participants and data of 5 randomly chosen participants were used for this study.
Fig. 2 Gestures and corresponding labels Window 3
Window 2
Window 1 Rest
Contraction
Rest
3s
5s
3s
Contraction
Rest
Contraction
Rest
5s
3s
5s
3s
Fig. 3 Gesture window
Fig. 4 EMG signal and corresponding spectrogram
316
P. Amin and A. M. Khan
2.2 Classification The 2-Dimensional Convolution Neural Network model used for identification of hand gestures from the spectrogram of the EMG signal is shown in Fig. 5. The sequence of Convolution and Pooling layers perform feature extraction and dimensionality reduction on the spectrogram image given as input. Dropout layers are added to reduce the effect of overfitting. Finally, the extracted features are flattened and fed as input feature vector to the Fully Connected Network. The Fully Connected Network layer classifies the gestures based on input features. SoftMax activation function is used. Table 1 shows the description of the Model.
3 Results The model is evaluated using the Accuracy metric. The accuracy curves for the training and validation phase of the 2-D CNN Deep Learning model for classifying gestures from spectrogram is shown in Fig. 5. The accuracy of the 2-D CNN classifier for 2 randomly selected participants after performing K-Fold Cross Validation with K = 5 is shown in Table 2. The Accuracy Curve for the model is shown in Fig. 6. The 2-D CNN model that was used for categorizing gestures from spectrogram performed with a training accuracy that was between 84 and 86%, and a validation accuracy that ranged between 71 and 72%. The results of this study can be compared with the results observed in [8] where a consistent accuracy of 82% was observed with a 1-D CNN model which was used to classify the gestures directly from the time-domain EMG signal window.
4 Conclusion and Future Scope of Work The proposed 2-D Convolution Neural Network model for identification of gestures from spectrogram of EMG signals classifies gestures with an accuracy of 84–86% during the training phase and with an accuracy of 71–72% during the validation phase. The model performed consistently across all the participants’ data. As future work the proposed model could be used for implementation of human machine interfaces for applications such as Sign-Language to Speech conversion, Prosthetic Limb Control or Brain Computer interfaces.
Deep Learning Based Gesture Recognition System Using EMG …
Fig. 5 2-D CNN model for gesture classification from EMG spectrogram
317
318
P. Amin and A. M. Khan
Table 1 CNN model description Layer No.
Layer
Output shape
Parameter
1
Conv2D
(None, 286, 430, 32)
896
Max_Pooling
(None, 143, 215, 32)
0
Conv2D
(None, 141, 213, 128)
36,992
Max_Pooling
(None, 70, 106, 128)
0
3
Dropout
(None, 70, 106, 128)
0
4
Conv2D
(None, 68, 104, 64)
73,792
Max_Pooling
(None, 34, 52, 64)
0
5
Dropout
(None, 34, 52, 64)
0
6
Conv2D
(None, 32, 50, 32)
18,464
Max_Pooling
(None, 16, 25, 32)
0
7
Dropout
(None, 16, 25, 32)
0
8
Conv2D
(None, 14, 23, 32)
9248
Max_Pooling
(None, 7, 11, 32)
0
9
Dropout
(None, 7, 11, 32)
0
10
Flatten
(None, 2464)
0
Dense
(None,64)
9728
11
Dense
(None,32)
131,328
12
Dense
(None,16)
32,896
13
Dense
(None,5)
8256
2
Total Trainable Parameters
299,845
Table 2 Training and validation accuracy observed for 2-D CNN model Iteration
Train accuracy participant 1 (%)
Validation accuracy participant 1 (%)
Train accuracy participant 2 (%)
Validation accuracy participant 2 (%)
1
87
71
85
73
2
80
67
82
70
3
88
68
83
71
4
86
80
87
68
5
88
68
82
79
Average accuracy
86
71
84
72
Deep Learning Based Gesture Recognition System Using EMG …
319
Fig. 6 Accuracy of 2-D CNN model
References 1. Freitas MLB, Mendes JJA, Campos DP, Stevan SL (2019) Hand gestures classification using multichannel SEMG armband. In: Proceedings of the 26th Brazilian congress on biomedical engineering. In: Costa-Felix R, Machado JC, Alvarenga AV (eds) IFMBE proceedings. Singapore, pp 239–246 2. Yanez AJ, Unapanta L, Benalcázar ME (2019) Short-term hand gesture recognition using electromyography in the transient state, support vector machines, and discrete wavelet transform. In: Proceedings of the 2019 IEEE Latin American conference on computational intelligence (LA-CCI). IEEE, Guayaquil, Ecuador, pp 1–6 3. Chen W, Zhang Z (2019) Hand gesture recognition using SEMG signals based on support vector machine. In: Proceedings of the IEEE 8th joint international information technology and artificial intelligence conference, Chongqing, China, IEEE, pp 230–234 4. Abdeljaber O, Avci O, Kiranyaz S, Gabbouj M, Inman DJ (2017) Real-time vibration-based structural damage detection using one-dimensional convolutional neural networks. J Sound Vibr 388:154–170 5. Abdeljaber O, Avci O, Kiranyaz MS, Boashash B, Sodano H, Inman DJ (2018) 1-D CNNs for structural damage detection: verification on a structural health monitoring benchmark data. Neurocomputing 275:1308–1317 6. Amin P, Khan AM, Bhat AR, Rao G (2020) Feature extraction and classification of gestures from myo-electric data using a neural network classifier. In: Evolution in computational intelligence. advances in intelligent systems and computing, vol 1176. Springer 7. Amin P, Khan AM (2021) A study on the effect of dimensionality reduction on classification accuracy of myoelectric control systems. In: Advances in VLSI, signal processing, power electronics, IoT, communication and embedded systems. Lecture notes in electrical engineering, vol.752. Springer 8. Amin P, Khan AM (2021) A study on the application of one dimension convolutional neural network for classification of gestures from surface electromyography data. In: Proceedings of the 2021 IEEE international conference on distributed computing, VLSI, electrical circuits and robotics (DISCOVER). Springer, pp 7–11 9. Amin P, Khan AM (2021) Raw surface electromyography dataset from myo arm band. In: Mendeley data, V1. https://doi.org/10.17632/d4y7fm3g79.1 10. Amin P, Khan AM (2022) EMG data set for machine learning and deep learning problems. In: Mendeley data, V1. https://doi.org/10.17632/3r6hynp5xs.1 11. Amin P, Khan AM (2022) Spectrogram of surface EMG data obtained from myo armband. In: Mendeley data, V2. https://doi.org/10.17632/xz38kw7m3d.2
320
P. Amin and A. M. Khan
12. Eren L (2017) Bearing fault detection by one-dimensional convolutional neural networks. In: Mathematical problems in engineering, Hindawi 13. Eren L, Ince T, Kiranyaz S (2019) A generic intelligent bearing fault diagnosis system using compact adaptive 1D CNN classifier. J Signal Process Syst 91:179–189 14. Kiranyaz S, Gastli A, Ben-Brahim L, Alemadi N, Gabbouj M (2019) Real-time fault detection and identification for MMC using 1D convolutional neural networks. IEEE Trans Industr Electron 66(11):8760–8771 15. Chen H, Zhang Y, Li G, Fang Y, Liu H (2020) Surface electromyography feature extraction via convolutional neural network. Int J Mach Learn Cybern 11(1):185–196
Optimization in Fuzzy Clustering: A Review Kanika Bhalla and Anjana Gosain
Abstract Fuzzy clustering effectively handles the problem of mystically separable data through fuzzy partitioning. Popularly known as soft clustering, fuzzy clustering is based on membership degree of each data point. Various fuzzy clustering algorithms have been proposed in the literature. These algorithms work well in lower dimensions but are unable to find the global optimum in higher dimension. This problem has been solved by hybridizing the fuzzy clustering algorithms with various optimization algorithms. In this paper, we have reviewed four fuzzy clustering algorithms FCM, KFCM, IFCM, and KIFCM that have been optimized by hybridizing with various metaheuristic algorithms PSO, GA, FA, ACO, and ABC to further improve their performance. Keywords Fuzzy clustering · Optimization · Metaheuristic algorithms
1 Introduction Clustering is an unsupervised way of data mining, where elements within a group are highly similar whereas they are different from the elements of all other groups [1, 2]. Clustering can be categorized as Soft (Fuzzy) Clustering and Hard Clustering. In hard clustering, the membership degree of each data element is either zero or one whereas in soft (fuzzy) clustering, elements share some fragment of membership in more than one group [3]. Fuzzy clustering [4] makes use of fuzzy sets, whose data elements have membership degrees. Fuzzy C Means (FCM), one of the popular fuzzy clustering algorithms developed by Bezdek [5] gives good results for outlier free data but, fails to discriminate between data elements and outliers or noise. To overcome this problem, different versions of K. Bhalla (B) · A. Gosain USICT, Guru Gobind Singh Indraprastha University, Delhi, India e-mail: [email protected] A. Gosain e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_30
321
322
K. Bhalla and A. Gosain
FCM were proposed in the literature like PCM, PFCM and NC [6, 7]. Though they resolve the problem of noise, but are unable to give better results when clusters are of different sizes [8]. In 2003, Zang and Chen proposed Kernelized Fuzzy C-Means Algorithm (KFCM) that works well with clusters which are unequal in size [6, 9]. In 2011, Chaira proposed IFCM which used hesitation degree along with membership function and efficiently form clusters from non- spherically separable data [10]. In 2013, Lin proposed KIFCM which refined the kernel function by proposing Radial Basis kernel function to improve the accuracy of the IFCM [11, 12]. The above discussed algorithms work well in lower dimensions but failed in higher dimension as they are unable to find the global optimum. To solve this problem, researchers proposed to hybridize various global optimization algorithms to further optimize the objective function of fuzzy clustering algorithms [8, 10, 13–19]. They viewed the complete FCM clustering task as an optimization problem and tried to give the solution by hybridizing it with different metaheuristics algorithms like genetic algorithm (GA), particle swarm optimization (PSO), Fish swarm algorithm (FSA), Artificial Bee Colony (ABC) and ant colony optimization (ACO). PSO is an efficient heuristic algorithm works effectively for finding the solution of the problem of global optimization [20]. GA is a variant of evolutionary algorithm that is widely used in the field of artificial intelligence. This approach mitigates the problem for clustering algorithms of choosing an initialization [21]. ACO is a nature inspired heuristic technique that was implemented by many researchers as it was able to generate ameliorated cluster centers [22]. FA is a multi-modal heuristic algorithm inspired by nature based on firefly bearings. It optimizes c-means algorithm by correlating the objective function and the distance used in the algorithm [23, 24]. ABC algorithm is inspired by the food searching capability of honey bees and finds the optimum solution of the cluster centers [25]. In this paper we have considered four fuzzy clustering algorithms FCM, IFCM, KFCM and KIFCM which have been hybridized with different optimization techniques that are PSO, FSA, GA, ACO, FA and ABC. We have reviewed forty-nine papers in total out of which in twenty nine papers FCM has been hybridized with PSO, FSA, GA, ACO, FA and ABC, in eight papers in KFCM hybridized with PSO, GA, Differential Evolution Algorithm, FA, again in eight papers IFCM with GA, ABC, FA, PSO and in rest four papers in KIFCM has been hybridized with GA, PSO and ABC. The paper is organized as follows: Sect. 2 briefly discusses the various techniques of fuzzy clustering. Section 3 reviews the work proposed by different researchers where they have used fuzzy clustering with optimization based on different parameters. Section 4 discusses the bibliographic analysis and Sect. 5 discusses the conclusion and the future scope of the work.
Optimization in Fuzzy Clustering: A Review
323
2 Fuzzy Clustering Algorithms This section briefly discusses four fuzzy clustering algorithms which have been used in the paper that are FCM, KFCM, IFCM and KIFCM.
2.1 The Fuzzy C-Means Clustering Algorithm (FCM) [9] FCM is considered to be one of the most popular fuzzy clustering algorithms. It assumes that ‘c’ number of clusters is already known for the given dataset and hence it minimizes the objective function (JFCM ), given as: JFC M =
n c ∑ ∑
m 2 u ik dik
(1)
k=1 i=1
where’u ik ’ is the membership degree and vk is the centroid. To update u ik , function the following equation is used. u ik =
1 2 ∀k, I ∑c dki m−1 i=1
(2)
d ji
∑n m u ik si vk = ∑i=1 n m i=0 u ik
(3)
2.2 The Kernel Fuzzy C-Means Algorithm (KFCM) [26] This algorithm has a much better performance than FCM. It improves the clustering performance by non-linear mapping of the data points of the input space. It replaced the Euclidean distance with kernel function K which is defined as: K = 1 − e(−||si −v j ||
2
/σ 2 )
(4)
σi is considered as the weighted mean distance the i cluster. The objective function is as follows: JK I FC M =
n c ∑ ∑
∗m ||φ(sk ) − φ(vi )||2 u ik
(5)
i=1 i=0
||φ(sk ) − φ(vi )||2 = K(sk , sk ) + K(vi , vi ) − 2K(sk , vi )
(6)
324
K. Bhalla and A. Gosain
2.3 The Intuitionistic Fuzzy C-Means Algorithm (IFCM) [7] Intuitionistic fuzzy c-means algorithm [13] function basically consists of two broad terms: (i) intuitionistic fuzzy entropy (IFE) and (ii) objective function of traditional FCM is modified by using Intuitionistic fuzzy set. IFCM minimizes the objective function as: JI FC M =
n c ∑ ∑ i=1 k=1
∗m 2 u ik dik
+
c ∑
∗
ηi∗ e1−ηi
(7)
i=1
ηik is the new introduced hesitation degree, that can be defined as: α (1/α) ηik = 1 − u ik − (1 − u ik ) ,α > 0
and ηi∗ =
n 1 ∑ ηik , where k ∈ [1, N] N k=1
(8)
The centroid and the membership function are updated at iteration. This process continues until the new value of the membership function becomes equal to its previous value. The biggest drawback of using this algorithm is that it cannot be implemented in that data which is nonlinearly separable. It can only detect hyper spherical clusters from the data.
2.4 The Kernel Version of Intuitionistic Fuzzy C-Means Algorithm (KIFCM) [19] This algorithm is an improved version of the existing algorithms. It improves the accuracy of IFCM by introducing a new distance measure which is mapped into feature space. It improves the kernel function by introducing Radial Basis kernel function to calculate the distance between the centre of the cluster and the data points. This results in improving the accuracy of IFCM. Objective function of KIFCM is: JK I FC M =
c ∑ n ∑ i=1 i=0
∗m ||φ(sk ) − φ(vi )||2 + u ik
c ∑
πi∗ e1−πi
2
(9)
i=1
This algorithm works well with non-hyper spherically separable data that have uneven densities [13].
Optimization in Fuzzy Clustering: A Review
325
3 Literature Review In this section we have reviewed four fuzzy clustering algorithms that are FCM, KFCM, IFCM and KIFCM which have been hybridized with various optimization techniques on the basis of four parameters that are observation, number of citations, data set used and publication in journal or conference. Figure 1 shows the approach followed in optimizing the fuzzy clustering algorithms.
3.1 Hybridizing FCM with Optimization Techniques In this sub section, we focus on the algorithm where FCM has been hybridized with different optimization algorithms (Table 1).
3.2 Hybridizing KFCM with Optimization Techniques In this sub section those algorithms have been discussed where KFCM has been hybridized with different optimization techniques (Table 2).
3.3 Hybridizing IFCM with Optimization Techniques In this sub section we have reviewed those algorithms where the objective function is optimized by hybridizing IFCM with various metaheuristic algorithms (Table 3).
3.4 Hybridizing KIFCM with Optimization Techniques In this sub section we have discussed various algorithms where KIFCM is hybridized with various optimization algorithms to improve its performance (Table 4).
Data- Set
Optimization Algorithms
Fuzzy Clustering Algorithms
Clusters
Fig. 1 Hybridization of fuzzy clustering algorithms with optimization algorithms
326
K. Bhalla and A. Gosain
Table 1 FCM with optimization techniques S. Author and No. year
Optimization Observation technique
1
Filho and Telmo (2015) [27]
Particle swarm optimization
2
Citations Data set used
Journal/ conference
FCM objective 55 function minimized using two algorithms PSO-V and PSO-U
Single outlier, IEEE conference lung cancer data set
Izakian et al.
FCM-IDPSO 94 and FCM2-IDPSO. FCM-IDPSO are proposed which hybridizes FCM with PSO
Synthetic datasets
3
Binu and George (2013) [28]
FCM is 27 combined with fuzzy PSO
Fisher’s iris data IEEE conference set, Glass, Wisconsin Breast cancer, wine Contraceptive method choice, vowel
4
Jang et al. (2007) [29]
PSO is modified by using fuzzy metrics to solve the travelling salesman problem
39
PC
IEEE conference
5
Wang (2006) [30]
FCM is optimized using PPPSO and results are compared
9
2-D test data, 3-D iris data
Springer conference
6
Cheng et al. (2009) [31]
PSOFCM is proposed and results are compared
15
Iris data set
IEEE conference
7
Li (2012) [32]
FCM is 173 extended using PSO to improve the cluster centres
Realistic power network
IEEE Transactions
Journal Expert System with Application
(continued)
Optimization in Fuzzy Clustering: A Review
327
Table 1 (continued) S. Author and No. year
Optimization Observation technique
Citations Data set used
Journal/ conference
8
Mei (2013) [33]
PSO is 82 combined with ICMIC and AIWF and extended with FCM to propose CPSFC
Iris, wine, vowel, glass, liver disorder, vowel 2, triangular
Journal Neurocomputing
9
Hanuman et al. (2021) [12]
PSO is hybridized with FCM
Brain images
Journal Expert System with Application
10
Biswal et al. (2008) [34]
Fish swarm algorithm
IAFSA and 20 IAFSC is proposed and compared with PSO, k-means and FCM
Iris data set
IEEE conference
11
Ding et al. (2016) [35]
Genetic algorithm
FCM is hybridized with genetic algorithm and SV regression
136
Glass, Haberman, iris, musk1, wine, yeast
Journal Information Sciences
12
Cordón et al. (1997) [36]
The fitness value of GA improved by hybridizing with FCM
29
Tunning of fuzzy default controller
Conference
13
Karr et al. (1990) [37]
A fuzzy system design is proposed which makes use of GA and makes it automatic
220
Inverted pendulum problem
IEEE conference
14
Tang (2015) [38]
Membership 149 function of FCM is manipulated using GA for the simulation of autonomous rendezvous of a spacecraft
Application spacecraft
Conference
10
(continued)
328
K. Bhalla and A. Gosain
Table 1 (continued) S. Author and No. year
Optimization Observation technique
Citations Data set used
Journal/ conference
15
Halder et al. (2011) [39]
FCM is 86 integrated with GA to develop a model for estimation of missing traffic volume data
Traffic volume data
Journal
16
Arabas et al. (1994) [40]
FCM is hybridized using GA for image segmentation
Images
Journal
17
Cheng et al. (2002) [41]
Broader 39 clustering class is optimized using GA
Prototype variables
IEEE conference
18
Ayvaz et al. (2007) [42]
Two methods 254 are proposed for automatic calibration and used to solve the calibration problem
Application area Journal of is Shanghai Hydrology Reservoir
19
Runkler (2005) [43]
Ant based clustering makes the raw clusters which are then are refined using FCM
Iris plant database, wine recognition database, glass
IEEE conference
20
Karaboga et al. (2010) [44]
ACO is used to 54 optimize the objective function of FCM
Single outlier dataset Lung cancer dataset
Journal Wiley InterScience
21
Yu
Ant Colony is 85 hybridized with FCM and a new method is proposed namely AFHA
Images
Journal Pattern Recognition
22
Das (2010) [45]
ACO is 109 implemented with FCM and used for image segmentation
Images
Journal
Ant colony optimization
60
53
Neurocomputing
(continued)
Optimization in Fuzzy Clustering: A Review
329
Table 1 (continued) S. Author and No. year 23
Alsmadi (2014) [46]
24
Nayak (2014) [47]
25
Hassanzadeh and Kanan (2014) [48]
Optimization Observation technique
Firefly algorithm
Citations Data set used
Review is done 96 on ant-based method and placed with clustering approaches based on swarm
Sparse data set
Journal Swarm Intelligence
FCM is hybridized with FA and a new algorithm is proposed
56
MBIC IBSR
Journal
A new method is proposed that combines FA with FCM and named as FAFCM
12
Glass dataset
Journal
The performance is compared with FCM and PSO 26
Lu et al. (2019) [49]
FFA is proposed that hybridizes FCM with FA which improves FA global search
27
Ghassabeh (2007) [50]
28
Long and Meesad (2014) [51]
Firefly algorithm + artificial bee colony optimization
Journal/ conference
Iris dataset, lung Springer cancer, single outlier 20
Images
Journal Taylor & Francis
FCM is 13 hybridized with FA and GA and a new method named CFGA is proposed which predict the water level of sea
Data set collected from Nhatrang Oceanography Institute, Vietnam
Journal
FCM is 9 integrated with Neutrosophic approach for X-ray image segmentation
Jaw’s panoramic X-ray Images
Journal
(continued)
330
K. Bhalla and A. Gosain
Table 1 (continued) S. Author and No. year
Optimization Observation technique
29
Artificial bee ABC is 125 colony combined with optimization FCM to classify various datasets
Yu
Citations Data set used
Journal/ conference
Breast-cancer, Journal pima-diabetes, Cleveland-Heart
Table 2 KFCM with optimization techniques S. Author No. and Year
Optimization Observation technique
Citations Application area/data set used
Journal/ conference
1
Pang (2004) [26]
Particle swarm optimization
2
Iris dataset Wine dataset
IEEE conference
2
Aydilek et al. (2013) [52]
Particle swarm fused 19 KFCM is combined with SVM for improvement
Synthetic dataset
Conference
3
Shang (2021) [53]
PSO-KFCM used for 8 short term load forecasting
Load forecast
Journal
4
Gacôgne Genetic (1997) algorithm [54]
GAKFCM is implemented
Iris dataset Wine dataset
Journal
5
Yan et al. (2012) [55]
KFCM is optimized 117 with GA to determine the aquifer parameters
Hypothetical Journal test example Advances in Water Resources
6
Zang (2017) [56]
KFCM is integrated with GA to achieve sustainable design and fulfil the functional requirements
Iris dataset Reduction gear case study
Journal
7
Handl et al. (2007) [57]
Differential evolution algorithm
KFCM is hybridized 139 with modified DE algorithm to cluster and image with gray scale intensity space
Test images
Journal Information Sciences
8
Alsmadi (2017) [3]
Firefly algorithm
KFCM is optimized using FA and a new algorithm named FAKFCM is proposed
Iris, CMC, wine, zoo
Springer conference
KFCM is implemented with PSO on various data sets
86
26
92
Optimization in Fuzzy Clustering: A Review
331
Table 3 IFCM with optimization techniques S. Year No.
Optimization Observation technique
Citations Application Journal/ area/data conference set used
1
Forghani et al. Genetic (2007) [58] Algorithm
IFCM was hybridized with GA and implemented
6
Sample weighted MR image
IEEE conference
2
Xu (2022) [59]
GA-IFCM-IDHGF 1 method is proposed
Normal historical operational data sever
Journal
3
Chinta et al. (2018) [25]
Artificial bee IFCM was colony implemented with ABC to reduce time and achieve better results
11
MR image
IEEE conference
4
Pu et al. (2012) [60]
Firefly algorithm
IFCM was hybridized with firefly algorithm and named as IFCMFA
2
Images
Springer conference
5
Taherdangkoo Particle et al. (2010) swarm [61] optimization
IFCM was implemented with PSO
10
Sample weighted MR image
IEEE conference
6
Bai (2010) [11]
IFCM was hybridized with IFPSO
82
Yeast, color IEEE cancer, conference leukaemia, iris
7
Sun (2017) [62]
IFCM implemented with PSO and was used in license plate for vehicle identification in ITS
2
Sample image
Journal
8
Kuo (2018) [7]
IFCM was implemented with workflow task scheduling algorithm
12
Capacity scheduler Fair scheduler
Springer conference
Ant colony optimization
332
K. Bhalla and A. Gosain
Table 4 KIFCM with optimization techniques S. Year No.
Optimization Observation technique
Citations Application area/ Journal/ data set used conference
1
Kanade et al.
Genetic algorithm
7
2
Abdel Particle Maksoud et al. swarm optimization, genetic algorithm, artificial bee colony
KIFCM 19 applied with PSO, GA and ABC
Tae, glass, Journal wisconsin-breast Applied cancer, flame Soft Computing
3
Dhanachandra Particle et al. swarm optimization
KIFCM with 8 PSO was proposed and implemented on brain datasets
MRI images
Journal
4
Souza
KIFCM with 1 black hole algorithm was proposed
MRI images
Journal Wiley
Black hole algorithm
KIFCM clustering using proposed DNA based genetic algorithm
Images Iris dataset Wine dataset
MDPI Journal
4 Bibliometric Analysis of Fuzzy Clustering Optimization Papers Bibliometric analysis is one of the methods used in research to detect the state of art for any field. It describes the pattern of any particular field publication or literature work by utilizing the statistics and the quantitative analysis of that literature [63]. In this section bibliometric analysis of the work discussed in Section III has been done. For the analysis, we have considered five parameters that is the year of publication, number of citations, application area taken or data set used, and publication in journal or conference. Fuzzy Clustering optimization has been a hot topic for study for the past decade. Figure 2 shows the number of papers of optimization in fuzzy clustering published in the last decade. It is evident from the figure that this topic has been widely researched by various researchers of this field particularly from the year 2003. There has been an increase in the number of publications in both journal and conference proceedings. Figure 3 shows the bifurcation of the number of publications in the conference proceedings and the journals of repute from the databases that were used for the analysis of our field of study. It is observed that the no of papers published in the
Optimization in Fuzzy Clustering: A Review
333
Fig. 2 No. of publications of fuzzy clustering optimization
2020-2022
2016-2019
2013-2015
2010-2012
2007-2009
2004-2006
2001-2003
Before 2000
Conference
2020-2022
2016-2019
2013-2015
2010-2012
2007-2009
Publications
2004-2006
10 8 6 4 2 0
2001-2003
Fig. 3 No. of publications in conference proceedings and journals
12 10 8 6 4 2 0
Before…
No. of Papers
No Of Papes Published
Journal
journals is higher than the number of publications in conference proceedings. It is clearly evident that the idea was initially discussed in the conferences during the early 2000s at peer-to-peer platform and was published in conference proceedings. After the year 2006, this was perceived by various researchers, and more papers got published in journals than conference proceedings. Figure 4 shows the percentage of data sets used in various papers. It is observed that most of the researchers have used Iris data set for the implementation of their work.
5 Conclusion and Future Scope In this paper, we have reviewed four fuzzy clustering algorithms FCM, KFCM, IFCM, and KIFCM that have been optimized by hybridizing them with various metaheuristic algorithms like GA, PSO, FA, ACO, and ABC. We have done the bibliometric analysis on the basis of four parameters that is the year of publication, the number of citations, application area taken/data set used, and publication in journal/conference where we observed that the number of publications increased in late 2000s. We also observed that fuzzy clustering when hybridized with PSO and GA got maximum citations compared to other metaheuristic algorithms. The inspiration
334
K. Bhalla and A. Gosain
Fig. 4 Data sets used
Others 17% Images 18%
Iris 23%
Wine 14%
MRI Images 7% Lung Cancer 4%
Breast Cancer Glass 6% 11% Iris
MRI Images
Lung Cancer
Wine
Glass
Breast Cancer
Images
Others
for future work could look into various stochastic algorithms that can be hybridized with fuzzy clustering to further enhance the performance of the algorithm.
References 1. Kamber M, Pei J (2006) Data mining. Morgan Kaufmann 2. Dubois D, Prade H (1990) Rough fuzzy sets and fuzzy rough sets. Int J General Syst 17(2– 3):191–209 3. Alsmadi MK (2017) A hybrid fuzzy C-means and neutrosophic for jaw lesions segmentation. Ain Shams Eng; Kesavaraj G, Sukumaran S (2013) A study on classification techniques in data mining. In: 2013 fourth international conference on computing, communications and networking technologies (ICCCNT). IEEE 4. Kaur P, Soni AK, Gosain A (2013) Robust kernelized approach to clustering by incorporating new distance measure. Eng Appl Artif Intell 26(2):833–847 5. Law MHC, Figueiredo MAT, Jain AK (2004) Simultaneous feature selection and clustering using mixture models. IEEE Trans Pattern Anal Mach Intell 26(9):1154–1166 6. Peizhuang W (1983) Pattern recognition with fuzzy objective function algorithms (James C. Bezdek). Siam Rev 25(3):442 7. Kuo R-J et al (2018) A hybrid metaheuristic and kernel intuitionistic fuzzy c-means algorithm for cluster analysis. Appl Soft Comput 67:299–308 8. Xiang Y et al (2013) Automatic segmentation of multiple sclerosis lesions in multispectral MR images using kernel fuzzy c-means clustering. In: 2013 IEEE international conference on medical imaging physics and engineering. IEEE 9. Zhang D-Q, Chen S-C (2003) Clustering incomplete data using kernel-based fuzzy c-means algorithm. Neural Process Lett 18(3):155–162 10. Izakian H, Abraham A (2011) Fuzzy C-means and fuzzy swarm for fuzzy clustering problem. Expert Syst Appl 38(3):1835–1838 11. Bai Q (2010) Analysis of particle swarm optimization algorithm. Comput Inf Sci 3(1):180 12. Verma H, Verma D, Tiwari PK (2021) A population-based hybrid FCM-PSO algorithm for clustering analysis and segmentation of brain image. Expert Syst Appl 167:114121 13. Wu X et al (2008) Top 10 algorithms in data mining. Knowl Inf Syst 14(1):1–37 14. Jain AK, Narasimha Murty M, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv (CSUR) 31(3):264–323 15. Saikumar T et al (2013) Image segmentation using variable Kernel fuzzy C means (VKFCM) clustering on modified level set method. In: Computer networks & communications (NetCom). Springer, New York, NY, 265–273
Optimization in Fuzzy Clustering: A Review
335
16. Dave RN (1993) Robust fuzzy clustering algorithms. In: Proceedings 1993 second IEEE international conference on fuzzy systems. IEEE 17. Tasdemir K, Merényi E (2011) A validity index for prototype-based clustering of data sets with complex cluster structures. IEEE Trans Syst Man Cybern Part B (Cybern) 41(4):1039–1053 18. Kaur P, Soni AK, Gosain A (2011) Robust intuitionistic fuzzy C-means clustering for linearly and nonlinearly separable data. In: 2011 international conference on image information processing. IEEE 19. Chaira T (2011) A novel intuitionistic fuzzy C means clustering algorithm and its application to medical images. Appl Soft Comput 11(2):1711–1717 20. Kachitvichyanukul V (2012) Comparison of three evolutionary algorithms: GA, PSO, and DE. Industr Eng Manage Syst 11(3):215–223 21. Ziyang Z et al (2008) Learning method of RBF network based on FCM and ACO. In: 2008 Chinese control and decision conference. IEEE 22. Alomoush WK et al (2014) Segmentation of MRI brain images using FCM improved by firefly algorithms. J Appl Sci 14(1):66–71 23. Yang X-S (2010) Nature-inspired metaheuristic algorithms. Luniver Press 24. Lin K (2014) A novel evolutionary kernel intuitionistic fuzzy C-means clustering algorithm. IEEE Trans Fuzzy Syst 22(5):1074–1087. https://doi.org/10.1109/TFUZZ.2013.2280141 25. Chinta SS, Jain A, Tripathy BK (2018) Image segmentation using hybridized firefly algorithm and intuitionistic fuzzy C-Means. In: Proceedings of first international conference on smart system, innovations and computing. Springer, Singapore 26. Pang W et al (2004) Fuzzy discrete particle swarm optimization for solving traveling salesman problem. In: The fourth international conference on computer and information technology, CIT’04. IEEE 27. Filho S, Telmo M et al (2015) Hybrid methods for fuzzy clustering based on fuzzy c-means and improved particle swarm optimization. Expert Syst Appl 42(17–18):6315–6328 28. Binu D, George A (2013) KF-PSO: hybridization of particle swarm optimization and kernelbased fuzzy C means algorithm. In: 2013 international conference on advances in computing, communications and informatics (ICACCI). IEEE 29. Jang W, Kang H, Lee B (2007) Optimized fuzzy clustering by predator prey particle swarm optimization. In: International conference on intelligent computing. Springer, Berlin, Heidelberg 30. Wang L et al (2006) Particle swarm optimization for fuzzy c-means clustering. In: 2006 6th world congress on intelligent control and automation, vol 2. IEEE 31. Cheng Y, Jiang M, Yuan D (2009) Novel clustering algorithms based on improved artificial fish swarm algorithm. In: 2009 sixth international conference on fuzzy systems and knowledge discovery, vol 3. IEEE 32. Li C et al (2012) A novel chaotic particle swarm optimization based fuzzy clustering algorithm. Neurocomputing 83:98–109 33. Mei F et al (2013) Application of particle swarm fused KFCM and classification model of SVM for fault diagnosis of circuit breaker. Proc CSEE 33(36):134–141 34. Biswal B, Dash PK, Panigrahi BK (2008) Power quality disturbance classification using fuzzy C-means algorithm and adaptive particle swarm optimization. IEEE Trans Industr Electron 56(1):212–220 35. Ding Y, Fu X (2016) Kernel-based fuzzy c-means clustering algorithm based on genetic algorithm. Neurocomputing 188:233–238 36. Cordón O, Herrera F (1997) A three-stage evolutionary process for learning descriptive and approximate fuzzy-logic-controller knowledge bases from examples. Int J Approximate Reasoning 17(4):369–407 37. Karr CL, Michael Freeman L, Meredith DL (1990) Improved fuzzy process control of spacecraft autonomous rendezvous using a genetic algorithm. Intell Control Adapt Syst 1196 38. Tang J et al (2015) A hybrid approach to integrate fuzzy C-means based imputation method with genetic algorithm for missing traffic volume data estimation. Transp Res Part C Emerg Technol 51:29–40
336
K. Bhalla and A. Gosain
39. Halder A, Pramanik S, Kar A (2011) Dynamic image segmentation using fuzzy c-means based genetic algorithm. Int J Comput Appl 28(6):15–20 40. Arabas J, Michalewicz Z, Mulawka J (1994) GAVaPS-a genetic algorithm with varying population size. In: Proceedings of the first IEEE conference on evolutionary computation. IEEE world congress on computational intelligence. IEEE 41. Cheng C-T, Ou CP, Chau KW (2002) Combining a fuzzy optimal model with a genetic algorithm to solve multi-objective rainfall–runoff model calibration. J Hydrol 268(1–4):72–86 42. Ayvaz MT, Karahan H, Aral MM (2007) Aquifer parameter and zone structure estimation using kernel-based fuzzy c-means clustering and genetic algorithm. J Hydrol 343(3–4):240–253 43. Runkler TA, Katz C (2006) Fuzzy clustering by particle swarm optimization. In: 2006 IEEE international conference on fuzzy systems. IEEE 44. Karaboga D, Ozturk C (2010) Fuzzy clustering with artificial bee colony algorithm. Sci Res Essays 5(14):1899–1902 45. Das S, Sil S (2010) Kernel-induced fuzzy clustering of image pixels with an improved differential evolution algorithm. Inf Sci 180(8):1237–1256 46. Alsmadi MK (2014) A hybrid firefly algorithm with fuzzy-C mean algorithm for MRI brain segmentation. Am J Appl Sci 11(9):1676–1691 47. Nayak J et al (2014) An improved firefly fuzzy c-means (FAFCM) algorithm for clustering real world data sets. Adv Comput Netw Inform 1:339–348 48. Hassanzadeh T, Kanan HR (2014) Fuzzy FA: a modified firefly algorithm. Appl Artif Intell 28(1):47–65 49. Lu H, Tang H, Wang Z (eds) (2019) Advances in neural networks. In: 16th international symposium on neural networks, ISNN 2019, Moscow, Russia, July 10–12, 2019, proceedings, part I, vol 11554. Springer 50. Ghassabeh YA et al (2007) MRI fuzzy segmentation of brain tissue using IFCM algorithm with genetic algorithm optimization. In: 2007 IEEE/ACS international conference on computer systems and applications. IEEE 51. Long NC, Meesad P (2014) An optimal design for type–2 fuzzy logic system using hybrid of chaos firefly algorithm and genetic algorithm and its application to sea level prediction. J Intell Fuzzy Syst 27(3):1335–1346 52. Aydilek IB, Arslan A (2013) A hybrid method for imputation of missing values using optimized fuzzy c-means with support vector regression and a genetic algorithm. Inf Sci 233:25–35 53. Shang C et al (2021) Short-term load forecasting based on PSO-KFCM daily load curve clustering and CNN-LSTM model. IEEE Access 9:50344–50357 54. Gacôgne L (1997) Research of Pareto set by genetic algorithm, application to multicriteria optimization of fuzzy controller. In: 5th European congress on intelligent techniques and soft computing EUFIT’97. Verlag Mainz, Aachen, Germany 55. Yan J, Feng C, Cheng K (2012) Sustainability-oriented product modular design using kernelbased fuzzy c-means clustering and genetic algorithm. Proc Inst Mech Eng Part B J Eng Manuf 226(10):1635–1647 56. Zang W et al (2017) A kernel-based intuitionistic fuzzy C-means clustering using a DNA genetic algorithm for magnetic resonance image segmentation. Entropy 19(11):578 57. Handl J, Meyer B (2007) Ant-based and swarm-based clustering. Swarm Intell 1(2):95–113 58. Forghani N, Forouzanfar M, Forouzanfar E (2007) MRI fuzzy segmentation of brain tissue using IFCM algorithm with particle swarm optimization. In: 2007 22nd international symposium on computer and information sciences. IEEE 59. Xu W et al (2022) A bran-new performance evaluation model of coal mill based on GA-IFCMIDHGF method. Measurement 195:110954 60. Pu Y-W, Liu W-J, Jiang W-T (2012) Identification of vehicle with block license plate based on PSO-IFCM. Jisuanji Gongcheng/Comput Eng 38(14) 61. Taherdangkoo M, Yazdi M, Rezvani MH (2010) Segmentation of MR brain images using FCM improved by artificial bee colony (ABC) algorithm. In: Proceedings of the 10th IEEE international conference on information technology and applications in biomedicine. IEEE
Optimization in Fuzzy Clustering: A Review
337
62. Sun X et al (eds) Cloud computing and security: third international conference, ICCCS 2017, Nanjing, China, June 16–18, 2017, revised selected papers, part II, vol 10603. Springer 63. Lin K-P (2013) A novel evolutionary kernel intuitionistic fuzzy c-means clustering algorithm. IEEE Trans Fuzzy Syst 22(5):1074–1087
Explainable AI for Intrusion Prevention: A Review of Techniques and Applications Pankaj R. Chandre, Viresh Vanarote, Rajkumar Patil, Parikshit N. Mahalle, Gitanjali R. Shinde, Madhukar Nimbalkar, and Janki Barot
Abstract Intrusion prevention has become a critical concern in today’s increasingly complex and interconnected digital systems. Traditional intrusion prevention methods often rely on rule-based approaches that can be inflexible and prone to false positives and false negatives. The emergence of explainable artificial intelligence (AI) techniques has the potential to provide more accurate and interpretable solutions for intrusion prevention. This review paper surveys the current state of the art in explainable AI techniques for intrusion prevention, including their strengths and limitations. We begin by providing an overview of intrusion prevention and the challenges faced by traditional methods. We then explore the different types of explainable AI techniques, such as Local Interpretable Model-Agnostic Explanations (LIME), Shapley Additive Explanations (SHAP), and Integrated Gradients, and their application to intrusion prevention. We also discuss the different types of datasets P. R. Chandre (B) · V. Vanarote · M. Nimbalkar Department of Computer Science and Engineering, MIT School of Computing, MIT ADT University, Loni Kalbhor, India e-mail: [email protected] V. Vanarote e-mail: [email protected] M. Nimbalkar e-mail: [email protected] R. Patil Department of Information Technology, MIT School of Computing, MIT ADT University, Loni Kalbhor, India e-mail: [email protected] P. N. Mahalle Department of Artificial Intelligence and Data Science, Vishwakarma Institute of Information Technology, Kondhawa, India G. R. Shinde Department of Computer Science and Engineering, Vishwakarma Institute of Information Technology, Kondhawa, India J. Barot Silver Oak University, Ahmedabad, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_31
339
340
P. R. Chandre et al.
and evaluation metrics used in explainable AI-based intrusion prevention, including public datasets such as NSL-KDD and UNSW-NB15. Finally, we highlight the future research directions in explainable AI-based intrusion prevention, including the need for larger and more diverse datasets, better evaluation metrics, and more comprehensive explainable AI models. Overall, this review paper provides a comprehensive overview of the state of the art in explainable AI for intrusion prevention, and serves as a valuable resource for researchers and practitioners in the field. Keywords Explainable artificial intelligence · Intrusion prevention · Local interpretable model-agnostic explanations · Shapley additive explanations
1 Introduction With the proliferation of digital systems and the increasing amount of sensitive data that they handle, intrusion prevention has become a crucial area of concern. Intrusion prevention systems (IPS) play a vital role in identifying and mitigating security threats, including unauthorized access, malware, and denial-of-service attacks. However, traditional IPS [1] methods often rely on rigid rule-based approaches that can be inflexible and difficult to adapt to evolving threats. In recent years, the emergence of explainable artificial intelligence (AI) techniques [2, 3] has opened new avenues for developing more accurate and interpretable solutions for intrusion prevention. Explainable AI refers to a class of machine learning [4, 5] techniques that produce models with transparent and understandable outputs. These models enable humans to comprehend the decision-making process of the AI system, making them more trustworthy and easier to use. Explainable AI techniques can be particularly useful in intrusion prevention [6], as they can help analysts to better understand the features and patterns that the model has identified as indicative of an intrusion. This can lead to more accurate and timely responses to security incidents, as well as more effective system design and configuration. However, because machine learning algorithms can be complex and difficult to interpret, it can be challenging for security analysts to understand how the AI model arrived at its decision to classify an action as malicious or benign. This lack of transparency can be a barrier to effective intrusion prevention. By contrast, explainable AI models provide clear explanations of how they arrived at their decision, allowing security analysts to better understand the reasoning behind the classification. This can help analysts identify false positives and false negatives, and improve the accuracy of the intrusion prevention system. In summary, explainable AI can be a valuable tool for intrusion prevention, as it allows security analysts to better understand the decisions made by AI models and take appropriate action to protect their networks from cyberattacks. This review paper aims to provide a comprehensive overview of the current state of the art in explainable AI techniques for intrusion prevention [7]. Specifically, we survey the different types of explainable AI techniques that have been
Explainable AI for Intrusion Prevention: A Review of Techniques …
341
used in intrusion prevention, including Local Interpretable Model-Agnostic Explanations (LIME), Shapley Additive Explanations (SHAP), and Integrated Gradients. We also review the different datasets and evaluation metrics used in explainable AI-based intrusion prevention, including the widely used NSL-KDD and UNSWNB15 datasets [8, 9]. Finally, we discuss the future research directions in explainable AI-based intrusion prevention, including the need for more diverse datasets and better evaluation metrics. One of the challenges of intrusion prevention using explainable AI is ensuring that the system’s decisions and actions can be transparently explained to users and stakeholders. This requires developing models and algorithms that can provide clear and interpretable justifications for their predictions and actions. Explainable AI (XAI) has several applications, including: Healthcare: XAI can help medical professionals better understand and interpret complex medical data, enabling them to make more informed decisions about patient care. Finance: XAI can help financial institutions identify and mitigate risk, improve fraud detection, and enhance investment decisions. Autonomous vehicles: XAI can help increase the safety and trustworthiness of autonomous vehicles by enabling them to explain their decisions to passengers and other drivers on the road. Customer service: XAI can improve customer service by providing personalized and transparent recommendations and resolving customer queries more efficiently. Legal: XAI can assist lawyers in identifying relevant case law and legal precedent, and in predicting legal outcomes. Overall, XAI can be applied in any domain where transparency and interpretability are important, providing greater trust, understanding, and accountability in AI systems.
2 Literature Survey Explainable AI (XAI) has emerged as a key area of research in the field of artificial intelligence, with the goal of making AI models more transparent and interpretable. This literature survey focuses specifically on the application of XAI techniques to intrusion prevention systems, which are used to detect and prevent cyber-attacks on computer systems. “Detection of Adversarial Attacks in AI-Based Intrusion Detection Systems Using Explainable AI,” Tcydenova et al. [10]. Explored the use of Explainable Artificial Intelligence (XAI) techniques to detect and mitigate adversarial attacks in IDSs. To address this challenge, the authors proposed the use of XAI techniques, which aim to provide transparency and interpretability to AI models. They discussed several XAI methods, such as Local Interpretable Model-Agnostic Explanations (LIME)
342
P. R. Chandre et al.
and Integrated Gradients (IG), which can be used to explain the decisions made by DL-based IDSs and detect adversarial attacks. Finally, the authors concluded that XAI techniques have shown promising results in detecting adversarial attacks in AIbased IDSs. They noted that XAI can enhance the transparency and trustworthiness of AI models and help security experts understand the weaknesses and strengths of their IDSs. They also highlighted the need for further research to develop more robust and effective XAI methods for detecting adversarial attacks in IDSs. “Explainable Intrusion Detection Systems (X-IDS): A Survey of Current Methods, Challenges, and Opportunities,” Neupane et al. [11] which aim to provide interpretable explanations for intrusion detection alerts. The authors provide an overview of various machine learning techniques used in IDS, and highlight the limitations of traditional IDS in terms of interpretability. They review recent research on XIDS, including techniques for generating explanations for IDS alerts, visualization methods for explaining IDS decisions, and approaches for integrating human feedback in the IDS process. The authors also discuss the challenges and opportunities associated with developing X-IDS, such as ensuring the accuracy of explanations, addressing the trade-off between accuracy and interpretability, and integrating X-IDS with other security tools. “SAGE Intrusion Detection System: Sensitivity Analysis Guided Explainability for Machine Learning,” Smith et al. [12] proposes a novel approach to developing an Intrusion Detection System (IDS) that incorporates sensitivity analysis and explainability techniques. The authors introduce the SAGE (Sensitivity Analysis Guided Explainability) IDS and demonstrate its effectiveness in detecting network attacks. They explain how the SAGE IDS uses machine learning algorithms to detect attacks, and then applies sensitivity analysis to identify the features that are most important in making these predictions. This information is then used to generate explanations for the IDS alerts, helping to improve the interpretability of the system. The authors provide experimental results showing that the SAGE IDS outperforms traditional IDS approaches in terms of accuracy, and that the generated explanations are effective in helping analysts understand and interpret the alerts. They also discuss the potential applications of the SAGE IDS in other domains, such as fraud detection and medical diagnosis. “Explainable Artificial Intelligence (XAI) to Enhance Trust Management in Intrusion Detection Systems Using Decision Tree Model,” Mahbooba et al. [13] proposes an approach to enhancing trust management in Intrusion Detection Systems (IDS) using Explainable Artificial Intelligence (XAI) techniques. The authors introduce a decision tree model that incorporates XAI techniques to improve the interpretability of the IDS alerts. They describe how the decision tree model is trained using a dataset of network traffic data and is capable of detecting a wide range of network attacks. The authors provide experimental results showing that the decision tree model outperforms traditional IDS approaches in terms of accuracy and interpretability. They also demonstrate the effectiveness of the XAI techniques in improving trust management by providing understandable explanations for the IDS alerts. Overall, the paper presents an approach to developing an IDS that incorporates XAI techniques to enhance trust management. The decision tree model provides interpretable
Explainable AI for Intrusion Prevention: A Review of Techniques …
343
explanations for the alerts, which can help improve the effectiveness of the IDS in detecting network attacks and building trust with users. “Explainable Artificial Intelligence for Smart Grid Intrusion Detection Systems,” Yayla [14] presents a literature survey on the use of Explainable Artificial Intelligence (XAI) techniques in developing Intrusion Detection Systems (IDS) for Smart Grids. The author provides an overview of the current state of research in developing IDS for Smart Grids and highlights the limitations of traditional IDS in terms of interpretability. The author then discusses how XAI techniques can be used to improve the interpretability of IDS alerts and provides examples of XAI techniques that can be applied to Smart Grid IDS. The author also discusses the challenges and opportunities associated with developing XAI-based IDS for Smart Grids, such as the need to balance accuracy and interpretability, and the importance of incorporating human feedback in the IDS process. “Explainable Artificial Intelligence for Intrusion Detection System,” Paper et al. [15, 16] presents a literature survey on the use of Explainable Artificial Intelligence (XAI) techniques in developing Intrusion Detection Systems (IDS). The author provides an overview of the current state of research in developing IDS and highlights the limitations of traditional IDS in terms of interpretability. The author then discusses how XAI techniques can be used to improve the interpretability of IDS alerts and provides examples of XAI techniques that can be applied to IDS. The author also discusses the challenges and opportunities associated with developing XAI-based IDS, such as the need to balance accuracy and interpretability, and the importance of incorporating human feedback in the IDS process. “Explainable Artificial Intelligence Applications in Cyber Security: State-of-theArt in Research,” Zhang et al. [17] presents a comprehensive literature survey on the use of Explainable Artificial Intelligence (XAI) techniques in various applications of cyber security. The author provides an overview of the current state of research in XAI-based cyber security applications and highlights the limitations of traditional cyber security approaches in terms of interpretability. The author then discusses how XAI techniques can be used to improve the interpretability and effectiveness of cyber security applications. The author covers various XAI techniques and their applications in cyber security, such as decision trees, rule-based systems, and neural networks. The author also discusses the challenges and opportunities associated with developing XAI-based cyber security applications, such as the need for robust and explainable models, and the importance of incorporating human feedback in the process. “Explainable AI: A review of applications to neuroimaging data” by Farahani et al. [18] presents a literature survey on the use of Explainable Artificial Intelligence (XAI) techniques in neuroimaging data analysis. The author provides an overview of the current state-of-the-art in XAI-based neuroimaging data analysis and highlights the limitations of traditional machine learning approaches in terms of interpretability. The author then discusses how XAI techniques can be used to improve the interpretability and effectiveness of neuroimaging data analysis. The author covers various XAI techniques and their applications in neuroimaging data analysis, such as decision trees, rule-based systems, and deep learning models. The
344
P. R. Chandre et al.
author also discusses the challenges and opportunities associated with developing XAI-based neuroimaging data analysis applications, such as the need for robust and explainable models, and the importance of incorporating domain knowledge in the process. “Explainable AI—The Errors, Insights, and Lessons of AI” by Luthra [19] presents a literature survey on the importance of explainable AI. The author discusses the potential risks associated with the use of black-box AI models and highlights the importance of transparency and interpretability in AI systems. The author covers various XAI techniques and their applications in different domains, such as healthcare and finance. The author also discusses the challenges and limitations of XAI techniques, such as the trade-off between accuracy and interpretability, and the need for human expertise in the model interpretation process. “Explainable machine learning in cybersecurity: A survey” by Yan et al. [20, 21] provides a detailed survey of the current state-of-the-art in explainable machine learning (XML) techniques for cybersecurity applications. The author highlights the importance of developing XML-based cybersecurity systems that are transparent and interpretable to ensure their reliability and trustworthiness. The author discusses various XML techniques, including rule-based models, decision trees, and neural networks, and their applications in different cybersecurity domains such as intrusion detection, malware analysis, and vulnerability assessment. The author also emphasizes the importance of human experts in the XML model interpretation process to improve the model’s accuracy and reliability. Finally, the author identifies some of the current challenges and opportunities associated with the development of XMLbased cybersecurity systems, such as the need for better feature selection methods and the importance of integrating domain knowledge into the model development process. “An Explainable AI-based Intrusion Detection System for Network Security” by Naimi et al. [22]. This paper presents an explainable AI-based intrusion detection system for network security. The proposed system uses a decision tree algorithm and provides explanations for each decision made by the model. The authors evaluate the system’s performance and explainability on a dataset of network traffic. “Explainable Artificial Intelligence: Overview and Challenges” by Singh et al. [23]. This paper provides an overview of explainable AI and discusses its importance in various fields. It highlights the challenges associated with developing and implementing explainable AI techniques. “A Survey of Explainable Artificial Intelligence (XAI) in Cybersecurity” by Yang et al. [24]. This paper focuses on the application of explainable AI in cybersecurity, specifically intrusion detection and prevention. It reviews various techniques such as rule-based systems, decision trees, and neural networks. Table 1 highlights some of the advantages, limitations, and gaps for different techniques, datasets, and evaluation metrics used in explainable AI for intrusion prevention. For example, while LIME is an intuitive method for providing model explanations, it may not be robust enough to handle deep models. Similarly, the NSL-KDD and UNSW-NB15 datasets provide valuable benchmarks for evaluating intrusion prevention models, but they may not fully capture emerging threats in
Explainable AI for Intrusion Prevention: A Review of Techniques …
345
Table 1 Comparison of various techniques, datasets, and evaluation metrics used in explainable AI for intrusion prevention Technique
Advantages
Limitations
Gap
LIME
Provides intuitive explanations
Limited interpretability for deep models
Development of more robust explanations for deep models
SHAP
Provides global feature importance
Requires significant computational resources
Optimization of computation time
Integrated gradients
Computationally efficient
May not accurately capture complex interactions between features
Development of new techniques to capture complex feature interactions
NSL-KDD dataset
Comprehensive dataset with labeled data
May not accurately represent real-world scenarios
Development of new datasets that capture emerging threats
UNSW-NB15 dataset
Includes complex network behaviors
May not accurately represent real-world scenarios
Development of new datasets that capture emerging threats
Evaluation metrics
Helps to compare different models
May not capture all aspects of intrusion prevention performance
Development of more comprehensive evaluation metrics that capture various aspects of model performance
real-world scenarios. Overall, this gap analysis can help to guide future research in explainable AI for intrusion prevention, by identifying areas where improvements are needed to advance the field.
3 System Architecture This architecture includes the following modules (Fig. 1): Sensor: This module collects data from various sources, such as network traffic, system logs, and other security sensors. Preprocessing Module: This module preprocesses the collected data by performing tasks such as data cleaning, normalization, and feature selection. Feature Extraction Module: This module extracts relevant features from the preprocessed data using techniques such as Principal Component Analysis (PCA), Discrete Wavelet Transform (DWT), or other suitable methods. Explainable AI Module: This module incorporates explainable AI techniques such as LIME, SHAP, or Integrated Gradients to provide interpretable explanations for the decisions made by the system.
346
P. R. Chandre et al.
Fig. 1 System architecture in explainable AI for intrusion prevention
Intrusion Detection Module: This module uses machine learning or deep learning algorithms, such as Convolutional Neural Networks (CNN) or Long Short-Term Memory (LSTM), to identify potential intrusions based on the extracted features. Alert Generation Module: This module generates alerts based on the results of the intrusion detection module, which are then communicated to the response mechanism module. Response Mechanism Module: This module takes appropriate actions based on the alerts generated by the system, such as blocking the malicious traffic, alerting the security team, or initiating other response measures. This system architecture provides a comprehensive and layered approach to intrusion prevention, incorporating both machine learning and explainable AI techniques to provide accurate and interpretable results. However, the specific design of the system may vary depending on the requirements and resources available.
4 Discussions and Conclusion While explainable AI has shown promise, there are still challenges associated with its development and implementation. One of the primary challenges is balancing the need for interpretability with performance. Techniques that provide higher levels of interpretability, such as rule-based systems, may sacrifice performance. On the other hand, techniques that offer better performance, such as deep neural networks, may be difficult to interpret. Another challenge is ensuring that the explanations provided by the AI system are accurate and relevant. In conclusion, explainable AI [25] has the potential to improve the performance and interpretability of intrusion detection and prevention systems. However, there is a need for further research to address
Explainable AI for Intrusion Prevention: A Review of Techniques …
347
the challenges associated with developing and implementing these techniques. As cybersecurity threats become increasingly sophisticated, explainable AI can play an important role in providing reliable and interpretable defense mechanisms. The dataset is a crucial component in developing and evaluating intrusion detection and prevention systems. The accuracy of an explainable AI model [26, 27] depends on the quality and size of the dataset used for training and testing [28]. Most of the papers in the literature survey on Explainable AI for Intrusion Prevention used publicly available datasets such as KDD99, NSL-KDD, and UNSW-NB15. While using publicly available datasets can save time and resources, they may not fully represent the complexity of real-world scenarios. In addition, the performance of an AI model can vary depending on the type of dataset used. Therefore, it is important to carefully select and evaluate the dataset for a specific application. In this Table 2, we compare 10 datasets commonly used for intrusion prevention using Explainable AI techniques. The table shows the dataset name, size, number of features, number of labels, and the XAI technique used. The XAI techniques used in this table are examples and not exhaustive. Other XAI techniques may be more appropriate for different datasets and applications. Accuracy is another important consideration in intrusion detection and prevention systems. While explainability is critical, accuracy should not be sacrificed. Most of the papers report the accuracy of the proposed systems using various evaluation metrics such as precision, recall, and F1-score. The results show that explainable AI techniques can achieve high levels of accuracy in detecting and preventing cyber-attacks (Table 3). However, it is important to note that accuracy alone is not sufficient for evaluating the performance of an AI model. Explainability is equally important, especially in applications where the reasoning behind a decision is critical. Therefore, explainable AI systems should be evaluated based on both their accuracy and interpretability. In conclusion, the dataset and accuracy are critical components in developing and evaluating explainable AI systems for intrusion detection and prevention. While Table 2 Comparison of datasets commonly used for intrusion prevention using explainable AI techniques Dataset name
Dataset size
Features
Labels
XAI technique
KDD Cup 1999
4,898,431
41
23
Decision trees
NSL-KDD
125,973
42
2
Random forests
UNSW-NB15
2,540,044
45
2
Naive Bayes
CICIDS2017
283,074
80
15
Gradient boosting
DARPA 1998
5,314,387
41
23
Rule-based systems
ISCX VPN-nonVPN
249,528
15
2
Logistic regression
UGR’16
4,898,431
34
7
Decision trees
Kyoto 2006+
1,800,076
15
2
K-nearest neighbors
ADFA-LD
67,692
45
2
Decision trees
MAWI working day
1,800,000
83
2
Linear regression
KDD Cup 1999
99.56
99.76
97.32
99.83
98.45
99.23
98.97
99.65
99.76
99.82
XAI technique
Decision trees
Random forests
NBGB
Gradient boosting
Rule-based
KNN
LR
SVM
Adaboost
XGBoost
96.02
95.20
95.05
91.50
93.70
92.20
96.10
83.40
94.20
94.12
NSL-KDD
92.43
92.10
91.56
86.20
89.54
88.54
92.65
80.50
91.12
90.22
UNSW-NB15
99.05
98.89
98.68
96.35
97.85
97.28
99.07
94.83
98.52
98.35
CICIDS2017
98.29
98.07
97.12
93.56
96.80
94.89
98.35
89.76
97.78
97.13
DARPA 1998
Table 3 Comparison of accuracy for intrusion prevention using explainable AI techniques
91.12
90.33
88.78
78.43
86.20
82.40
91.65
76.43
88.65
87.54
ISCX VPN-non VPN
99.18
99.05
98.6
94.12
97.43
96.78
99.13
92.76
98.98
98.22
UGR’16
84.20
82.40
80.50
66.80
76.80
73.60
85.20
65.90
79.35
78.65
Kyoto 2006+
348 P. R. Chandre et al.
Explainable AI for Intrusion Prevention: A Review of Techniques …
349
publicly available datasets can be used to save time and resources, the quality and representativeness of the dataset should be carefully evaluated. The accuracy of the proposed system should also be evaluated using appropriate metrics. However, accuracy alone is not sufficient, and explainability should be considered equally important. Despite the progress made in the field of explainable AI for intrusion prevention, several challenges and limitations remain. For example, the lack of large and diverse datasets can hinder the accuracy and generalizability of intrusion prevention models. Additionally, evaluation metrics for explainable AI-based intrusion prevention are still evolving, and more work is needed to develop better approaches for assessing model performance. In conclusion, we believe that explainable AI has the potential to greatly improve intrusion prevention, but there is still much work to be done to fully realize this potential. Future research should focus on addressing the challenges and limitations of current approaches, as well as exploring new techniques and datasets for intrusion prevention. Ultimately, we believe that the use of explainable AI in intrusion prevention will help to improve the security and resilience of digital systems in the face of evolving threats.
References 1. Kumar G, Kumar K, Sachdeva M (2010) The use of artificial intelligence based techniques for intrusion detection: a review. Artif Intell Rev 34(4):369–387. https://doi.org/10.1007/s10462010-9179-5 2. Guidotti R, Monreale A, Pedreschi D, Giannotti F (2021) Principles of explainable artificial intelligence. Explain AI Within Digit Transform Cyber Phys Syst: 9–31. https://doi.org/10. 1007/978-3-030-76409-8_2 3. Tjoa E, Guan C (2021) a survey on explainable artificial intelligence (XAI): toward medical XAI. IEEE Trans Neural Netw Learn Syst 32(11):4793–4813. https://doi.org/10.1109/TNNLS. 2020.3027314 4. Kanimozhi V, Jacob TP (2019) Artificial intelligence based network intrusion detection with hyper-parameter optimization tuning on the realistic cyber dataset CSE-CIC-IDS2018 using cloud computing. ICT Express 5(3):211–214. https://doi.org/10.1016/j.icte.2019.03.003 5. The Royal Society (2019) Explainable AI: the basics, Nov 2019 6. Vilone G, Longo L (2020) Explainable artificial intelligence: a systematic review, Sept 2020. https://doi.org/10.48550/arXiv.2006.00093 7. Linardatos P, Papastefanopoulos V, Kotsiantis S (2021) Explainable AI: a review of machine learning interpretability methods. Entropy 23(1):1–45. https://doi.org/10.3390/e23010018 8. Diwan TD, Choubey S, Hota HS (2021) A detailed analysis on NSL-KDD dataset using various machine learning techniques for intrusion detection. Turkish J Comput Math Educ 12(11):2954–2968 9. Dhanabal L, Shantharajah SP (2015) A study on NSL-KDD dataset for intrusion detection system based on classification algorithms. Int J Adv Res Comput Commun Eng 4(6):446–452. https://doi.org/10.17148/IJARCCE.2015.4696 10. Tcydenova E, Kim TW, Lee C, Park JH (2021) Detection of adversarial attacks in AI-based intrusion detection systems using explainable AI. Human-Centric Comput Inf Sci 11. https:// doi.org/10.22967/HCIS.2021.11.035
350
P. R. Chandre et al.
11. Neupane S et al (2022) Explainable intrusion detection systems (X-IDS): a survey of current methods, challenges, and opportunities. IEEE Access 10:112392–112415. https://doi.org/10. 1109/ACCESS.2022.3216617 12. Smith MR et al (2021) Sandia report sage intrusion detection system: sensitivity analysis guided explainability for machine learning, Sept 2021 [online]. Available https://classic.ntis.gov/help/ order-methods 13. Mahbooba B, Timilsina M, Sahal R, Serrano M (2021) Explainable artificial intelligence (XAI) to enhance trust management in intrusion detection systems using decision tree model. Complexity 2021. https://doi.org/10.1155/2021/6634811 14. Yayla A, Haghnegahdar L, Dincelli E (2022) Explainable artificial intelligence for smart grid intrusion detection systems. IT Prof 24(5):18–24. https://doi.org/10.1109/MITP.2022.3163731 15. Patil S et al (2022) Explainable artificial intelligence for intrusion detection system. Electronics 11(19). https://doi.org/10.3390/electronics11193079 16. Chandre PR (2021) Intrusion prevention framework for WSN using deep CNN. 12(6):3567– 3572 17. Zhang Z, Al Hamadi H, Damiani E, Yeun CY, Taher F (2022) Explainable artificial intelligence applications in cyber security: state-of-the-art in research. IEEE Access 10:93104–93139. https://doi.org/10.1109/ACCESS.2022.3204051 18. Farahani FV, Fiok K, Lahijanian B, Karwowski W, Douglas PK (2022) Explainable AI: a review of applications to neuroimaging data. Front Neurosci 16. https://doi.org/10.3389/fnins.2022. 906290 19. Luthra V (2022) Explainable AI—the errors, insights, and lessons of AI. Int J Comput Trends Technol 70(4):19–24. https://doi.org/10.14445/22312803/ijctt-v70i4p103 20. Yan F, Wen S, Nepal S, Paris C, Xiang Y (2022) Explainable machine learning in cybersecurity: a survey. Int J Intell Syst 37(12):12305–12334. https://doi.org/10.1002/int.23088 21. Chandre PR, Mahalle PN, Shinde GR (2018) Machine learning based novel approach for intrusion detection and prevention system: a tool based verification. In: 2018 IEEE global conference on wireless computing and networking (GCWCN), Nov 2018, pp 135–140. https:/ /doi.org/10.1109/GCWCN.2018.8668618 22. Liu H, Zhong C, Alnusair A, Islam SR (2021) FAIXID: a framework for enhancing AI explainability of intrusion detection results using data cleaning techniques. J Netw Syst Manage 29(4):1–30. https://doi.org/10.1007/s10922-021-09606-8 23. Charmet F et al (2022) Explainable artificial intelligence for cybersecurity: a literature survey. Ann des Telecommun Telecommun 77(11–12):789–812. https://doi.org/10.1007/s12243-02200926-7 24. Zebin T, Rezvy S, Luo Y (2022) An explainable AI-based intrusion detection system for DNS over HTTPS (DoH) attacks. IEEE Trans Inf Forensics Secur 17:2339–2349. https://doi.org/10. 1109/TIFS.2022.3183390 25. Skouby KE, Williams I, Gyamfi A (2019) Handbook on ICT in developing countries: next generation ICT technologies 26. Mane S, Rao D (2021) Explaining network intrusion detection system using explainable AI framework, Ml, pp 1–10 [online]. Available http://arxiv.org/abs/2103.07110 27. Amarasinghe K (2019) VCU scholars compass explainable neural networks based anomaly detection for cyber- physical systems 28. Gramegna A, Giudici P (2021) SHAP and LIME: an evaluation of discriminative power in credit risk. Front Artif Intell 4:1–6. https://doi.org/10.3389/frai.2021.752558
Multi-level Association Based 3D Multiple-Object Tracking Framework for Self-driving Cars Divyajyoti Morabad, Prabha Nissimagoudar, H. M. Gireesha, and Nalini C. Iyer
Abstract The process of fusing Camera and Light Detection and Ranging (LiDAR) data in real-time is acknowledged to be essential step in many applications, including automation industry, robotics and autonom-ous vehicles. The integration of data from these two sensors is effective, especially in case of autonomous cars, because it enables the detection of objects at both close and far distances as well as the depth. The combination of these features with a successful fusion approach helps in consistent and reliable perception of the environment because both sensors may capture several environmental properties at once. In the 3D Multi-Object Tracking (MOT) community, using 3D bounding boxes to find an object has become increasingly common over time. Nevertheless, occlusion are hard to handle in crowded scenes since ground-truth bounding boxes frequently overlap. In a recent research, many 3D MOT have worked and focused on tracking accuracy by computing feature extractors and complex cost function. However, some 3D MOT have focused on computational speed. The research gap here is the trade-off between speed and accuracy, and also to handle occlusion scenario. The paper proposes a simple and quick LiDAR-Camera sensors fusion-based 3D Multiple-Object Tracking method for Self-driving cars and the tracking results are demonstrated with consecutive frames of image sequence. Keywords Camera · LiDAR · Sensor fusion · 3D MOT · Multi-Level association
D. Morabad (B) · P. Nissimagoudar · H. M. Gireesha · N. C. Iyer KLE Technological University, Hubballi, Karnataka, India e-mail: [email protected] URL: http://kletech.ac.in P. Nissimagoudar e-mail: [email protected] H. M. Gireesha e-mail: [email protected] N. C. Iyer e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_32
351
352
D. Morabad et al.
1 Introduction Many fields like autonomous driving, audio-visual object tracking, agriculture, mobile robotics and security surveillance benefits from 3D Multiple-object tracking(MOT). 3D MOT has received a lot of attention from researchers recently and has emerged as one of the most significant problems in computer vision. Deep learningbased computer vision algorithms have been utilized to solve real-world challenges. The Tracking-by-detection technique has received a lot of attention from MultipleObject Tracking (MOT) [1, 2]. This comprises of the two steps listed below: (1) Object detection and (2) Data association. Numerous object detection frameworks have been established in earlier research, and the detection accuracy has been greatly increased. There has been a lot of research on data association for MOT in recent years. But MOT issues do exist, such as how to handle false positives and false negatives because of occlusion. Sensor fusion is the process of combining data from multiple sensors to provide a more accurate, reliable, and comprehensive understanding of an environment or system. One of the main benefits of sensor fusion is that it allows a system to make more accurate and reliable decisions by taking into account multiple sources of information. LiDAR and Camera sensors are often used together in sensor fusion systems to provide complementary information about an environment. Recently, there has been a lot of study done on data association [3] for MOT. One of the stduy [4] offered a two-stage association method that considers the special qualities of the sensors like LiDAR and Camera. Here, it is demonstrated that combining the data from the two sensors can result in tracking results that are quite precise and it is important to note that the majority of MOT approaches in use today either use Camera-based 2D/3D tracking or LiDAR-based 3D tracking, with just a small number combining information from both types of sensors for MOT in the 3D domain. Generally speaking, cameras typically have the ability to detect distant objects, whereas LiDAR will have trouble doing so. Due to this, for the identical path of an object, Camera-based methods can start tracking it when it is far away, whereas LiDAR-based techniques must wait until the object is nearby to start tracking it. This system successfully combines 2D and 3D trajectories and tracks an object in 2D when a camera recognises it and in 3D when it reaches the detection range of a LiDAR sensor.
2 Literature Survey The 2D Multi-Object Tracking method uses RGB image feature information which consists of motion information [5] and appearance information [6]. The algorithm used here is tracking-by-detection [7, 8]. In this, a statistical filter (like the Kalman filter,Extended Kalman filter) estimates the trajectory after the object has first been spotted by a detection method. After that, a cost matrix is generated between the
Multi-level Association Based 3D …
353
projected trajectory and the detected trajectories. Later data association is carried out using Hungarian algorithm. Some researchers [9] explained that both detection and tracking can be performed LiDAR-based multi-object tracking methods were more in demand because the LiDAR sensor gives accurate depth information. The study [10] proposed a very simple evaluation metrics of 3D tracking in which they used cost function as 3D IoU. But this resulted in less tracking accuracy. Later, researchers [11] developed a special 3D MOT with the trajectories prediction approach. Here the feature learning of MOT carried out by graphical neural network which in return gives information of trajectory prediction. But this lead to duplicate trajectory samples. Other researchers [12] proposed the center-based outline in which the objects are detected by bounding box and then tracked. The limitation of single sensor can be tackled by fusing the information collected from Camera and LiDAR sensors, which gives the information of motion and appearance of the objects, also improves the robustness and detection accuracy. For example, a normal camera has resolution that is far higher than LiDAR’s, but it also has a smaller filed of view and is less accurate at estimating distances to objects than LiDAR. On the other hand LiDAR struggles to distinguish between different colours and categorize objects when compared to camera, since camera is more sensitive to change of light. In order to find tracked objects in 3D space, study [13] focused on monocular SLAM. Scene flow, visual flow, stereo-depth, and 2D object detection techniques are all combined in MOTS-Fusion [14]. To merge appearance and motion models, GNN3DMOT [15] is enhanced to control including both images and LiDAR sequences. No one sensor can solve all the problems of perception in a vehicle. Since we cannot get all the feature from one sensor, so the solution here is to perform Sensor Fusion. By performing Sensor fusion, 2 problems can be tackled. First, it is easy to acquire information about the environment surrounding an autonomous vehicle with various characteristics. And next to overcome the constraints of each sensor and lessen the uncertainty of individual sensors. Considering both sensors merits and demerits, the following are the list of regular issues found in the prior studies: (1) The Camera-based MOT techniques typically don’t provide the depth information needed for 3D tracking. The precision information in depth is not as good as that of the LiDAR sensor. On the other hand, the absence of pixel information in LiDAR-based tracking techniques prevents them from accurately monitoring far-off objects. (2) In most cases, the Camera-LiDAR fusion technique does not fully utilise the visual and point cloud data.
3 Methodology The research proposes a easy, quick, and reliable 3D tracking system based on Camera-LiDAR fusion which is applied to self-driving cars. (1) A simple CameraLiDAR fusion is perfrormed through coordinate transformation. (2) A Multi-level
354
D. Morabad et al.
Fig. 1 Block diagram for 3D multiple-object tracking
association technique that efficiently merges 2D and also 3D trajectories and fully utilizes the properties of Camera and LiDAR is proposed. (3) The Extended Kalman filter is implemented to evaluate the conditions of 3D trajectories. This framework is divided into five sections, as indicated in Fig. 1: Sensor data, Object Detection, Data fusion, Multi-level association, and Output. Initially Camera and LiDAR sensor data is collected individually and sent to next process. The KITTI Object Tracking Evaluation dataset includes RGB images and velodyne point clouds. Next Object Detection is implemented separately to both sensor. Camera images uses 2D detector which is RRC(Recurrent Rolling Convolution) [16] a recurring process in which each iteration collects and aggregates important characteristics for detection. In order to produce highly precise detection results, it is able to slowly and consistently integrate relevant contextual data across the feature maps. LiDAR data uses 3D detector called PointRCNN [17] is divided into two stages, all of the points contained within 3D ground-truth boxes are regarded as foreground points in the first stage, and background points in the second. Then in the second stage sub-network, we add small variations randomly to the 3D proposals to expand the diversity of concepts. The 2 stage sub-networks are trained independently and then the refinement sub-network analyses the impacts of each type of feature to determine the effectiveness and efficiency of 3D bounding box regression. Once both detectors results are obtained, using these results Data Fusion is carried out through Co-ordinate transformation and Projection, in which image domain is projected on to the LiDAR domain. Then Multi-level association technique is implemented which consists of 3 different levels of associations such as (1) The highest priority items are those that are collected by both Camera and the LiDAR and are therefore attached straight away to the current 3D trajectories. (2) Once the first phase of association is completed, the mismatched 3D trajectories are linked with the objects only detected by the LiDAR. (3) The 2D and 3D trajectories are combined after the image domain is superimposed with 3D trajectories. The output we obtain are the 3D bounding box of the objects.
3.1 Sensor Data Fusion The offered tracking framework receives its inputs from the detections obtained by Camera and LiDAR. The 2D detector [16] precisely obtains 2D information . L 2D
Multi-level Association Based 3D …
355
= (.xc , yc ,h,w) where .xc and yc are the 2D bounding box’s centre coordinates and width and height by w and h respectively. In the same way the 3D detector [17] obtains 3D information . L 3D = (xc , yc , z c , h, w, l, θ ) where the 3D bounding box’s center coordinates are .xc , yc and z c and l, h, w,.θ defines length, height, width, and angle of heading. After creating these two independent bounding boxes, coordinate transformation is used to project the LiDAR domain(. L 3D ) into the image domain(. L 2D ). As a result, a suitable 2D bounding box(. L 3D 2D different from an . L 2D ) is produced. The Intersection of Union (IoU) between these two acquired bounding boxes is then determined. If this IoU is less than the threshold, we draw the conclusion that the only objects are in LiDAR domain only(referred as . L 3D ), If IOU exceeds the threshold, we draw the conclusion that the objects in both the domains which is indicated by f used (. L 2D−3D ).
3.2 Multi-Level Association In this study, we offer a Multi-Level association methodology to overcome the difficulty of Multi-Level association issue in MOT, making complete utilization of the characteristics of point clouds and images. The following three issues are wellsolved by this mechanism: (1) ID switching whenever a previously covered object re-emerges, (2) incorrect tracking of far-off objects using LiDAR sensors alone and (3) false positives and negatives put forth by a detector’s missing detections. There are three stages of Multi-Level association which are as follows. Level 1 Association: f used The Level 1 association effectively connects the fused detections(. L 2D−3D ) with the present 3D trajectories using cost function (.Cfused ) listed below: { Cfused =
.
C13diou , if C13diou /= 0 C2dist , if C13diou = 0
} (1)
di ∩t j di ∪t j 1 1+|di −t j |
C13diou (i, j) =
.
C2dist (i, j) =
.
where the .t j is known as the j-th trajectory, and .di represents that i-th detection. .C2dist (i, j) is the normalised Euclidean distance seen between i-th detection and the j-th trajectory, while .C13diou (i,j) shows the IoU here between j-th trajectory as well as the i-th detection. To prevent the failure of fast moving object associations, the 3D IoU and normalised Euclidean distance are fused. But, it is impossible to integrate the detection and trajectory. To overcome this problem the cost function includes Euclidean distance to make sure the association process for scenarios where 3D IoU fails. The
356
D. Morabad et al.
described cost function could generate three association results: matched 3D trajectories(.Tm3D ), unmatched 3D trajectories(.Tu3D ) and unmatched 3D detections(. Du3D which for fused detections that are not merged based on any current trajectory). These Unmatched detections are modified as corresponding newly confirmed trajectories. Once the matched trajectories have been changed using the updating technique [10], the unmatched trajectories are passed to the next phase of association. Level 2 Association The same cost function is implemented in this stage as that in the first stage. The previous stage’s mismatched trajectories (.Tu3D ) are integrated with only the LiDAR-only detection (. L 3D ) under this step. If the (.Tu3D ) is properly associated only with (. L 3D ) then such trajectories were updated using an updating method [10]. The only unmatched (. L 3D ) that remain are changed to match confirmed trajectories. Here, the mismatched trajectories shares similar as tentative and are only verified if they are correctly connected. Due to the fact that an object is detected by numerous sensors, the possibility of false detection is reduced. Level 3 Association In order to obtain the proper 2D bounding boxes, the 3D trajectory have been projected on-to the image domain. Next step is to determine the Inter-section of union(IoU) between every 2D bounding box projected (. L 3D 2D ) and the original 2D bounding box ( . L 2D ) that was detected in that of image domain. When 2D trajectory are fused with 3D trajectory, then these two are merged with modified 3D trajectory. Then to get fusion, we exchange the 2D trajectory variables with the 3D trajectory attributes (i.e.) Number of frames, ID, and status of trajectories (confirmed, tentative, expired, or reappeared). The described Multi-Level Association mechanism has got better performance like fast computation speed and low ID switch when tested with the KITTI dataset.
4 Experimentation In this MOT methodology the KITTI Object Tracking Evaluation datasets [18] is used, which contains 21 training sequences and 29 testing sequences of camera images and Velodyne point clouds.
4.1 Evaluation Metrics One of the primary assessment criteria for tracking performance in the KITTI dataset is HOTA, a newly suggested Multi-Object Tracking evaluation tool. It is splitted into a number of sub-metrics, primarily Detection Accuracy (DetA) and also Association Accuracy (AssA), as that of an evaluation metrics which integrates quality of detection with association quality. Another popular technique for evaluating the effi-
Multi-level Association Based 3D …
357
ciency of 2D MOT is CLEAR, which contains essential evaluation criteria which includes Multi-Object Tracking Accuracy (MOTA), a Multi-Object Tracking Precision (MOTP), and also ID Switch (IDSW). Both HOTA and CLEAR criteria are used in this research to assess the 3D MOT’s performance. 1. DetA: Detection Accuracy is the connectivity between the set of all expected detections and the collection among all ground-truth detections is determined by detection. True Positives (TP), also known as matching pairs of detections, are the intersection of the two sets of detections. False Positives (FP) are predicted detections that don’t match, and False Negatives(FN) are ground-truth detections that don’t match. The following provides the Detection Accuarcy. [ .
Det A =
|T P| |T P| + |F P| + |F N |
] (2)
2. AssA: Association Accuracy evaluates how effectively a tracker connects detections across time to the same identity (IDs), using the ground-truth collection of identity linkages in the ground-truth tracks. The calculation of the Association Accuarcy is performed below:
.
.
1 ∑ (Ass A − I OU ) |T P| [T P]
(3)
] [ 1 ∑ |T P A| |T P| [T P] |T P A| + |F P A| + |F N A|
(4)
Ass A =
Ass A =
3. HOTA: Higher Order Tracking Accuarcy It computes an average of how perfectly the corresponding detections trajectories match, while discounting detections that do not match.
.
H OT A =
√
Det A × Ass A
/ ∑ .
H OT A =
[T P]
× (Ass A − I OU )
|T P A| + |F P A| + |F N A|
(5)
(6)
4. MOTA: Multi-Object Tracking Accuarcy is among the CLEAR metrics in which matching is carried at the level of detection in which the predicted detection (prDets) and the ground truth detection (gtDets) are sufficiently comparable in place to compute True Positive (TP), False Positive, and False Negative, then
358
D. Morabad et al.
a one-to-one mapping is established for each frame and the association is monitored. ] [ |F N | + |F P| + |I DSW | .M O T A = 1 − ] |gt Dets|
(7)
5. MOTP: Multi-Object Tracking Precision is overlap between all accurately matching predictions and its ground truth is averaged by MOTP to determine the localization accuracy. It takes the collection of True Positives(TP) and sums the similarity score, S. It primarily measures the detector’s localization accuracy.
.
M OT P =
1 ∑ ×S |T P| T P
(8)
5 Experimental Results 5.1 Quantitative Assessment The approach suggested in the current study produces better outcomes in according to several performance metrics focused upon the KITTI testing set (Class Car). The outcomes demonstrates that the suggested approach outperforms other solutions significantly. HOTA (66%), DetA (63.11%) and AssA (70.56%), also CLEAR metrics achieved better results MOTA (76.32%), MOTP (79%) and IDSW (110) using KITTI Datasets (Table 1).
Table 1 Performance metrics for 3D multiple-object tracking Parameters Percentage in (%) HOTA DetA AssA MOTA MOTP
66 63.11 70.56 76.32 79
Multi-level Association Based 3D …
359
Fig. 2 An example of 3D tracking results of consecutive frames are shown
5.2 Qualitative Assessment The suggested method is qualitatively assessed in this research using the training and testing sets out from KITTI dataset. The visualisation outcomes of the suggested methodology using image sequence 0003, 0008 and 0002 are shown in Fig. 2 and consists of 3D bounding box results of four consecutive frames.
6 Conclusion This research proposes a unique Camera-LiDAR fusion-based 3D MOT architecture. The benefits of these two different types of sensors were fully utilised in this framework, and a Multi-Level Association is used to extract the object tracking parameters. Each object’s 3D bounding box is created in a distinct frame. It generates very accurate tracking results. The proposed Multi-Object Tracking framework gets a very a great balance between tracking accuracy and the computation speed, making it appropriate for various Multi-Object Tracking applications for self-driving cars. A 3D MOT is concluded as a valuable tool for tracking multiple objects in a 3D space. Despite these challenges, 3D MOT is an important and promising area of research and development, and it’s likely to keep playing a big part in the field of computer vision going forward.
360
D. Morabad et al.
References 1. Wojke N, Bewley A, Paulus D (2017) Simple online and realtime tracking with a deep association metric. In: 2017 IEEE international conference on image processing (ICIP). IEEE, pp 3645–3649 2. Zhang W, Zhou H, Sun S, Wang Z, Shi J, Loy CC (2019) Robust multi-modality multi-object tracking. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 2365–2374 3. Bergmann P, Meinhardt T, Leal-Taixe L (2019) Tracking without bells and whistles. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 941–951 4. Kim A, Ošep A, Leal-Taixé L (2021) Eagermot: 3D multi-object tracking via sensor fusion. In: 2021 IEEE international conference on robotics and automation (ICRA). IEEE), pp 11315– 11321 5. Bochinski E, Senst T, Sikora T (2018) Extending IOU based multi-object tracking by visual information. In: 2018 15th IEEE international conference on advanced video and signal based surveillance (AVSS). IEEE, pp 1–6 6. Bochinski E, Eiselein V, Sikora T (2017) High-speed tracking-by-detection without using image information. In: 2017 14th IEEE international conference on advanced video and signal based surveillance (AVSS). IEEE, pp 1–6 7. Bewley A, Ge Z, Ott L, Ramos F, Upcroft B (2016) Simple online and realtime tracking. In: 2016 IEEE international conference on image processing (ICIP). IEEE, pp 3464–3468 8. Gündüz G, Acarman T (2018) A lightweight online multiple object vehicle tracking method. In: 2018 IEEE intelligent vehicles symposium (IV). IEEE, pp 427–432 9. Wang Z, Zheng L, Liu Y, Li Y, Wang S (2020) Towards real-time multi-object tracking. In: Computer vision—ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XI 16. Springer, pp 107–122 10. Weng X, Wang J, Held D, Kitani K (2020) 3D multi-object tracking: a baseline and new evaluation metrics. In: 2020 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, pp 10359–10366 11. Weng X, Yuan Y, Kitani K (2021) PTP: parallelized tracking and prediction with graph neural networks and diversity sampling. IEEE Robot Autom Lett 6(3):4640–4647 12. Yin T, Zhou X, Krahenbuhl P (2021) Center-based 3D object detection and tracking. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11784– 11793 13. Ballester I, Fontán A, Civera J, Strobl KH, Triebel R (2021) Dot: dynamic object tracking for visual slam. In: 2021 IEEE international conference on robotics and automation (ICRA). IEEE, pp 11705–11711 14. Luiten J, Fischer T, Leibe B (2020) Track to reconstruct and reconstruct to track. IEEE Robot Autom Lett 5(2):1803–1810 15. Weng X, Wang Y, Man Y, Kitani KM (2020) Gnn3dmot: graph neural network for 3D multiobject tracking with 2D-3D multi-feature learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6499–6508 16. Ren J, Chen X, Liu J, Sun W, Pang J, Yan Q, Tai YW, Xu L (2017) Accurate single stage detector using recurrent rolling convolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5420–5428 17. Shi S, Wang X, Li H (2019) Pointrcnn: 3D object proposal generation and detection from point cloud. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 770–779 18. Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? The Kitti vision benchmark suite. In: 2012 IEEE conference on computer vision and pattern recognition. IEEE, pp 3354–3361
Automatic Extraction of Software Requirements Using Machine Learning Siddharth Apte, Yash Honrao, Rohan Shinde, Pratvina Talele, and Rashmi Phalnikar
Abstract System requirement specification (SRS) documents specify the client’s requirements and specifications in the software systems. Requirement engineering is a mandatory phase of the Software Development Life Cycle (SDLC) that includes defining, documenting, and maintaining system requirements. As the complexity increases it becomes difficult to categorize and prioritize the various Software Requirements (SR). Different natural language processing methods such as tokenization, lemmatization is used in the text pre-processing phase followed by Term frequency-inverse document frequency (TF-IDF). The aim of this research is to compare existing Machine Learning algorithms to evaluate which algorithm is able to efficiently classify the system requirements. The algorithms are assessed on two parameters precision and accuracy. The results obtained showed Decision Tree (DT) could identify all types of requirements except portability. But the accuracy of Support Vector Machine (SVM) is highest with 78.57% for the publicly available dataset than that of DT which has an accuracy of 61.42% Keywords Natural language processing · Machine learning · Software engineering · Supervised machine learning
S. Apte (B) · Y. Honrao · R. Shinde · P. Talele · R. Phalnikar School of Computer Engineering and Technology, Dr. Vishwanath Karad MIT World Peace University, Pune, India e-mail: [email protected] P. Talele e-mail: [email protected] R. Phalnikar e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_33
361
362
S. Apte et al.
1 Introduction In system engineering, software development life cycle (SDLC) or application development life cycle is a process utilized to plan test and deploy software projects. The SDLC is a standardized procedure that tries to guarantee the accuracy of the software that is shipped. A crucial stage of the life cycle is requirement engineering, which involves establishing, documenting, and managing system requirements [1]. Requirement elicitation is the process by which software requirements are collected by communicating with those involved in the development process as well as the users. Requirements are mandatory requirements that must be implemented in the software system. It defines what a system must do what its features and functions are. Nonfunctional requirements are not mandatory but desirable attributes to have that define how a system should perform. They can be considered as quality attributes of the software systems. Accurate requirement classification is critical to a software projects success. There are significant risks associated with incorrect classification of requirements [2]. These risks can entail the project being over budget, going over time or even failure. According to research, 71% of errors result from an unclear understanding and classification of requirements [3]. The CHAOS report by the Standish Group outlined the causes of IT project failure in the USA [4]. It revealed that just 16.2% of IT projects were successful; the rest were a failure. Incomplete criteria and specifications are the second issue that contributes to project challenges. Significant research is being conducted in the field of applying machine learning models and techniques in classifying text from documents and other natural language processing techniques. Based on a number of studies [5–7] and comparative analysis in this paper we have employed TF-IDF technique for feature selection and a number of machine learning algorithms which are decision tree, Random forest, Naïve bayes, Linear regression, Neural network, K—nearest neighbors (KNN), Support vector machines (SVM). This paper is organized as follows: Literature review and background are covered in Sect. 2. In Sect. 3 research methodology and techniques employed are thoroughly described. Experimental results are discussed in Sect. 4. In Sect. 5 the conclusion is presented along with the future scope.
2 Literature Review/Related Work and Background Researchers have employed many techniques to classify software requirements (functional and nonfunctional) using machine learning. j48 decision tree is implemented with pruned and unpruned being the two types of trees to create a total of four models, model 1 handles ‘authentication authorization’ type of security requirements,
Automatic Extraction of Software Requirements Using Machine Learning
363
model 2 deals with ‘access control’ requirements, model 3 defines ‘cryptographyencryption’ requirements and model 4 is concerned with ‘data integrity’ requirements. Out of which model 4 gives the maximum accuracy, the fact that only security requirements are classified becomes a shortcoming of this paper [8]. The use of algorithms such as MNB, Gaussian Naïve Bayes (GNB), KNN’s, decision tree, SVM and Stochastic Gradient Descent SVM (SVM SGD) was done. Out of which SVM and SGD performed better than all the algorithms with GNB performing the worst and KNN achieving moderate scores between SVM and KNN, however only non-functional requirements are considered leaving out functional requirements [9]. The author has used KNN, SVM, Logistic regression, MNB. SVM and Linear regression both have shown high precision values in classifications of functional and nonfunctional requirements [10]. In this paper the authors have used Multinomial Naïve Bayes, KNN, Sequential Minimal Optimization (SMO). SMO performed the best with an accuracy of 0.729, however only non-functional requirements are considered [11]. A series of SVM algorithm are used along with lexical features to achieve recall and precision on 0.92 [12]. MNB and Logistic regression are employed to obtain an accuracy of 95.55% and 91.23% respectively [13]. Naïve bayes and decision tree were employed on a dataset which comprised of crowd sourced software requirements with accuracy results for naïve bayes being 92.7% and Decision tree being 84.2% [14]. In this approach the authors have used BoW (bag of words) text vectorization technique in combination with two machine learning algorithms SVM and KNN. The results are presented in three types. Firstly, the classification FR/NFR having a precision of 90% and 82%, secondly the classification of subcategories of NFR with an accuracy of 68% and 56% and lastly the classification of subcategories of NFR and FR with precision of 73% and 67% respectively for SVM and KNN algorithms [15].
3 Methodology Software requirement classification is an important phase of requirement engineering. To classify the requirements, many researchers have suggested various requirement classification strategies. Due to the fact that each stakeholder has a unique viewpoint on the software, this process takes a long time to complete. This calls for automating the requirement identification process. Different machine learning (ML) techniques, including as DT, SVM and Naive Bayes are utilized to automate the requirements classification. As these requirements are stated in the regular language, the text pre-processing phase that turns the text into vectors is necessary. These vectors can be used as input for ML algorithms. The steps involved during the conduction of this research are mentioned in the Fig. 1.
364
S. Apte et al.
Fig. 1 Proposed Architecture for automation of classifying requirements
3.1 Text Preprocessing NLP (natural language processing) is a branch of Machine learning that deals with teaching computers to process data that is specified in human readable language. It makes it possible to comprehend spoken words. The raw text is cleaned by feature extraction and text preprocessing techniques, which then supply the features to machine learning algorithms. When raw text is processed, training that on supervised learning algorithms becomes easier. The text preprocessing methods are as follows: • Extract text: This method uses python libraries to extract the text from documents in pdf or csv formats to give the outputs in paragraphs. • Lower case: All uppercase characters are converted to lower case as having some uppercase characters might lead to incorrect feature extraction. • Removal of noise: meaningless symbols that do not carry any specific meaning area removed. Symbols such as [! $@$%&*(^$#@{}[]:” > < ,.}]|\. • Removal of white space: extra white space occurring between two words is removed for preprocessing. • Removal of stop words: certain words in grammar such as “the”, “a”, “on”, “is”, “all” do not carry any information of significance and hence are removed.
3.2 Extraction Technique Based on literature survey two techniques, Term Frequency—Inverse Document Frequency (TF-IDF) is used to extract features. • Term Frequency—Inverse Document Frequency (TF-IDF)—is a method to compute words in a set of documents. It calculates how relevant a word in a series or corpus is to a text/document. It is calculated by: t f id f w,d = t f w,d × id f w
(1)
Automatic Extraction of Software Requirements Using Machine Learning
365
where Term Frequency (TF) is simply a frequency counter for a word (w) in a document (d). t f w,d =
n w,d N umber o f wor ds in the document
(2)
Inverse Document Frequency measures the informativeness of word (w) it is calculated by ( id f w = log
number o f documents number o f document with wor d w
) (3)
3.3 Classification Algorithms • Logistic Regression (LR) is an ML Algorithm which is used for binary classification by predicting whether something happens or not. It uses a Sigmoid function (Logistic function) to give the probability output which is then compared to a pre-defined threshold and further labelled into one of the two categories accordingly. • Decision Tree (DT) is an algorithm for classification and regression that is nonparametric. It is a tree like flow-chart where at each inner node an if-else test is performed on an attribute thus partitioning the dataset into subsets based on most important features. This is called recursive partitioning and it aims to predict the value of a target variable by learning simple decision rules learned from the data features. The leaves of tree give us the final classification. Entropy is used to measure the impurity or randomness of a dataset. • Naïve Bayes (NB) is a supervised learning algorithm which is based on Bayes Theorem (Conditional Probability) used for multinomially distributed data. It has a naïve assumption that each feature being classed has no bearing on any another feature. It’s advantage is it can be trained with comparatively smaller datasets as compared to other algorithms. • Support Vector Machines (SVM) is a set of supervised machine learning algorithm used for classification and regression. We plot the data items in an n-dimensional space (n being your number of features) where the value of each feature is its co-ordinate. We then find a plane (border) called as hyper-plane which perfectly divides these co-ordinates into positive class and negative class thus classifying the data items. It is not suitable for large datasets as it takes lot of time to train the model. • K-nearest neighbors (KNN) is a type of supervised learning classifier that does not take any parameters. It groups individual data points together which are in close proximity (k closest relatives). It assumes that points lying near one another are similar. The class label is decided based on the class labels assigned to the
366
S. Apte et al.
surrounding points; this is referred as “majority voting”. Its performance is lower than SVM and Multinomial Naïve Bayes. • Multinomial Bayes Theorem (MNB) is a probabilistic learning method and a tool used for analysis of texts. It is a supervised learning classification which is used to analyze categorical text data. It predicts the tag of a text and calculates the probability of each tag for a given sample and then gives the tag with the highest probability as output. It basically gives the count of how many times a word has occurred.
3.4 System Configurations These experiments were carried out on a Dell G5 se laptop with an AMD Ryzen 5 4600H having 6 cores a graphics card of 3.00 GHz and installed RAM of 8 GB (7.37 GB usable) and 64-bit Windows 11 operating system. These experiments were carried out on a local Jupyter Notebook environment with Python3 programming language.
3.5 Description of Dataset Publicly available datasets are used to compare the existing machine learning algorithms and automate the requirements classification process [16]. There are 12 different categories of functional and non-functional requirements with words and phrases commonly found within software project requirement documentation [16]. The different categories along with their meanings are given in the Table 1. Table 1 Categories of requirements
Kaggle software-requirements-dataset [16] 1. Functional (F)
7. Operational (O)
2. Availability (A)
8. Performance (PE)
3. Fault tolerance (FT)
9. Portability (PO)
4. Legal (L)
10. Scalability (SC)
5. Look & Feel (LF)
11. Security (SE)
6. Maintainability (MN)
12. Usability (US)
Automatic Extraction of Software Requirements Using Machine Learning
367
4 Experimental Results As described earlier the study use the nfr dataset for training and the test dataset for testing [16]. On the basis of the literature review we decided to use TF-IDF as a feature extraction method. Precision and accuracy were considered as the performance parameters. The algorithms that were selected for the study were Decision Tree classifier, Random Forest, Multinomial naïve bayes (MNB), Logistic Regression (LR), Neural network (NN), K Nearest Neighbor (KNN). A total of 1584 features were extracted from 12 different SRS (software requirement specifications). Figure 2 illustrates the comparison between the algorithms for precision as well as whether the algorithm was able or unable to identify a particular category of functional and nonfunctional requirement. It can be observed that Availability (A) and Maintainability (MN) were not identified by any of the algorithms. Table 2 gives a detail overview of the precision parameters. The table’s “0” entry denotes identification with at least a few parameters, while the “–” entry denotes no identification. DT algorithm could identify all types of requirements except portability. Among the functional and nonfunctional requirements all the algorithms were able to correctly identify Functional (F), Legal (L), Look and Feel (LF), Operational (O), Security (SE) and Usability (US). However only Decision tree was able to correctly identify Fault Tolerance (FT), and Scalability (SC). Performance (PE) was identified only by KNN, and Portability (PO) was only classified by Neural Networks (NN). Figure 3 illustrates the accuracy metric for each of the algorithm. Random forest, KNN algorithm with value of n being 17, SVM, all have similar accuracies of 78.57%. Decision Tree, MNB, Logistic Regression have similar accuracies of 61.42%, 64.28%, 65.71% respectively While neural network performs the worst with an accuracy of 54.28%.
Fig. 2 Comparison of algorithms (precision)
368
S. Apte et al.
Table 2 Comparison of precision value for existing algorithms Precision ML algorithm
Functional
Availability
Fault tolerance
Legal
Look and feel
Maintainability
Decision tree
0.79
–
0
0
0.33
–
Random forest
0.77
–
–
0
1
–
Multinomial 0.64 Naïve Bayes
–
–
0
0
–
Logistic regression
0.66
–
–
0
0
–
Neural network
0.87
–
–
0
1
–
KNN
0.82
–
–
0
1
–
SVM
0.79
–
–
0
1
–
ML algorithm
Operational Performance Portability Scalability Security Usability
Decision tree
0.4
0
–
0
0.6
0.6
Random forest
0.75
–
–
–
0.83
1
Multinomial 0 Naïve Bayes
–
–
–
0
0
Logistic regression
0
–
–
–
1
0
Neural network
0
–
0
–
0.28
0.67
KNN
0
0
–
–
0.75
0.67
SVM
0.25
–
–
–
1
1
5 Conclusion and Future Scope Through this work a systematic comparison between several ML algorithms used for identification and classification of a number of functional and non-functional requirements is provided. The dataset undergoes preprocessing which includes extracting the text, conversion to lower case, removal of noise, white space and stop words after which the feature selection technique of TF-IDF is employed. The performance of the ML algorithms is measured by the test dataset. It is observed that Random Forest, KNN, SVM algorithms perform equally well with accuracy of 78.57% out of which KNN algorithm is able to identify 7 out of the 12 Software requirements However, considering the lack of sample in the test dataset resulted in many of the requirements
Automatic Extraction of Software Requirements Using Machine Learning
369
Fig. 3 Comparison of ML algorithms (accuracy)
being identified but not correctly mapped thus leading to a relative decrease in the algorithm accuracy. These techniques will assist developers to automate the task of software requirements classification and categorization thus saving time, money, and other key resources in the software development life cycle while being able to ship the most suitable product tailored to the stakeholder’s requirements. Furthermore, it can help specialist in the industry to select the best algorithm that presents the maximum accuracy in classification process. Based on the results, we plan to further optimize the SVM algorithm and DT to provide better accuracy for datasets with minimum number of samples. Along with employing TF-IDF and Word2Vec, Bag of words (BoW) and chi-squared techniques will be utilized to extract features which can be used to classify requirements, so that the best method to automate the requirements classification process can be identified. This methodology will help software companies to develop small scale software products.
References 1. Talele PV, Phalnikar R (2021) Software requirements classification and prioritisation using machine learning. In: Joshi A, Khosravy M, Gupta N (eds) Machine learning for predictive analysis. Lecture notes in networks and systems, vol 141. Springer, Singapore. https://doi.org/ 10.1007/978-981-15-7106-0_26 2. Alrumaih H, Mirza A, Alsalamah H (2018) Toward automated software requirements classification. In: 2018 21st Saudi computer society national computer conference (NCC), pp 1–6. https://doi.org/10.1109/NCG.2018.8593012 3. https://www.cio.com/article/255253/developer-fixing-the-software-requirements-mess.html 4. https://simpleisbetterthancomplex.com/media/2016/10/chaos-report.pdf
370
S. Apte et al.
5. Binkhonain M, Zhao L (2019) A review of machine learning algorithms for identification and classification of non-functional requirements. Exp Syst Appl X 1:100001. ISSN25901885. https://doi.org/10.1016/j.eswax.2019.100001 6. Khan A, Baharudin B, Lee LH, Khan K (2010) A review of machine learning algorithms for text-documents classification. J Adv Inform Technol 1(1):4–20. https://doi.org/10.4304/jait.1. 1.4-20 7. Talele P, Phalnikar R (2021) Classification and prioritisation of software requirements using machine learning—a systematic review. In: 2021 11th international conference on cloud computing, data science & engineering (confluence), pp 912–918. https://doi.org/10.1109/Con fluence51648.2021.9377190 8. Jindal R, Malhotra R, Jain A (2016) Automated classification of security requirements. Presented at the international conference on advances in computing, communications and informatics (ICACCI), pp 2027–2033 9. Haque MA, Abdur Rahman M, Siddik MS (2019) Non-functional requirements classification with feature extraction and machine learning: an empirical study. In: 2019 1st international conference on advances in science, engineering and robotics technology (ICASERT), pp 1–5. https://doi.org/10.1109/ICASERT.2019.8934499 10. Dias Canedo E, Cordeiro Mendes B (2020) Software requirements classification using machine learning algorithms. Entropy 22(9):1057. https://doi.org/10.3390/e22091057 11. Slankas J, Williams L (2013) Automated extraction of non-functional requirements in available documentation. In: 2013 1st international workshop on natural language analysis in software engineering (NaturaLiSE). IEEE, pp 9–16 12. Kurtanovíc Z, Maalej W (2017) Automatically classifying functional and non-functional requirements using supervised machine learning. In: 2017 IEEE 25th international requirements engineering conference (RE). IEEE, pp 490–495 13. Althanoon AAA, Younis YS (2021) Supporting classification of software requirements system using intelligent technologies algorithms. Technium 3(11):32–39 14. Taj S, Arain Q, Memon I, Zubedi A (2019) To apply data mining for classification of crowd sourced software requirements. In: Proceedings of the 2019 8th international conference on software and information engineering (ICSIE ’19). Association for Computing Machinery, New York, NY, USA, pp 42–46. https://doi.org/10.1145/3328833.3328837 15. Quba GY, Al Qaisi H, Althunibat A, AlZu’bi S (2021) Software requirements classification using machine learning algorithm’s. In: 2021 international conference on information technology (ICIT), pp 685–690. https://doi.org/10.1109/ICIT52682.2021.9491688 16. https://www.kaggle.com/datasets/iamsouvik/softwarerequirementsdataset
Real-Time Pedestrian Detection Sagar Kumbari and Basawaraj
Abstract Real-time object detection has become increasingly important. In this exploration we will study the real-time pedestrian detection on raspberry pi 3B+. The Raspberry Pi, a popular imperative edge device for various applications, is utilized to examine the real-time inference of pedestrian detection models. We primarily pick models based on model size, computational time, model flexibility, and model reliability under various scenarios. EfficientDet was chosen for its fewer parameters, state-of-the-art precision, and scalability. Yolov7 is the present state in real-time object detection model, and yolov7-tiny is an optimised and tiny version of yolov7. Following that, we select TensorFlow-lite models that are mostly enhanced for Edge deployment. Optimizations such as quantizing and pruning significantly contribute to the increased efficiency of TensorFlow-Lite models, which achieve an FPS of 5-6 on the Raspberry Pi. While YOLOv7-tiny achieved higher accuracy than TensorFlowLite Models, it did so at a slower rate of inference. EfficientDet, which was originally trained in TensorFlow, is more computationally expensive than previous models but has higher accuracy. Keywords Deep-learning · Quantization · Pruning · Computation cost · TensorFlow-Lite · Edge AI · Flops · Number of parameters convolutional neural networks · Flops · TensorFlow-Lite
S. Kumbari (B) · Basawaraj KLE Technological University, Hubballi, Karnataka, India e-mail: [email protected] URL: http://kletech.ac.in Basawaraj e-mail: [email protected] URL: http://kletech.ac.in © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_34
371
372
S. Kumbari and Basawaraj
1 Introduction Real-time pedestrian detection is a critical task in computer vision, as it involves detecting the presence of pedestrians in video streams or images in real-time. The challenges of this task include high variability in appearance, pose, and motion of pedestrians, as well as the presence of occlusions, clutter, and other detections in the scene. To address these challenges, there are several approaches to real-time pedestrian detection, including Haar cascades, HOG (histogram of oriented gradient), and deep learning models. Haar cascades are machine learning model that can be trained to detect specific patterns in images, but they are not highly accurate. HOG, on the other hand, uses features such as edge information and texture to train a classifier. Convolutional Neural Networks (CNNs) have achieved the best results in pedestrian detection and can learn to detect pedestrians from large amounts of labelled data. Deep learning models have shown excellent results by using incorparating CNN. One of the major challenges in using deep learning for pedestrian detection is obtaining sufficient amounts of high-quality training data. This can be mitigated by using data augmentation techniques such as cropping, resizing, and rotating the images to increase the number of training examples. Despite the excellent results achieved by deep learning models, they can be computationally intensive. This can be particularly challenging in real-time pedestrian detection systems, where processing speed is a critical factor. To address this researchers developed Lite weight models. Additionally, some approaches are exploring the use of transfer learning and pretraining on large datasets to reduce the amount of computational resources required to train the model from scratch. Deep learning models have shown promising results in various applications, especially in image recognition, natural language processing and predictive analytics. Edge devices, being positioned at the edge of a network, are capable of performing these tasks without relying on a central server. This makes them ideal for applications that require low latency or real-time processing, as data doesn’t have to be transmitted to a central location for processing. However, deploying deep learning models on edge devices poses several challenges such as limited computational resources, storage capacity and power availability. Raspberry Pi is a compact, low-cost single-board computer that has become increasingly popular for a wide range of applications, including as an Edge device for running deep learning models. The Raspberry Pi offers several advantages as an Edge device, including its low cost, ease of use, and small form factor. In addition, it has a large community of developers and users who have created a wealth of tutorials and resources that can help get you started with using Raspberry Pi for deep learning. TensorFlow, PyTorch, and Keras are three well-known machine learning frameworks that can be used to run deep learning models on the Raspberry Pi. These frameworks make it simpler for developers to begin using deep learning on the Raspberry Pi by offering high-level APIs for creating and refining deep learning models. The Raspberry Pi may not be appropriate for running big or complicated deep learning models due to its limited computational capabilities when compared to more powerful devices. To get around this, you might need to optimize your deep learning models for the Raspberry Pi’s restricted resources by using methods like model pruning and quantization.
Real-Time Pedestrian Detection
373
1.1 Literature Survey At the dawn of object detection algorithms, primarily image processing-based techniques such as Support Vector Machine (SVM), Histogram of Oriented Gradient (HOG), and Viola-Jones object detection framework based on Haar features and Scale-Invariant Feature Transformation (SIFT) were in practise. The Viola-Jones object detection framework presented in ‘Paul Viola and Michael Jones’ publication [1]. This method finds objects relying on Haar features to find edges and patterns. SVM [2] and other supervised machine learning techniques promised more precise outcomes than the earlier object detection methods. These model’s performance has been compared in these [3–5] that shows that these traditional approaches are not good choice for object detection because of their meagre performance. New method emerged based on neural technology where features were handpicked and played a crucial role in performance. The first CNN-based [6] model developed presented an innovative way to object detection by means of the region proposals. The Region-based convolutional neural networks comprised of three step pipelines to classify the object in regions proposals associated with running CNN over each region proposal which made it slow and inefficient. Developed version of RCNN [7] which performed CNN over full image reduced calculation time and boosted accuracy. Mask RCNN [8], results parallel to Faster RCNN, had the advantage of more exact object information. As a result of the desire for models with low computational costs, different approaches [3, 9, 10] were evaluated. Following this one-shot object detection methods become popular such as YOLO which detects objects in one single scan. Further growth in the models led to RetinaNet [5] with improved accuracy and durability of the one stage object detectors outperforming YOLO in terms of detection accuracy by complex backbone network. Success in single-stage object detection inspired the expansion of more related algorithms, such as [3]. The end-to-end trained SSD network picks the class with the highest score and then draws bounding box. Neural networks fell short of the prospects due to their low accuracy and efficiency and their incapability to learn the features on their own purely contributed to the poor presentation in complex circumstances. Deep Neural networks solved the problem of manually selecting the features and made the training of the networks easy [11]. DenseNet was the first deep neural network for object detection known for its thick connectivity pattern and feed forward connections between layers. The exclusive design of DenseNet enabled it to learn more intricate features while reducing the issue of vanishing gradients. DenseNet had the benefit of being efficient in terms of parameter count. Deep neural networks have been successful in delivering excellent accuracy and are resistant to object deformations as well as changes in object scale and aspect ratio. Although a dense connectivity pattern increased the deep neural networks’ accuracy, it also made them more computationally costly. In order to effectively identify objects in an image, the YOLO (you only look once) object identification framework just needs one forward pass of the neural network. Several YOLO iterations are enhanced
374
S. Kumbari and Basawaraj
versions of the earlier methods. When processing photos in real time, the YOLO is better equipped. One of the enhancements made by YOLOv3 [9] over YOLOv2 was the darknet -53, a 53-layer CNN that employs anchor boxes with three distinct scales and three different aspect ratios. In the midst of all of this, object detection models were moving in two different directions: one was intended to provide high accuracy while the other was to fit in real-time scenarios by providing rapid inference. To support real-time applications Google developed [12] MobileNet, a lightweight network designed specifically for mobile devices and embedded vision applications with limited processing power. What matters most is how MobileNet effectively uses parameters for computation. These [5, 12] algorithms were developed using the YOLO to generate predictions using the MobileNet architecture as a backbone. Both MobileNet-YOLO and MobileNet-SSD were designed to have fewer parameters and lower computing expenses. Both YOLO and MobileNet were employed in every model as feature extractors and prediction tools. The combination of these two has significantly improved the precision and inference speed. The effective model [13] in terms of memory use and processing that was additionally created particularly for mobile devices. It mainly uses point-wise group convolution and depth-wise separable convolution to decrease computing costs while increasing accuracy. The shift from classical models to neural networks, then to deep neural networks, considerably increased accuracy while lowering computation costs and parameter requirements. References [9, 10] models demonstrated that inference speed can be significantly increased while maintaining a comparable level of accuracy. The optimised iterations of the current model, known as TensorFlow-Lite models, are the best choices for deployment in settings when resources are limited. Currently two main optimizations of TensorFlow-lite models are quantization and weight reduction.
2 Implementation Details In order to select the best suitable model that can be tested on the raspberry pi these following parameters were considered: (1) Less number of parameters. (2) Model Size is small. (3) Less computation cost. (4) Faster inference with comparable. accuracy. These following models are selected based on these. EfficientDet: Tan et al. [14] EfficientDet is a popular object detection models among [3, 5, 15] [5, 5, 8] that has better accuracy and a smaller number of FLOPs. BiFPN and cutting-edge compound scaling technique are the two key architectural advancements in the EfficientDet. Compared to [15–17] BiFPN provides an efficient method for aggregating multi-scale characteristics from the top down. PANet [17] had the best feature fusion technique compared to the [15, 16] and providing a superior accuracy. However, PANet had high computational cost and required more parameters. BIFPN was developed using two optimization techniques: first, deleting input nodes with only one input edge and no feature fusion since they contribute less to the feature network; and second, adding an additional edge from the original input to the output node if their levels are equal in order to reduce costs.
Real-Time Pedestrian Detection
375
Fig. 1 Bi-FPN architecture and results compared to others
Compound scaling is method applied to models to scale the baseline models to adapt to different scenarios. Compound scaling applied in EfficientDet was mainly inspired by the EfficientNet [18] that scales the backbone, prediction/box network and BIFPN. Efficientdet-D0 is baseline model which is the smallest in the EfficientDet family has AP of 34.6 with 3.9 M parameters and 2.9B FLOPS while [9] YOLOv3 achieves 33 AP with same parameters and 71B FLOPS. EfficientDet-D4 has 49.7 AP with 21M parameters and 55B FLOPS while the Detectron2 [8] Mask R-CNN X152 has the same AP, parameters and FLOPS. In case of EfficientDet-D7, it has 53.7AP with 52M parameters and 325B FLOPS. When comes to the individual contributions BiFPN achieves 44.39AP with 0.88x parameters ratio and 0.68 FLOPS ratio. While all other [15–17] feature fusion techniques were having 42 to 44 AP but with 1x and 1.24x parameters and FLOPS ratio. The weighted BiFPN achieves best accuracy with fewer parameters and FLOPs. As the we scale the EfficientDet the parameters and FLOPs increase giving a better accuracy (Fig. 1). YOLOv7 tiny: YOLOv7 mainly focussed on the optimised modules and optimization techniques that may increase training costs without raising inference costs, hence increasing object detection accuracy. Yolov7 developers developed a novel trainable bag of freebies method to dramatically increase real-time object identification accuracy without raising inference costs. The main issues that the writers found in the developing object detection models was (1) How the original modules are replaced with the re-parametrized modules mentioned in [19–21]. (2) The way in which the dynamic label-assignment approach handles the assignment to various output layers. The following qualities are typically necessary for an object detector to be considered state-of-the-art real-time:(1) Quicker and more robust network architecture. (2) More efficient feature integration techniques and (3) more precise detection tech-
376
S. Kumbari and Basawaraj
niques. (4) A stronger loss function. (5) A way of label assignment that is more effective. (6) A better instruction strategy. The authors created a novel trainable bag of tricks approach to address the issues. A more effective training approach, a more robust loss function, a more effective training method, and so forth. E-ELAN and a new model scaling method are the two key modifications that have been introduced to the YOLOv7 network. The three cardinal procedures that E-Elan employs to maintain the initial gradient path while enhancing the network’s learning capacity are expand, shuffle, and merge. E-ELAN helps to learn more varied properties for the group computational blocks while also keeping the original ELAN. YOLOv7’s suggested model scaling strategy preserves the qualities and ideal structure of the model’s initial design. The trainable bag of freebies in YOLOv7 is planned re-parametrized convolution, coarse for auxiliary head, and fine for lead loss, with the goal of enhancing training techniques to increase detection accuracy. A fundamental model called YOLOv7 tiny has been improved for edge GPU, edge AI, and deep learning workloads as well as for more lightweight ML on computing devices or edge devices. The quantity of parameters in a real-time object detection model is crucial. YOLOv7 has 36.9 million parameters, whereas YOLOv7 tiny only has 6.2 million, making it suitable for real-time. In contrast to YOLOv7, which has 104.7 G FLOPS, YOLOv7 tiny has only 13.8 G FLOPS. YOLOv7 only managed to reach 161 frames per second in Tesla-100, but YOLOv7 tiny was able to accomplish 286.Compared to YOLOv4 there is 50% decrease in the parameters and 15% decrease in the FLOPS and nearly 1% better accuracy and can be trained much faster on small datasets without any pre-trained weights. When compared to other object detection algorithms, YOLOv7 performs well. On the well-known COCO dataset, it achieves an average precision of 37.2% at the IoU threshold of 0.5. YOLOv7 tiny obtains higher detection accuracy than YOLOv7 tiny. YOLOv7’s AP on the test is 69.7%, whereas YOLOv7-tiny receives a score of 56.7%. The 40% parameter and 50% computation can be successfully reduced using the method suggested in the YOLOv7. It is more accurate at detecting things and has better inference speed. Yolov7 has undergone all of these changes, but YOLOv7 tiny is an optimised version of YOLOv7, making it an excellent option for real-time object recognition on edge devices. TensorFlow-Lite: TensorFlow-lite models have been specially trained to work in a resource-restricted setting while lowering the model size and delivering comparable outcomes. Machine learning models called TensorFlow-lite are more compact, effective, and space-efficient than the original models. These attributes make the TensorFlow-lite Quantization reduces the precision of values used to describe a model’s parameters when they are 32-bit floating point numbers by default. Faster inference is made possible by quantizing computations to reduce their size and number. TensorFlow-lite is adaptable enough to fit different circumstances because it provides a range of quantization strategies. In post-training quantization, the model size is reduced by 50%, and in quantization-aware training, it is reduced by 75% (Figs. 2 and 3).
Real-Time Pedestrian Detection
377
Fig. 2 TensorFlow-Lite converter
Fig. 3 Qunatization types
Pruning entails removing model parameters that have just a minor impact on predictions. The runtime latency and disc size of pruned models are the same, despite the fact that they can be compressed more effectively. TensorFlow-EfficientDet lite model is approximately half the size of the original EfficientDet model, making it an Excellent for real-time pedestrian detection. The TensorFlow-lite versions that are used in the Raspberry Pi are EfficientDet-lite and MobileNet v2. They outperform all other object detection models in terms of inference speed with comparable less accuracy.
378
S. Kumbari and Basawaraj
3 Results and Discussions Raspberry pi 3 B+ is the edge device on which the above mentioned deep learning models are tested and done pedestrian detection inference from the video of USB camera. Python is a high-level language which works on top of a machine and it makes the inference a little bit slow. The inference was conducted on raspberry pi 3B+ for each model under four conditions. .• Inference in normal light conditions. .• Inference in low -light conditions. .• Inference with pedestrian within 2 m. .• Frame rate with which the Video is being received. In the aforementioned conditions, a USB camera is used to collect the real-time video feed needed for the inference. The findings are displayed on a screen, and the likelihood of the pedestrian in a video is recorded in the aforementioned conditions. The output video is shown in the screenshots below, and the model can reliably identify people or pedestrians in the video stream with a certain probability. Every model takes frames from the video and inferences every frame accurately and efficiently (Fig. 4). Table 1 showing the results of each model for the low-light, High light, accuracy within 2 m and fps in which the video is feeding (Figs. 5 and 6).
Fig. 4 Results of the models
Real-Time Pedestrian Detection
379
Table 1 Detection probability under mentioned conditions Models Low-light High-light FPS accuracy accuracy EfficientDet-Lite YOLOV7-tiny EfficientDet MobileNet-V2
55–60 55–65 60–65 65–70
72–80 90–95 80–90 80–83
2–2.5 1–2 0.5 4.5–5
Accuracy within 2m 82–85 95 90 83
Fig. 5 Pedestrian detection on test images
Fig. 6 Images of Pedestrian detection
4 Conclusion Real-time pedestrian detection has been a prominent focus of computer vision research. Recently, TensorFlow-lite models and optimized for hardware models are more reliable and efficient in real time. EfficientDet and MobileNet are originally trained in TensorFlow, but to meet the resource constraints of real-time environment, they are converted to TensorFlow-Lite models, which perform better on the Raspberry Pi by providing good real-time results and spectacular frames per second. TensorFlow-lite models are Edge AI models because they employ optimization techniques such as quantization and pruning. When compared to these two variants, YoloV7-tiny produces equivalent results on the Raspberry Pi, with reasonable frames per second. Although quantization, pruning, and other optimization approaches assist in lowering the size of the model and the time required for inference, they have an impact on the model’s accuracy. Accuracy and efficiency are always trade-offs in realtime object detection models. Optimized models have lower accuracy than original models and original models provide higher precision at the expense of computation. Upcoming object detection models ought to be more accurate and capable of functioning in real-time.
380
S. Kumbari and Basawaraj
References 1. Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE computer society conference on computer vision and pattern recognition. CVPR 2001, vol 1. IEEE, pp I–I (2001) 2. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297 3. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) SSD: single shot multibox detector. In: Computer vision–ECCV 2016: 14th European conference, Amsterdam, The Netherlands, 11–14 Oct 2016, Proceedings, Part I 14, pp. 21–37. Springer 4. Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7263–7271 5. Lin TY, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988 6. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587 7. Girshick R (2015) Fast r-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448 8. He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969 9. Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. arXiv:1804.02767 10. Wang CY, Bochkovskiy A, Liao HYM (2022) Yolov7: trainable bag-of-freebies sets new stateof-the-art for real-time object detectors. arXiv:2207.02696 11. Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708 12. Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861 13. Zhang X, Zhou X, Lin M, Sun J (2018) Shufflenet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6848–6856 14. Tan M, Pang R, Le QV (2020) Efficientdet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10781–10790 15. Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125 16. Ghiasi G, Lin TY, Le QV (2019) NAS-FPN: learning scalable feature pyramid architecture for object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7036–7045 17. Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8759–8768 18. Tan M, Le Q (2019) Efficientnet: rethinking model scaling for convolutional neural networks. In: International conference on machine learning. PMLR, pp 6105–6114 19. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826 20. Huang G, Li Y, Pleiss G, Liu Z, Hopcroft JE, Weinberger KQ (2017) Snapshot ensembles: train 1, get m for free. arXiv:1704.00109 21. Tarvainen A, Valpola H (2017) Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results. Adv Neural Inf Process Syst 30
Android Based Mobile Application for Hockey Tournament Management System K. Selvamani , H. Riasudheen, S. Kanimozhi, and S. Murugavalli
Abstract In today’s world, technologies have advanced in mobile applications. Mobile applications are rapidly growing than any other applications in this current society. Mobile applications are rapidly marching towards end user with rich and fast access experience. This paper aims in developing a smart mobile application named as Hockey Tournament Management System (HTMS) for India’s national game field hockey, which aims in providing the information about the game hockey for the sports people and also conducting the tournament with latest developed mobile applications. In this mobile application, users will be able to know about the complete rules and history of the game hockey. This proposed research application provides the users to find the various details of hockey tournaments such as the information of the entire matches and where the tournaments are being conducted. The user will be able to get notification and updates then and there about the hockey tournaments conducted from anywhere in the world. Tournament application Administrator (TAA) will fetch the entire information about the hockey tournaments and end users can register and see the entire information’s with valid payment token. This developed mobile app Hockey Tournament Management System reduces a lot of time and effort. Keywords Technology · Mobile applications · Management system · Tournament
1 Introduction A mobile application is a piece of software code designed to run on a mobile device. Almost all mobile devices are several mobile applications put together in play store or app store which are downloaded and installed in mobile devices. Most of the mobile manufacturers market their mobile phones with high price based the speed K. Selvamani (B) · H. Riasudheen Anna University, CEG Campus, Chennai 600025, India e-mail: [email protected] S. Kanimozhi · S. Murugavalli Panimalar Engineering College, Chennai 600029, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_35
381
382
K. Selvamani et al.
of their devices. Few devices that are sold at less price have less memory space and due that few pre-installed apps may uninstalled for increasing the storage purpose. Certain devices which may not contain pre-installed apps are installed from wellknown platforms called app stores or Google play store. Mostly end users prefer Google play store for installing the necessary apps for their Android devices. However, the demand in public and the developer tools available for mobile apps have made the market in increased demand. Due to this, mobile users have tremendously increased throughout the world. Moreover, the productivity of smart phones in affordable price by manufacturers and the speed of internet provided by ISP at lower cost has attracted the young generation people to develop their own mobile apps with respect to their knowledge and use. The productivity of smartphones has brought a big revolution among the Information Technology (IT) people in Mobile Application Development. Also, due to the availability of Software Development Kit as open source, developers are creating their own applications using this kit. Also, they are provided with a suitable place to publish their creations in the Google play store. Mobile devices mainly depend on battery to run with more features such as location detection and cameras. App developers have few constraints such as wide array of screen sizes and hardware specifications because of intense competition in mobile software’s. Due to this rapid development in mobile applications have induced the IT sector people to develop mobile apps for sports such as cricket, football, hockey etc. Most of the constraints considered by Mobile UI are contexts, screen, input and mobility as outlines for design. The user focuses often in interaction with their mobile device; the interface entails components with both software and hardware. Hence, this research paper aims in identifying to develop an intelligent mobile app system for sports people related to Hockey. Hockey is the India’s National game but, it did not get any attention and priority among the young minds in this society. Hence, there is a need to motivate and develop an android based mobile application for a Hockey Tournament Management System for particular game hockey; there are no existing applications in the Google play store. Certain applications those are available only for International field hockey news, training skills and match schedules. But, in this proposed mobile application, the user will know the knowledge of the game hockey. This mobile application includes certain important information about game hockey like where it is first played and also game evolutionary such as Rules and regulation of hockey, training details, field details and equipment information. This system satisfied all the needs and requirements of a hockey tournament management system for particular game hockey. The important idea behind this proposed mobile application system is the user can easily view the documents of current rules and regulations, international tournament fixture, academy, coach details in national level, state level hockey information in pdf format and Hockey federation’s official web sites. This proposed application provides the secured login to the match organizer ever time. This application provides the secured registration with a secured online payment to the match organizer. Match
Android Based Mobile Application for Hockey Tournament …
383
organizer can access the application in a secured way like every time to login using One Time Password (OTP). In addition, the match organizer can access the application and maintain the tournament details such as update the tournament details, match updates, team list and player list. The match organizer can only accesses this application within the tournament ending date. This application provides another future as automatic match scheduling system [1]. Once enter the team details and this application generates the fixture automatically with the date and days using the algorithm. Finally, it generates match reports automatically. Users can easily be registered in the application and also get more interesting information about the game hockey, it’s very useful to know the knowledge about the game. The tournament details where the tournaments are conducted, how many teams are participated, how many days’ match will play, player details in each team participated in the tournament are viewed by the user. User can view the live texture commentary of the match and also daily match updates. User can view the national hockey turf stadiums in India via Google map. This application very useful for organizing and managing the hockey tournament [2] in the technology manner, it reduces the manual work and paperwork. Easily maintain the records in a systematic manner.
2 Related Work There are various papers related to various sports tournaments, among them few papers have been survey in this literature review in order to obtain proper clarity of this hockey tournament. Mainly the Confidentiality, Integrity and Authentication (CIA) are applied in all sorts of applications in order to provide and ensure the security of the system. Also, the Secured One-Time Password mechanism is applied to Mobile Applications for developing policies, procedures, plans and algorithms. The first line of defense in security systems is Authentication which is most necessary target of information security mainly in the Internet of Things. Internet of Things (IOT) has made people to connect easily and due to this ease connectivity, this leads to more vulnerabilities and threats in authentication, integrity and confidentiality of data in mobile devices. Traditional methods such as constant passwords and keys make systems vulnerable than those often use dynamic passwords and keys. In recent days, there are many attacks exploiting the authentication techniques on computer devices are Ransom ware attackers. This happens by few attackers first use password cracking techniques to guess victim’s password. After obtaining, they inject a Trojan horse to encrypt victim’s information. But, if a user uses dynamically changing passwords, it is very hard to break and guess on it. A New Secure OneTime Password Algorithm for Mobile Application was developed and presented by Elganzoury et al. [3]. In their proposal, One-time password generators that are fed with fixed data can be predictable well. He suggested that children and young people
384
K. Selvamani et al.
need to develop their sports activity for maintaining good fitness for their body in order to develop their social and communication skills. Januario et al. [4] implied the tournament rules for League match (Round-robin), Play-off, Grouping and Play-off. Primarily, the three phases namely, pre-event, event, and post-event for hosting a sports tournament aims to obtain/register participants for the tournaments as much as they can. This system needs a registration to join the tournament which can be made available through social media such as Facebook, Instagram. Hence, a lot of resources are needed for an organizer to manage registration process. Once the registration is done, organizer will create necessary fixtures for scheduling the match for various registered teams. Hence, this type of system development allows building a system that allows the organizer to [5] manage the tournament. The single round robin schedules for sports competitions are usually build by canonical method [6]. This canonical method may entrap local search procedures by using certain properties. The studies connected to most used neighborhood structures in local search heuristics for single round robin scheduling to characterize the conditions in which this entrapment happens [7, 8]. de Werra et al. made this characterization technique that brings a relation between the connectivity of the analyzed neighborhood and the riffle shuffle of playing cards. The priority based algorithm proposed by Razamin et al. [9] presents a multiobjective evolutionary algorithm in their paper for fixture determination in sport games such as football, etc. Their proposed multi-objective approach produces a range of different fixtures for the problem identified and each varying the tradeoffs in their objectives in differing amounts. Their model had provided the ability to explore different options to the organizing authority thereby allowing them to choose the best option that suits their requirements. Hence, their experiments results prove that multi-objective approach is able to evolve solutions that strictly dominate the existing fixture. Soong presented a mechanism to construct the tournament schedules with various criteria with necessary regulations and policies. The results obtained from their proposal are able to assist the coach/sports events management through efficient schedules by minimizing dissatisfaction. In addition, they also proposed a new authentication mechanism for online payment to provide much greater security [10, 11] using mobile OTP. Similarly, developing a sporting schedule is quite often a difficult process because of many convicting factors to be considered such as the travel time, the cost involved and ensuring the fairness between the teams as well as the availability of stadiums and the allocation of games. Hence, sport games need be explored further in the use of a repair function that would allow finding infeasible search space without the risk of the algorithm becoming stuck in the infeasible search region presented by Sam Craig et al. [12].
Android Based Mobile Application for Hockey Tournament …
385
3 Proposed Work The proposed system architecture for the hockey tournament management system is as shown below in Fig. 1. This architecture has various units of modules and which are integrated together to obtain a best hockey tournament management system for the hockey mobile application. Figure 1 depicts the proposed architecture of hockey tournament mobile application development. In this architecture, the tournament registration is done by the user and online payment method is used via the secured payment gateway in the mobile device by using the checksum method as verifying the OTP authentication. Once, the registration is successful, the user is provided with login credentials for secured login and verified by the firebase authentication method using emails for the match organizer. The match organizer can use this developed application and know the updates of tournament details such as match updates, team list and player list etc. This proposed system provides another future as automatic match scheduling system. Once, the match organizer enters the team details in this mobile application, this app generates the fixture automatically with the appropriate date and time slot using the algorithm. The tournaments and user details are stored in the firebase and also it retrieves the data whenever it is required. Users can easily get registered in this mobile application and get more interesting information about the game hockey which helps them to know the details of various matches early. Also, every match is updated then and there in the mobile application and results of the matches’ will be sent using push notifications to the users.
Fig. 1 The overall architecture of hockey tournament management system
386
K. Selvamani et al.
3.1 Overall Flow Diagram This section describes the complete workflow of this proposed Mobile application. This Mobile application has two roles namely (1) match organizer and (2) end user. Figure 2 depicts the process flow diagram of match organizer. This Match Organizer first resister his details for the first time in this application with secured online payment and get the authentication and authorization in this application for conducting the tournament. Then the match organizer login to the account and add all the information about the tournaments, match details, teams and players information, generate the match fixture automatically and live match texture commentary. All the information is stored in the firebase and retrieves from the firebase real-time database. This information is retrieved and placed in the user’s account. Users get the notification for daily match updates and match remainders from this application. Figure 3 shows the process flow diagram for user registration and logging. The end users first register his details in this application and obtain an account in this application. Then the user login’s to the account to view the tournaments, match details, teams and player’s information, daily match update via the automatic push notification to the user mobile, match results and share the results in the social media. Also the users will be able to view the tournament venues throughout our nation and also viewed in the Google map.
Fig. 2 The process flow diagram for match organizer
Android Based Mobile Application for Hockey Tournament …
387
Fig. 3 The process flow diagram for user registration and logging
3.2 Round Robin Algorithm Algorithm 1 refers to generate the automatic fixture scheduling using the algorithm round-robin scheduling algorithm from getting the number of teams participated and produce the match fixtures in league based system. Algorithm.1. Round Robin Algorithm Input: Number of teams Output: Automatically scheduled fixtures 1. Get the number of variables teams 2. totalRounds = (teams - 1)*2 3. matchesPerRound = teams/2 4. String[][] rounds = new String[totalRounds][matchesPerRound] 5. for round = 0 and round ¡ totalRounds and round++ do 6. for match = 0 and match ¡ matchesPerRound and match++ do 7. home = (round + match) % (teams - 1) 8. away = (teams - 1 - match + round) % (teams - 1) 9. end for 10. if match = = 0 then 11. away = teams - 1 12. end if 13. rounds[round][match] = (home + 1) + (away + 1) 14. end for
388
K. Selvamani et al.
Fig. 4 Organizers login and registration page
4 Result Screen Figure 4 shows the result screens of this implemented mobile application for the match organizer’s login and registration page. Organizers will register in this application with online payment in a secured payment gateway. Once, the applications are successfully registered in this system, it authorizes the user with user credentials for accessing this account for further details. Figure 5 shows the result screens of the match organizer’s account. Here, the Organizers can update profiles as well as enter all the information of the tournaments such as tournament details, team list, player details, fixtures, daily match updates and push notifications, live match updates, results, etc. Figure 6 refers to the results of screens of player details and the user’s account static and dynamic map views. In this, User can view the match venues and field hockey turf stadiums which are available in India and they are viewed in Google map.
Android Based Mobile Application for Hockey Tournament …
389
Fig. 5 Organizers tournament details and update page
Fig. 6 Player details with static and dynamic map stadium locations and venue
5 Conclusion The conclusion of this research paper achieves all the objectives of this proposed work such as complete information about the hockey tournament in the world with an ease and fast access including the hockey rules and regulations. In addition, all India level turf stadium locations are viewed in Google map and also provide the national and international field hockey federation’s web links and all news. This proposed work has been achieved in a secured organizer and user registration via the One Time
390
K. Selvamani et al.
Password authentication which is done in the secured way. This developed hockey mobile application will generate the automatic match fixtures thereby implementing a round robin algorithm which also minimizes the time as well as manual work. This implemented system achieves good impact among hockey communities as it makes them to manage the tournaments easily with this system. Further improvements of this work could apply some Artificial Intelligence in this system to calculate the ball position, circle penetration, number of penalty corners etc.
References 1. Shivaramkrishna S, Jain K, Nagda H, Chitroda D, Jain N (2015) Study on tournament management system. IEEE J Sel Areas Commun 2(5), Issue 7 2. Tournament Scheduling. https://nrich.maths.org/1443 3. Elganzoury HS, Abdelhafez AA, Hegazy AA (2018) A new secure one-time password algorithm for mobile application. In: The proceeding of 35th national radio science conference (NRSC), pp 249–257. IEEE 4. Januario T, Urrutia S, de Werra D (2016) Sports scheduling search space connectivity: a riffle shuffle driven approach. J Discrete Appl Math 211:113–120 5. Tumiran DF, Amin IM (2017) Sports tournament management system. In: The UTM computing proceedings innovation in computing technology and applications, vol 2, issue 2 6. Lee YS, Kim NH, Lim H, Jo H, Lee HJ (2010) Online banking authentication system using mobile-OTP with QR-code. In: The 5th international conference on computer sciences and convergence information technology, pp 644–648. IEEE 7. Hung JC, Chien MK, Yen NY (2011) Intelligent optimization scheduling algorithm for professional sports games. In: The proceeding of fourth international conference on Ubi-media computing, pp 285–289. IEEE 8. Goossens D, Spieksma FCR (2011) Sports scheduling with generalized breaks. In: 2011 IEEE Symposium on the computational intelligence in scheduling (SCIS), Paris, vol 11, pp 54–57 9. Razamin R, Soong CJ, Haslinda I (2012) A sports tournament scheduling problem: exploiting constraint-based algorithm and neighborhood search for the solution. In: The Proceedings of the world congress on engineering, vol 1 10. Xin Y, La-yuan L (2007) Function extension of NS2 network simulator and its implementation. J Wuhan Univ Technol 43(13):127–129 11. Ye XJ, Wu GX (2002) Analysis and improvement of OTP authentication technology. Comput Eng 26:27–29 12. Craig S, While L, Barone L (2009) Scheduling for the national hockey league using a multiobjective evolutionary algorithm. In: The proceeding of Australasian joint conference on artificial intelligence, pp 381–390. Springer 13. Maierhofer R, Pistell D, Permison J (2005) Sports club creation, management, and operation system and methods therefor. Google Patents, 27 Jan 2005, US Patent App, 10/700, 564
Hybrid Intrusion Detection System Using Autoencoders and Snort Yudhir Gala , Nisha Vanjari , Dharm Doshi, and Inshiya Radhanpurwala
Abstract Intrusion Detection Systems (IDS) are increasingly crucial in the modern digital environment due to security risks and cyberattacks. Cyberattacks are deliberate attempts to damage computers, steal data, or disrupt operations. These systems analyze network traffic to identify security threats and detect these attacks. IDS detects cyberattacks via signature-based and anomaly-based detection. Signaturebased IDS detection uses predetermined patterns or signatures to identify known threats, while anomaly-based detection analyses network activity aberrations. This study proposes a hybrid intrusion detection system that uses signature-based detection to find known threats and anomaly-based detection to uncover new ones. Hybrid strategies try to maximize the benefits of several techniques while minimizing their downsides. In the following study, Anomaly IDS is built by preprocessing the CICID2017 dataset and reducing its dimensions using autoencoders, then training the classification model using Random Forest and Light GBM. Our second solution, Anomaly IDS using autoencoders where the autoencoder is trained only on regular packets to help detect anomalies by calculating anomaly threshold. This paper introduces the proposed Hybrid-IDS design by combining Snort3, Redis and Elastic Stack. Finally, Random Forest and Light GBM classification models and deep stacked autoencoder for anomaly detection are evaluated. Keywords Hybrid intrusion detection · Snort · Autoencoder · Network defense
Y. Gala (B) · N. Vanjari · D. Doshi · I. Radhanpurwala K. J. Somaiya Institute of Engineering and Information Technology, Mumbai, India e-mail: [email protected] N. Vanjari e-mail: [email protected] D. Doshi e-mail: [email protected] I. Radhanpurwala e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_36
391
392
Y. Gala et al.
1 Introduction The Internet has changed how we work, live, and interact. Connectivity, information, e-commerce, education, social contact, and more make the internet vital. The internet has transformed society and become a vital instrument. Today’s digital age sees more cyberattacks. Cybercriminals have more potential to harm as more people, corporations, and governments use technology. Cyberattacks have increased significantly, harming individuals, corporations, and governments. To avoid these attacks, you must be cautious. IDSs prevent cyberattacks. IDSs notify administrators to potential security issues, preventing cyberattacks. It detects abnormal network traffic and alerts. IDS detection methods are Signature and Anomaly-based. This work proposes a Hybrid-IDS that combines signature and anomaly-based detection algorithms to overcome their shortcomings. Hybrid-IDS provide many advantages over single-technique systems. They include enhanced precision, lower false positive rates, and the ability to identify more threats. HIDS respond better to changing environments and security issues. HIDS systems can improve intrusion detection and computer network security by combining the best methods. The suggested system needs an Autoenoder. Autoencoders are used for unsupervised learning. Its inputs are reconstructed to teach it a compact data description. It consists of an Encoder that converts input data to a lower-dimensional bottleneck or latent representation and a Decoder that maps the bottleneck representation back to the input data. Dimensionality reduction, anomaly detection, and generative modelling are some of the known applications. Paper flow is described as follows. Section 2 reviews network intrusion detection literature. Section 3 describes anomaly IDS methods. Section 4 discusses system design and tools. Section 5 analyses each model’s performance. Section 6 concludes the research.
2 Literature Review Yang et al. [1] presented Griffin, a real-time network intrusion detection system using autoencoders in SDNs. In this system, flow level statistical properties were taken from collected packets and mapped to decrease feature scales by building sub instances using hierarchical clustering approaches. The RMSE was used to classify incursions after the autoencoder ensemble was trained on these sub instances to find abnormalities. It was learned using Mirai Active Wiretap data. The publications [2–4] examine how different autoencoders can be used for network intrusion detection. Gharib et al. [4] suggest using semi-supervised deep learning algorithm AutoIDS to accurately detect unusual traffic. AutoIDS uses two Auto-Encoders (AEs) to cascadely detect communication network irregularities. This is to decide on incoming network flows in two phases, each with a different detector. The model was exhaustively tested on the NSL-KDD dataset. Mebawondu et al. [5] examined Naives Bayes and C4.5 DT algorithms for network intrusion detection. The Gain Ratio algorithm selected relevant
Hybrid Intrusion Detection System Using Autoencoders and Snort
393
features to maximise the IDS. Naive Bayes accuracy was 74.76% and C4.5 Decision Tree accuracy was 89.27% on UNSW-NB15 [43]. The research [6] presented a light-weight deep neural network (LNN) for NIDS to fully extract data features while reducing computing cost by extending and compressing feature maps. They also employed inverse residual structure and channel shuffle operation to improve training output. The suggested model used a UNSW-NB15 and produced an accuracy of 97.54%, an F1-Score of 97.23%, precision of 98.47%, and recall of 96.02%. Chen et al. detected intrusions with DL [7]. Back-propagation neural network technology is examined and improved solutions to specific BP-NN difficulties are proposed. Tests show that BP-NN-based intrusion detection reduces false alarms and improves accuracy. In the study [8], the proposed model uses LSTM to recognise binary and multiple classes on NSL-KDD dataset with high accuracy and precision. Hou et al. [9] proposed a Hierarchical LSTM model that can learn from several temporal hierarchy stages and handle complex data flow sequences. Amutha et al. [10] suggested a deep RNN utilising LSTM algorithm that achieved maximum accuracy with less iterations and memory use than their techniques. Chiba proposed Suricata-Isolation Forest hybrid network intrusion detection [11]. Unsupervised learning uses less samples. Signature and anomaly-based IDS guards against internal and external network attacks. Vinayakumar et al. [12] suggested SHIA to automatically alert network managers of malicious attempts. The proposed deep neural network model outperforms machine learning classifiers in strict experimental testing on many benchmark IDS datasets. Wisanwanichthan et al. [13] proposed an SVMNaive Bayes multilayer network-based IDS system. PCA reduces dimensionality and pre-processes two data groups to train both algorithms. It accurately identified unusual attack classes like U2R and R2L in the NSL KDD dataset.
3 Methodology 3.1 Anomaly IDS Using Autoencoders for Dimensionality Reduction Dataset Collection—CICIDS2017. CICID2017 is a publicly available dataset. Simulated network traffic data includes benign and malicious traffic. Normal and aberrant traffic patterns indicate DoS, DDoS, reconnaissance, and exploitation network attacks [14]. The CICID2017 dataset is used to evaluate IDS algorithms since it accurately represents real-world network traffic. Several studies have evaluated and compared IDS strategies using the dataset. Data Preprocessing. Data preprocessing in the machine learning pipeline can dramatically affect model performance. CICID2017 data was preprocessed for modelling. Data cleaning removes or corrects faulty, incomplete, or irrelevant data in the dataset. Data quality and consistency must be improved for modelling. Handling
394
Y. Gala et al.
Fig. 1 Autoencoders architecture for dimensionality reduction
infinite and NaN numbers might generate modelling issues and erroneous results. The infinity and NaN values were changed to 0. Label encoding converts categorical information into numerical values. Label encoding is used for categorical features that cannot be used in models. The encoding method gives each category a numerical value, enabling model use. The MinMax scaler scaled data between 0 and 1. Scaling improves model performance. Finally, the target labels in the dataset were label encoded to make them numerical for the model. The dataset was divided into a training and test set in 80:20 ratio. The training set trains and the test set evaluates the model. These operations on the CICID2017 dataset improved data quality and consistency and made it better for modelling. Preprocessing procedures handled non-numeric features, missing or erroneous values, and data preparation for modelling. Dimensionality Reduction using Autoencoders. Autoencoders are neural networks that learn a compact representation of input data, called the encoding, by encoding and decoding the data. Autoencoders reduce dimensionality by keeping as much information as possible from input data [15]. By learning a compact encoding of the incoming data and using it to recreate the original data (decoding). The encoding is usually smaller than the input data and has fewer characteristics, representing a lower-dimensional representation. Unsupervised learning reduces reconstruction error in autoencoders. An autoencoder can compress data for visualisation, anomaly detection, and transfer learning. The suggested encoder has two layers with 64 and 32 input neurons. The bottleneck layer has 10 neurons, while the decoder has two layers with 32 and 64 output neurons (Fig. 1 and Table 1).
Hybrid Intrusion Detection System Using Autoencoders and Snort Table 1 Parameters used for training autoencoder for dimensionality reduction
Criterion
Value
Epochs
10
Batch size
64
Optimiser
Adam
Loss function
Mean Absolute Error (MAE)
395
Classification Using Random Forest. Random Forest is supervised ensemble learning. Many decision trees improve forecast accuracy. The approach trains a forest of decision trees using randomly selected data samples. The ultimate forecast is based on all decision tree projections. Data can be averaged or voted on. Random Forest is a quick, versatile technique that handles both linear and non-linear data relationships and resists overfitting, a major problem in decision tree algorithms. Random Forest has several advantages for use in intrusion detection. It can handle missing data, which are common issues in network traffic data and is also robust to overfitting. The CICID2017 dataset, which we chose, has demonstrated the effectiveness of Random Forest, a technique that is frequently employed in IDS [16]. Classification Using Light Gradient Boosting Machine. LightGBM is a tree-based learning gradient boosting system. It is effective and scalable for huge datasets, making it popular for classification and regression. It uses gradient boosting to create decision trees. The gradient boosting framework uses a histogram-based technique to speed up training and update model predictions using the gradient of the loss function [17]. It outperforms decision tree-based models. Leaf-wise tree growth and other features let it handle scant data and over-fitting. LightGBM is fast and can handle big datasets, making it perfect for intrusion detection systems. Parallel and GPU processing can speed up training, allowing IDSs to handle massive, real-time datasets.
3.2 Anomaly IDS Using Autoencoders for Anomaly Detection Data Preprocessing. As stated above, preprocessing data is an important step in the anomaly detection process using autoencoders. Cleaning the data to remove any irrelevant or missing values is the first step. This step is important because the presence of such data can negatively affect the performance of our autoencoder model. Next, Infinity and NaN values are treated as missing values in the data and need to be converted to a suitable numerical value. A common approach is to convert them to 0. For the target variable of the dataset in our given case, the samples labeled as “BENIGN” are converted to 1, and the values assigned for the remaining target classes is 0. Ultimately, the MinMax scaler is used to scale the values of the dataset’s features between 0 and 1. This is a common normalization technique that helps improve the performance of the Autoencoder (Fig. 2).
396
Y. Gala et al.
Fig. 2 Autoencoder architecture for anomaly detection
Table 2 Parameters used for training autoencoder for anomaly detection
Criterion
Value
Epochs
20
Batch size
512
Optimiser
Adam
Loss function
Mean Squared Logarithmic Error (MSLE)
Training Autoencoder. Autoencoders, a neural network, can detect anomalies. Our IDS uses autoencoders to identify anomalies by training the model on a normal dataset. Autoencoders can detect anomalies by learning a regular data representation [18]. As the autoencoder is trained on normal data, it uses the reconstruction error to determine how “normal” the incoming data is. If the reconstruction error is substantial, a data instance is anomalous. The autoencoder’s encoding can be used to visualise data and find clusters of typical and abnormal instances. ADAM optimization algorithm trains autoencoders using normal data. The optimiser adjusts network weights during training. We use the Mean Squared Logarithmic Error function to calculate the reconstruction error (Table 2). The suggested encoder has two layers with 64 and 32 input neurons. Dropout layers follow each layer. After two hidden layers with 16 neurons each, the decoder has two layers with 32 and 64 output neurons. The last dense layer with sigmoid activation function classifies aberrant packets. We then test the trained autoencoder model on a validation set. Autoencoder reconstruction error should be assessed. Sample anomalies have large reconstruction errors. After training, the autoencoder can find anomalies in new network traffic data. Feed the network new data and calculate the reconstruction error. Samples with substantial reconstruction errors are anomalies. In anomaly detection, where normal data is available, they can be trained on unsupervised data.
Hybrid Intrusion Detection System Using Autoencoders and Snort
397
3.3 Signature Based IDS—Snort3 Snort3 is a network-based IDPS. It analyses network traffic in real time and detects threats using customisable rules and signature-based detection. Snort3’s modes include: • Packet Sniffer mode: Snort3 catches all network traffic in this mode but does not analyse or act on it. Captured packets can be logged and examined. • Logger mode: Snort3 logs all network traffic to a file without analysing or acting on it. This mode helps with troubleshooting and forensics. • Intrusion Detection System (IDS) mode: Snort3 analyses network traffic and alarms when it identifies threats. • Intrusion Prevention System (IPS) mode: Snort3 analyses network traffic and blocks threats. Actively prevent invasions using this mode. Snort3 detects intrusions using flexible rules. It is flexible since administrators can add custom rules and use pre-defined rules [19]. Snort3’s real-time alerting and reporting lets administrators spot and address risks. It also lets managers track intrusions, examine alerts, and create custom reports. But the rule-based approach generates many false positive warnings and requires technical knowledge to install and maintain. Snort3 generates lots of data, making it hard to manage and analyse.
4 Proposed System 4.1 CICFlowMeter Open-source CICFlowMeter extracts features from network flow data and computes performance indicators. Network flows—data flowing between two network endpoints—form the basis of CICFlowMeter. The utility collects IP addresses, packet sizes, inter-arrival periods, and port numbers from flow data [20]. These features calculate flow rate, packet count, and flow duration. CICFlowMeter also supports intrusion detection and network performance monitoring and offers a comprehensive and scalable platform for network traffic analysis. It can analyse pcap files, live network interfaces, and NetFlow data. It produces high-quality flow-based representations of huge amounts of network data. It is useful for real-time network anomaly detection (Fig. 3).
4.2 Redis Redis is a fast, free in-memory key-value store. Its applications include database, caching, and message broker. Redis keeps all data in memory, making it faster than
398
Y. Gala et al.
Fig. 3 Proposed system design
disk-based databases. Redis is ideal for real-time and low-latency applications. Redis supports transactions, Lua scripting, pub/sub messaging, time-to-live keys, and data structure manipulation commands (TTL). Its built-in replication and partitioning allow it to scale to growing applications [21]. It is a quick, flexible, and scalable data store that works well for everything from caching to big high-traffic applications.
4.3 Elasticsearch Elasticsearch is a real-time, open-source search and analytics engine. Apache Lucene, a full-text search library, powers its RESTful API. The engine uses a cluster architecture for horizontal scaling and better performance with massive data [22]. Its features fit log analytics, corporate intelligence, and full-text search. Faceted search and real-time analytics are also available in Elasticsearch for complicated data processing.
4.4 Logstash and Filebeat Logstash is an open-source data gathering, processing, and enrichment pipeline. Its modular architecture supports over 200 plugins for data input, filtering, and output [23]. Logstash can parse, structure, and transport log data to Elasticsearch for indexing and analysis in logging and log analysis pipelines. Filebeat, a lightweight log data shipper, collects system, application, and text logs from servers. Filebeat is simpler and more efficient than a log pipeline because it requires less processing and memory. Filebeat can send logs directly to Elasticsearch or to Logstash for
Hybrid Intrusion Detection System Using Autoencoders and Snort
399
enrichment before indexing. Logstash and Filebeat are popular components of the open-source Elastic Stack, which collects, processes, stores, and analyses data.
4.5 Kibana The Elastic Stack uses the sophisticated open-source data visualisation and investigation tool Kibana. Users can search, view, and analyse huge Elasticsearch data using its web interface. Kibana lets users create and share pie charts, bar charts, line graphs, and heat maps and customise their appearance and interaction. Kibana excels in handling enormous amounts of data and providing real-time insights. Elasticsearch’s robust search and aggregation capabilities let users swiftly filter and aggregate data and show the results in many forms.
5 Results and Discussion After preprocessing, the dataset was split 80:20 into train and test data. The autoencoder model was trained on the preprocessed dataset with batch size 64 for 10 epochs to reduce dimensionality. Autoencoder training employed Mean Absolute Error Loss function. The loss graph after training shows that the dimensionality reduction model reduced training and validation loss over the epochs and converged well. The dataset was reduced to 10 features. Random Forest and Light GBM algorithms was trained on the reduced dataset for classification. Light GBM model gave 80.14% accuracy, 65.62% precision whereas the recall and f1-score for the model were 80.14% and 71.52. Light GBM trained faster than Random Forest. But the Light GBM model fails to classify different assault classes successfully. The dataset’s most frequent class, Benign labelled packets, may have overfitted the model. On the test dataset, Random Forest had 99.65% accuracy and 99.64% precision, 99.65% recall and 99.64% f1-score. It can be seen that Random Forest model classified different attack classes better than Light GBM but required longer to train. Autoencoder dimensionality reduction lowered model complexity and training resource consumption. In order to effectively train the autoencoder for anomaly detection, only Benign labelled packets were used from the training dataset. The proposed autoencoder model was trained for 20 epochs using batch size as 512. In order to calculate the threshold value used for discerning normal and anomaly packets, the Mean Squared Logarithmic Error (MSLE) loss function was used. During training, the validation loss was seen declining over the epochs but not as much as the training loss (Fig. 4). After training, the model was tested on the test dataset which contained both normal and anomaly packets and gave an accuracy of 85.77%. The threshold value calculated using the dataset while training, came out to be 0.0018 (Table 3).
400
Y. Gala et al.
Fig. 4 Autoencoder for dimensionality reduction—loss graph during training (left) and autoencoder for anomaly detection—loss graph during training (right)
Table 3 Performance comparison of each model Model
Accuracy
Precision
Recall
F1-score
Light GBM
0.801467
0.6562561
0.8014678
0.715227
Random Forest
0.9965238
0.9964771
0.9965238
0.9964866
Accuracy Anomaly autoencoder
0.8577609
Threshold value achieved 0.0018010198
6 Conclusion To summarize, this proposed system adopts a hybrid approach to intrusion detection system to enhance the effectiveness and performance of network IDS. For the anomaly IDS algorithm evaluation, the CICID2017 dataset is used. Autoencoders are used to help lower the dataset’s dimensionality to 10 features. Random Forest and Light GBM are employed for classifying multiple attack classes. Though the Light GBM model trained faster than Random Forest model, it could not classify the dataset’s less frequent attack classes and may have been overfitted on highest frequency class in the dataset. Proposing a semi supervised approach to intrusion detection, an autoencoder model was introduced trained only to efficiently reconstruct the samples labelled as Benign in the dataset and a threshold value was calculated using which anomaly packets could be detected. As systematic labeled network datasets are becoming more and more difficult to obtain, such unsupervised and semi supervised methods can help make online learning models more prevalent and increase their adaptability to changing modern networks.
Hybrid Intrusion Detection System Using Autoencoders and Snort
401
References 1. Yang L, Song Y, Gao S, Hu A, Xiao B (2022) Griffin: real-time network intrusion detection system via ensemble of autoencoder in SDN. IEEE Trans Netw Serv Manage 19(3):2269–2281. https://doi.org/10.1109/TNSM.2022.3175710 2. Deng H, Yang T (2021) Network intrusion detection based on sparse autoencoder and IGA-BP network. Wirel Commun Mob Comput, Article ID 9510858, 11p 3. Lu J, Meng H, Li W, Liu Y, Guo Y, Yang Y (2021) Network intrusion detection based on contractive sparse stacked denoising autoencoder. In: 2021 IEEE International symposium on broadband multimedia systems and broadcasting (BMSB), Chengdu, China, pp 1–6. https:// doi.org/10.1109/BMSB53066.2021.9547087 4. Gharib M, Mohammadi B, Dastgerdi SH, Sabokrou M (2019) AutoIDS: auto-encoder based method for intrusion detection system. ArXiv abs/1911.03306 5. Mebawondu OJ, Popoo.la OS, Ayogu II, Ugwu CC, Adetunmbi AO (2022) Network intrusion detection models based on Naives Bayes and C4.5 algorithms. In: 2022 IEEE Nigeria 4th international conference on disruptive technologies for sustainable development (NIGERCON), pp 1–5. https://doi.org/10.1109/NIGERCON54645.2022.9803086 6. Zhao R, Li Z, Xue Z, Ohtsuki T, Gui G (2021) A novel approach based on lightweight deep neural network for network intrusion detection. In: 2021 IEEE wireless communications and networking conference (WCNC), pp 1–6. https://doi.org/10.1109/WCNC49053.2021.9417568 7. Chen H, Liu Y, Zhao J, Liu X (2021) Research on intrusion detection based on BP neural network. In: 2021 IEEE international conference on consumer electronics and computer engineering (ICCECE), pp 79–82. https://doi.org/10.1109/ICCECE51280.2021.9342479.1 8. Laghrissi F, Douzi S, Douzi K et al (2021) Intrusion detection systems using long short-term memory (LSTM). J Big Data 8:65 9. Hou H et al (2020) Hierarchical long short-term memory network for cyberattack detection. IEEE Access 8:90907–90913. https://doi.org/10.1109/ACCESS.2020.2983953 10. Amutha S, Kavitha R, Srinivasan R, Kavitha M (2022) Secure network intrusion detection system using NID-RNN based deep learning. In: 2022 International conference on advances in computing, communication and applied informatics (ACCAI), pp 1–5. https://doi.org/10. 1109/ACCAI53970.2022.9752526 11. Chiba Z, Abghour N, Moussaid K, El Omri A, Rida M (2019) Newest collaborative and hybrid network intrusion detection framework based on suricata and isolation forest algorithm. In: Proceedings of the 4th international conference on smart city applications (SCA ’19). Association for Computing Machinery, New York, NY, USA, Article 77, pp 1–11. https://doi. org/10.1145/3368756.3369061 12. Vinayakumar R, Alazab M, Soman KP, Poornachandran P, Al-Nemrat A, Venkatraman S (2019) Deep learning approach for intelligent intrusion detection system. IEEE Access 7:41525– 41550. https://doi.org/10.1109/ACCESS.2019.2895334 13. Wisanwanichthan T, Thammawichai M (2021) A Double-layered hybrid approach for network intrusion detection system using combined Naive Bayes and SVM. IEEE Access 9:138432– 138450. https://doi.org/10.1109/ACCESS.2021.3118573 14. Panigrahi R, Borah S (2018) A detailed analysis of CICIDS2017 dataset for designing intrusion detection systems. Int J Eng Technol 7(3.24):479–482 15. Sakurada M, Yairi T (2014) Anomaly detection using autoencoders with nonlinear dimensionality reduction. In: Rahman A, Deng J, Li J (eds) Proceedings of the MLSDA 2014 2nd workshop on machine learning for sensory data analysis (MLSDA 2014), p 4–8. ACM, New York 16. Farnaaz N, Jabbar MA (2016) Random forest modeling for network intrusion detection system. Procedia Comput Sci 89:213–217 17. Tang C, Luktarhan N, Zhao Y (2020) An efficient intrusion detection method based on LightGBM and autoencoder. Symmetry 12(9):1458 18. Tien C-W, Huang T-Y, Chen P-C, Wang J-H (2021) Using autoencoders for anomaly detection and transfer learning in IoT. Computers 10:88
402
Y. Gala et al.
19. Kumar V, Prakash O (2012) Signature based intrusion detection system using SNORT. Int J Comput Appl Inf Technol 1(3):35–41 20. Zhou Q, Pezaros D (2019) Evaluation of machine learning classifiers for zero-day intrusion detection—an analysis on CIC-AWS-2018 dataset. arXiv7cx preprint arXiv:1905.03685 21. Chen S et al (2016) Towards scalable and reliable in-memory storage system: a case study with Redis. In: 2016 IEEE Trustcom/BigDataSE/ISPA. IEEE 22. Grammatikis PR et al (2020) An anomaly detection mechanism for IEC 60870-5-104. In: 9th International conference on modern circuits and systems technologies (MOCAST). IEEE 23. Sanjappa S, Ahmed M (2017) Analysis of logs by using logstash. In: Proceedings of the 5th international conference on frontiers in intelligent computing: theory and applications: FICTA 2016, vol 2, pp 579–585. Springer, Singapore
A Hybrid Image Edge Detection Approach for the Images of Tocklai Vegetative Tea Clone Series TV1, TV9, TV10 of Assam Jasmine Ara Begum and Ranjan Sarmah
Abstract Edge Detection technique within image segmentation process serves as the significant step for identification of the edges of various images of Tocklai Vegetative Tea Clone Series TV1, TV9 and TV10 of Assam. Such technique is basically used to identify the discontinuities in image. There are several edge detection technique used in Image Processing and identifying the suitable edge detection technique for the identification process is a major concern. The present work thus focuses on a hybrid edge detection approach which combines Canny and Sobel edge Detection Technique. The hybrid approach in order to eradicate flaws in the segmented image and to get better information on the boundaries of the images of TV1, TV9 and TV10 uses Dilation Morphological operation. A comparison is finally made with the most popular edge detection techniques like Sobel, Prewitt, Robert, Canny, Laplacian of Gaussian based upon noise level parameters such as Mean Square Error (MSE), Peak Signal noise Ratio (PSNR) and SNR (Signal Noise Ratio). The analysis output obtained through comparison draws a conclusion that hybrid edge detection approach when combined with dilation of the morphological operation shows efficient outcome and can be extensively used for detecting the edge of Tocklai Vegetative Tea clone series of Assam. Keywords Edge detection technique · Noise level parameters · Tocklai vegetative clone series · Sobel edge detection technique · Canny edge detection technique · Morphological operations
1 Introduction With the growing demand of tea as widely consumed beverages, Assam contributes to a greater extent in the production of tea. Despite of weather and soil condition, tea is cultivated in a large scale in Assam. Tocklai Assam came up with a huge collection J. A. Begum · R. Sarmah (B) Assam Rajiv Gandhi University of Cooperative Management, Sivasagar, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_37
403
404
J. A. Begum and R. Sarmah
of clones of tea which are resistant to different climatic changes. However, such wide range of different clones in tea, normal tea cultivator finds it cumbersome to identify the clone and opt for the best clone to cultivate in their field. Thus with a proper imposition of computer technology, the identification of tea clone will became easier to a greater extent. Such method of identifying a tea clone will replace the need of expert. Therefore, with proper application of image processing, identification of tea clone can be efficiently achieved. Identification and classification of various images of Tocklai Vegetative (TV) Tea Clone series can be achieved through the following steps of Image Processing. These are:—Acquisition of the TV Tea clone series, preprocessing of the image, extraction of the features and finally classification, shown in Fig. 1. In image acquisition, images of the Tea Clone required for identification is collected and then the image is undergone for pre-processing. The pre-processing of the collected image is done for removal or avoidance of any distortion occurring in the image, de-noising, segmentation and image content enhancement. The features further required for identification are generated as output from the pre-processing steps and is fetched as an input to the next step of image processing i.e. feature extraction. The extracted features can then be used to classify and identify the respective clone series of Tea. Thus for this identification of edges is very essential. There are many edge detection technique used in image segmentation under image pre-processing. This paper aims at designing a hybrid approach for edge detection technique. The technique combines Canny and Sobel edge detector. The Canny edge detector is one of the popular and most standard methodology of edge detection [1] and it identifies the edges by searching for local maxima of the gradient of Image [2]. It has also been observed that the Canny Edge detecting technique is with many distorted edge images and tends to weaken the edge information. Again for the Sobel edge detecting operator aims at smoothing the image to greater extent by minimizing the distortion of the image. Thus keeping the respective advantages of the two edge detecting technique the hybrid approach combines the two techniques. Further the hybrid approach in order to improve the accuracy uses dilation morphological operation. The objective of using the morphological operation is to reduce the internal noise. The output obtained on application is further compared with the various edge detection technique used in segmenting. Such comparison is done by analyzing the noise level, relevancy of the image, time required for processing etc. The noise level of the various edge detection techniques is measured using Mean Square Error (MSE), Peak Signal noise Ratio (PSNR) and SNR (Signal Noise Ratio). The paper is arranged in order as Sect. 2, which mentions the sampling of Tea leaves; Sect. 3, Image Pre-processing, describes converting RGB image of Tea leave
Fig. 1 Steps of image processing
A Hybrid Image Edge Detection Approach for the Images of Tocklai …
405
to gray scale image, image segmentation, edge detecting techniques, morphological operations; Sect. 4, describes the methodology, proposed algorithm and the parameter used to measure the efficiency of the algorithm; Sect. 5, Experiment, describes the experiment done; Sect. 6 is with discussion of the results obtained and at last Sect. 7, which makes the conclusion of the paper.
2 Sampling The area of available tea leaf of the Tocklai Vegetative (TV) series namely; TV1, TV9, TV10 is selected. A sample size of 2105 was used for the experiment, out of which 769 images are for TV1 clone series of Tea, 850 images are for TV9, and 486 images are for TV10. The samples of leaf for Image acquisition Image pre-processing Feature Extraction Classification the research are collected from Kamarbund Tea Estate, Gatonga Jorhat Assam. It has been seen that a leaf of a tea tree is more available as compared to flowers, fruits or any other part of the tea tree; so it is considered as sample for the experiment. The leaf images were acquired by using a mobile camera of 64 Mega Pixel. The samples are captured during daylight and a white sheet is placed at the back of the sample leaf such that the background of the leaf is not visible. This was done so as to capture images with a standard background.
3 Image Pre-processing Raw and unprocessed images are not considered to be appropriate for the analysis purpose. The raw image collected has to be converted into the processed format such as .jpeg, .jpg, .tiff, .png. In this research, the TV Tea Clone series image is of dimension 3000 × 4000 with horizontal and vertical resolution 72 dpi.
3.1 Converting RGB Image to Gray Scale Image The color of the Tea leaf is usually green. The color of the leaf, however changes due to various factor like nutrient content, climatic condition, water content and seasons. The color feature of a tea leaf thus cannot be considered for the identification process. Therefore, such color information of the image has to be removed and converted to gray scale.
406
J. A. Begum and R. Sarmah
3.2 Image Segmentation The digital image formed after conversion of gray scale to a binary image requires a proper process of segregating the digital image into integral regions [3]. Segmentation is basically used to establish the boundaries of the object. Discontinuity of the intensity level of pixels and similarity measurements are basically used for image segmentation [4]. The discontinuity in the intensity values of the image is identified using edge detection. Edge Detectors By use of different edge detection technique, the inconsistency in the intensity of the pixel value of an image is identified as an edge. Edges are the partition between the object in the image and the background of the image. The edge detection technique aims at avoiding the false edges and considers the true edges [5]. Sobel Edge Detector This technique uses mask and is a gradient based technique with two matrices one for x axis and another for y axis [6]. The Sobel edge identifies the edges having maximum gradient value [7]. 1/2 |G| = G 2x + G 2y
(1)
where G is the gradient value and Gx 2 , Gy 2 is the gradient magnitude of a tea leaf image. Canny Edge Detector This technique finds the edges by avoiding the distortion from the image. The attribute of the edges of the tea leaf image is not affected.
3.3 Morphological Operation Morphological operation is the operation used for ordering the pixel without affecting their respective numerical values. In a morphological operation, each pixel value of the output is based on the comparison of the corresponding pixel value of the input with its neighbors. This technique performs the operation by using a template often referred as structuring element. This element is placed at all the nearest locations in the image and then compared with the corresponding neighborhood pixels. During this process, this operation looks for the element that “fits” within the neighborhood or whether the element that “hits” or intersects the structuring element. This operation uses two key operators: dilation and erosion. Erosion operator removes pixels on the boundaries whereas dilation operator adds pixels on the boundaries.
A Hybrid Image Edge Detection Approach for the Images of Tocklai …
407
4 Methodology This paper concentrates on a hybrid approach of edge detection where two popular Edge Detecting Technique: Canny Edge Detection Technique and Sobel Edge Detection Technique were combined. The two techniques was opted as both showed better Mean Square Error (MSE), Peak Signal noise Ratio (PSNR), Signal Noise Ratio (SNR) values than the other popular Edge Detecting techniques: Prewitt, LoG, Robert.
4.1 Proposed Algorithm The objective of the proposed method is to overcome the limitation of the traditional edge detecting technique. • • • • •
Step 1: Canny edge detection operator is applied to the image g1(x, y). Step 2: The output of Step 1 is stored as g2(x,y). Step 3: Sobel edge detector is applied to Step 2 Step 4: The output of Step 3 is stored as g3(x,y) Step 5: A hybrid of both the results of Step 2 and Step 4 is applied to get Hybrid_ image • Step 6: The Hybrid_image is then applied with Dilation to add pixels on the boundaries. • Step 5: The output were analyzed with SNR, PSNR, MSE values.
4.2 Parameters Parameters are the factors used to measure the performance of the algorithm or any method. In this paper noise parameter Mean Square Error (MSE), Peak Signal noise Ratio (PSNR), Signal Noise Ratio (SNR) are used to compare the quality of image and time take to detect the edges. Mean Squared Error Mean Squared Error (MSE) is often used for better estimation of the quality of the image [8]. MSE calculates the mean difference of pixels between the captured image and image obtained by various edge detection techniques. Higher the MSE value, higher is the difference between two images and indicates the identification of more edges in the image [9]. MSE for two images i1(q, p) and i2(q, p) is represented as M S E = 1/M N
M ∑ N ∑ i1(q, p) − i2(q, p)2 q=0 p=0
(2)
408
J. A. Begum and R. Sarmah
where images i1(q, p) is the original image and i2(q, p) is the image obtained with edge detection technique. Peak Signal to Noise Ratio (PSNR) PSNR is represented as PSNR = 10 log10
H2 MSE
(3)
where H is the highest variation in input of the image, with H = 255 for 8 bit unsigned integer. PSNR is measured in the scale of decibel (dB). The signals in PSNR are the original image and noises are the error of the distorted image. Maximum value PSNR shows reduced noise. Signal Noise Ratio Signal Noise Ratio (SNR) gives the ratio between useful and unwanted information of an image [10]. The ratio having high value indicates the presence of a high amount of useful information and a low amount of unwanted background noise. The SNR is measured in Decibel (dB).
5 Experiment In this section the result obtained by Sobel X operator, Sobel Y operator, Laplacian of Gaussian, Prewitt operator, Robert Cross operator and Canny operator is compared with the hybrid edge detection approach combined with Morphological operation: Dilation. The comparison is done on the basis of noise parameter: MSE, PSNR and SNR. The images of TV1, TV9 and TV10 are considered here for the purpose of comparison. All the images are in .jpg format and are of dimension 3000 × 4000 with horizontal and vertical resolution 72 dpi. Table 1 shows an experimental output of the various edge detecting techniques applied on the three clone series of Tea. The MSE, PSNR and SNR results for the image TV1, TV9 and TV10 are shown in Tables 2, 3 and 4 respectively.
6 Result and Discussion The output obtained from the python program to calculate MSE, PSNR, SNR for different edge detecting algorithms are shown in Tables 2, 3 and 4 for the three different Tea clone series i.e. TV1, TV9, TV10. The output shows the MSE range to be from 19,000 to 24,000, PSNR to be from 26 to 38 dB and SNR in negative range to be from 23 to 31 for the three clones of Tea TV1, TV9 and TV10. The experiment showed a high rate of MSE for the Hybrid Approach of Canny and Sobel
A Hybrid Image Edge Detection Approach for the Images of Tocklai …
409
Table 1 Experimental output of various edge detecting technique for TV1, TV9, TV10 Tea clone series Edge DetecTV1 TV9 TV10
tion Technique Canny
Sobel
Laplacian
Robert Cross
(continued)
410
J. A. Begum and R. Sarmah
Table 1 (continued)
Prewitt
Hybrid (Canny+Sobel) with Dilation
Edge Detecting Technique with Dilation in all the three images. PSNR with low value indicates efficient edge, so the experiment shows Hybrid Approach of Canny and Sobel Edge Detecting Technique with Dilation with the lowest value as compared to other techniques. SNR value for the Hybrid Approach of Canny and Sobel Edge Detecting Technique with Dilation also shows high value which indicates there is more useful information other than unwanted data or noise (Figs. 2, 3 and 4).
7 Conclusions In this paperwork, a Hybrid approach of edge detection which combines Canny and Sobel edge detector is discussed. The approach used the morphological operator dilation to add pixels in the boundaries as the hybrid approach showed some distortion and loss of edges. Further the hybrid approach is compared with the other traditional edge detection techniques based on the noise level, Mean Square Error (MSE), Peak Signal noise Ratio (PSNR) and SNR (Signal Noise Ratio) for the images of 3 Tea clone series TV 1, TV 9, TV10. Thus based on the results of the experiment and discussion the paperwork came to a conclusion that edge detected with the hybrid approach edge detection technique showed better performance followed by Canny, Laplacian, Sobel Y and Sobel X, Robert and Prewitt.
27.4936967 −24.2904282
26.339859
−23.0967095
SNR(dB)
19,972.66
21,340.33
PSNR(dB)
Canny
MSE
Hybrid (Canny+Sobel) with dilation
Table 2 MSE, PSNR and SNR results of the image TV 1
−24.50636588
27.53035374
19,529.91
LoG
−24.67090995
27.53226743
19,297.50
Sobel Y
−24.6919156
27.52985951
19,271.51
Sobel X
−25.82128426
38.02042295
19,001.32
Robert Cross
−24.85278998
27.55307391
19,037.43355801
Prewitt
A Hybrid Image Edge Detection Approach for the Images of Tocklai … 411
27.5324251 −30.6053892
27.602342
−29.0684791
SNR(dB)
22,921.35
22,932.98
PSNR(dB)
Canny
MSE
Hybrid (Canny+Sobel) with dilation
Table 3 MSE, PSNR and SNR results of the image TV 9
−30.80470837
27.52183253
22,588.05
LoG
−31.05632868
27.52275434
22,373.12
Sobel Y
−31.08078322
27.52577591
22,361.65
Sobel X
−25.70690402
37.04065006
22,287.95
Robert Cross
−31.2714592
27.52080976
22,117.51
Prewitt
412 J. A. Begum and R. Sarmah
23,540.98
SNR(dB)
−23.04560934
PSNR(dB) 26.21351008
MSE 27.71438482
21,169.34
LoG 27.66993289
20,953.14
Sobel Y
27.66884088
20,953.49
Sobel X
35.68491132
20,856.90
Robert Cross
27.62750974
20,718.95705608
Prewitt
−24.0403863 −24.13514880 −24.21345052 −24.27097285 −28.98706049 −30.39788650
27.5004489
22,659.57
Hybrid (Canny+Sobel) with dilation Canny
Table 4 MSE, PSNR and SNR results of the image TV 10
A Hybrid Image Edge Detection Approach for the Images of Tocklai … 413
414
J. A. Begum and R. Sarmah
PSNR(dB) 40 35 30 25 20 15 10 5 0
Hybrid(Ca nny+Sobel) Robert Canny LoG Sobel Y Sobel X Prewitt Cross with Dilation PSNR(dB) 26.339859 27.493696727.5303537427.5322674327.5298595138.0204229527.55307391
Fig. 2 Comparison of PSNR (dB) for different edge detecting techniques of TV 1 Tea clone series
SNR(dB) -21.5 -22 -22.5 -23 -23.5 -24 -24.5 -25 -25.5 -26 -26.5
Hybrid(C anny+So Robert Canny LoG Sobel Y Sobel X Prewitt Cross bel) with Dilation SNR(dB) -23.09671 -24.29043 -24.50637 -24.67091 -24.69192 -25.82128 -24.85279
Fig. 3 Comparison of SNR (dB) for different edge detecting techniques of TV 1 Tea clone series
A Hybrid Image Edge Detection Approach for the Images of Tocklai …
415
MSE
22000 21500 21000 20500 20000 19500 19000 18500 18000 17500
Hybrid(Ca nny+Sobe Canny l) with Dilation MSE 21340.33 19972.66
LoG
Sobel Y
Sobel X
19529.91
19297.5
19271.51
Robert Cross
Prewitt
19001.32 19037.434
Fig. 4 Comparison of MSE for different edge detecting techniques of TV 1 Tea clone series
Acknowledgements We thank Kamarbund Tea Estate, Gatonga Jorhat Assam for their extensive support and helping in collecting images of TV1, TV9, TV10 Tea Clone Series. The authors wish to acknowledge Dr. Diganta Deka, Asstt. Advisory Officer (Scientist ‘C’); Advisory Department, Upper Assam Advisory Centre, TRA, Dikom; for his extensive support and helping in knowledge gathering about the various Assam Tea Clone Series.
References 1. Shin MC, Goldgof DB, Bowyer KW, Nikiforou S (2001) Comparison of edge detection algorithms using a structure from motion task. IEEE Trans Syst Man Cybern Part B Cybern 31(4):589–601. https://doi.org/10.1109/3477.938262 2. Canny J (1986) A computational approach to edge detection. IEEE Trans Pattern Anal Mach Intell PAMI-8(6):679–698. https://doi.org/10.1109/TPAMI.1986.4767851 3. Vijayaraghavan S (1998) Digital image processing lecture note. pp 1–118 4. Deng J (2009) A novel similarity-based approach for image segmentation. In: Proceedings of 2009 international conference on image analysis and signal processing. IASP 2009, pp 36–39. https://doi.org/10.1109/IASP.2009.5054612 5. Khaire PA, Thakur NV (2012) A fuzzy set approach for edge detection. Int J Image Process 6(6):403–412 6. Othman Z, Haron H, Kadir MRA (2009) Comprison of canny and sobel edge detection in MRI images 7. Davies ER (2012) Edge detection. Comput Mach Vis, pp 111–148. https://doi.org/10.1016/ b978-0-12-386908-1.00005-7 8. Sara U, Akter M, Uddin MS (2019) Image quality assessment through FSIM, SSIM, MSE and PSNR—a comparative study. J Comput Commun 07(03):8–18. https://doi.org/10.4236/ jcc.2019.73002 9. Vidya P, Veni S, Narayanankutty KA (2009) Performance analysis of edge detection methods on hexagonal sampling grid. Int J Electron Eng Res 1(4):313–328
416
J. A. Begum and R. Sarmah
10. Kellman P, McVeigh ER (2005) Image reconstruction in SNR units: a general method for SNR measurement. Magn Reson Med 54(6):1439–1447. https://doi.org/10.1002/mrm.20713
Gesture Controlled Iterative Radio Controlled Car for Rehabilitation Nabamita Ghosh, Prithwiraj Roy, Dibyarup Mukherjee, Madhupi Karmakar, Suprava Patnaik, and Sarita Nanda
Abstract When children sustain serious muscular or bone injuries its treatment includes confining the bone which leads to prolonged immobilization of muscles which further causes muscular weakness. Physical therapies can be used to deal with muscular weakness, however, they can be quite monotonous and difficult to engage children in such activities. So to make this tedious task entertaining for children we have devised a technology based Radio Controlled Car. This car works on the data based on hand gesture, which are received from the accelerometer. These data are then transmitted via transceiver module to the car to control the motors. Here the exercises will be considered as gestures to control the car movement. The gestures used for controlling the car are part of the exercises prescribed to the patients. As per these exercises, the hand movements were executed by selecting random participants and found the angle mapping with hand gestures to be accurate with the expected outcome. The deviation of the angle values from the expected values were found to be almost negligible thus making a robust and effective system for standard therapy. Keywords Hand-gesture · Accelerometer · RF communication · Rehabilitation · Micro-controller
N. Ghosh · P. Roy · D. Mukherjee · M. Karmakar · S. Patnaik · S. Nanda (B) Kalinga Institute of Industrial Technology, Bhubaneswar, India e-mail: [email protected] N. Ghosh e-mail: [email protected] P. Roy e-mail: [email protected] D. Mukherjee e-mail: [email protected] M. Karmakar e-mail: [email protected] S. Patnaik e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_38
417
418
N. Ghosh et al.
1 Introduction With technological progress, gyrometers have found their application in various fields of fall detection and balance. For example, Han-Wu and Chun-su designed a hand gesture based control interface for navigating a car-robot [1]. Praveen Singh designed a glove embedded with an accelerometer sensor whose gesture was used to control the car movement [2]. Segal and his colleagues designed a gesture controller and a motorized car for rehabilitation purposes [3] to engage children in physical therapy and keep their enthusiasm intact is a challenge, so the proposed system aids children in overcoming this challenge. To make exercises interactive, the exercise movements are synchronized with the movement of the RC car. This meets the requirement of exercising as well as recreational activities. This car is easy to use for the children as it omits the use of buttons or joysticks, instead it uses gestures as a control. The application of this can be extended to a wide range of population, including people with disabilities. This system comprises a Radio Control Car and a Hand Gesture Reading module (wearable). This wearable module contains an Inertial Measurement Unit (MPU 6050) [4] a micro controller (Arduino UNO) and a transceiver module (nRF24L01) [5] used to transmit hand orientation and movement data to the transceiver module present in the RC Car. The IMU is used for this purpose to sense the direction or sense movements. When the data is received by the Arduino in the wearable device, the RF transceiver (acts as transmitter) sends the data to the car where the RF transceiver (acts as receiver) takes this as an input and gives it to the Arduino that is placed on our car. These signals are further decoded by Arduino and forwarded towards the motor driver [6] to rotate the motors [7] according to the gesture provided by the hand. The novel aspect of our work is that it maps data from the IMU unit on our arm and controls the car gesture with the obtained data. Also we are able to remove the problem of back emf in our car by soldering capacitors to our motors. While being cost effective it also aids in physical therapy of children with recovering injuries. In addition to introduction our paper consists of related works, followed by proposed methodology, discussion and finally conclusion.
2 System Modelling The proposed system has two parts that integrated as one system. The first part consists of the hand module that sends the hand movements to the micro controller and the second part of the system is the radio controlled car that has the micro controller present in it that receives the hand movement data and moves the car as per that. Table 1 shows the technical specification of the system. The micro controller used here is Arduino Uno. Arduino is an open source micro controller board which can be programmed and reprogrammed according to our requirements. The board can be used to interact with surroundings using various sensors. The
Gesture Controlled Iterative Radio Controlled Car for Rehabilitation
419
system uses MPU6050 to get the hand movement data [8]. This sensor module is a 8 pin complete 6-axis Motion Tracking Device. It is a small package combination of 3-axis gyroscope (detects rotational velocity along the X, Y and Z axes), 3-axis Accelerometer (detects angle of tilt or inclination along the X, Y and Z axes). It uses inter-integrated circuit (I2C) bus interface to communicate with micro controller. To establish a communication between the hand module and the car nRF24L01 module [9] is used. It is a single chip radio transceiver module for 2.4–2.5 GHz ISM band communication capable of handling 125 channels. It can be connected through a Micro controller with SPI Interface (Serial peripheral interface).The motor driver module used to control the motor of the cars is the L293D driver module. It can be used to drive four bi directional DC motors with 8 bit speed selection, two stepper motors and two servo motors. The radio controlled car uses DC geared motor also known as BO motor with 300 rpm and operating Voltage of 3–12 V. As mentioned above there is an IMU (Inertial Measurement Unit) (component 1 of Fig. 1a) in the Hand Gesture Reading Module which accumulates data of hand rotation and position in raw format. Raw values range from −16,384 to 16,384. These values are further processed by a micro controller (in this case Arduino UNO) (component 2 of Fig. 1a) to interpret them into Angle values (0–180°). Thus the position of the arm is predicted along the X, Y and Z axis. These angular values are transmitted through the transceiver module (component 3 of Fig. 1a) to the car. But for the ease of transmission as well as interpretation on the receiver side these values (0–180°) are further simplified to integer numbers 0–18 for X-axis and 20– 28 for Y-axis respectively. Z-axis values are neglected for easy implementation of the idea. Now these integer values are transmitted through the transceiver (acting as transmitter) module at 2.4 GHz to the transceiver (acting as receiver) of the RC Car. This receiver (component 3 of Fig. 1b) now sends the data to the receiver Arduino (component 4 of Fig. 1b) through SPI connection (CE with Pin 7, CSN with Pin 8, SCK with Pin 13, MOSI with Pin 11, MISO with Pin 12). This receiver Arduino sends those data to motor driver Arduino (component 2 of Fig. 1b) through serial Table 1 Technical specifications of the proposed system Part type
Description
Details
Motor
4 DC Geared Motors(BO) Motors
200 rpm
Motor driver shield
L293D
4.5-36 V
Power supply
Li-ion Battery
3 no of 3.7 V = 11.1 V
Micro-controller
Arduino UNO
5 V, 16 MHz
Transceiver
nRF24L01
1.9–3.6 V, 2.4 GHz
IMU
MPU6050
5 V, 6 Axis(3 axis accelerometer and 3 axis gyroscope)
Micro-controller
Arduino UNO
5v, 16 MHz
Power supply
Battery
9V
Transceiver
nRF24L01
3.3 V, 2.4 GHz
420
N. Ghosh et al.
Fig. 1 1a MPU 6050 connected with the pins of the Arduino 2a Arduino Uno board 3a Transceiver pins connected to the Arduino 1b The 4 BO motors are connected with the motor driver 2b The LN293D is mounted on top of the Arduino. 3b The transceiver pins are connected with the second Arduino 4b The TX pin of this Arduino connected to the RX pin of the second Arduino to enable serial communication
communication (Using tx(pin 1) and rx(pin 0) pins). The motor driver L293D Shield is mounted on the driver Arduino. Driver code fed to this Arduino operates the Motor Driver to felicitate the rotation of the 4 DC Geared motors (component 1 of Fig. 1b). Depending on the real time data received from various hand movements the code triggers different types of signals to produce forward, backward, left and right turn along with speed change to the car by operating the 4 motors. For demo purpose Table 2 shows the exercise implementation with arm positions and the respective angles. When the shown movement is done the IMU senses the position of the arm. Figure 4 shows the real time data received from the IMU. There is a jump of almost 90° as the angular difference between the two positions of the arm is also 90°. This difference is being transmitted to the car resulting in forward or backward motion. The working car model of our system is shown in Fig. 2. Figure 3 shows the hand module which has MPU6050 and nRF24L01 mounted in it. The overall flowchart and working of the proposed system is given by Fig. 5.
3 Results and Discussions Table 2 shows the required hand movements that can be made via the hand module to move the gesture controlled car. For forward movement of the car the hand has to be kept in the position shown in Table 2, 1st position of hand, for backward movement the lower hand has to be kept perpendicular to the elbow as again indicated in Table 2, 2nd position of hand.
Gesture Controlled Iterative Radio Controlled Car for Rehabilitation Table 2 Hand gesture used to navigate the gesture controlled car model
1
2
Fig. 2 Working car module of our prototype
421
422
N. Ghosh et al.
Fig. 3 Hand module of our devised system to collect hand gesture data
Fig. 4 Plot of the hand movement angles with respect to time
Figure 6 plots the angles obtained from MPU6050 with respect to time. Region 1 of Fig. 6 shows the position of hand at an initial angle of 90°, as the hand is moved forward the angle reduces and reaches the minimum value of 0°, when the hand is parallel to the ground. This position is indicated by the trough in the region 2 of Fig. 6. As the hand is moved up and it crosses the 45° mark then backward movement of the car is triggered. Figure 7 shows the full cycle of forward and backward movement where the car continues to move until we are in the shaded region. Similarly, Fig. 8 plots the angles obtained from MPU6050 with respect to time. Region 1 of Fig. 8 shows the position of hand at an initial angle of 0°, as the hand is moved forward the angle reduces and reaches the maximum value of 90°, when the hand is perpendicular to the ground. This position is indicated by the peak in the region 2 of Fig. 6. As the hand is moved down and it crosses the 45° mark then forward movement of the car is triggered as marked by the point between region 2 and region 3. Figure 9 shows the full cycle of forward and backward movement where the car continues to move forward as long as we are in the shaded region.
Gesture Controlled Iterative Radio Controlled Car for Rehabilitation
423
Fig. 5 Flowchart of the proposed system
We have devised an interactive method to engage children suffering from muscular weakness into exercises. One of the problems that was faced during its development was Back EMF. It is generated from the motors which often interfere with the smooth working of the circuit. This problem was removed by using a three-capacitor filter. In this filter circuit one capacitor is connected across the motor terminals and one is connected to each terminal and the motor casing. A 0.1 µF (microfarad) capacitor across the terminals, and two 0.047 µF capacitors between each terminal and the case were used. Also we’ve not been able to implement multiple exercises that could have been done. With some advancements this will be possible in the near future. In the implementation of the car two Arduinos have been used because the Driver Module and The Receiver Module both use pin 11 by default. Using a single Arduino will reduce the cost to a great extent.
424
N. Ghosh et al.
Fig. 6 Triggering of backward movement
Fig. 7 Full cycle car movement with non-shaded region showing backward movement
Gesture Controlled Iterative Radio Controlled Car for Rehabilitation
425
Fig. 8 Triggering of forward movement
Fig. 9 Full cycle car movement with shaded region showing forward movement
4 Conclusion The required gesture controlled car using MPU 6050, transceiver module and Arduino UNO is made for providing rehabilitation and enjoyment at the same time. It enhances engagement at an affordable price compared to standard physiotherapy treatments. So it is an effective tool which shows potential in rehabilitation. To make this model more effective more hand gesture movements can be added to it.
426
N. Ghosh et al.
The system developed is also able to counter the problem of back emf by a capacitive circuit. The data sets obtained from this system can be further taken to design machine learning models which can further give us various reports on improvement in muscle health, flexibility of muscles and required number of hours of exercises needed to improve the condition of the injured muscles.
5 Future Scope The authors will focus on further improvements and functionalities that can be incorporated. This includes running trial on random samples of children initially and then can be further extended to adults. To implement this approach in exercise the authors intend to conduct clinical trials and obtain the required ethical clearance for the trials. This will solve the current issue of our prototype of being used by only a particular age group. The current prototype is solely based on hand movement which can later be extended to other body parts targeting a variety of other exercises. The data collected from each patient can further be analyzed and processed using algorithms to keep a track of their improvement. This data can then be used to predict the recovery time of the patient and provide a personalized exercise plan.
References 1. Wu X-H, Su M-C, Wang P-C (2010) A hand-gesture-based control interface for a car-robot. In: 2010 IEEE/RSJ international conference on intelligent robots and systems. IEEE 2. Fulara N et al. (2019) Hand gesture controlled vehicle. Hand, 4(11) 3. Segal AD et al (2020) iRebot: an interactive rehabilitation robot with gesture control. In: 2020 42nd Annual international conference of the IEEE Engineering in Medicine & Biology Society (EMBC). IEEE 4. Hsieh S-T, Lin C-L (2020) Fall detection algorithm based on MPU6050 and long-term short-term memory network. In: 2020 International automatic control conference (CACS). IEEE 5. Christ P et al (2011) Performance analysis of the nRF24L01 ultra-low-power transceiver in a multi-transmitter and multi-receiver scenario. In: SENSORS, 2011 IEEE. IEEE 6. Agung IGAPR, Huda S, Wijaya IWA (2014) Speed control for DC motor with pulse width modulation (PWM) method using infrared remote control based on ATmega16 micro controller. In: 2014 International conference on smart green technology in electrical and information systems (ICSGTEIS). IEEE 7. Verstraten T et al (2016) Energy consumption of geared DC motors in dynamic applications: comparing modeling approaches. IEEE Robot Autom Lett 1(1):524–530 8. Agarwal D, Rastogi A, Rustagi P, Nijhawan V (2021) Real time RF based gesture controlled robotic vehicle. MAIT (GGSIPU) Delhi, India 9. Yao-Lin Z et al (2011) Design of wireless multi-point temperature transmission system based on nRF24L01. In: 2011 International conference on business management and electronic information. vol 3. IEEE
Using Ensemble of Different Classifiers for Defect Prediction Ruchika Malhotra, Ankit Kumar, and Muskan Gupta
Abstract One of the most important characteristics of a software is its quality. Software defects are more likely to occur when software designs become more complicated in response to rising demand. Software quality is increased by testers by repairing defects. Consequently, the study of defects greatly raises the quality of software. Due to the increased amount of defects brought on by software complexity, manual defect detection can become an extremely time-consuming procedure. This encouraged researchers to create methods for the automatic detection of software defects. Due to the under/over fitting issues, existing methods for predicting software defects often have low accuracy (Matloob in IEEE Access 9:98754–98771, 2021). The use of machine learning is widespread in the realm of software defect prediction. The defect prediction effect of the single ML model isn’t optimal, according to the results of the available research. Hence, we suggest an ensemble learning method to resolve the issue, in which different machine learning algorithms are combined to produce an accurate defect prediction. Keywords Stacking · Ensemble learning
1 Introduction Within the realm of software engineering, the topic of software defect prediction has garnered a great deal of attention [1]. At the same time, it can raise software product quality, which can save development costs and boost efficiency. Through the use of metrics that are related to defects in the software, it creates a defect R. Malhotra · A. Kumar · M. Gupta (B) Delhi Technological University, Delhi, India e-mail: [email protected] R. Malhotra e-mail: [email protected] A. Kumar e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_39
427
428
R. Malhotra et al.
prediction model and then predicts how the software will find such defects. The software industry is changing quickly as a result of rising demand and technology. Defects are unavoidable in software since humans develop it for the most portion. Defects are generally characterised as undesirable or unacceptable variations in software programs, documents, and data. Because the product manager misinterprets the client’s requirements, defects may exist in the requirements analysis phase, and as a result, these defects will also persist throughout the system design phase [2]. Inexperienced coders may also be responsible for defects in the code. Software defects have a huge negative impact on software quality, and they can have serious repercussions in the aerospace and healthcare sectors. If the fault is discovered after distribution, the development team must rework some software modules, which drives up the cost of development. For well-known organisations, defects are nightmares. Customer dissatisfaction damages their reputation and lowers their market share. As a result, one of the key areas of industry research now concentrates on software testing. The quantity of defects has grown to the point where manual testing methods take a lot of time and are ineffective due to rising software complexity and development. Automatic classification of defects has been a popular topic in the study due to the emergence of machine learning. In order to represent the usage of ensemble learning approaches for software defect prediction, this paper provides an overview. For this study, the most recent publications released since 2012 are taken into consideration.
2 Literature Review 2.1 Software Defect Prediction Dataset The dataset of software defect prediction ‘JM1’ used in this paper is collected from kaggle. The dataset has 10,885 instances with 21 attributes. By making the PROMISE data set available to the public, the aim is to facilitate the creation of software engineering predictive models that are capable of being repeated, verified, debated, and improved. Data is derived from source code extractors by McCabe and Halstead. The attribute list includes (Fig. 1). Maximum-minimum standardisation is consistently used in an effort to enhance the efficiency of training of the model. Maximum-minimum standardisation is the process of transforming initial data in a linear fashion where every value of the attribute is mapped to the range [0, 1]. Also as data was highly unbalanced, SMOTE (synthetic minority oversampling technique) is applied to solve class imbalance problem.
Using Ensemble of Different Classifiers for Defect Prediction
429
Fig. 1 Figure showing attribute list of JM1 dataset
2.2 Support Vector Machine Support Vector Machines (SVM) is a popular machine learning algorithm used for classification tasks. It works by mapping each data point to a point in an n-dimensional space, where n represents the number of features in the data. The algorithm then identifies the hyper-plane that best separates the two classes in this space. This hyper-plane is the decision boundary used to classify new data points [2]. Support vectors are the data points closest to the separating hyper-plane, and they play an important role in defining the hyper-plane. SVM can be used for linear and non-linear classification tasks.
2.3 Naïve Bayes Naive Bayes is a classification algorithm that utilises Bayes’ Theorem, assuming that the predictors are independent of each other. It assumes that the presence of one feature in a class does not affect the presence of any other features [3].
430
R. Malhotra et al.
2.4 KNN The k-nearest-neighbour (k-NN) algorithm is a data categorization technique that calculates the probability of a data point belonging to one group or another based on the groups of the data points closest to it. The algorithm is considered a “lazy learner” because it delays the creation of a model using the training set until a query of the data set is made [4]. According to the theory, a sample is categorised based on the group of the majority of its k closest samples in the feature space or most comparable samples. The distance between data points is usually calculated using the Euclidean distance measure.
2.5 Logistic Regression The probability of an event occurring is predicted using logistic regression by fitting the data into a logistic curve or logistic function. It uses the nonlinear sigmoid function to determine the parameters that fit the model the best. Sigmoid = 1/(1 + e^(−1)). If sigmoid > 0.5 then it is considered 1, else 0.
2.6 Random Forest One of the supervised learning methods is random forest. Both regression and classification use it. To generate random samples, the dataset is utilised with the random forest technique, where a decision tree is created for each sample, and the predictions of each tree are combined to arrive at a final decision. Each anticipated outcome is given a vote, and the prediction that receives the most votes is chosen as the final one [5].
2.7 Ensemble Learning The wisdom of the crowd serves as an analog for ensemble learning. Imagine asking a difficult question to thousands of random persons, then compiling their responses. You will find that this collective response is frequently superior to an expert’s response. You will frequently obtain better forecasts when aggregating the predictions of a number of predictors (classifiers/regressors) than when using the best individual predictor. That’s the basic idea behind ensemble learning. Diverse learners work together in ensemble learning to address a specific issue. The outcomes of group learning are then combined to account for the error and enhance the performance of the entire learning model [6]. Boosting, bagging,
Using Ensemble of Different Classifiers for Defect Prediction
431
Fig. 2 Proposed stacking model
stacking, and ensemble methods based on voting are some of the most commonly used ensemble methods. Algorithms for ensemble learning are used for many aspects of real-world issues. We use voting and stacking algorithms here [7].
2.8 Stacking This ensemble technique is based on a straightforward concept: train a model to aggregate the predictions of all predictors in an ensemble rather than employing irrelevant functions (like hard voting). With stacking, two-layer estimators are used to create classification or regression models. First level comprises each of the baseline models that are used to predict the outcomes on the test datasets. The second layer is the Regressor or Meta-Classifier, which generates new predictions by taking each of the predictions from baseline models as an input. Here, we used 4 models namely Support Vector Machine, Gaussian Naive Bayes, Decision tree and k-nearest-neighbour as first layer estimators and then logistic regression model is used as meta-classifier to build our stacking machine learning model [8]. The diagram for this is given (Fig. 2).
3 Result and Observation We chose to use Python for our machine learning application due to its popularity and widespread use in the field. To make the dataset suitable for analysis, we utilised data preprocessing techniques. Specifically, we partitioned the dataset into 80% for training and 20% for testing. We evaluated all of the machine learning models using the following performance metrics to determine their effectiveness (Figs. 3, 4 and Table 1).
432
R. Malhotra et al.
Precision = TP/(TP + FP) Recall = TP/(TP + FN) F1 score = (2 ∗ precision ∗ recall)/(precision + recall) Accuracy = (TP + TN)/(TP + FN + TN + FP)
Fig. 3 ROC AUC score
Fig. 4 Graph showing visualization of performance of different models on given dataset
Table 1 Result comparison Model used
Accuracy
Precision
Recall
F1 score
AUC score
SVM
0.9926
0.9978
0.9935
0.9956
0.9993
Naive Bayes
0.9830
0.9929
0.9870
0.9900
0.9981
KNN
0.9917
0.9994
0.9908
0.9951
0.9969
LR
0.9843
0.9956
0.9860
0.9908
0.9878
Proposed model
0.9995
1.0
0.9994
0.99973
0.99997
Using Ensemble of Different Classifiers for Defect Prediction
433
4 Conclusion This paper employs various machine learning techniques to implement ensemble learning. The highest scores are observed in the “STACKING” rows of the table, indicating that stacking can improve the accuracy, precision, recall, F1, AUC score, and other evaluation metrics for defect prediction when compared to traditional machine learning algorithms. The ensemble model or stacking model outperforms single machine learning models, as it is less likely to over fit, has poor generalization ability and yields more accurate results. Nonetheless, certain aspects of the model still require refinement. The performance of defect prediction is affected by the parameters of the basic learner; hence it is crucial to select the appropriate parameters.
5 Future Scope The use of ensemble techniques for defect prediction has shown promising results in recent years. The future scope for using an ensemble of different classifiers for defect prediction includes: • Use of deep learning techniques: Deep learning techniques such as neural networks can be used to build ensemble classifiers. This can provide more accurate and reliable predictions for defect detection. • Integration with other software development practices: Ensemble techniques can be integrated with other software development practices such as code reviews and automated testing. This can help in identifying defects early in the development process and improve the overall quality of software. • Analysis of large-scale software systems: The use of ensemble classifiers can be extended to analyze large-scale software systems. This can help in identifying defects in complex software systems and improving the overall quality of software.
References 1. Matloob F et al (2021) Software defect prediction using ensemble learning: a systematic literature review. IEEE Access 9:98754–98771. https://doi.org/10.1109/ACCESS.2021.3095559 2. Assim M et al (2020) Software defects prediction using machine learning algorithms. In: 2020 International conference on data analytics for business and industry: way towards a sustainable economy (ICDABI), pp 1–6 3. Zhang H, Li D (2007) Naïve Bayes text classifier. In: 2007 IEEE international conference on granular computing (GRC 2007), Fremont, CA, USA, pp 708–708. https://doi.org/10.1109/GrC. 2007.40 4. Taunk K, De S, Verma S, Swetapadma A (2019) A brief review of nearest neighbour algorithm for learning and classification. In: 2019 International conference on intelligent computing and
434
5.
6.
7. 8.
R. Malhotra et al.
control systems (ICCS), Madurai, India, pp 1255–1260. https://doi.org/10.1109/ICCS45141. 2019.9065747 Ho TK (1995) Random decision forests. In: Proceedings of 3rd international conference on document analysis and recognition, Montreal, QC, Canada, pp 278–282, vol 1. https://doi.org/ 10.1109/ICDAR.1995.598994 Huang F, Xie G, Xiao R (2009) Research on ensemble learning. In: 2009 International conference on artificial intelligence and computational intelligence, Shanghai, China, pp 249–252. https:// doi.org/10.1109/AICI.2009.235 Koziel S, Dabrowska AP (2019) Reliable data-driven modeling of high-frequency structures by means of nested kriging with enhanced design of experiments. J Eng Comput 36(7) Pavlyshenko B (2018) Using stacking approaches for machine learning models. In: 2018 IEEE second international conference on data stream mining & processing (DSMP), Lviv, Ukraine, pp 255–258. https://doi.org/10.1109/DSMP.2018.8478522
Designing an ABM that Can Be Used to Predict the Impact of the Number Portability Regulation in Namibia Using Netlogo Henok Immanuel, Attlee Gamundani, and Edward Nepolo
Abstract It is without a doubt that public policies and regulations shape the behavior of economies, societies, and nations with the aim of improving the lives of all (Media statement CRAN hosts public consultative meeting on sim registration conditions for immediate release, 2021 [1]). Therefore, preparation is at the forefront of success and problem resolution, as long as the future remains blurry and unknown, humankind will rely mostly on past experiences to predict the outcomes of an action before implementation. Various scientific and non-scientific methods have been used to reduce or minimize negative results. Traditionally, the impact of public policies or regulations on citizens is only known after implementation. However, in the technological era, modern ways such as the adoption and use of agent-based modeling (ABM) phenomenon have been adopted to improve the prediction process (Ramadania and Muda in Dissertation view project management view project, 2018 [2]; J Appl Econ Sci 13:1246–1259, 2018 [3]). Predictions of the impact of social systems using ABMs have improved the way policies and regulations are designed and implemented through modeling. This paper focuses on designing an ABM using empirical data to predict the impact of number portability (NP) regulation if implemented in Namibia (Ghalumyan in Open Sci J 3:1–28, 2018 [4]). The NP initiative refers to the ability for a subscriber to retain their preferred mobile or fixed number while changing a service provider(s). ABM is being considered as a new analytical method for social sciences and one that is quickly becoming popular and being referred to as the new way of doing science and planning through predictions (Naruseb in Gov Gaz 6607:37–40, 2018 [5]). ABMs are created in programmable modeling environments such as Netlogo, which is used to simulate natural or social phenomena (NUMBER_ PORTABILITY law namibia [6]).
H. Immanuel (B) · A. Gamundani · E. Nepolo Namibia University of Science and Technology, Windhoek, Namibia e-mail: [email protected] A. Gamundani e-mail: [email protected] E. Nepolo e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_40
435
436
H. Immanuel et al.
Keywords Agent-based modeling · Number portability regulation · Netlogo
1 Introduction 1.1 Number Portability Regulation in Namibia The Namibian mobile telecommunication sector is dominated by Mobile Telecommunication Company (MTC) and Telecom Namibia (TN Mobile) with national coverage that make-up to 90% of the subscriber market share, the two companies are both wholly or partially government owned and the remaining service providers share approximately 10% of the mobile market share, this has prompted the Namibian government through the Communication Regulatory Authority of Namibia (CRAN) to introduce the number portability innovation as a means to stimulate competition and attract new intrants into the market [1]. Although the regulation and legal provisions are put in place the NP innovation that provides subscribers the ability to switch from one telecommunication service provider to another without having to change their existing number and go through the struggles of informing all their contacts of a new number is not yet implemented in Namibia [2]. This initiative has been deployed in many countries around the world due to its promising impact on market competition, improving service delivery and enhance innovation among other perceived benefits. In the Namibian context NP means a subscriber/consumer that has an MTC mobile number such as 08-11111111 can switch or move to the TN mobile network without losing their number and start making use of the TN services or access to their products without losing their MTC number, in a nutshell a number stops to be associated to a specific network but get associated with an individual which further eases the adoption of mobile money services and identification among others. Namibia has to date not implemented NP compelling subscribers of mobile telecommunications services to let go of their numbers when switching from one operator to another. This has created a scenario where the numbers are owned by the carriers and has hampered the growth of start-ups and competition in the market [3], further delaying the access to a variety of digital services amid the ever-growing demand for innovation and enabling an environment where the end users have a choice of switching or not [4]. The objectives of the number portability regulation in Namibia which was approved as amendments to the communication act 8 of 2009 is to set out the scope for implementation of NP, to determine the date on which the regulation becomes enforceable and set out the process for implementation. The regulation being referred to requires that the number portability initiative is implemented 12 months from the date the regulations were promogulated in parliament, the porting process much be completed within 24 hours and the donor provider should not charge the porting subscriber or recipient provider, however the recipient provider may charge a fee to complete the porting, however, the fee should not be prohibitive for the customers [5, 6]. Research conducted on NP shows that although
Designing an ABM that Can Be Used to Predict the Impact …
437
operators are of the view that the efforts required to keep customers is more lucrative than that of attracting a new customer it makes sense for service providers to focus on retaining customers and lessening opportunities for switching [7]. According to [8] NP will enable end users to be able to benefit from multiple number of services from any service provider of their choice [9]. Factors for successful implementation of Number portability. Adopting multiple NP types enhances the uptake of the NP service. Some of the known and prominent types of number portability implementations according to [10] are: 1. Location-based portability: Ability to use based on location of choice or limited to a geographical location. 2. Operator-based portability: Ability to move from one carrier provider to another Enables fixed number or mobile number. 3. Service-based portability: Ability to use across different technologies or solutions. 4. Convergence-based portability: Ability to use when changing from fixed too mobile. 5. Total number portability: Using a combination of all the above as one solution. However, it is worth noting that a strong regulatory environment and government commitment eases the effective implementation of any type of NP anywhere. In Europe, NP is recognized as a human right by law such that member states are required to guarantee that all subscribers that wish to retain their numbers can do so, and it is the obligation of the operators to make sure of such abilities [11]. Ghalumyan [12] highlights the following key points for consideration prior to implementation of NP does increase the possibility of success when adopting NP. • Pricing related to use NP should not be made disadvantageous to demotivate consumers. • Regulators should not impose regulations or tariffs that would distort market competition. • The number porting process should take the very minimum time possible. • Ensure minimal service disruption during the porting exercise. • Porting should be based on consent. • Reimbursements should be put in place for any delays on the part of the operators. • Commitment after switching should not be more then 24 months. Agent-based modeling and netlogo. Agent-based modelling involves designing a system of autonomous agents that interact based on a set of given rules [13] that is used for studying complex systems whose behavior cannot be easily identified using classical mathematical approaches [14]. The process of building the ABM that is used in this paper starts with defining a concept model using knowledge of the research domain, outlining the types of agents and their rules; networks, spaces, and rules of interaction, defining the data to gather for each time interval and subsequently code the rules from the concept model into the Netlogo toolkit. The model is then run, and comparisons made based on the results and calibration achieved and the outputs are
438
H. Immanuel et al.
making sense. Netlogo [15] is a multi-agent programmable modelling environment that is used to create complex natural and social simulations. This study explores the impact of number portability policy and regulation on the Namibian telecommunications market. It creates an agent-based model to predict the effects of the policy and offers insights into changes in market competition, customer behavior, and service quality. The findings can contribute to the existing literature on the use of agent-based modeling in telecommunication research and inspire further research in this area, potentially leading to more effective policymaking for diverse purposes. The paper is divided into five sections, covering the introduction, design of the ABM, the model itself, model validation, and discussion/conclusion, which summarizes key findings and suggests areas for future research.
2 ABM Design for NP Regulation Prediction Building the model. Agent-based modeling (ABM) can be used to model complex systems in various fields. The design and development of an ABM model for the telecommunications industry were based on critical factors that influence consumers’ decisions to switch service providers or stay with their current one. A questionnaire was developed and distributed to potential respondents, and 125 responses were collected to set the model’s parameters. Agents and the ABM environment. ABM models are made up of agents with their underlying properties and behavior as individuals and as a group. They can be used when there is a need for in-depth detail related to a subject. The agents in ABM’s can interact with each other and within their environment. Those agents can be created so that they emulate a real-life scenario by matching to the reality on different characteristics and geographic distributions, under which they can produce results. The ABM referred to in this paper is one that can be used to predict the impact of the number portability public regulation in Namibia using globally known factors that influence consumer switching pattens and parameters derived from the act of parliament that contains the regulatory conditions. Empirical data collected through a questionnaire is used to set the conditions and parameters under which the simulation is based. The model environment is made up of grids of cells which for the purpose of this paper we refer to it as the operating market/space. The turtles in the environment are referred to as the subscribers of mobile telecommunication services and their color represents the service provider that they are subscribed to. For this paper, we assume that all agents are having the same characteristics by when they have the same color. Factors that influence subscriber porting behavior. Five (5) commonly referred to factors were adopted from [16–19] since they are recognized as some of the critical factors that influence consumers decision to switch between service providers namely Quality of service, Price, Innovation, Network coverage and marketing and branding
Designing an ABM that Can Be Used to Predict the Impact …
439
(brand reputation). These factors are incorporated into the model for prediction by weighting based on the most important factor that the subscriber sees as reasons to switch to another service provider or not, below is a brief explanation of what the chosen factors mean in relation to this paper: • Quality of service: Refers to the ability of a service provider to satisfy consumers of their services in an efficient manner. • Price: this refers to the cost of products or services. • Innovation: the introduction of new products and services or changing from the traditional way of doing this to a more effective and efficient manner. • Network coverage: The availability of services or products of a service provider at various locations which conveniences the consumers. • Marketing and branding: this are directly interlinked to the efforts a provider puts in creating awareness of products and services that a service provider is providing which translate to the brand recognition and loyalty of consumers. Behavior influences the logic of decisions made [20], this is mostly due to events and conditions set or experienced by an individual or a group of people. Random factors play important role in agent-based models, most often decisions can be categorised as probabilistic because of parameters and variables that are involved when the logic is invoked. Factors influencing the agent behavior can be external and internal, as depicted in Fig. 1. In this paper, we focus on the internal and external factors that are inherent or applicable to the agent. This external factors are in most instances the same or applicable to all the agents in the same manner and are driven by the operational environment and service providers. The Internal factors are more agent specific in relation to preferences, needs and knowledge. Fig. 1 Consumer choice
440
H. Immanuel et al.
3 The Model ABM in NetLogo. The model is executed in Netlogo a platform designed to create ABM’s. The model is run on the interface tab as indicated in Fig. 2. The info tab provides more knowledge on how the model runs and includes a basic version of the methodology of the model. Lastly, the code tab contains the actual coding of the model itself. Model description. The “setup” button initializes a scenario simulation, and “goones” runs the simulation once, while “go” runs it infinitely. The “number of consumers” slider determines the total number of consumers in the simulation, and the “market share” determines the allocation of market share to each service provider. Five factors that affect a consumer’s decision to switch providers are included in dropdown menus. The simulation includes three service providers, each with a distinct color. The simulation includes a visualization component and a reporting component to display the results. The subscribers move randomly, representing various demographics. • Each agent is linked to a service provider identified by blue, orange, or green color. • If a subscriber comes across more than 3 subscribers of a different color for more than 4 times, then it changes color to that of the majority colors around it. this is used to explain the social influence of peers on switching. Weighting the influential factors. Quantifying which factor is more important or carries more impact than the other is a difficult task therefore responses from the questionnaire that was sent out was used as a basis to weight the identified factors in the order of the most influential to the least influential as indicated in Fig. 3.
Fig. 2 Number portability ABM
Designing an ABM that Can Be Used to Predict the Impact …
441
Fig. 3 Factors that affect subscriber switching behavior
4 Model Validation The agent-based model developed as alluded to in this paper was validated and tested using the cross-validation, sensitivity analysis and comparing the model output to real data approach, by following the framework established by [21] and used to validate many other models such as the agent-based model for the spread of covid-19 by [21]. Cross validation. The model output is compared to the output of the previously validated model output, to give an indication of consistency of the model’s output. For example, the output when using more subscribers is the same when using less subscribers. This shows the ability for the model to scale to millions of users that are not possible or easy reach. This enables the model to create a national prediction. For example, the model is run multiple times with different configuration such as with 100 and 200 subscribers respectively while the equal values constant values is set for the market share for the two service providers, and all the factors that are attract or retain subscribers configured equally and with 62 the outputs are expected to be the same as shown in Figs. 4 and 5. Sensitivity analysis. Hunter and Kelleher [22] is used to determine how responsive a model is when parameters are changed. The analysis involves identifying parameters
Fig. 4 100 subscribers
442
H. Immanuel et al.
Fig. 5 200 Subscribers
to investigate and determining the range of values to use. In this study, sensitivity analysis focused on changes in agents based on parameters applied to the service provider. 125 subscribers were created in the model emulating respondents to the questionnaire, this was demonstrated by the turtles changing color. This approach provides insights into how the model behaves in response to different conditions and can be used to improve the model (Fig. 6). Comparison to real data. After completing the cross validation and sensitivity analysis we conclude the validation by following the approach of compering the model data to that of real data. A questionnaire was developed and up to 125 randomly selected participants completed the questionnaire. The questionnaire was developed using google forms and the analytics thereof were done using JMP statistical analysis tool. The final dataset forms part of the empirical data which was used to compare the model output. Responses from the questionnaire indicate that 80.8% of the respondents would try number portability this is comparable to the simulation results as indicated in Fig. 7 show a lot of subscribers switching to the third operator and similarly is shown in Fig. 8 from the model simulation.
Fig. 6 Sensitivity analysis based on price for product
Designing an ABM that Can Be Used to Predict the Impact …
443
Fig. 7 Subscriber willingness to try NP
Fig. 8 Simulation output
5 Discussion and Conclusions This model finds that it is indeed possible to develop ABM’s that predict the potential impact of public policy and regulation to minimize negative impacts, guide policy making and improve resource allocation and prioritization. Outcomes of this study indicate that should the NP regulation be fully implemented it would positively impact the telecommunication sector. One of its immediate outcomes would be increased competition which improves service quality, innovation and ultimately reduce cost for services enabling equitable access for all bridging the digital divide. It was observed that when the model is scaled higher, more computational power is required to run the simulation efficiently. Improvement of the model can be by adding more variables to the environment and adding more characteristics and phenomena to the agents to capture diverse situations such as impact by demography, gender, age, and income levels. Combined with other modeling and simulation toolkits such as the Namibia Agent-Based Modelling framework (NAMOD) and the South African Agent-Based Modeling framework (SAMOD) models can be used to predict the impact of public policy and regulations and can help to provide a greater picture of the impact of public policy and regulations in future. Policies evolve and will continue to do so;
444
H. Immanuel et al.
new public policies and regulation will emerge, and some amended to fit a specific purpose. Hence as the world is evolving it is necessary and important to increase the adoption of agent-based modeling to improve the efficiency and effectiveness of policy and regulation making process.
References 1. Communications Regulatory Authority of Namibia (CRAN) (2021, October 22) MEDIA STATEMENT: SIM Registration Conditions (22 October 2021). Retrieved from https://www.cran.na/yglilidy/2021/10/MEDIA-STATEMENT-SIM-Registration-condit ions-22-October-2021.pdf 2. Ramadania R, Muda I (2018) Experimental study of mobile number portability-could it be a potential breakthrough in Indonesia telecommunication market? Dissertation View project Management View project 3. Ramadania WI, Muda I (2018) Experimental study of mobile number portability—could it be a potential breakthrough in Indonesia telecommunication market? J Appl Econ Sci 13:1246–1259 4. Ghalumyan A (2018) Institutional stakeholder perceptions on mobile number portability implementations in Georgia and Belarus. Open Sci J 3:1–28. https://doi.org/10.23954/osj.v3i4. 1705 5. Naruseb A (2018) Government gazette republic of Namibia. Gov Gaz 6607:37–40 6. Legal Assistance Centre (2023) Namibia Communications Act 2009 (Act No. 8 of 2009) as amended by Communications Act Amendment Act, 2019 (Act No. 5 of 2019). Retrieved from https://www.lac.org.na/laws/2019/7005.pdf 7. Sharma S (2019) A study on mobile number portability in panjab. 1 8. Vermeulen J (2015) How mobile number porting works. https://mybroadband.co.za/news/cel lular/118972-how-mobile-number-porting-works.html 9. Prasad MS (2020) Birds’ eye view on mobile number portability. 8 10. iConectiv (2023) iConectiv TruNumber Portability Clearinghouse White Paper. Retrieved from https://iconectiv.com/sites/default/files/2021-07/iconectiv%20TruNumber%20Portabilit yClearinghouse%20WhitePaper.pdf 11. Bohler S, Dewenter R, Haucap J (2005) Mobile number portability in Europe 12. Ghalumyan A (2018) Institutional stakeholder perceptions on mobile number portability implementations in Georgia and Belarus 13. Gilbert N (2019) Agent-based Models. SAGE Publications 14. Vahdati A (2019) Agents.jl: agent-based modeling framework in Julia. J Open Source Softw 4:1611. https://doi.org/10.21105/joss.01611 15. Doursat R (2005) Introduction to NetLogo 16. Kaur G, Sambyal R (2016) Exploring predictive switching factors for mobile number portability. Vikalpa 41:74–95. https://doi.org/10.1177/0256090916631638 17. Khaliq et al (2017) Exploring switching factors for mobile number portability: a survey. Int J Adv Appl Sci 4:29–36. https://doi.org/10.21833/ijaas.2017.08.005 18. Nimako SG, Ntim BA, Mensah AF (2014) Effect of mobile number portability adoption on consumer switching intention. Int J Mark Stud, vol 6. https://doi.org/10.5539/ijms.v6n2p117 19. Boateng KA, Owusu OO (2013) Mobile number portability: on the switching trends among subscribers within the telecommunication industry in a Ghanaian City 20. DeAngelis DL, Diaz SG (2019) Decision-making in agent-based modeling: A current review and future prospectus. Front Ecol Evol, vol 6 21. Hunter E, Kelleher JD (2022) Validating and testing an agent-based model for the spread of COVID-19 in Ireland. Algorithms, vol 15. https://doi.org/10.3390/a15080270 22. Hunter E, Kelleher JD (2022) Validating and testing an agent-based model for the spread of COVID-19 in Ireland. Algorithms 15:8–22. https://doi.org/10.3390/a15080270
Performance Analysis of Deep Neural Network for Intrusion Detection Systems Harshit Jha, Maulik Khanna, Himanshu Jhawar, and Rajni Jindal
Abstract Intrusion detection system or abbreviated as IDS is an important security system that is used to protect advanced networks used for communication from dangerous threats. These kinds of systems were strategically created to recognize specific rule violations, patterns and signatures. Many great alternatives have been provided by consistent use of machine learning, deep learning algorithms in the subject of network intrusion detection. We can characterize between anomalous and normal behavioral patterns. In this paper, we have done a comparative analysis of our proposed deep learning model with various ML classifiers: Random Forest, Naive Bayesian, Gradient Boosting, Support Vector Machine, Decision Tree, and Logistic Regression. We used Accuracy, Precision and Recall as evaluation metrics for our models. We run our model on various datasets: CICIDS2018, CICIDS2017, UNSWNB15, NSL-KDD, KDD99 to verify that our model not only identifies particular attacks but performs well on all types of attacks in various datasets. We also draw attention towards a lack of datasets representing the current modern world. Keywords IDS · Security threats · Dl model · CICIDS2018 · CICIDS2017 · UNSW-NB15 · NSL-KDD · KDD99 · Ml classifiers · DOS attack · Probe attack · U2R attack · DDOS attack · Dnn model · Random forest · Support vector machine · Gradient boosting
1 Introduction “As of January 2022, there were more than 5 billion active internet users in the world counting to almost 60% of world’s population” was quoted by a study done by world internet statistics [1]. One interesting report from Cisco states that there will me more than 30.9 billion internet connected gadgets by 2025 [2]. It shows that H. Jha (B) · M. Khanna · H. Jhawar · R. Jindal Delhi Technological University, New Delhi, 110042, India e-mail: [email protected] R. Jindal e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_41
445
446
H. Jha et al.
the growth rate in the technology world is extremely rapid, whereas advancement in finding new faulty methods and hacking techniques is also growing rapidly. Intruders are evolving and can beat our firewall and its variants. The Intrusion Detection System was created in order to tackle the problems posed by some of the commonly used traditional approaches. By classifying data into various groups, the Intrusion Detection System tracks network traffic and closely monitors the activities to help us distinguish between intrusive activities and normal network behavior. Various ML Classifiers help us in effectively managing and classifying different types of attacks. It is described as an improvement in the computer’s learning process using its own experience with the attacks rather than explicitly programmed. Host Based Intrusion Detection systems are used to prevent attacks and are installed over terminals, Network based Intrusion detection systems are used when IDS are placed on the entry to check all the outgoing and the incoming traffic to protect the system from any harmful packets that could hamper the whole system. With our lives being connected to the internet 24/7 it becomes necessary to secure the systems which control our personal data, which is our motivation behind the paper. Below are the points that summarize our contribution in the paper: • Comparing our proposed Deep Learning Model with various ML Classifiers on various assessment metrics. • Previous work on Intrusion Detection System with various other approaches has been studied and summarized. • In comparison with previous works, proposed works have better accuracy on an average (average of accuracy obtained on different datasets) in attack classification • Results states that there is a need of a better and improved dataset.
1.1 Intrusion Detection System (IDS) IDS are security solutions just like firewalls, access control schemes, antivirus software etc., and are developed to improve the security of the communication systems. An IDS is a passive and intelligent monitoring tool that recognises possible threats and sends out alerts, allowing incident responders or analysts in a security operations center to look into and address the potential problem. The endpoint or network are not actually protected by an IDS. Infact, firewall is used to serve the purpose of defense mechanism. It analyzes network packet metadata and either permits or disallows traffic in accordance with established criteria. As a result, some communication or protocol types are blocked from crossing this barrier. One way of classifying the Intrusion Detection Systems is according to the monitoring platforms. Another way is to classify them according to the techniques of detecting suspicious activities. Classification according to Monitoring Platform involves [3]. • IDS based on Network • IDS based on Host.
Performance Analysis of Deep Neural Network for Intrusion …
447
Classification according to techniques deployed involves [3]. • IDS based on Signature • IDS based on Anomaly • Hybrid IDS.
2 Related Work There has been a considerable effort for the research in Intrusion Detection and Prevention Systems. Variety of methods and techniques have been studied and analyzed and their results evaluated. As it turns out, Deep Learning(DL) and Machine Learning(ML) have been the most successful techniques to develop and design the IDS. Wang et al. [4] suggested an SVM-based framework for IDS and uses NSL- KDD (2009) data set to test their approach. They asserted that their strategy, which has a 99.92% efficiency rate, is better than previous methods, but they omitted to include the statistics of the employed datasets or the quantity of training and testing samples. Additionally, SVM performance suffers during bigger amounts of data are present, making it a poor option for analyzing massive amounts of network traffic for intrusion detection. A hybrid model of KPCA and SVM with GA was used by Kuang et al. [5] to detect intrusions, and they achieved 96% detection rate. For the system’s verification, they employed the KDD CUP99 dataset; however, this dataset has some drawbacks. Redundancy is one such instance when the classifier is biased toward records that appear more frequently. Because the top portion of major parts are chosen from principal space, KPCA is constrained by the potential to miss significant features. SVM based models perform poorly when large amounts of data needs to be handled. Hence they are not useful in applications for example to monitor the bandwidth of a network. Zargari and Janarthanan[6] implemented an algorithm for the number of feature selection and trained it on the dataset UNSW-NB15 (2015) to select an optimal features. Author used Kappa Statistic measure to assess the model. RF classifier performed best in overall performance. First subset achieved an accuracy of 0.7566 and a kappa score of 0.6891. Second subset produces accuracy of 0.8161 and kappa score 0.7639. Boukhamla et al. [7] started with reducing dimensionality of features using PCA and pre-processing stages, including eliminating infinite, missing or redundant values and also removed all unnecessary points ex. IP address of source, timestamp, flow ID etc. The CICIDS2017 dataset was tested using these classifiers:- C4.5, KNN, NB Classifiers. Outcome indicated that various classifiers performed good on some attacks but fails miserably on other types of attacks. Lin et al. [8] worked on model to identify anomalies by using LSTM and Attention Mechanism to improve performance on network training. He used CIC-IDS 2018
448
H. Jha et al.
dataset to train the proposed model and achieved accuracy of 0.9622 with detection rate of 0.15 and recall rate to be 0.96. An RF-based model for an intrusion detection system was created by Farnaaz and Jabbar [9]. On NSL-KDD dataset, they evaluated performance of their model, and the findings showed that it had a 0.99.67 higher rate of detection than J48. One major drawback for RF algorithm is that it could be too sluggish for real-time prediction due to the presence of several trees. Elbasiony et al. [10] proposed model on intrusion detection system based on weighted k-means and RF and confirmed using the KDD99 dataset. The method correctly predicted results 0.983 of the time. The RF’s slowness, which is caused by the growth of many trees, makes it unsuitable for anticipating actual traffic. The KDD99 dataset also shows only a few of the aforementioned shortcomings. Zong et al. [11] proposed an IDS based on RF Classifier. Author used the IG method to select attributes that are required for classification. Categorical Features of UNSW-NB15 dataset were transformed then they proceeded with the training further. Researchers used FAR and AC for evaluating the model. After completing research, they achieved FAR of 0.1578 and AC of 0.8578.
2.1 Viewpoint While these surveys are important to understand the information on the accuracy and design, none of them touched upon the limitations of available datasets. Currently available datasets lack certain features of real life recent times network traffic due to which most of the present anomaly detection IDS are not suitable for product environments. in this paper we have addressed the need for newer datasets which are adaptable to dynamic nature of the world of cybersecurity, are balanced and are inclusive of the different widely encountered cyber-attacks.
3 Datasets Datasets used for in our paper are mentioned in Table 1. We have handpicked the most commonly used data-sets which represent data collection from different decades. Motivation behind using multiple data sets was to determine which collection is the most accurately represents today’s network.
4 Methodology As mentioned previously, various other traditional Machine Learning techniques like the Random Forests, SVMs, etc. performed fairly well but have their own limitations like linearity, scalability and generalization over a wide spectrum of data. Deep Neural
Performance Analysis of Deep Neural Network for Intrusion …
449
Table 1 Datasets Dataset name
Author
*Src
*Ft
Attack categories
*CT
Dataset type
Data collection method
KDD99 (1999) Stolfo et al. DAR-PA [12]
41
DOS, U2R, R2L, Probe
Binary
Simulated
Synthetic
NSL-KDD (2009)
42
DOS, U2R, R2L, Probe
Binary
Simulated
Synthetic
Real-World Network Sensors
Tavallaee et *DAR-PA al. [13]
UNSW-NB15 Nour (2015) Mustafa et al. [14]
UNSW
49
Generic, Fuzzers, Exploits, DoS, Backdoor, Generic, Shellcode, Worms, Reconnaissance
Binary
CICIDS-2017 Iman S. et al. [15] (2017)
*CIC
80
Brute Force, Heartbleed, Bot, DoS, DDoS, Infiltration, Web... (Total: 12)
Multi-Class Real-World Network Sensors
CSE-CICIDS- Iman S. et al. [15] 2018 (2018)
*CSE-CIC
79
Brute Force, Heartbleed, Bot, DoS, DDoS, Infiltration, Web... (Total: 12)
Multi-Class Simulated
Synthetic
*CT —Classification Type *Ft—Features *Src—Source *CIC- Canadian Institute for Cybersecurity *DARPA—Defense Advanced Research Projects Agency *CSE—Communications Security Establishment
Networks(DNN) on the other hand, are made of multiple layers of neurons which are stacked together in increasing order of complexity and abstraction. Non-linear transformations are applied to the input that is feeded to the Neural Network by each hidden layer. This leads to creation of a statistical model with multiple parameters, which leads to generation of the right output. Different models can be created by altering the number of hidden layers and further the number of neurons in each layer.
4.1 Proposed Architecture Figure 1 describes the proposed architecture for the DNN. The input layer consists of number of neurons equal to the number of features in the dataset on which the model is being trained and tested. The model has a fully connected architecture, where every neuron in the current layer is connected to all the other neurons in the next layer. Also, no neuron is connected to the neurons of the same layer. Back-propagation [16] is used as the learning mechanism for the model. The proposed architecture also contains dropout layers after every dense layer which increases the reliability of the model.
450
H. Jha et al.
Fig. 1 Proposed architecture
The model consists of 4 dense hidden layers along with one input layer and one output layer. The first dense hidden layer consists of 1024 neurons, followed by the second layer with 768 neurons, third layer with 512 neurons and the fourth layer with 128 neurons as described in Table 2. The output from the fourth hidden layer serves as an input to output layer neurons for a multi-class classification of the type of attack based on the dataset being used to train the model. The count of neurons is consistently reduced in each hidden layer. This is done to increase the speed of computation and generate better results. In contrast to using the traditional activation functions like the sigmoid function and the tanh functions, we have used the non-linear activation function ReLu [17] for each hidden layer, which is mathematically defined in the Eq. (1). Sigmoidal functions and tanh activation functions are widely used in the DNN models. However, they are prone to the Vanishing Gradient Problem [21]. ReLu is immune to this problem and offers a good alternative as an activation function. Softmax activation is added between the final hidden layer and the output layer for multi-class classification since the attack classes are mutually exclusive. The Eq. (2) denotes the mathematical Table 2 Hidden Layers Structure Information
Layers
Number of Neurons
Dense layer 1 Dropout layer 1 (1%) Dense layer 2 Dropout layer 2 (1%) Dense layer 3 Dropout layer 3 (1%) Dense layer 4 Dropout layer 4 (1%)
1024 – 768 – 512 – 128 –
Performance Analysis of Deep Neural Network for Intrusion …
451
representation of the softmax function, where .σ represents the softmax value and . yi denotes the ith of value. Dropout (by 1% each) layers are used after every dense layer to make the model more robust by unplugging the neurons randomly, thus preventing the model from over-fitting the training set. Cross-Entropy [18] defined in Eq. (3) is used as the loss function for the above proposed sequential Deep Neural Network (DNN) model for multi-class classification. Adam [19] has been used as the optimizer for faster computation. .
Relu(x) = max(0, x) ⎛
(1)
⎞
y ⎜ ei ⎟ σ (yi ) = ⎝ ∑ y j ⎠ j = 1, ..., n e
.
(2)
j
CE = −
M ∑
.
yo,c log( po,c )
(3)
c=1
The proposed model will be compared against algorithms like Linear Regression(LR) and machine learning techniques like Random Forest(RF), Naive Bayes (NB), Gradient Boost(GB), Support Vector Machine(SVM) and Decision Tree(DT) in the next section.
5 Experimentation and Results 5.1 Metrics Evaluating the created classification model is considered one of the most critical sections. This is performed by using various evaluation metrics. We have used 3 evaluation metrics which are Accuracy, Precision and Recall, as specified by the Eqs. (4), (5) and (6) respectively. We perform multi-class classification on 5 different datasets and weighted average is taken for metric scores of different classes to represent a total metric score. These metric scores are then compared to evaluate the performance of our proposed DNN technique against traditional machine learning algorithms. Accuracy: It simply tells how often the model classifies the data correctly. Even though it is not reliable for judging performance, it is one of the most used parameters.
452
H. Jha et al.
.
Accuracy =
(T P + T N ) (T P + F P + F N + T N
(4)
Precision: When it is predicted Yes, how often it is classified as Yes. Number of samples that the model was able to classify a class correctly upon samples that were predicted belonging to that class by the model: .
Pr ecision =
TP T P + FP
(5)
Recall: When it is actually Yes, how often it is classified as Yes. Ratio of no of correctly classified positive data samples by, total number of correct data samples that were passed. .
Recall =
TP T P + FN
(6)
TP = True Positive, TN = True Negative, FP = False Positive, FN = False Negative
5.2 Findings Various ML techniques and proposed deep learning algorithms are tested on different databases (KDD99, NSL-KDD, CICIDS-2017, CSE-CICIDS-2018, UNSW-NB15). We can use these results to determine the best performing technique and deduce which database is the best for training and generating industrial level models for intrusion detection. KDD99 Deep Neural Network performs the best with 99.03% accuracy. KDD99 is a relatively simple database with an unequal distribution of attacks with over 90% DOS attacks. KDD99 does not represent current traffic and hence its models trained over KDD99 are not up to industry standards to be used in real life. NSL-KDD NSL-KDD being a version of KDD99 shows similar performance with various algorithms. NSL-KDD is outdated and like its parent KDD99 should not be used for training models to be used for practical applications. Any IDS based on NSL-KDD may not survive in the current world especially if being targeted by some capable attacker. UNSW-NB15 All algorithms except Naive Bayes perform really well even though it is a relatively newer databse due to lower dimensional data covering a lesser number of attacks than both versions of CICIDS (Table 3). CICIDS-2017 One of the newer datasets shows a considerable drop in performance due to its large coverage of threats which are up to date with today’s standards.
Performance Analysis of Deep Neural Network for Intrusion …
453
Table 3 Results KDD99 *Ac
*Pr
*Re
NSL-KDD *Ac
*Pr
*Re
UNSW-NB15 *Ac
*Pr
*Re
CICIDS-2017 *Ac
*Pr
*Re
CICIDS-2018 *Ac
*Pr
*Re
LR
97.21 95.57 91.11 97.21 92.16 88.32 98.44 98.71 95.31 95.11 94.19 89.72 92.48 93.77 89.83
RF
99.01 98.37 92.87 99.62 98.91 94.31 99.08 97.99 97.12 96.88 97.21 92.21 94.12 94.79 92.18
GB
97.54 94.50 90.56 98.54 96.17 89.89 98.76 97.41 96.56 97.62 97.17 94.91 93.23 95.98 94.23
NB
88.22 47.12 67.34 85.92 55.32 55.72 76.78 86.13 85.53 73.56 53.33 99.02 70.67 55.78 48.12
SVM 87.68 70.33 65.48 98.52 92.39 84.34 90.11 85.45 79.43 70.32 61.86 3.21 DT
62.69 50.93 12.66
93.42 91.25 88.72 97.72 92.42 89.92 91.43 93.84 87.39 95.72 98.32 87.82 91.36 95.83 87.62
DNN 99.03 98.21 94.45 99.57 98.66 96.98 99.12 97.68 95.33 98.45 98.11 97.91 97.53 98.41 95.84
Note All values are percentages *LR Logistic Regression *RF Random Forest *GB Gradient Boosting *NB Naive Bayes *SVM Support Vector Machine *DT Decision Tree *Re Recall
*DNN Deep Neural Network *Ac Accuracy *Pr Precision
CSE-CICIDS-2018 An updated version of CICIDS-2017 which shows promising results and a considerable drop of performance shows how techniques developed for datasets like NSL-KDD won’t give similar performance moving on to newer databases. We can see our proposed model shining against the traditional algorithms widely used in the IDS industry. It can be inferred that our proposed deep learning algorithm performs the best on average across all datasets (average accuracy: 98.74) compared to the other best performing algorithm, Random Forest with average 97.7. The difference can be seen especially when we compare performances over the newer CICIDS datasets. This relatively easier approach gives great outcomes and developing more complicated Neural Networks at this stage seems like an overkill. This explains the need to develop newer datasets where more complex approaches can shine and contribute towards making better IDS. Amongst currently existing systems developing approaches for NSL-KDD/KDD99 is not required due to the database being highly outdated. [20, 21] and [22] give 99% accuracy which can be achieved alone by Random Forest when tested on NSL-KDD/KDD99. To fully utilize the capabilities of advanced Deep Learning techniques as mentioned in [20, 21] and [22] we need better datasets. Table 1 shows that CICIDS is able to capture many of the outlying security threats as compared to other widely used datasets. CSE-CICIDS-2018 is currently the best publicly available dataset and should be used for further research to develop more robust IDs until a new dataset comes up that accurately represents newer conditions.
6 Conclusion and Future Work Our paper proposes a robust deep learning solution for intrusion detection systems that beats traditional machine learning techniques. While providing a relatively better approach to classic machine learning techniques, the paper also highlights a lack
454
H. Jha et al.
of modern datasets. Based on our research CICIDS-2018 is the best publicly available dataset on which our proposed solution generates 97.53% accuracy consistent with industry requirements. However, CICIDS-2018 is still outdated, considering the highly dynamic nature of the cyber-attacks. Specific purpose datasets are required, not only to include novel and zero-day attacks but also to cover the specialized network and architectures (such as Internet Of Things, Tor Browser and SCADA etc.). Also, detection of cyber-attacks using ML and DL approaches is just one part of the bigger solution. A rapidly changing world of cyber-security needs adaptable datasets and smart data collection techniques. The paper identifies the need for datasets that must be able to quickly extend and integrate with other datasets in order to have a greater impact. Datasets would therefore be able to adapt to ongoing network changes. The process of generating or developing newer dataset should be given importance and there should be focus upon creating a standard platform for dataset generation. This process would reduce the extensive problems institutions face for making datasets from scratch. Sharing datasets is sometimes restricted by the organization due to privacy issues and the data contained in it, which leads to limited research in the datasets field. Due to the high and variable number of features required to create a successful model it can be bothersome to simulate real life scenarios. Furthermore, the IDS pipeline might be used to render dataset generation, eliminating the need to create a new dataset for each modification that is made. To do this, anomaly-based IDSs may be trained to recognise novel and zero-day attacks using cutting-edge ML approaches.
References 1. https://www.statista.com/statistics/273018/number-of-internet-users-worldwide/#:~:text=As %20of%202022%2C%20the%20estimated,66%20percent%20of%20global%20population 2. https://www.statista.com/statistics/1101442/iot-number-of-connected-devices-worldwide/#: ~:text=The%20total%20installed%20base%20of,that%20are%20expected%20in%202021 3. Yadav R, Phalguni P, Saraswat S (2020) Comparative study of datasets used in cyber security intrusion detection. Int J Sci Res Comput Sci Eng Inf Technol. 302–312. 10.32628/CSEIT2063103 4. Wang H, Gu J, Wang S (2017) An effective intrusion detection framework based on SVM with feature augmentation. Knowl -Based Syst 136:130–139, Nov 2017. https://doi.org/10.1016/j. knosys.2017.09.014 5. Kuang F, Xu W, Zhang S (2014) A novel hybrid KPCA and SVM with GA model for intrusion detection. Appl Soft Comput 18:178–184. https://doi.org/10.1016/j.asoc.2014.01.028 6. Janarthanan T, Zargari S (2017) IEEE 26th international symposium on industrial electronics (ISIE). IEEE 2017:1881–1886 7. Zine Boukhamla Akram (2021) CICIDS2017 dataset: performance improvements and validation as a robust intrusion detection system testbed. Int J Inf Comput Secur 16:20–32. https:// doi.org/10.1504/IJICS.2021.10039325 8. Lin P, Ye K, Xu CZ (2019) Dynamic network anomaly detection system by using deep learning techniques. Cloud Comput CLOUD 2019, 11513:161–176, LNCS 9. Farnaaz N, Jabbar MA (2016) Random forest modeling for network intrusion detection system. Proc Comput Sci 89:213–217. https://doi.org/10.1016/j.procs.2016.06.047.Jan
Performance Analysis of Deep Neural Network for Intrusion …
455
10. Elbasiony RM, Sallam EA, Eltobely TE, Fahmy MM (2013) A hybrid network intrusion detection framework based on random forests and weighted k-means. Ain Shams Eng J 4(4):753– 762. https://doi.org/10.1016/j.asej.2013.01.003 11. Zong W, Chow Y-W, Susilo W (2018) A two-stage classifier approach for network intrusion detection. In: International conference on information security practice and experience. Springer, Berlin. pp 329–340 12. Stolfo SJ, Fan W, Lee W, Prodromidis A, Chan PK (2000) Costbased modeling for fraud and intrusion detection: results from the jam project. Discex, 02:1130 13. Tavallaee M, Bagheri E, Lu W, Ghorbani AA (2009) A detailed analysis of the KDD cup 99 data set. In: Proceedings of the second IEEE international conference on computational intelligence for security and defense applications, IEEE Press, Piscataway, NJ, USA, CISDA’09, pp 53–58. http://portal.acm.org/citation.cfm?id=1736481.1736489 14. Moustafa N, Slay J (2015) Military communications and information systems conference (MilCIS). Canberra, ACT, Australia 2015:1–6. https://doi.org/10.1109/MilCIS.2015.7348942 15. Sharafaldin I, Lashkari AH, Ghorbani AA (2018) Toward generating a new intrusion detection dataset and intrusion traffic characterization. In 4th international conference on information systems security and privacy (ICISSP), Portugal, Jan 2018 16. Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536 17. Dubey SR, Singh SK, Chaudhuri BB (2022) Activation functions in deep learning: a comprehensive survey and benchmark. Neurocomputing 18. Srivastava Y, Murali V, Dubey SR (2019) A performance evaluation of loss functions for deep face recognition. In: National conference on computer vision, pattern recognition, image processing, and graphics. Springer, Berlin, pp 322–332 19. Ruder S (2016) An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747 20. Jia Y, Wang M, Wang Y (2019) Network intrusion detection algorithm based on deep neural network. IET Inf Secur 13: 48–53. https://doi.org/10.1049/iet-ifs.2018.5258 21. Qazi Emad-ul-Haq, Imran Muhammad, Haider Noman, Shoaib Muhammad, Razzak Imran (2022) An intelligent and efficient network intrusion detection system using deep learning. Comput Electr Eng 99:107764. https://doi.org/10.1016/j.compeleceng.2022.107764 22. Kim J, Shin N, Jo S, Kim S (2017) Method of intrusion detection using deep neural network. 313–316. https://doi.org/10.1109/BIGCOMP.2017.7881684
Classification of Choroidal Neovascularization (CNV) from Optical Coherence Tomography (OCT) Images Using Efficient Fine-Tuned ResNet and DenseNet Deep Learning Models Megha Goriya, Zeel Amrutiya, Ayush Ghadiya, Jalpesh Vasa, and Bimal Patel
Abstract Age-related macular degeneration (AMD) is a macular degenerative disease that is the primary cause of blindness globally. It is often accompanied by choroidal neovascularization (CNV), the abnormal development of blood vessels, which is a typical complication of AMD. Deep Neural Network-based predictors can be used to diagnose retinal illness via machine learning. When it comes to medical difficulties, people still have to rely on the doctor’s clarification. Machine learningbased methods have not been highly trusted in the medical industry due to a lack of explanation in neural networks, but there is potential for these methods to be useful in diagnosing retinal illnesses. This study focuses on the classification of Choroidal Neovascularization (CNV) from Optical coherence tomography (OCT) images using efficient fine-tuned ResNet and DenseNet deep learning models. The goal is to accurately identify the presence of CNV in OCT images, which can aid in early diagnosis and treatment of this condition. The models are fine-tuned to improve their performance, and the results are compared to determine the most effective model for CNV classification. The study aims to contribute to the development of more accurate and efficient diagnostic tools for CNV using deep learning technology. M. Goriya · Z. Amrutiya · A. Ghadiya · J. Vasa (B) · B. Patel Smt. K. D. Patel Department of Information Technology (KDPIT), Faculty of Technology and Engineering (FTE), Chandubhai S. Patel Institute of Technology (CSPIT), Charotar University of Science and Technology (CHARUSAT), Changa, Gujarat, India e-mail: [email protected] M. Goriya e-mail: [email protected] Z. Amrutiya e-mail: [email protected] A. Ghadiya e-mail: [email protected] B. Patel e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_42
457
458
M. Goriya et al.
Keywords Age-related macular degeneration (AMD) · Choroidal neovascularization (CNV) · Optical coherence tomography (OCT) · Deep learning · Transfer learning · ResNet 50 · DenseNet
1 Introduction The lens in human eyes converts light focused on the retina into neuronal impulses. The macula, situated in the central part of the retina, is responsible for high-resolution color vision in bright light conditions, analyzing visual data for identification and sending it to the brain through the optic nerve. A variety of diseases, including age-related macular degeneration, can have an impact on macular health. Age-related macular degeneration (AMD) is a major factor in vision loss and permanent blindness [1]. According to the World Health Organization’s report on the worldwide eye disease study, AMD has rendered 14 million individuals blind or severely impaired. There are two stages of AMD: early and late, with the latter causing visual impairment. AMD is quite frequent among the elderly; in one large community-based study of adults 75 years and older, 30% had early AMD symptoms and 7% had late AMD symptoms. Late AMD is characterized by two key traits, with choroidal neovascularization (CNV) being the most common complication. AMD is considered neovascular if it has CNV, where new blood vessels grow from the choroid to the outer retina, often leading to vision loss due to various complications such as hemorrhage and fluid accumulation [2]. Optical coherence tomography (OCT) [3] employs low-coherence light sources to produce high-resolution cross-sectional images of the optic nerve and retina [4]. This diagnostic tool aids in identifying ailments that cause optic nerve enlargement or atrophy, while also allowing for the observation of macular changes and the display of retinal layers in 3D. OCT is commonly used to detect diabetic macular edema (DME). OCT images have high resolution and are used to evaluate deep learning (DL) algorithms due to their ability to analyze tissues at a microscopic level [5, 6]. In recent years, the majority of research has focused on improving the efficiency of DL algorithms for classifying choroidal neovascularization in OCT images. Researchers have developed innovative deep learning (DL) models by utilizing techniques such as transfer learning, feature learning, and fine-tuning of pre-trained Convolutional Neural Network (CNN) architectures [7]. To detect CNV in OCT images, a range of CNN architectures have been employed, including ResNet50 and DenseNet [5, 8, 9]. Transfer learning is a widely used DL technique that employs a pre-trained model as a starting point for a new task, reducing time, computational resources, and expertise required for developing neural network models [7, 6]. Researchers can fine-tune the pre-trained models by customizing them to their specific classification task, resulting in enhanced accuracy and efficiency.
Classification of Choroidal Neovascularization (CNV) from Optical …
459
The work described in this study is structured into several sections that cover different aspects of the research. The literature review of relevant studies is presented and discussed in Sect. 2. The architecture of pre-trained model is described in Sect. 3, while methodology is described in Sect. 4. Section 5 presents the findings and results of the experiment, including the evaluation of the proposed method along with the dataset used. Finally, Sect. 6 summarizes the main conclusions of the study and suggests areas for further research, discussing the implications of the findings and providing suggestions for future work.
2 Related Work Deep learning algorithms [10], specifically Convolutional Neural Networks (CNNs), have become a widely-used tool for identifying different diseases, including those in optical coherence tomography (OCT) image processing [11]. While traditional handcrafted methods have also been utilized, training a CNN model for greater generalization is a time-consuming and data-dependent process. This is where transfer learning comes into play, leveraging the knowledge of trained architectures to improve performance. Transfer learning involves using a pre-trained CNN as a starting point and fine-tuning it for a new task. However, the effectiveness of transfer learning depends on the relationship between the initial and target problems, as the CNN’s feature learning must be relevant to the new task. Therefore, choosing an appropriate pretrained model and adapting it to a new task is essential to achieving high performance in deep learning-based medical image analysis. In this study, Ji and Huang [2] employed the same dataset as a previous study to classify CNV, DRUSEN, DME, and NORMAL. Various popular models including DenseNet121, Inception-v3, and ResNet50 were used for image classification, with each model removing a varying number of blocks from the end of its architecture to improve computational efficiency and diagnostic accuracy [12]. Results showed that ResNet50 achieved the highest accuracy of 99.80%, followed by DenseNet121 with the same accuracy, and Inception-v3 with a maximum accuracy of 99.70%. In their article [13] Yao-Mei Chen and Wei-Tai Huang took pre-trained networks and fine-tuned them on the same data [12] for four classes: CNV, DME, DRUSEN, and NORMAL. The VGG19 had an accuracy of 0.9948, Resnet101 had an accuracy of 0.9928, and Resnet50 had an accuracy of 0.9917. In a study conducted by Kushwaha and Rastogi [14] using the same dataset as in article [12], they expanded the classification task to include four classes—CNV, DME, drusen, and normal. The authors used a five-layer model with hyper parameters of image size 64 × 64, 30 epochs with dropout, and batch normalization. The model achieved an accuracy of 97.92%, which is a good result. It is important to note that the classification task was more complex due to the increased number of classes, and the model still performed well. The results demonstrate the potential of deep learning models to accurately classify retinal images into multiple categories, which could ultimately aid in the diagnosis and treatment of various eye diseases.
460
M. Goriya et al.
Barua and Chan [15] conducted a study to evaluate the accuracy and robustness of their proposed model on two open-source OCT image datasets. Their model utilized a suggested architecture, which incorporated five deep feature generators, including DarkNet53, MobileNetV2, DarkNet19, Efficient-Net b0, and DenseNet201 CNNs, for generating features in the first dataset (DB1). For the second dataset (DB2), the top five CNNs used were DarkNet19, MobileNetV2, ResNet101, DarkNet53. The study combined the features produced by these networks and employed incremental response filtering (IRF) to identify the most useful feature vector. The proposed model achieved an impressive accuracy of 97.30% and 100% on the DB1 and DB2 datasets, respectively. These findings indicate that the proposed model can effectively identify and diagnose retinal diseases from OCT images. However, it is crucial to note that the study was conducted on specific datasets, and the model’s performance may differ when applied to different datasets or real-world scenarios. Mohan et al. [16] proposed the MIDNet-18 model in their study, which demonstrated superior classification performance compared to the RESNET-50, MOBILENET, and DENSENET models for NORMAL, CNV, DME, and DRUSEN retinal images. The MIDNet-18 model was well-trained and achieved a high accuracy rating of over 96.69% for multi-class classification. Asif et al. proposed a ResNet50-based model fine-tuned on an OCT image dataset with expert grading for DRUSEN, CNV, DME, and NORMAL [17]. The proposed model, fine-tuned on a publicly available OCT image dataset, achieved an impressive overall classification accuracy of 99.48%, with only five misclassifications out of 968 test samples. The model’s performance was evaluated using the ROC curve, demonstrating the effectiveness of transfer learning and fine-tuning pre-trained models in accurately classifying OCT images of various retinal conditions, including DRUSEN, CNV, and DME. The study emphasizes the potential of deep learning models in assisting in the diagnosis and treatment of retinal diseases. Raen et al. [18] in their study proposed a modification to the ResNet50 architecture by adding three layers: Conv, Batch Normalization, and Activation Relu. They tested their model on a dataset of 5998 OCT images, including 1500 CNV images, 1500 DME images, 1500 Drusen images, and 1498 normal eye condition images. The model achieved a high level of test accuracy, which was measured at 99.81%. The proposed architecture’s performance was assessed using various metrics, including precision, recall, F1-score, and the area under the receiver operating characteristic (ROC) curve. The study demonstrated that the proposed architecture can effectively classify OCT images and could potentially be used in clinical settings for the accurate diagnosis of retinal diseases.
Classification of Choroidal Neovascularization (CNV) from Optical …
461
3 Architectures of Pre-trained Models ResNet and DenseNet are deep neural network architectures pre-trained on large image datasets. ResNet employs residual connections to allow for training of deeper networks, while DenseNet utilizes densely connected layers to maximize information flow between layers. Both architectures have been widely used for image classification and computer vision tasks.
3.1 ResNet-50 ResNet-50 is a popular deep neural network used for computer vision tasks, such as image segmentation and object detection, that was developed by Kaiming He and his colleagues to overcome the issue of vanishing gradients that impeded the training of deep neural networks. In the 2015 ImageNet challenge, ResNet outperformed other deep neural network architectures, achieving an error rate of 3.57%. The deeper the network, the more complex functions it can represent, but the gradient signal diminishes rapidly, hindering the training of large neural networks. This is due to the weight matrix multiplying the gradient each time it back propagates from the top layer to the bottom layer, making gradient descent slow. Skip Connection. A “shortcut” or “skip connection” in ResNet design enables the gradient to be directly backpropagated to earlier layers as shown in Fig. 1. The ResNet architecture employs two types of blocks to determine whether the input and output dimensions are equal or different. These blocks can be stacked to create a deep neural network. The top image shows the primary pathway through the network, while the bottom image includes a shortcut to the main route. Identity Block. The identity block is a key component of ResNet used when input and output activations have the same dimension. It preserves information flow, allowing deep networks to be trained without the vanishing gradient issue. The block involves convolutional layers, batch normalization, and activation, with the output added to
Fig. 1 Skip connection in ResNet
462
M. Goriya et al.
Fig. 2 Five stages of ResNet50 model
the input to produce the final output. By stacking multiple identity blocks, ResNet can achieve exceptional accuracy in computer vision tasks. Convolutional Block. In a ResNet, when the dimensions of the input and output activations are different, a different type of block can be used to account for this difference. This type of block is different from the identity block in that it includes a convolutional layer (CONV2D) in the shortcut path. This additional layer is used to adjust the dimensions of the shortcut connection to match the dimensions of the main path, allowing the two paths to be added together. This type of block is used in ResNets to enable the construction of very deep neural networks while avoiding the vanishing gradient problem that can occur in traditional deep networks. The ResNet-50 model consists of five stages as shown in Fig. 2, each including a convolutional block and an identity block, with each block having three convolutional layers. This model contains around 23 million trainable parameters.
3.2 DenseNet In a traditional feed-forward convolutional neural network (CNN), the output feature map of each convolutional layer is only passed on to the subsequent layer, with the first layer receiving the input. This means that there are “L” direct connections for each layer, connecting one layer to the next which is shown in Fig. 3. However, as CNNs become deeper, the “vanishing gradient” problem can occur, leading to the loss of information during propagation through the layers. DenseNets were proposed to address this issue by modifying the standard CNN architecture to
Fig. 3 Working of DenseNet
Classification of Choroidal Neovascularization (CNV) from Optical …
463
create a highly connected network. In a DenseNet, each layer is directly connected to every other layer, resulting in L(L + 1)/2 direct connections for each layer. This connectivity structure allows information to flow more easily throughout the network, making DenseNets highly effective in learning from input data. Connectivity. DenseNets use concatenated output feature maps from all preceding layers as inputs for each subsequent layer, unlike traditional CNNs that only use the previous layer output. This design creates dense connections between layers, promoting feature reuse and removing duplicate mappings, resulting in fewer parameters than standard CNNs. This connectivity helps to preserve information and mitigate the vanishing gradient problem. In conclusion, DenseNets are highly connected networks that promote feature reuse, preserve information, and require fewer parameters than standard CNNs, achieved by concatenating the feature maps of preceding layers. DenseBlocks. DenseNets’ concatenation method is not viable when feature map sizes differ. Downsampling layers play a significant role in CNNs by reducing dimensionality for faster computation. DenseNets solve this problem by dividing networks into DenseBlocks, maintaining constant feature map sizes but changing the number of filters between them. Transition Layers, placed between the blocks, cut channel numbers in half compared to current usage. GrowthRate. In a DenseNet, the features can be thought of as the network’s overall state. With each dense block, the feature map size increases because each layer contributes “K” new features to the global state. The growth rate parameter “K” determines how much information is added to each layer of the network. BottleNeck Layers. Despite the fact that each layer only generates k output featuremaps, there might be a large number of inputs, especially for higher layers. To increase computing efficiency and speed, a 1 × 1 convolution layer might be added as a bottleneck layer before each 3 × 3 convolution.
4 Methodology The research methods and steps involved data collection, image augmentation, transfer learning using CNN-based models, performance comparison, and classification of OCT images of CNV and Normal. Following steps were followed to achieve the research objectives.
464
M. Goriya et al.
4.1 Data Pre-processing Data pre-processing is crucial in deep learning, involving cleaning, transforming, and normalizing data to make it appropriate for analysis by a neural network. The goal is to reduce noise, standardize data formats, and ensure that the data is representative of the population being analyzed. Data Augmentation In machine learning, data augmentation involves adding extra data to a dataset to improve algorithm performance. The researchers in this study used the Keras image data generator class to augment their images. This class can perform real-time image augmentation, generating new images during model training to make the model more accurate and robust. The advantage of using Keras is that it simplifies the image augmentation process and is compatible with deep learning libraries.
4.2 Transfer Learning Transfer learning is a technique where a pre-trained neural network model can be used as a starting point to create a new model. The pre-trained model has already learned features from a large dataset, and these features can be used as a foundation for a new model that can then be trained on a smaller dataset for a specific task. This technique is particularly useful for deep learning because it allows researchers to avoid the time and resources required to train large models from scratch, and can often lead to improved performance on the new task. Deep Learning architecture for Transfer learning is shown in Fig. 4.
Fig. 4 DL architecture for transfer learning
Classification of Choroidal Neovascularization (CNV) from Optical …
465
4.3 Pre-trained Weights Utilizing the weights of a previous network as starting weight values for a new experiment, instead of starting from scratch, is called using a pre-trained network. The performance of pre-trained models is compared to one another in the ImageNet classification competitions, which provides a standard assessment of image categorization model quality. The ImageNet weights are used in many transfer learning models.
4.4 Activation Function An activation function is a mathematical function applied to a weighted sum of inputs, and its output is passed on as input to the next layer in a neural network. A sigmoid function is one such activation function, and when used as the activation function for a neuron, the output of the neuron will always be a value between 0 and 1.
4.5 Optimizers In deep learning, an optimizer is a method used to adjust the parameters of a neural network, such as weights and learning rate, with the aim of reducing total loss and enhancing accuracy. One such optimizer is the Adaptive Moment Estimation (Adam) algorithm, which is a generalization of the Gradient Descent Optimization algorithm. Recently, Adam has been extended with the AdaMax algorithm to improve its performance. AdaMax is similar to Adam but replaces the L2 norm of the gradients with the L-infinity norm, which leads to better generalization of the optimization process. In addition, AdaMax provides better control over the learning rate and requires less memory, making it a more efficient optimizer for deep learning.
4.6 Model Training with Fine-Tuning A total of 64,004 photos with two classes were used for training in the proposed methodology. The DenseNet and ResNet models were utilized in the study. The data was first augmented using pre-processed photos and was then ready to be fed into the model. All four models used ImageNet weights, sigmoid activation function, and adamax optimizer. Since they had previously been trained on the ImageNet database,
466
M. Goriya et al.
Fig. 5 Proposed methodology
all of the top layers with dense blocks were frozen. Some of the final levels were unfrozen and trained on the dataset. To classify a dataset containing two classes (CNV and Normal), a dense layer with an activation function was added at the output layer softmax. Overall architecture for evaluation for transfer learning along with detailed steps are shown in Fig. 5. Next sub section will describe two deep learning models DenseNet and ResNet for transfer learning and experimental results of both the model will be described in experimental section.
5 Experimental Evaluation We used 411-NGN-Geforce with computational configuration Intel@Core i7-8700K CPU @ 3.70 GHz and 64 bit operating system for fast processing. With 32 GB of RAM, a 2 TB hard drive, and a single NVIDIA GP 100 Accelerator graphics card.
Classification of Choroidal Neovascularization (CNV) from Optical …
467
5.1 Dataset Description In paper [12] authors utilized a CNN model to diagnose eye diseases from retinal optical coherence tomography (OCT) images. They trained and tested the model on a dataset of over 83,000 images, including CNV, DME, Drusen, and normal eye condition images, with 968 images in the testing set. The images were preprocessed and labeled based on patient disease and ID. The study aimed to reduce the timeconsuming manual interpretation of OCT images using machine learning. The study focused on the classification of two classes, CNV and normal, and the dataset is available for further research. This research demonstrates the potential of using machine learning to diagnose retinal illnesses by leveraging large-scale datasets and deep learning algorithms. With the ability to accurately classify OCT images, medical professionals can save time and make more informed decisions. The availability of the dataset opens up the possibility of developing more advanced and accurate models in the future, which can assist in the early detection and treatment of retinal diseases.
5.2 Metrics for Model Evaluation The accuracy and loss metrics has been used for evaluation to check the goodness of the model as shown in Eq. 1. Accuracy :
TP +TN T P + T N + FP + FN
(1)
For the DenseNet architecture we got the 99.6% accuracy and according training loss as depicted in Figs. 6 and 7. Similarly, for the fine-tuned ResNet50 architecture we got 99.65% accuracy and according training loss as depicted in Figs. 8 and 9.
6 Conclusion In summary, the study revealed that deep learning models can significantly improve the accuracy of classifying Choroidal Neovascularization (CNV) from Optical coherence tomography (OCT) images. The fine-tuned Res-Net and DenseNet models exhibited high accuracy, showcasing the potential of transfer learning techniques.
468
M. Goriya et al.
Fig. 6 Validation loss for DenseNet
Fig. 7 Validation accuracy for DenseNet
Fig. 8 Validation loss for ResNet50
This has promising implications for the timely diagnosis and treatment of ocular diseases, as precise CNV classification is crucial for effective treatment planning. Additionally, these models have the potential to serve as a screening tool to quickly identify patients who may require further testing or treatment. However, there is
Classification of Choroidal Neovascularization (CNV) from Optical …
469
Fig. 9 Validation accuracy for ResNet50
still a need for further research and improvement. Although the models achieved high accuracy in this study, they were evaluated on a relatively small and homogenous dataset. A more extensive and diverse dataset may uncover new challenges and limitations that need to be addressed. Furthermore, the fine-tuning process of the pretrained models can be further refined to achieve even higher levels of accuracy. With continuous research and development, deep learning models can make a substantial impact in the field of ophthalmology, leading to improved patient outcomes.
References 1. Serener A, Serte S (2019) Dry and wet age-related macular degeneration classification using OCT images and deep learning. In:2019 Scientific meeting on electrical-electronics & biomedical engineering and computer science (EBBT), pp 1–4. https://doi.org/10.1109/EBBT.2019. 8741768 2. Ji Q, Huang J, He W, Sun Y (2019) Optimized deep convolutional neural networks for identification of macular diseases from optical coherence tomography images. Algorithms 12(3):51. https://doi.org/10.3390/a12030051 3. Upadhyay PK, Rastogi S, Kumar KV (2022) Coherent convolution neural network based retinal disease detection using optical coherence tomographic images. J King Saud Univ Comput Inform Sci 34(10):9688–9695. https://doi.org/10.1016/j.jksuci.2021.12.002 4. Nugroho KA (2018) A comparison of handcrafted and deep neural network feature extraction for classifying optical coherence tomography (OCT) images. In: 2018 2nd International conference on informatics and computational sciences (ICICoS), pp 1–6. https://doi.org/10. 1109/ICICOS.2018.8621687 5. Mbunge E, Simelane S, Fashoto SG, Akinnuwesi B, Metfula AS (2021) Application of deep learning and machine learning models to detect COVID-19 face masks—a review. Sustain Oper Comput 2:235–245. https://doi.org/10.1016/j.susoc.2021.08.001 6. Vasa J, Thakkar A (2022) Deep learning: differential privacy preservation in the era of big data. J Comput Inform Syst 1–24. https://doi.org/10.1080/08874417.2022.2089775 7. Sarker L, Islam M, Hannan T, Ahmed Z (2021) COVID-DenseNet: a deep learning architecture to detect COVID-19 from chest radiology images 8. Too EC, Yujian L, Njuki S, Yingchun L (2019) A comparative study of fine-tuning deep learning models for plant disease identification. Comput Electron Agric 161:272–279. https://doi.org/ 10.1016/j.compag.2018.03.032
470
M. Goriya et al.
9. Reddy N, Rattani A, Derakhshani R (2018) Comparison of deep learning models for biometricbased mobile user authentication. In: 2018 IEEE 9th International conference on biometrics theory, applications and systems (BTAS), Redondo Beach, CA, USA, Oct 2018, pp 1–6. https:/ /doi.org/10.1109/BTAS.2018.8698586 10. Shrestha A, Mahmood A (2019) Review of deep learning algorithms and architectures. IEEE Access 7:53040–53065. https://doi.org/10.1109/ACCESS.2019.2912200 11. Alqudah AM (2020) AOCT-NET: a convolutional network automated classification of multiclass retinal diseases using spectral-domain optical coherence tomography images. Med Biol Eng Comput 58(1):41–53. https://doi.org/10.1007/s11517-019-02066-y 12. Naik A, Pavana BS, Sooda K (2022) Retinal disease classification from retinal-OCT images using deep learning methods. In: Machine intelligence and data science applications, vol 132, Aug 22. Springer, pp 95–104. [Online]. Available: https://doi.org/10.1007/978-981-19-234 7-0_8 13. Chen Y-M, Huang W-T, Ho W-H, Tsai J-T (2021) Classification of age-related macular degeneration using convolutional-neural-network-based transfer learning. BMC Bioinf 22(S5):99. https://doi.org/10.1186/s12859-021-04001-1 14. Kushwaha AK, Rastogi S (2022) Solution to OCT diagnosis using simple baseline CNN models and hyperparameter tuning. In: International conference on innovative computing and communications, Singapore, pp 353–366 15. Barua PD et al (2021) Multilevel deep feature generation framework for automated detection of retinal abnormalities using OCT images. Entropy 23(12):1651. https://doi.org/10.3390/e23 121651 16. Mohan R, Ganapathy K, Arunmozhi R (2022) Comparison of proposed DCNN model with standard CNN architectures for retinal diseases classification. J Popul Ther Clin Pharmacol 29(3). https://doi.org/10.47750/jptcp.2022.945 17. Asif S, Amjad K, Qurrat-ul-Ain (2022) Deep residual network for diagnosis of retinal diseases using optical coherence tomography images. Interdisc Sci Comput Life Sci 14(4):906–916. https://doi.org/10.1007/s12539-022-00533-z 18. Raen R, Islam MM, Islam R (2022) Diagnosis of retinal diseases by classifying lesions in retinal layers using a modified ResNet architecture. In:2022 International conference on advancement in electrical and electronic engineering (ICAEEE), pp 1–6. https://doi.org/10.1109/ICAEEE 54957.2022.9836427
Early Heart Disease Prediction Using Support Vector Machine T. Bala Krishna, Neelapala Vimala, Pakala Vinay, Nagudu Siddhardha, and Padilam Murali Manohar
Abstract Heart disease is a significant issue for world health, and early detection and prevention can significantly enhance results. In this work, we looked into the potential of the Kardia Mobile and Support Vector Machines (SVMs) for early heart disease prediction. An electrocardiogram (ECG) can be taken anytime, anywhere with the Kardia Mobile, a medical device that can identify early heart disease symptoms. The SVM method uses the ECG data as input to forecast the prevalence of heart disease in a community. The findings shown that using ECG data from the Kardia Mobile device, the SVM algorithm was able to precisely forecast the risk of heart disease in individuals. The algorithm was able to recognise those who were at a high risk of developing heart disease by taking into account a number of variables, such as heart rate variability, ECG morphological features, and demographic data. Keywords Support vector machine (SVM) · Kardia mobile · Electrocardiogram (ECG)
1 Introduction The electrocardiogram (ECG) is a key diagnostic tool when determining the cause of cardiovascular disease, the leading cause of mortality worldwide26. Diagnosing cardiac disease is more difficult than deciding whether or not the ECG is normal. Conventional ECG signal analysis and classification is labor- and time-intensive since it depends on the advice of skilled medical professionals. Thus, there is a rising demand for computer-based, fully automated ECG analysis [1]. Feature extraction is challenging in ECG analysis [1]. Signal processing is used by current classification methods to extract medical features. Systems combine the features and compare them with features taken from other cardiac diseases to identify T. Bala Krishna · N. Vimala (B) · P. Vinay · N. Siddhardha · P. M. Manohar Gudlavalleru Engineering College, Gudlavalleru, India T. Bala Krishna e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_43
471
472
T. Bala Krishna et al.
the ECG. Certain features, however, are challenging to extract because of noise interference. It is impossible to create a system that can extract all the necessary properties because the ECG signals of various cardiovascular diseases have diverse qualities. Because of this, each diagnostic system has a low accuracy and limited scalability [2].
2 Literature Review Early heart disease prediction is a crucial component of cardiovascular health since it enables people to take preventative action to stop the progression of the disease. An electrocardiogram (ECG) instrument called the Kardia mobile can be used to identify heart disorders including atrial fibrillation (AFib). The device, which measures the electrical signals produced by the heart and can yield an ECG readout in about 30 s, is linked to the fingertips [3]. For classification and regression issues, Support Vector Machines (SVM) is a machine learning technique that has been applied in a variety of disciplines. By utilising a variety of physiological and demographic factors as input features, SVM has been employed in the healthcare industry to predict heart disease [2, 4]. A few studies have been published in the literature that collected ECG data using the Kardia mobile device and then utilised SVM algorithms to forecast cardiac illness. In one study, the Kardia mobile device’s ECG signals were gathered and pre-processed to extract parameters like heart rate, QRS complex, and ST segment. The SVM classifier used these features as input to predict heart disease. The study discovered that the SVM classifier was capable of predicting heart disease with great accuracy, with an overall accuracy of 92%. The results of the literature review suggest that combining the Kardia mobile device with SVM algorithms can be a successful way to predict cardiac disease early on. However, additional study is required to verify the findings and create more reliable and universal models for heart disease prediction [5].
3 Existing Method 3.1 Stochastic Gradient Descent SGD aims to minimise a cost function by iteratively modifying the model’s parameters [6]. In order to estimate the gradient of the cost function with respect to the model parameters, the algorithm randomly selects a subset (or “batch”) of the training data at each iteration. The cost function’s value is then decreased by updating the model parameters in the direction of the negative gradient [7].
Early Heart Disease Prediction Using Support Vector Machine
473
The “stochastic” component of SGD refers to the fact that the method modifies the parameters using a single data batch at a time as opposed to the complete dataset [8]. This increases the algorithm’s computing effectiveness and gives it the ability to handle big datasets [9]. The updates become noisy as a result since each batch only delivers a noisy estimate of the true gradient [10].
3.2 KNN The distance between each new data point and every other point in the training dataset is calculated as part of the KNN algorithm’s operation. The next step is to choose the k-nearest neighbours based on their proximity to the newly added data item. The majority class of the k-nearest neighbours is then used to forecast the class of the new data point [11]. The average value of a new data point’s k-nearest neighbours is utilised as the projected value when using KNN for regression tasks.
3.3 Naïve Bayes Given the observed values of the characteristics, the Naive Bayes method determines the probability of each class label. This is accomplished by presuming that, given the class label, the features are conditionally independent of one another. This presumption makes the computations easier and increases the computational efficiency of the method. The procedure determines the likelihood of each class label given the observed values of the characteristics in order to classify a new data point using Naive Bayes. The predicted class for the new data point is then chosen as the class label with the highest probability [12].
4 Proposed Method 4.1 SVM A well-liked machine learning approach called Support Vector Machine (SVM) is utilised for both classification and regression applications. The optimal hyperplane that divides the various classes in the dataset is found by the SVM, a supervised learning method. The margin—the distance between the hyperplane and the nearest data points for each class—is maximised by the hyperplane as chosen. Both linear and non-linear datasets can be used using SVM [13].
474
T. Bala Krishna et al.
The following are some essential SVM concepts: A hyperplane is a line that divides two classes in a two-dimensional dataset. A hyperplane is a surface that divides the classes in higher dimensions. Support vectors: They indicate the location of the hyperplane by defining the data points that are closest to it. These support vectors are used by the SVM algorithm to increase the margin between the two classes. Using the kernel approach, non-linear datasets can be converted into linearly separable datasets. The original dataset is transformed nonlinearly using the SVM algorithm to produce a new feature space where the data points can be separated by a hyperplane [6]. Regulating the trade-off between maximising the margin and reducing the classification error is the regularisation parameter. Although a lower value of the regularisation parameter produces a broader margin but more classification errors, a higher value produces a narrower margin but fewer classification errors. Accuracy: Accuracy is Calculated and Compared and best one should be noticed. Accuracy = (TP + TN)/(TP + TN + FP + FN)
5 Data Collection 5.1 How to Record the ECG Signals Setting up the device and your hands Your phone and the Kardia gadget should both be placed on the table in front of you. Make sure the Kardia gadget is 6 inches or less from the microphone on your smartphone (usually at the bottom portion that you would normally speak in to, iPads normally have the microphone near the top). You can place your wrists and forearms on the table in front of you while placing your fingernails on the Kardia EKG electrodes as illustrated in Fig. 1. During recording Be sure to maintain your stillness while the mobile gadget records your ECG [1]. Noise in the tracing will result from movement during recordings. Don’t squeeze or press down too hard on the electrodes; instead, keep your touch soft.
Early Heart Disease Prediction Using Support Vector Machine
475
Fig. 1 Record EKG
Fig. 2 Value extraction
5.2 Value Extraction We are extracting the Heart Rate, P, PR interval, and QRS complex for our ECG signals with reference to this wave [1] (Fig. 2).
6 Generating Dataset See Fig. 3.
476
Fig. 3 Data set
Fig. 4 Accuracy evaluation
7 Performance Evolution See Fig. 4.
8 User Interface See Fig. 5.
T. Bala Krishna et al.
Early Heart Disease Prediction Using Support Vector Machine Fig. 5 Web page for user interaction
9 Results See Figs. 6 and 7.
Fig. 6 Output sample
477
478
T. Bala Krishna et al.
Fig. 7 Output sample
10 Conclusion Using Support Vector Machines (SVMs) and the Kardia Mobile device to predict early heart disease is a significant advancement in the field of cardiovascular health, in our opinion. Heart disease can be prevented and its effects can be significantly reduced if the risk of heart disease can be precisely predicted using ECG data gathered by the Kardia Mobile. The findings of this study demonstrated that the SVM algorithm outperformed conventional statistical techniques and was able to predict the risk of heart disease with accuracy, making it a useful tool for medical professionals and people looking to take charge of their heart health. A easy and accessible alternative for people to monitor their heart health and find early indicators of heart disease is made possible by the pairing of the Kardia Mobile device and SVMs. The Kardia Mobile and SVMs can aid in early detection and prevention, resulting in better outcomes and an improved quality of life by enabling people to take ECGs whenever they want and wherever they are, as well as by enabling them to forecast their risk of heart disease. This study emphasises the potential of SVMs and the Kardia Mobile device to transform the field of cardiovascular health and have a significant effect on public health, to put it briefly [14].
Early Heart Disease Prediction Using Support Vector Machine
479
11 Future Scope This project predicts people with cardiovascular disease by extracting the patient medical history that leads to a fatal heart disease from a dataset that includes patients’ medical history such as chest pain, sugar level, blood pressure, etc. [3]. Helpful in healthy monitoring of patients conditions.
References 1. Bhandary A, Rajendra Acharya U, Adeli H (2019) Congestive heart failure automated detection employing dual tree complex wavelet transform and statistical features collected from two seconds’ worth of ECG signals. Comput Methods Programs Biomed 173:27–35. https://doi. org/10.1016/j.cmpb 2. Kaushik S, Panigrahi BK, Kumar P (2019) An intelligent method for predicting heart illness based on heart rate variability. J Ambient Intell Humanized Comput 10(8):3209–3219. https:/ /doi.org/10.1007/s12652-019-01400-5 3. Chua ECP, Lim CS (2018) A review of machine learning methods for heart disease prediction. J Healthc Eng 1–16. https://doi.org/10.1155/2018/1724195 4. Li J, Li Q (2020) A brand-new approach based on SVM and PCA for the diagnosis of coronary artery disease. J Med Syst 44(4):1–9. https://doi.org/10.1007/s10916-020-01529-5 5. Singh R, Kumar P, Panigrahi BK (2018) Employing machine learning techniques to identify heart problems early. J Med Syst 42(7):1–9. https://doi.org/10.1007/s10916-018-0981-3 6. Abinaya R, Aditya Y et al (2020) Automated classification of oral squamous cell carcinoma stages detection using deep learning techniques. Eur J Mol Clin Med 7(4):1111–1119 7. Indira DNVSLS, Abinaya R et al (2021) Secured personal health records using pattern based verification and 2-way polynomial protocol in cloud infrastructure. Int J Ad Hoc Ubiquitous Comput 40(3):86–93. ISSN: 1743-8233 8. Abinaya R, Sigappi AN (2018) A keystroke dynamics based biometric for person authentication. Int J Pure Appl Math 118(5):769–783 9. Abinaya R, Sigappi AN (2020) A novel biometric approach for facial image recognition using deep learning. Int J Adv Trends Comput Sci Eng 9(5):8874–8879 10. Abinaya R, Indira DNVSLS et al (2021) Acoustic based scene event identification using deep learning CNN. Turk J Comput Math Educ 12(5):1398–1305. ISSN: 1398-1405 11. Abinaya R, Sigappi AN. Biometric identification of genuine user/imposter by keystrokes dynamic database. Int J ChemTech Res 1429–1443. ISSN: 0974-4304 12. Abinaya R, Sigappi AN (2018) Machine learning based biometric for person authentication by using BeiHang keystroke dynamic dataset. Int J Sci Res Comput Sci Appl Manag Stud 7(3):1–13. ISSN: 2319-1953 13. Abinaya R, Sigappi AN (2018) LSTM based fusion biometric for person identification using audio and keystroke dynamics information. Int J Sci Res Comput Sci Appl Manag Stud 7(6):1– 12. ISSN: 2319-1953 14. Abinaya R, Sigappi AN (2018) Biometric authentication of age and gender prediction by using GREYC keystroke dynamics dataset. J Sci Res Sci Eng Technol (IJSRSET) 5(5): 289–296. ISSN: 2394-4099 15. Abinaya R, Sigappi AN (2018) Deep learning based soft biometric identification of age and gender using grey keystroke dynamics. J Adv Res Dyn Control Syst Des 10(8):1429–1443. ISSN: 1943-025X 16. Abinaya R, Sigappi AN (2017) Identification of genuine user/imposter by keystrokes dynamic dataset using machine learning techniques. Int J Adv Res Trends Eng Technol (IJARTET) 5(4):903–913. ISSN: 2394-3777
An Analysis of Task-Based Algorithms for Cloud Computing Murli Patel, Abhinav Sharma, Priyank Vaidya, and Nishant Doshi
Abstract This review paper provides a comprehensive analysis of the most widely used task scheduling algorithms for load balancing in cloud computing environments. Task scheduling is one major scheduling where we have to allocate the processes by scheduler. Fact major load balancing scheduling in cloud computing are based on this scheduling. Hence with reducing overall execution time the task scheduling algorithm are implemented as part of application or as whole in the application’s environment. Task based algorithms will help us to get the best result according to its requirement which will make machine faster. The paper aims to evaluate the performance of these algorithms in terms of key metrics such as resource utilization, task completion time, and scalability. Keywords Task scheduling · Load balancing · Cloud computing · Min–min · Max–min · Genetic · Round robin
1 Introduction Load balancing is a critical aspect of modern computing systems, particularly in distributed and cloud computing environments, as it ensures efficient utilization of resources. Task scheduling algorithms play a crucial role in load balancing, by deciding which tasks should be assigned to which process so it will be executed first. Task scheduling algorithms have been proposed and implemented, each with its own strengths and limitations [1]. This review paper aims to provide an overview of the most widely used task scheduling algorithms for load balancing and to critically evaluate their performance from it derived version. M. Patel (B) · A. Sharma · P. Vaidya · N. Doshi Department of Computer Science and Engineering, Pandit Deendayal Energy University, Gandhinagar, Gujarat, India e-mail: [email protected] N. Doshi e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_44
481
482
M. Patel et al.
The process of allocating load to available resources so as to prevent resource overload is known as load balancing. More effective load balancing methods increase user satisfaction. In cloud computing, there are several different types of loads, including CPU load, network load, and memory capacity issues. Challenges to face in load balancing are Protection of data, regularity and maintenance, cloud computing capabilities, adjustment of burden on server etc. [2]. Benefits are scalability, virtualization, low infrastructure and cost increased storage [3]. Some of the main objective of load balancing algorithms are: • Improving cost effectiveness. • Enhancing Scalability while maintaining flexibility. • Priority. Load Balancing algorithms were categorized on basis of two factors: • Static Algorithm: This type of algorithm needs to get the information of all resources of system. This type of algorithm is suitable for homogeneous process. • Dynamic Algorithm: This algorithm works at run-time. This algorithm works well for heterogeneous system and is used with unpredictable loads.
2 Algorithm Based on Task Scheduling It will help to decide which process will be go first for execution. It is decided by scheduler. Hence the set of distinct set of rule through which the processes will be executed are derived as Task Scheduling Algorithm. It is scheduler which optimises and assign task, hence scheduler will make process in job ready queue [4, 5]. Task Scheduling is done to efficiently utilize the resources by allocating specific process to specific resources. It automatically improves the quality of service and performance. Hence some common algorithm are discussed below min min, max min, genetic and round robin [6].
2.1 Min Min Algorithm 2.1.1
Introduction
The simple min min algorithm, helps us to execute the process which has minimum time of all given task or program. This algorithm reschedules after every execution of task. A majority of non-scalable task are executed by this algorithm. It serves on first come first serve bases [7, 8]. As per Fig. 1, the completion time is based on the process allocated to minimum completion time. Hence the instant processes could be handled by this algorithm [7, 9].
An Analysis of Task-Based Algorithms for Cloud Computing
483
Fig. 1 Flowchart of min min algorithm [8]
2.1.2
Advantages
• Produces a better make span when most tasks have small execution time. • For homogenous data it is very helpful to get through it. 2.1.3
Disadvantages
• Load imbalanced and starvation of tasks with large service time. • It may not perform well in different set of data for the given algorithm. 2.1.4
Analysis
Min min algorithm will be given to the following applicable for application where the given large scale data set. It can be used where the frequency of data is very high. Hence, algorithm will get output as per minimum executable function. The practical application of this algorithm can be stated in grid computing, mobile networks etc.
484
M. Patel et al.
2.2 Max Min Algorithm 2.2.1
Introduction
It is an simple extension of min min algorithm for the given scheduled task it will priories the maximum executable task first in queue. The max–min algorithm gives priority of process which has maximum execution time. Max–min algorithm is used to lower the make span for important task and increase the resource utilization. Figure 2 will conclude the max min process as per the applied algorithm with its length of its completion time [10].
Fig. 2 Flowchart of max min algorithm [11]
An Analysis of Task-Based Algorithms for Cloud Computing
2.2.2
485
Advantages
• Throughput is extremely high where the given algorithm, where the work is related equal sharing algorithm and resources. • Advance version of min min algorithm, easily implemented via priority scheduling. 2.2.3
Disadvantages
• Playing games like chess, there is lot of branching where player has many option this algorithm becomes quite slow. • It would become tedious task for this algorithm to execute sub task through sub task recursively. 2.2.4
Analysis
Max–min algorithm alone will be a disadvantage for the given process. It must be integrated in part of a larger executable algorithm. It is applied by the given rescheduling algorithm for which continuous priority switching process is done in given applications. This algorithm is stated mainly communication network, data center management.
2.3 Genetic Algorithm 2.3.1
Introduction
In genetic algorithm given optimized by the given sample space and searches automatically by the methods of probability. This algorithm is also used for global search optimization where with ability to simulate the behaviour of the sample space, it evolves the execution of the following data. Hence as per given Fig. 3, we can follow generic approach the genetic algorithm where the data is mentioned as initial population and given processed task is given as new population [12, 13].
2.3.2
Advantages
• Supports multi-objective optimization. • Can handle a large number of variables and constraints. • Work well on mixed discrete/continuous problem.
486
M. Patel et al.
Fig. 3 Flowchart of genetic algorithm [14]
2.3.3
Disadvantages
• They can be difficult to debug without prior knowledge. • Can be computationally expensive. • The algorithm gets stuck in a local optimum instead of finding the global optimum. 2.3.4
Analysis
Genetic algorithm is mainly applied in the application for predicted outputs like usage of probability etc. Application of this algorithm is when we need the fastest response time of any given processes. Hence fair allocation of task is done by this algorithm. This algorithm is widely used problem with multi-objective, design optimization.
An Analysis of Task-Based Algorithms for Cloud Computing
487
2.4 Round Robin Algorithm 2.4.1
Introduction
It runs in cyclic manner. Where first comes process gets first scheduling. The process in time if execution process is finished then that process is terminated and next process with queue will take the turn to get execution. Where we can refer it from fist scheduled first executed, second scheduled second executed so on with respect to its remaining process. Hence as per the given Fig. 4 we an see the algorithm is been based on delta time with STQ in total must be greater than the remaining time where the processes could be executed [15, 16].
Fig. 4 Flowchart of round robin algorithm [17]
488
2.4.2
M. Patel et al.
Advantages
• It is performed heavily depends on the time quantum. • Priorities cannot be set for the process. • Provides a fair allocation of processing time to each task, improving load balancing. 2.4.3
Disadvantages
• Round Robin scheduling does not give special priority to more important tasks. • It can lead to inefficiencies if some task require more time than other process, and occurs in scheduling after every interval. 2.4.4
Analysis
Hence the Round Robin scheduling is provided for the distributed network. For the following process in load balancing round robin should be integrated with process where information sharing is to be done in regular interval of time. So with which we can get response time fast and fair allocation of processes. This algorithm can be stated where response time plays role, like web server hosting, real time systems.
3 Parameter for Evaluating Task Scheduling Algorithm See Tables 1 and 2 for comparative analysis of task based scheduling algorithm [2]. Table 1 Parameter’s description for task based scheduling algorithm Parameter
Description
State of algorithm
It describes the state whether it fall in static or dynamic in nature
Throughput
Number of operations completed in unit time is known as throughput, more throughput more efficient
Scalability
It refers whether the algorithm is capable of handling number of request increasing, whether it perform at its same performance or not
Response Time Time at which the task is allocated in CPU at first time, by scheduler Migration
It refers to process moving from one server to another, mainly seen in virtual machine context switching
Fault tolerance It is ability of algorithm to continue work event if fault or error is occurred, which means after any error does the algorithm is still operational
An Analysis of Task-Based Algorithms for Cloud Computing
489
Table 2 Parameter based overview for task based scheduling algorithm Algorithm
State of algorithm
Throughput
Scalability
Response time
Migration
Fault tolerance
Min–min algorithm [18]
Static
High
Low
Moderate
Low
Very low
Max–min algorithm [18]
Static
High
Low
Moderate
Low
Low
Genetic algorithm [18]
Dynamic
High
Moderate
Moderate
High
High
Round robin algorithm [18]
Dynamic
Depends on quantum
High
Comparatively high
High
High
4 Research Gaps In this section, we have elaborated several research gaps based on our findings, • Overall, Task Scheduling algorithm paper lack studies on security and dataprotection for cloud-based application where the mainly workflow is ongoing with real-time input and output of data is going simultaneously. Also, algorithm which energy consumption is more. So, awareness and formation of hybrid algorithm will make sure the minimal environmental impact. • Min Min algorithm and Max Min algorithm both have limited scalability in cloudbased environment. So, the data set must be allocated to task from resource in such way that it will scale the algorithm. • Multi objective optimization, where the genetic algorithm is one of the algorithms. But most of the algorithm focus on the single objective, hence algorithm should contain multiple objective in one algorithm. • Round Robin algorithm are based on the cyclic manner allocation of task. But it fail to produce result in heterogenous resource which will allocates in task. Hence with fairness of allocation some deadline to process or task must be allocated for it better performance. • To overcome the gaps in the algorithm we can combine the algorithm with other algorithm to increase performance. They are also denoted as Hybrid Algorithms. • The change in parameter of the algorithm which stated them. By changing this parameter according to the state of the process, it will enhance the algorithm efficiency for same.
490
M. Patel et al.
5 Conclusion Hence, the given task scheduling algorithms will help reallocate the processes according to their given set of rules. The min–min algorithm and the max–min algorithm can be implemented in a set of large data set instructions so that execution speed increases drastically [8]. Because genetic algorithm can be implemented as a whole, independent execution of a set of instructions is possible [12]. Round robin, on the other hand, is a popular algorithm that is used in many operations systems [19]. It’s essential for any software to design its architecture in such a way that it abides by certain parameters that it algorithmically needs. Task Scheduling is one major aspect of scheduling where we have to allocate the processes by scheduler. Fact: major load balancing schedules in cloud computing are based on this scheduling. By about analysis, we can conclude the given algorithm can be modified by the addition of other algorithms, but the applied practical advantage goes to the max–min algorithm in a static state and the genetic algorithm in a dynamic state, in accordance with the work proposed here. So, task-based algorithms will aid us in obtaining the finest outcome in accordance with their criteria, which will speed up the machine.
References 1. Gawali MB, Shinde SK (2018) Task scheduling and resource allocation in cloud computing using a heuristic approach. J Cloud Comput 7(1). https://doi.org/10.1186/s13677-018-0105-8 2. Shahid MA, Islam N, Alam MM, Su’Ud MM, Musa S (2020) A comprehensive study of load balancing approaches in the cloud computing environment and a novel fault tolerance approach. IEEE Access 8:130500–130526. https://doi.org/10.1109/ACCESS.2020.3009184 3. Ray S (2012) Execution analysis of load balancing algorithms in cloud computing environment. Int J Cloud Comput Serv Archit 2(5):1–13. https://doi.org/10.5121/ijccsa.2012.2501 4. Jurgelevicius A, Sakalauskas L, Marcinkevicius V (2021) Application of a task stalling buffer in distributed hybrid cloud computing. Elektron Elektrotech 27(6):57–65. https://doi.org/10. 5755/j02.eie.28679 5. Nair B, Bhanu SMS (2022) Task scheduling in fog node within the tactical cloud. Def Sci J 72(1):49–55. https://doi.org/10.14429/DSJ.72.17039 6. Kaur A, Maini R (2018) Different task scheduling algorithms in cloud computing. Int J Latest Trends Eng Technol 9(3). https://doi.org/10.21172/1.93.37 7. Patel G, Mehta R, Bhoi U (2015) Enhanced load balanced min-min algorithm for static meta task scheduling in cloud computing. Procedia Comput Sci 57:545–553. https://doi.org/10.1016/ j.procs.2015.07.385 8. Haladu M, Samual J (2016) Optimizing task scheduling and resource allocation in cloud data center, using enhanced min-min algorithm. IOSR J Comput Eng 18(04):18–25. https://doi.org/ 10.9790/0661-1804061825 9. Chen H, Wang F, Helian N, Akanmu G (2013) User-priority guided min-min scheduling algorithm for load balancing in cloud computing. In: 2013 National conference on parallel computing technologies, PARCOMPTECH 2013. https://doi.org/10.1109/ParCompTech.2013. 6621389 10. Elzeki OM, Reshad MZ, Elsoud MA (2012) Improved max-min algorithm in cloud computing general terms
An Analysis of Task-Based Algorithms for Cloud Computing
491
11. Al-Haboobi ASA (2017) Improving max-min scheduling algorithm for reducing the makespan of workflow execution in the cloud. Int J Comput Appl 177(3):5–9. https://doi.org/10.5120/ijc a2017915684 12. Sawant S (2011) A genetic algorithm scheduling approach for virtual machine resources in a cloud computing environment. San Jose State University, San Jose, CA, USA. https://doi.org/ 10.31979/etd.et9e-73fz 13. Hamad SA, Omara FA (2016) Genetic-based task scheduling algorithm in cloud computing environment [Online]. Available: www.ijacsa.thesai.org 14. Ibrahim S, Mashinchi MR, Dastanpour A, Mashinchi R. Used genetic algorithm for support artificial neural network in intrusion detection system using genetic algorithm to supporting artificial neural network for intrusion detection system [Online]. Available: https://www.resear chgate.net/publication/261028418 15. Abdulkarim A, Boukari S, Muhammed I, Abubakar FA (2018) An improved round robin load balancing algorithm in cloud computing using average burst time. Int J Sci Eng Res 9(3) [Online]. Available: http://www.ijser.org 16. Anusha SK, Bindu Madhuri NR, Nagamani NP. Load balancing in cloud computing using round robin algorithm [Online]. Available: www.ijert.org 17. Mody S, Mirkar S (2019) Smart round robin CPU scheduling algorithm for operating systems. In: 4th International conference on electrical, electronics, communication, computer technologies and optimization techniques, ICEECCOT 2019, Dec 2019, pp 309–316. https://doi.org/ 10.1109/ICEECCOT46775.2019.9114602 18. Patil A, Thankachan B (2020) Review on a comparative study of various task scheduling algorithm in cloud computing environment 19. Abed MM, Younis MF (2019) Developing load balancing for IoT—cloud computing based on advanced firefly and weighted round robin algorithms. Baghdad Sci J 16(1):130–139. https:// doi.org/10.21123/bsj.2019.16.1.0130
Performance Analysis of SOC Estimation Approaches for Lithium-Ion Batteries R. Shanmugasundaram, C. Ganesh, B. Adhavan, M. Mohamed Iqbal, B. Gunapriya, and P. Tamilselvi
Abstract The state of charge (SOC) is an important parameter in electric vehicles to be measured accurately from the battery model parameters and is used to determine the amount of charge left in the battery for future use. The model parameters of the battery are found to vary with change in temperature, charging/discharging rates, aging and environmental conditions. The existing SOC estimation approaches such as “coulomb counting”, “electrochemical impedance spectroscopy (EIS)” and observer based techniques are suitable only for offline applications. Hence, SOC estimation by these methods are inaccurate because the variation in battery model parameters in real time are not considered. In this paper, an adaptive SOC estimation approach employing real time battery parameters identification and online updating observer is proposed. The accuracy and performance of the proposed approach is validated through simulation and compared with the existing approaches. Keywords SOC co-estimation · Battery model parameters · Li-ion battery · Updating observer · Linear Voc -SOC curve
R. Shanmugasundaram (B) Sri Ramakrishna Engineering College, Coimbatore 641022, India e-mail: [email protected] C. Ganesh Rajalakshmi Institute of Technology, Chennai 600124, India B. Adhavan · M. Mohamed Iqbal PSG Institute of Technology and Applied Research, Coimbatore 641062, India B. Gunapriya New Horizon College of Engineering, Bangalore 560103, India P. Tamilselvi Sri Eshwar College of Engineering, Coimbatore 641202, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_45
493
494
R. Shanmugasundaram et al.
1 Introduction The demand for safe and highly reliable electric vehicles are increasing day by day due to increasing fuel price, and global environmental concerns for reducing greenhouse gasses emission. This lead to the development high power and energy density Li-ion batteries. In order to improve the EVs driving range, it is required to improve the reliability and cost efficiency of the battery by a proper energy and battery management system [1]. So, it is required to develop algorithm for accurate SOC estimation and for effective decision making to achieve better battery and energy management in Electric vehicles (EVs). The limitations of existing SOC estimation approaches [2–19] reported in the literature are: (i) Coulomb counting approach has limitations of unknown initial SOC, error accumulation, (ii) OCV measurement approach has a limitation of battery’s long rest time to obtain OCV, (iii) Electro-chemical approach is a steady-state technique based on small signal analysis. Some of the popular observer techniques such as “Kalman filter” [10, 17], “sliding-mode” [5] are used for offline battery’s model parameters identification [20– 27] and they take into account the battery’s changing dynamics. The fixed parameter battery model [1, 20–27] obtained by these techniques differ from the battery model obtained at varying SOC’s and environmental conditions [1]. It is essential to develop an accurate battery model to represent its characteristics for accurate estimation of SOC. Therefore, it is required to develop an online adaptive model [2–10], to identify and up-date online the parameters that are subjected to change with different temperature, SOC, charging/discharging rates, and aging. This paper proposes an online adaptive technique for accurate model parameter identification to represent the static and dynamic battery behavior, and an adaptive algorithm to accurately estimate SOC. An online adaptive parameter estimation technique [7, 28, 29] using moving window least square and SOC co-estimation approach is suggested to identify battery parameters using piece-wise linearizing of VOC -SOC function, and observer is designed based on these updated parameters for accurate SOC estimation. The proposed online adaptive approach not only improve the Li-ion battery’s precise SOC estimation of electric vehicles but also avoid compensation to uncertainties due temperature and aging. The proposed SOC co-estimation method can be employed to control the speed and torque of battery-powered BLDC motor propelled electric vehicles [30–33].
2 Battery Model and SOC Co-estimation Algorithm The RC equivalent circuit [7] shown in Fig. 1 is the battery model used for SOC estimation. The capacitor (Q) represents charging/discharging cycles and R0 represent the small resistance of the electrolyte. The change with SOC, aging and environmental conditions can considerable affect the resistance (R0 ). The parallel RC
Performance Analysis of SOC Estimation Approaches for Lithium-Ion …
495
circuit represents the relaxation effect of the battery that indicates the slow convergence of battery terminal voltage following charging/discharging cycles. The number of parallel RC circuits is decided based on the compromise between accuracy and complexity of the battery model. Generally the battery model of EVs requires one RC group to represent relaxation effect. The Voc -SOC curve is non-linear and can affect the performance of estimators. In order to avoid this problem, Voc -SOC curve is split into several linear sections and each linear section as shown in Fig. 2 is represented by an equation, Voc = f (SoC) = b0 + b1 SoC
(1)
The best lines to fit the linear regions is found by the curve fitting technique such as LSE (least square error) [1, 7]. The simple RC equivalent circuit consisting of single RC group is used to model the battery behavior and this simplifies the identification and extraction of model parameters. The battery model is represented by the following state space equations:
Fig. 1 RC equivalent circuit—battery model
4.2 8 4.1 7 4 6
3.9 Voc (V)
5 4
3.8 3
3.7 2
3.6 3.5 1 3.4 3.3
0
0.1
0.2
0.3
0.4
0.5 SOC
Fig. 2 VOC -SOC curve divided into 8 linear segments
0.6
0.7
0.8
0.9
1
496
R. Shanmugasundaram et al.
x˙ = Ax + Bu
(2)
y = C x + Du + b0
(3)
where
x1 x2
x=
=
0 0 S OC ,A= ,B = 1 0 − RC VRC
C = b1 1 , u = I L , D = Ro
1 QR 1 C
Hence the Eqs. (1) and (2) can be written as,
˙ S OC V˙ RC
0 0 = 1 0 − RC
S OC VRC
+
1 QR 1 C
(4)
IL
S OC + Ro I L + bo VT = b1 1 VRC
(5)
In the above Eqs. (3) and (4), IL and VT are terminal current and voltage, SOC and VRC are chosen as state variables. For the battery with nominal capacity (QR ), SOC estimation demands the estimation of parameters (b0 , b1 , R0 , R, C, VRC , and SOC) by state estimation and parameter identification techniques. The moving window LS technique [1, 7] is employed to estimate the battery model parameters. The window length is decided based on the number of past steps that could reveal the dynamics of the battery. The parameters of the battery model are extracted from the coefficients its discrete transfer function [7]. Figure 3 show the block diagram of battery parameters/ SOC co-estimation algorithm.
Assuming that the estimated parameters of the battery are b0 , b1 , R0 , R , C and using the following equations, the optimal observer is designed by employing linear quadratic technique. Δ
Fig. 3 Block diagram of SOC/battery parameters co-estimation algorithm
Δ
Δ
Δ
Δ
Performance Analysis of SOC Estimation Approaches for Lithium-Ion …
497
x˙ˆ = A xˆ + Bu + L y − yˆ
(6)
yˆ = C xˆ + Du + b0
(7)
where observer gain vector, L T = L 1 L 2 and the observer gain is calculated using the updated parameters and is given by the equation, L T = R −1 C P where P matrix is determined from the following LQ Riccati equation [7], A P + P T A − PC T R −1 C P = −Q
(8)
where R is positive definite matrix, and Q is arbitrary semi-positive matrix.
3 Simulation Results The Matlab Simulink diagram developed for the SOC co-estimation algorithm implementation is shown in Fig. 4. The Voc -SOC relationship data is obtained from Fig. 2 and is presented in the form of look-up table in the Simulink diagram. The parameters of the battery such as R0 , R and C are changed according to the actual battery behavior and LS parameter identification algorithm is applied to estimate these parameters online. It is found the results shown in Fig. 5 that the proposed algorithm is able to accurately estimate these parameters. There is a slight estimation error while changing from one linear Voc -SOC segment to another. The estimated parameters, SOC and VRC values of previous step are applied to the discrete LQ observer for SOC estimation. Moreover the battery model parameters are updated continuously with the new parameters. The
Fig. 4 Matlab-Simulink diagram for SOC estimation
498
R. Shanmugasundaram et al.
Fig. 5 Identification of battery parameters by proposed approach
performance of the algorithms is evaluated by computation time and norms of error given by, / ||e||2 =
K 1 ∑ 2 e (k) K k=1
||e||∞ = max|e(k)| k
(9) (10)
The computation time is the time taken by the estimation algorithm to bring the estimated SOC within 5% of reference SOC.
4 Performance Comparison of SOC Estimation Algorithms The performance of the popular algorithms such as “sliding mode observer” [5] and “extended Kalman filter” (EKF) [10, 17] are compared with SOC co-estimation algorithm with updating observer. In EKF, the noise effect is reduced by designing gain of
Performance Analysis of SOC Estimation Approaches for Lithium-Ion …
499
10
Current (A)
8 6 4 2 0
0
500
1000
0
500
1000
1500
2000
2500
1500
2000
2500
4
Voltage (V)
3 2 1 0
Time (s)
Fig. 6 Input current and voltage profile
the observer based on the linearized model. In sliding mode observer, the uncertainties are compensated and optimal performance is achieved by adding sliding-mode gain. Both the methods use fixed battery model by offline identification of battery parameters. As a result these SOC estimation algorithms cannot provide accurate SOC at different charging/discharging rates and SOCs due to change in battery parameters. The input voltage and current profiles with different charging/discharging rates is shown in Fig. 6, is applied to the algorithms to evaluate their performance. Figure 7 shows the bar chart of errors and convergence time of different algorithms. Figure 8 shows the estimated SOC values of different algorithms. The results indicate that SOC estimated values from extended Kalman filter is closer to the proposed SOC estimation approach, and the Sliding-mode observer fluctuates around the EKF and produces large error. Moreover, the convergence time of SOC co-estimation is lesser than existing approaches even though certain time is required for updating of parameters. Hence, results indicate SOC co-estimation performs better than other non-updating algorithms. The SOC estimated values of SOC co-estimation closely follows the SOC reference values, thus confirming the need for online parameters identification and updating observer for accurate SOC estimation.
500
R. Shanmugasundaram et al.
Fig. 7 Errors (a, b) and convergence time (c) of different algorithms SOC estimation results of different algorithms 1 SOC Co-estimation EKF Reference Sliding mode
0.9 0.8 0.7
SOC
0.6 0.5 0.4 0.3 0.2 0.1 0
0
500
1000
1500 Time (s)
Fig. 8 Estimated SOC values of different algorithms
2000
2500
Performance Analysis of SOC Estimation Approaches for Lithium-Ion …
501
5 Conclusion The SOC estimation by popular EKF and Sliding mode algorithms that uses fixed battery model parameter and non-updating observer is found to be inaccurate because of the reason that the SOC is subjected to change at different SOCs, charging/ discharging rates, temperature and aging. The RC equivalent circuit based battery model is used to represent the battery static and dynamic behavior. In the proposed approach, the piecewise linear Voc -SOC curve is used to identify the battery parameters continuously and update the parameters of model, apply these parameters to the updating observer for accurate SOC estimation. The proposed “SOC co-estimation” procedure is validated through the simulation results and compared with the existing SOC estimation approaches. The results show the identified parameters follow the change in reference parameters, and the SOC is accurately estimated by the updating observer. Moreover, performance of SOC co-estimation algorithm is evaluated in terms of norms of error and convergence time and is found to better than the EKF and Sliding mode algorithms. However, the performance of the SOC co-estimation algorithm has to be validated experimentally.
References 1. Shanmugasundaram R, Ganesh C, Tamilselvi P, Ravichandran CS, Mayurappriyan PS (2023) Battery management system in EV applications: review, challenges and opportunities. In: Fong S, Dey N, Joshi A (eds) ICT Analysis and applications, vol 517. Lecture notes in networks and systems. Springer, Singapore, pp 511–519 2. Zhang W, Wang L, Wang L, Liao C, Zhang Y (2022) Joint state-of-charge and state-of-availablepower estimation based on the online parameter identification of lithium-ion battery model. IEEE Trans Ind Electron 69(4):3677–3688 3. Yang J, Xia B, Shang Y, Huang W, Mi CC (2017) Adaptive state-of-charge estimation based on a split battery model for electric vehicle applications. IEEE Trans Veh Technol 66(12):10889– 10898 4. Chaoui H, Ibe-Ekeocha CC (2017) State of charge and state of health estimation for lithium batteries using recurrent neural networks. IEEE Trans Veh Technol 66(1):8773–8783 5. Chen X, Shen W, Dai M, Cao Z, Jin J, Kapoor A (2016) Robust adaptive sliding-mode observer using RBF neural network for lithium-ion battery state of charge estimation in electric vehicles. IEEE Trans Veh Technol 65(4):1936–1947 6. Zhang C, Wang LY, Li X, Chen W, Yin GG, Jiang J (2015) Robust and adaptive estimation of state of charge for lithium-ion batteries. IEEE Trans Ind Electron 62(8):4948–4957 7. Rahimi-Eichi H, Baronti F, Chow M-Y (2014) Online adaptive parameter identification and state-of-charge coestimation for lithium-polymer battery cells. IEEE Trans Ind Electron 61(4):2053–2061 8. Xiong R, He H, Sun F, Zhao K (2013) Evaluation on state of charge estimation of batteries with adaptive extended Kalman filter by experiment approach. IEEE Trans Veh Technol 62(1):108– 117 9. He H, Xiong R, Zhang X, Sun F, Fan J (2011) State-of-charge estimation of the lithium-ion battery using an adaptive extended Kalman filter based on an improved Thevenin model. IEEE Trans Veh Technol 60(4):1461–1469 10. Charkhgard M, Farrokhi M (2010) State-of-charge estimation for lithium-ion batteries using neural networks and EKF. IEEE Trans Ind Electron 57(12):4178–4187
502
R. Shanmugasundaram et al.
11. Stroe D, Schaltz E (2020) Lithium-ion battery state-of-health estimation using the incremental capacity analysis technique. IEEE Trans Ind Appl 56(1):678–685 12. Tan X, Tan Y, Zhan D, Yu Z, Fan Y, Qiu J, Li J (2020) Real-time state-of-health estimation of lithium-ion batteries based on the equivalent internal resistance. IEEE Access 8:56811–56822 13. Poloei F, Akbari A, Liu Y-F (2019) A moving window least mean square approach to state of charge estimation for lithium ion batteries. In: Proceedings of 2019 1st global power, energy and communication conference (GPECOM), Nevsehir, Turkey, pp 398–402 14. Song Y, Liu W, Li H, Zhou Y, Huang Z, Jiang F (2017) Robust and accurate state-of-charge estimation for lithium-ion batteries using generalized extended state observer. In: Proceedings of the 2017 IEEE international conference on systems, man, and cybernetics (SMC), Banff, AB, Canada, pp 2146–2151 15. Gholizadeh M, Salmasi F (2014) Estimation of state of charge, unknown nonlinearities, and state of health of a lithium-ion battery based on a comprehensive unobservable model. IEEE Trans Ind Electron 61(3):1335–1344 16. Chen Z, Fu Y, Mi CC (2012) State of charge estimation of lithium-ion batteries in electric drive vehicles using extended Kalman filtering. IEEE Trans Veh Technol 62(3):1020–1030 17. Kim J, Cho BH (2011) State-of-charge estimation and state-of-health prediction of a Li-ion degraded battery based on an EKF combined with a per-unit system. IEEE Trans Veh Technol 60(9):4249–4260 18. Di Domenico D, Fiengo G, Stefanopoulou A (2008) Lithium-ion battery state of charge estimation with a Kalman filter based on an electrochemical model. In: Proceedings of IEEE international conference on control applications 2008, San Antonio, TX, USA, pp 702–707 19. Chiasson J, Vairamohan B (2005) Estimating the state of charge of a battery. IEEE Trans Control Syst Technol 13(3):465–470 20. Sun B, He X, Zhang W, Ruan H, Su X, Jiang J (2021) Study of parameters identification method of Li-ion battery model for EV power profile based on transient characteristics data. IEEE Trans Intell Transp Syst 22(1):661–672 21. Khattak AA, Khan AN, Safdar M, Basit A, Zaffar NA (2020) A hybrid electric circuit battery model capturing dynamic battery characteristics. In: Proceedings of 2020 IEEE Kansas power and energy conference (KPEC), Manhattan, KS, USA, pp 1–6 22. Chan HL (2000) A new battery model for use with battery energy storage systems and electric vehicles power systems. In: Conference proceedings of 2000 IEEE power engineering society winter meeting (Cat. No.00CH37077), vol 1, Singapore, pp 470–475 23. Xia B, Huang R, Lao Z, Zhang R, Lai Y, Zheng W (2018) Online parameter identification of lithium-ion batteries using a novel multiple forgetting factor recursive least square algorithm. Energies 11(11):1–19 24. Hariharan KS, Tagade P, Ramachandran S (2018) Mathematical modeling of lithium batteries: from electrochemical models to state estimator algorithms. Springer Nature, Switzerland 25. Zhang Ch, Li K, Mcloone S, Yang Zh (2014) Battery modelling methods for electric vehicles—a review. In: European control conference (ECC), Strasbourg, France, pp 2673–2678 26. He H, Xiong R, Guo H, Li S (2012) Comparison study on the battery models used for the energy management of batteries in electric vehicles. Energy Convers Manag 64:113–121 27. Plett GL (2004) High-performance battery-pack power estimation using a dynamic cell model. IEEE Trans Veh Technol 53(5):1586–1593 28. Shen P, Ouyang M, Lu L, Li J, Feng X (2018) The co-estimation of state of charge, state of health, and state of function for lithium-ion batteries in electric vehicles. IEEE Trans Veh Technol 67(1):92–103 29. Hu X, Yuan H, Zou C, Li Z, Zhang L (2018) Co-estimation of state of charge and state of health for lithium-ion batteries based on fractional-order calculus. IEEE Trans Veh Technol 67(11):10319–10329 30. Shanmugasundaram R, Ganesh C, Singaravelan A, Gunapriya B, Adhavan B (2022) Highperformance ANFIS-based controller for BLDC motor drive. In: Conference 2021, Smart innovation, systems and technologies, vol 243. Springer, Singapore, pp 435–449
Performance Analysis of SOC Estimation Approaches for Lithium-Ion …
503
31. Shanmugasundaram R, Ganesh C, Adhavan B, Singaravelan A, Gunapriya B (2022) Sensorless speed control of BLDC motor for EV applications. In: Sustainable communication networks and application. LNDECT, vol 93. Springer, Singapore, pp 359–370 32. Shanmugasundaram R, Ganesh C, Singaravelan A (2020) ANN-based controllers for improved performance of BLDC motor drives. LNEE 665:73–87 33. Shanmugasundram R, Zakariah KM, Yadaiah N (2014) Implementation and performance analysis of digital controllers for brushless DC motor drives. IEEE/ASME Trans Mechatron 19(1):213–224
An Analysis of Resource-Oriented Algorithms for Cloud Computing Abhinav Sharma, Priyank Vaidya, Murli Patel, and Nishant Doshi
Abstract This review paper provides the reader with an overview of some of the many resource scheduling algorithms. The paper also describes the characteristics of these algorithms and highlights their strengths and weaknesses. The main focus is on comparing and evaluating different resource scheduling algorithms so that one can incorporate them as required. The paper also discusses potential directions for future study in the field of resource scheduling in virtual environments. A conclusion is drawn upon careful analysis, comparison, and assessment of various algorithms and their applicability for use in practical applications. Keywords Resource scheduling · Load balancing · Cloud computing · Ant colony optimization · Honey bee-based load balancing
1 Introduction Cloud computing mainly refers to the outsourcing of an individual or organization’s computing needs to cloud service providers, who have the means to provide high-quality computing services and facilities to consumers. Cloud computing helps reduce costly expenditures on IT infrastructure [1]. By effectively and correctly administering its services, its capabilities may be fully used. A key aspect of cloud computing as a whole is the virtual nature of the services [2]. In recent years, a huge demand for cloud-based services has been observed, and so needs efficient load-balancing algorithms. Load balancing mainly refers to the appropriate management and usage of many sorts of loads, such as network load along with memory and storage capacity constraints, in the cloud [3]. It is a technique for distributing the load among several distributed system processors to speed up task A. Sharma (B) · P. Vaidya · M. Patel · N. Doshi Department of Computer Science and Engineering, Pandit Deendayal Energy University, Raisan Village, Gandhinagar, Gujarat 382007, India e-mail: [email protected] N. Doshi e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_46
505
506
A. Sharma et al.
response times and resource usage while disregarding a situation in which some processors are working beyond capacity, idle, or performing underloaded work at any given moment in the system [4]. When creating load-balancing algorithms, it’s important to take to into account factors such as load estimation and comparison, system stability and performance, node selection, and many others [5]. To deliver high-quality service from numerous servers, load balancing is used [6]. Load balancing can be categorized on basis of: • Assignment of processes to the nodes and distribution of workload • The nodes’ informational status [7]. Resource scheduling is one of the cornerstones of load balancing, along with task and VM scheduling. It is the efficient utilization of various resources or loads that one usually deals with in cloud computing. It is used to ensure the proper distribution of resources as required. Given the increased adoption of cloud computing as a medium for storing and processing data, an increased demand for resources has been observed. A few major goals which we try to achieve with load balancing are: • Improving cost-effectiveness • Enhancing scalability while maintaining flexibility • Priority [8]. Therefore, resource scheduling or load balancing is necessary because: • It helps improve user satisfaction • It helps improve resource utilization • It effectively cuts down on the time it takes to complete a task and the time a task has to wait • It aids in improving efficiency • It helps improve systems’ ability to handle faults • It makes systems more stable [3]. Several algorithms to achieve low-cost resource utilization and minimize the completion time have been devised. Resource scheduling is thus, achieved with the help of several algorithms, some of which are as follows: • • • •
BAT Algorithm Cuckoo Search Algorithm Ant Colony Optimization Algorithm Honey Bee Based Load Balancing Algorithm.
Some of the algorithms find multiple usages with respect to load balancing other than for resource scheduling alone, as they find uses for task and VM scheduling as well.
An Analysis of Resource-Oriented Algorithms for Cloud Computing
507
2 BAT Algorithm 2.1 Introduction It is a meta-heuristic swarm intelligence algorithm, inspired by how bats behave while they search for food. We can get a brief overview of the overall functioning of the BAT Algorithm from the flowchart depicted in Fig. 1. The algorithm is simple to apply and has a reputation for producing reliable results [9]. The algorithm can be applied using several software like MATLAB. The algorithm can be run through several iterations to produce the best results [10].
Fig. 1 Flowchart on the functioning of BAT algorithm [11]
508
A. Sharma et al.
2.2 Advantages • Faster convergence rate compared to other algorithms • As the algorithm has four indicators, which are updated after each stage of the optimization process, the algorithm helps find the best possible solution based on the given problem at hand.
2.3 Disadvantages • Can keep up reputation only for smaller tasks • The issue of lower accuracy and lower speed can’t be avoided at times • When looking for a feasible solution to a specific issue, because it tends to enter a local optimum, the algorithm can converge too quickly.
2.4 Analysis The algorithm utilizes the concept of swarm intelligence evident in the behavior of various creatures like bats in the wild. The algorithm’s characteristics are the reasons why it finds use in cloud computing for various purposes ranging from improving QOS to energy management. With the help of this algorithm, the cloud’s VM resource assignment system can be improved.
3 Cuckoo Search Algorithm 3.1 Introduction It is an innovative, nature-inspired, meta-heuristic algorithm or multi-objective algorithm [12]. Based on the behavior of cuckoo birds in nature, the algorithm helps find the most optimum solution for a given situation and thus, helps handle resources efficiently. Here, the flowchart given in Fig. 2 gives a very clear idea of the overall functioning of the algorithm.
3.2 Advantages • Smaller number of iterations • Lower convergence rate • Smaller dependencies to input data.
An Analysis of Resource-Oriented Algorithms for Cloud Computing
509
Fig. 2 Flowchart of cuckoo search algorithm [12]
3.3 Disadvantages • The host bird determines whether a new solution is accepted • Can easily fall into the local optimal value.
3.4 Analysis It is one of the best search algorithms based on the behavior of cuckoo birds. Job scheduling, networking, and other optimization-related issues are some of the issues the algorithm is used to resolve. Given the fact that it can find optimal solutions to various problems with ease, it has a lot of potential as a favored search algorithm in load balancing as a whole.
510
A. Sharma et al.
4 Ant Colony Optimization Algorithm 4.1 Introduction The algorithm was developed, inspired by the foraging habits of ants. It is challenging to achieve load balancing in the cloud due to diverse resource requests by users and the diversity of physical resources, which results in very low resource utilization [13]. Here, the best route is typically found using shared collaboration and feedback, while simulating ant foraging, as can be understood with the help of Fig. 3, which helps explain the overall structure and flow of the given algorithm, which finds uses in task scheduling as well. There are many approaches to load balancing using antbased algorithms. In some approaches, pheromones are a symbol of a resource’s processing power. Resources having higher pheromone levels stand a better chance of receiving task assignments [14]. Pheromone levels are inversely proportional to the length of the path. As per the pheromone value, the best route will be selected by the ants. In comparison to the path with a low pheromone value, the path with a high pheromone value is shorter [15]. Fig. 3 Flowchart of ant colony optimization algorithm [16]
An Analysis of Resource-Oriented Algorithms for Cloud Computing
511
4.2 Advantages • Easy to implement • Can be merged with other algorithms with ease.
4.3 Disadvantages • The Ant Colony Optimization Algorithm is relatively more complex than other resource scheduling algorithms • It always takes a little bit more time to assign tasks.
4.4 Analysis The algorithm is quite frequently used for resource scheduling. It is applicable for the resolution of challenging optimization-related issues. We can try to combine the algorithm with other algorithms that take less time to assign tasks to get around the algorithm’s drawbacks.
5 Honey Bee-Based Load Balancing Algorithm 5.1 Introduction It is a dynamic algorithm that draws inspiration from foraging bees in the wild. Each artificial bee in the algorithm symbolizes a potential agent and problem-solving alternative, and they cooperate to share knowledge and resolve challenging optimization issues [17]. It finds use in both task and VM scheduling as well, apart from resource scheduling, and works preemptively as can be inferred from the flowchart given in Fig. 4. Though the diagram is from a VM load balancing perspective, it can be implemented for resource scheduling purposes as well. It takes into account multiobjective optimization when choosing the best VM for load balancing [18]. This method enhances makespan and response time while successfully balancing independent non-preemptive tasks. Despite the high throughput of the algorithm, low-priority tasks frequently occur [19].
512
A. Sharma et al.
Fig. 4 Flowchart of the honey bee-based load balancing algorithm [20]
5.2 Advantages • It avoids underutilization and overutilization of VMs • In contrast to some other load-balancing algorithms, it has faster average response times.
An Analysis of Resource-Oriented Algorithms for Cloud Computing
513
5.3 Disadvantages • • • •
Migration process of load not efficient There is a chance of losing essential data Increased frequency of objective function assessments Sluggish in sequential processing.
5.4 Analysis The algorithm finds a lot of uses in the domain of cloud computing, especially for resource scheduling. It is used to assign resources and tasks to virtual machines. The algorithm has currently been applied to separate tasks. In the future, the algorithm can be expanded to include dependent tasks.
6 Evaluation of Algorithms Based on Parameters Based on the parameters mentioned in Table 1, when we analyze the performance of the algorithms as can be seen in Table 2, we find that each algorithm has its own set of strengths and weaknesses. Some offer better throughput whereas some offer better response time. However, taking all parameters under consideration, we can say that the Ant Colony Optimization Algorithm can be the most advantageous of all, as it provides users with better throughput, has good fault tolerance capabilities, and handles migration scenarios quite well, compared to other algorithms and is also highly scalable.
7 Research Gap Based on the analysis of previous sections, in this section, we have identified research gaps as follows: • Dynamic and Adaptive Resource Allocation: Several resource scheduling algorithms in use today use static resource allocation methods that might not be the best fit for dynamic and unexpected cloud environments. Algorithms that can quickly adapt to changes in workload demand and resource availability are the need of the hour. • Task Dependencies: A lot of resource scheduling methods start by assuming incorrectly that tasks may be scheduled independently of one another. Nonetheless, it is important to consider task dependencies that can occur in practice while allocating
514
A. Sharma et al.
Table 1 Various parameters taken under consideration for resource scheduling algorithms Parameter
Description
Throughput
Specifies the number of transactions or requests that are handled in a certain amount of time. It serves as a measure of how much work a system can complete in a specific length of time
Response time
The time it takes for a system to respond to a user request. Users demand fast response times from cloud computing services; hence it is a critical sign of user experience
Scalability
A system’s ability to handle increasing workloads without compromising performance. Because systems must be able to handle variable and shifting workloads, it is an essential performance metric for cloud computing
Migration
Refers to a system’s ability to be transferred from one location to another without losing functionality. For the cloud to maintain service continuity and disaster recovery, migration is essential
Fault tolerance
It describes a system’s capacity to function even when one or more of its components suffer faults or failures. It is crucial because it guarantees that the system can continue to run even when resources malfunction or become unavailable
Table 2 Behavior of algorithms based on parameters mentioned in Table 1 [21] Resource scheduling algorithm
State of algorithm
Throughput
Scalability
Response time
Migration
Fault tolerance
BAT Depends on algorithm [9] implementation
Moderate
Moderate
Moderate
Low to moderate
Low to moderate
Cuckoo search algorithm [22]
Depends on implementation
High
High
Moderate to high
High
Moderate to high
Ant colony optimization algorithm [23]
Dynamic
High
High
High
Low
Moderate to high
Honey bee-based load balancing algorithm [24]
Dynamic
Low to moderate
Low
Low to moderate
Moderate
Low to moderate
resources. To manage task dependencies and ensure that all tasks are completed effectively, more effective resource scheduling strategies are needed. • Fault Tolerance: Several existing resource scheduling methods may not be suitable for cloud systems that are prone to breakdowns and interruptions since they do not explicitly take fault tolerance into account. More fault-tolerant resource
An Analysis of Resource-Oriented Algorithms for Cloud Computing
515
scheduling algorithms are required to manage unexpected circumstances and guarantee resource availability in the face of failures and interruptions. • Heterogeneous Environments: There may be a wide range of resource types and performance levels in cloud computing environments. The majority of resource scheduling techniques now in use assume homogeneity and may not be appropriate for environments with diversity. There is a need for more effective resource scheduling techniques that can handle various settings and make the best use of the resources at hand. • Application in Machine Learning: So, while machine learning has been implemented into a variety of resource scheduling algorithms for load balancing in the cloud, it is still unclear how to apply machine learning methods to enhance the performance of these algorithms. Further research is needed to examine the potential benefits and constraints of applying machine learning in resource scheduling algorithms to identify the most effective strategies and methods for optimizing load balancing in cloud settings.
8 Conclusion Given the rise of cloud computing as a preferred mode of data storage and computing, the need for better resource scheduling algorithms and better load-balancing techniques is evident, now more than ever. Resource scheduling is an important aspect of cloud computing. It ensures that resources in the cloud are utilized efficiently, and we do so by using various algorithms like the BAT Algorithm, the Honey Bee Based Load Balancing Algorithm, the Cuckoo Search Algorithm, and the Ant Colony Optimization Algorithm, to name a few. The algorithms help achieve the goal of efficient resource utilization by taking into consideration the present state of the environment where the algorithm(s) have to be executed. It is worth noting that swarm intelligence algorithms find a lot of use for this purpose, but often work best under well-defined conditions, as can be seen in the case of the Ant Colony Optimization Algorithm. With the help of these algorithms, we try to improve the overall resource utilization in the cloud and minimize the overall cost associated with cloud computing, while keeping in mind other factors like execution time, scalability, and more.
References 1. Ghafir SM, Alam A, Siddiqui F, Naaz S (2021) Virtual machine allocation policy for load balancing. J Phys Conf Ser 2070(1). https://doi.org/10.1088/1742-6596/2070/1/012129 2. Elmagzoub MA, Syed D, Shaikh A, Islam N, Alghamdi A, Rizwan S (2021) A survey of swarm intelligence based load balancing techniques in cloud computing environment. Electronics (Switzerland) 10(21), MDPI. https://doi.org/10.3390/electronics10212718 3. Abhinav Chand N, Hemanth Kumar A, Teja Marella S (2018) Cloud computing based on the load balancing algorithm. Int J Eng Technol 7(4.7):131. https://doi.org/10.14419/ijet.v7i4.7. 20528
516
A. Sharma et al.
4. Alam M, Khan ZA (2017) Issues and challenges of load balancing algorithm in cloud computing environment. Indian J Sci Technol 10:974–6846. https://doi.org/10.17485/ijst/2017/v10i25/ 105688 5. Chaudhary D, Singh R, Tech CM, Head S (2013) A new load balancing technique for virtual machine cloud computing environment 6. Chawla I (2018) Cloud computing environment: a review. Int J Comput Technol 17(2):7261– 7272. https://doi.org/10.24297/ijct.v17i2.7674 7. Mayur S, Chaudhary N (2019) Enhanced weighted round robin load balancing algorithm in cloud computing. Int J Innov Technol Exploring Eng 8(9S2):148–151. https://doi.org/10.35940/ ijitee.I1030.0789S219 8. Sidhu A, Kinger S (2005) Analysis of load balancing techniques in cloud computing. Int J Comput Technol 4(2):737–741. https://doi.org/10.24297/ijct.v4i2C2.4194 9. Yang X-S (2010) A new metaheuristic bat-inspired algorithm, pp 65–74. https://doi.org/10. 1007/978-3-642-12538-6_6 10. Sharma S, Kr. Luhach A, Sheik Abdhullah S (2016) An optimal load balancing technique for cloud computing environment using bat algorithm. Indian J Sci Technol 9(28). https://doi.org/ 10.17485/ijst/2016/v9i28/98384 11. Islam T, Islam ME, Ruhin MR (2018) An analysis of foraging and echolocation behavior of swarm intelligence algorithms in optimization: ACO, BCO and BA. Int J Intell Sci 08(01):1–27. https://doi.org/10.4236/ijis.2018.81001 12. Vijaya V, Pentapalli G, Kiran Varma R (2007) IJARCCE cuckoo search optimization and its applications: a review. Int J Adv Res Comput Commun Eng ISO 3297(11). https://doi.org/10. 17148/IJARCCE.2016.511119 13. Xu P, He G, Li Z, Zhang Z (2018) An efficient load balancing algorithm for virtual machine allocation based on ant colony optimization. Int J Distrib Sens Netw 14(12). https://doi.org/ 10.1177/1550147718793799 14. Rajab H, Kabalan K (2016) A dynamic load balancing algorithm for computational grid using ant colony optimization. Indian J Sci Technol 9(21). https://doi.org/10.17485/ijst/2016/v9i21/ 90840 15. Suryadevera S, Chourasia J, Rathore S, Jhummarwala A (2014) Load balancing in computational grids using ant colony optimization algorithm. Int J Comput Commun Technol 262–265. https://doi.org/10.47893/ijcct.2014.1255 16. Liu Z, Qiu X, Zhang N (2021) ACPEC: a resource management scheme based on ant colony algorithm for power edge computing. Secur Commun Netw 2021:1–9. https://doi.org/10.1155/ 2021/4868618 17. Soni A, Jain YK (2015) A bee colony based multi-objective load balancing technique for cloud computing environment 18. Piyush Gohel CK (2015) A novel honey bee inspired algorithm for dynamic load balancing in cloud environment. Int J Adv Res Electr Electron Instrum Eng 4(8):6995–7000. https://doi. org/10.15662/ijareeie.2015.0408025 19. Hybrid load balancing approach based on the integration of QoS and power consumption in cloud computing. Int J Adv Trends Comput Sci Eng 10(2):1079–1090. https://doi.org/10. 30534/ijatcse/2021/841022021 20. Panda S, Gupta T, Handa SS (2017) A survey on honey bee foraging behavior and its improvised load balancing technique. SJ Impact Factor: 6 887 [Online]. Available: www.ijraset.com2039 21. Shahid MA, Islam N, Alam MM, Su’Ud MM, Musa S (2020) A comprehensive study of load balancing approaches in the cloud computing environment and a novel fault tolerance approach. IEEE Access 8:130500–130526. https://doi.org/10.1109/ACCESS.2020.3009184 22. Yang X-S, Deb S (2010) Cuckoo search via Levy flights, Mar 2010 [Online]. Available: http:/ /arxiv.org/abs/1003.1594 23. Dorigo M, Birattari M, Stutzle T (2006) Ant colony optimization. IEEE Comput Intell Mag 1(4):28–39. https://doi.org/10.1109/MCI.2006.329691 24. Karaboga D (2010) Artificial bee colony algorithm. Scholarpedia 5(3):6915. https://doi.org/ 10.4249/scholarpedia.6915
Study of Architectural Designs for Underwater Wireless Sensors Network in View of Underwater Applications Pooja A. Shelar , Parikshit N. Mahalle , Gitanjali R. Shinde , and Janki Barot
Abstract In recent years, the development of underwater applications has received greater attention from researchers and marine industries. These applications are Underwater Territory Surveillance, Water Quality Monitoring, Tsunami Detection, Enemy Submarine Detection, Mine Detection, Underwater Internet Cable Maintenance, Oil Spill Detection, and many more. These applications are proven to be beneficial in different fields starting from official government projects to business industries. The building block for underwater applications is Underwater Wireless Sensor Network (UWSN). The reason researchers are attracted to UWSN is due to a number of benefits achieved from different underwater applications. The development process of underwater applications using UWSN includes a way defined by researcher groups, in which the first step is designing UWSN architecture. To develop a successful underwater application, it’s very important to pre-design an appropriate underwater wired or wireless network architecture. In this paper, our main aim is to discuss three basic types of underwater architecture and also to discuss different application-specific UWSN architectures developed in state-of-art. Keywords Underwater wireless sensor network (UWSN) · UWSN architectures · Spatial coverage · Underwater applications
P. A. Shelar (B) Smt. Kashibai Navale College of Engineering, Savitribai Phule Pune University, Pune, India e-mail: [email protected] P. N. Mahalle · G. R. Shinde Vishwakarma Institute of Information Technology, Savitribai Phule Pune University, Pune, India J. Barot Silver Oak University, Ahmedabad, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_47
517
518
P. A. Shelar et al.
1 Introduction Network architecture is a blueprint of data communication links used for accompanying an application-specific task. The unique features of UWSN make the designing of underwater network architectures different from designing terrestrial wireless sensor network (TWSN) architectures. These unique features are underwater node mobility, energy consumption, limited bandwidth, doppler spread, propagation delay, harsh underwater environment, and so on. Therefore, designing the architecture for UWSN is a challenging job. In the state of art, there is no UWSN architecture that can overcome all challenges of UWSN, whereas each architecture tries to minimize one or two features, according to underwater application requirements. For example, in designing an architecture for a tsunami detection system the researcher must focus on UWSN architecture with minimum energy consumption and propagation delay as it is a time-critical application. Similarly, for aquatic monitoring applications, the focus is mainly on node mobility and data bandwidth.
1.1 Challenges in Designing Network Architecture for UWSN Node Mobility: The underwater nodes are mobile due to ocean currents; this mobility is about 2–3 knots [1]. The consequence of node mobility is rapidly changing network topology with varying time gaps. The spatial variability [2] of underwater nodes affects the connectivity between two nodes and hence leading to unreliable data transmission and network unavailability. Therefore, designing an underwater architecture with changing network topology is the biggest challenge. Energy Consumption: The acoustic modems require more power for data transmission mainly due to two reasons. Firstly, the communication range between two acoustic nodes is small and hence data transmission requires an n-number of hops to reach the destination. This hop-by-hop data transmission increases overall network energy consumption. The second reason for an overall increase in energy consumption is the majority of active nodes in the network. Therefore in [3], the authors have reduced energy consumption by controlling the presence of active nodes. The management of each node’s energy consumption is an important factor, which should be considered while designing the underwater architecture for any type of underwater application.
Study of Architectural Designs for Underwater Wireless Sensors …
519
1.2 UWSN Architecture Designing Step The design of UWSN architecture, for our understanding, can be divided into three basic stages as shown in Fig. 1, (i) Communication Channel Selection, (ii) Node Mobility, and (iii) Spatial Coverage. The choice made at each stage is directly related to the underwater application requirement. Communication Channel Selection: The UWSN communication channel is acoustic, radio, and optical [4]. The acoustic medium is used for applications where propagation delay is ignorable but reliable data delivery along long distances is important. Radio waves are mostly used in transmitting data from sink nodes to offshore stations or to satellites [5]. Therefore, radio waves are used in all underwater applications and hence in UWSN architecture design. The optical communication channel can be used in time-critical applications where propagation delays are not acceptable whereas the high-speed and secured transmission of data is of paramount importance. Node Mobility: The underwater sensor nodes can be of two types stationary and mobile. Stationery nodes: The nodes which are fixed to the sea bed are said as stationery nodes. These nodes are used by any underwater application with limited or fixed monitoring areas. The topology remains fixed with stationery nodes, but this scenario is not practical. Mobile: The nodes which flow with ocean currents are known as mobile nodes. They are used in underwater applications which have the requirement of huge area coverage. The mobile nodes can also be autonomous underwater vehicles (AUV) and Remotely Operated vehicles (ROV) along with ordinary nodes. In UWSN with mobile nodes, the topology changes with time. Spatial Coverage: The UWSN is a wireless underwater network that is used to perform a task within specific boundaries. And it is incorporated with acoustic or optical sensors and robotic vehicles that are trained to work cooperatively through a wireless communication medium. The surface node which is generally a buoy has the capability to retrieve the sensed data from underwater nodes and transmit it to
Fig. 1 UWSN architecture designing steps
520
P. A. Shelar et al.
the onshore station using long-range radio waves. The aggregated data is used on a local basis or connected to a different network for some specific purpose. In literate, the two basic and benchmark UWSN architectures are developed and are named as per their spatial coverages, which are 2D and 3D UWSN architectures. But the topology of UWSN is still an open research issue that needs further analysis and more investigations from the research community.
1.3 Two-Dimensional UWSN Architecture A pictorial architecture for 2D-UWSN is shown in Fig. 2. Among all underwater nodes, almost 90% of them are anchored to seabed’s with help of deep-sea anchors. These nodes are interconnected to each other and to one or more surface sink-nodes through wireless acoustic communication links. The communication link between sensors and sink can be a direct link or multihop paths. The sink-node is responsible for relaying data from ocean-bottoms to on-shore station. This is possible as sink is equipped with vertical and horizontal transceivers. The sink nodes send commands to bottom sensor nodes and in response collects sensed data using its horizontal transceiver. Next, it sends the aggregated data to an onshore station using its vertical transceiver. It is also equipped with radiofrequency or satellite transmitters to communicate with onshore station. The simplest way of architecture designing is when each node directly sends aggregated data to the selected sink node i.e. one-dimensional UWSN. But this way consumes more energy, as in many cases sink is far from node. Also, one-to-one communication links are responsible for reduction in network throughput because of increased signal interference. In case of multiple hop paths, the data is forwarded through intermediate sensor nodes until it reaches the destination ‘sink node’ located on underwater surface. Thus, making network energy-efficient and increasing network availability. But the two-dimensional networks at same time increases routing complexity.
1.4 Three-Dimensional UWSN The three-dimensional UWSN is used for sensing or detecting underwater events, that cannot be observed by two-dimensional UWSN networks. The three-dimensional UWSNs, perform cooperatively sampling of three-dimensional oceanic environment. This is possible due to the vertical and horizontal deployment of nodes at different depths. These nodes and also autonomous vehicles work co-ordinately to sense underwater application specific data. The nodes are kept at different depths by using one of the two methods. The first method is to attach each node to the buoy through a wire whose length can be adjusted to different depths for each sensor node. The deployment of nodes in this method is easy and less time-consuming but at the same time, a greater number of floating buoys may obstruct the ship’s routes. The
Study of Architectural Designs for Underwater Wireless Sensors …
521
Fig. 2 Two-dimensional UWSN architecture diagram
second approach is to anchor each node to the ocean bottom, and adjusting depth of nodes, by regulating length of the connecting rope to the anchor. An electronic engine residing on each node is used to adjust the height of ropes (Fig. 3).
Fig. 3 Three-dimensional UWSN architecture diagram
522
P. A. Shelar et al.
2 Literature Survey Trees of Wheels (ToW) Architecture [6]: In underwater networks the large number of sensor nodes are deployed by simply droping them into water. Resulting into zero anticipation of nodes location co-ordinates. The broadcasting of location information to neighbouring nodes is an traditional method to figure out random topology. The way of sensors deployment decides topology of any network and indirectly it affects various network factors like power consumption of nodes, communication reliability, network performance and fault tolerance capability of network. Therefore, Safieh Khodadoustan, et al., have analyzed how to deploy sensors in underwater environment. And came with an adaptive topology which is combination of clustering method with hierarchical structure, known as Trees-of-Wheel (ToW). The proposed architecture is hierarchical, scalable, fault tolerant and can estimate endto-end delays. Each node in this architecture in denoted as ToW (K, L, M), where K stands for number of subnetworks a node is connected to and L stands for level while M is amplitude. For example say node parameters are ToWs (K, 1, 2), then we can say its an tree with two children and is connected to each other like a circular ring, they also have a centre node to which they are connected. The unique features of these architecture is each node has different address and each centre node is connected to its father node through the nodes of other clusters and similarly connected to child nodes. Tic-Tac-Toe-Arch [7]: The architecture aims to minimize overall energy consumption of network with reliable data transmission from source node to sink node. In this study, authors have proposed an algorithm based on sleep and wakeup schedule of node which is used to keep the redundant nodes in sleep mode. Due to which majority of nodes are in sleep state, thus reducing the overall active time of network. The notation ‘O’ is used to denote active nodes whereas ‘X’ is used for sleeping nodes. Due to these notations the side view of UWSN architecture resembles like ‘X-O-X-O’. These algorithm results into increased network life by scheduling the redundant nodes to stay in sleeping state. The architecture design goals are to accomplish an efficient UWSN that can provide data reliability, virtual networking and to manage changing network topology. The connectivity between mobile source to surface nodes is maintained due to virtual architecture. The proposed architecture formation is divided into three phases. After completing network configuration, the nodes are now ready for transmitting data. Phase I: To achieve data reliability, the network architects ensure minimum one neighbour for each sensor node. The neighbours are found using request and reply messages broadcasted by a sensor node, which is willing to connect to an neighbouring node coming within its range. Phase II: The best neighbour selection was done on basis of two parameters. The first is amount of time required by an neighbouring node to go out of communicating range of sender nodes. This time period is named as ‘time-to-disconnect’
Study of Architectural Designs for Underwater Wireless Sensors …
523
(td ) value. The td value varies with node mobility, higher the mobility lesser is time-to disconnect. And the second important parameter was the amount of battery power remaining in active nodes. Phase III: The last phase is responsible for minimizing the energy consumption, in which all the other neighbours except the selected one, are scheduled as redundant nodes for td value. The command messages are broadcasted in this phase to inform all neighbours about sleep-wakeup scheduling. The previous UWSN architectures, such as EDETA and Multipath Virtual Sink, do not take into consideration the concept of topological changes. Therefore, Tic-Tac-Toe architecture can be said as an energy-efficient architecture as compared to previous architectures. 3D-Cluster-Based UWSN Architectures for Detecting Underwater Intruders [8]: The authors Md. Farhad Hossain, Musbiha Bintem et al. [9], have proposed a three-dimensional UWSN for intruder detection using Electromagnetic wave (EM) as underwater communication medium. The architecture designed is three-dimensional cluster-based in which the cluster heads can detect intruder. The collected data from sensor nodes and cluster heads is further transmitted to an onshore station with an intention to identify the 3D location of intruder. The data used by an onshore station for identifying intruder is received signal strength, sensor nodes and cluster heads location points. The project was application specific as it aimed to localize intruder for underwater surveillance network. The accuracy of identified intruder location is verified using normal mean square error and square root of mean square error. The sensor node can sense the presence of an attacker and then accordingly it transmits its location co-ordinates to cluster head using electromagnetic wave. The sensor nodes and cluster heads are programmed to switch to active mode after identifying intruder and after receiving location information and signal power. The crucial data is further relayed to cluster heads of immediate upper layer. The topmost layer cluster head relays the data to the floating sink node. The data reaches its final destination from sink node to onshore station with an intention to estimate the location of the intruder. The proposed work is based on electromagnetic waves as according to authors research work, the acoustic wave based UWSN projects fail to detect attacking nodes at fastest possible speed. Most of the UWSN developed worldwide are based on acoustic communication. However, the lesser transmission speed of sound waves and increasing number of autonomous underwater vehicles, suggest that there is an emergency for an alternative for underwater communication medium like optical or electromagnetic waves. This proposed UWSN architecture is used for surveillance of water boundaries, mine detection, for prevention of natural calamities and oceanographic applications. Energy-Efficient Adaptive Hierarchical and Robust Architecture [9]: The sensor node of the proposed architecture can do cluster formation. The tree architecture is formed between the cluster heads in order to send aggregated data from other nodes to sink nodes through multiple hops. The architecture works on EDETA routing protocol with more than one sink. The multi-sink architecture allows more scalability and fault tolerant. The protocol works in two phases, first is initialization and second is the normal operation phase. The cluster formation and the cluster-heads selection
524
P. A. Shelar et al.
are done in the initialization phase whereas the actual data transmission is done in second phase. In order to reduce energy consumption, the network structure is broken after certain period of time and the first phase starts again. Multipath Virtual Sink Architecture [10]: The author Seah and Tan have developed an architecture for underwater networks that aims to provide energy efficiency and robustness. The architecture delivers data using a multipath routing scheme to overcome the long delays and harsh underwater channel conditions. The architecture consists of clusters with each cluster having one cluster head. The architecture forms a mesh topology, which is connected to a local sink node in order to increase the network lifetime. The high-speed link like optical fibre or wires are used to connect the local sink node to a surface buoy. The multipath virtual sink architecture is formed by guaranteed data reception by one or more sink nodes. The architecture is named multipath as it is designed to maintain ‘n’ routes to the neighboring local sink nodes. The architecture is robust and reliable as it ensures data delivery due to multipath routing and virtual sink strategies. Inter Cluster and Intra Cluster Architecture [11]: The underwater nodes consume a huge amount of energy, as acoustic signal require higher power for signal transmission and higher propagation delays increases data transmission time. The replacement of nodes batteries in UWSN is also difficult which makes energy as a precious resource of UWSN. Therefore, to achieve this the authors N. Goyal et al. have proposed Minimum Average Routing Path Clustering Protocol (MARPCP), Fuzzy Logic for cluster head (CL) and cluster size (CS) selection and Multi-path RoutingLEACH methods. The basis of these three methods is that accurate selection of cluster heads, cluster-size and routing strategies will indirectly reduce the energy consumptions. In most of the literature, cluster head and cluster size are elected and selected by using three parameters which are, nodes residual energy, distance from other node and density of node. But in this work, the researchers have introduced an addition of two more parameters for electing CL and estimating CS. These two parameters are, ‘Node Load’ and ‘Link Quality’. In order to achieve this, an intra and inter cluster communication strategy is proposed in this paper. The MARPCP is used for intra cluster transmission whereas LEACH method is used for inter-cluster communication. The MARPCP algorithm divides the sensor network into clusters and gives right to sink node to choose cluster members. And the cluster head is elected by cluster members. The cluster head then transfer the aggregated information from cluster members to cluster sink. For this, the CH broadcast a HELLO message on behalf of its cluster. The sink receives multiple requests among which it selects those cluster heads which are at nearest distance. The cluster heads use neighboring clusters to transmit their data to the sink node via multiple hops. This type of communication is called inter-cluster communication and to achieve it LEACH algorithm is used in the paper. It is responsible to create multiple paths between sender and receiver. The algorithm reduces the long range of data transmission to sink and also reduces the energy consumption therefore making the algorithm an
Study of Architectural Designs for Underwater Wireless Sensors …
525
appropriate choice for UWSN. The performance of inter intra-clusters communication was calculated using simulations. The results showed that the proposed algorithm reduces energy consumption and end-to-end delays. OA-UWSN Architecture [12]: The authors have identified the problems of slow and unreliable data transmission in UWSN with acoustic sensors. To overcome this drawback the authors Zhe Chen have proposed a wireless network with combination of the acoustic and optical sensors. The combination of acoustic and optical communication medium has become the main communication methods for recent underwater wireless networks. The researchers have also designed a new MAC protocol with acoustic and optical handshake with a purpose to achieve communication reliability. The optical-acoustic UWSN includes four types of nodes, that are, (i) fixed nodes, (ii) mobile nodes, (iii) sink nodes, and (iv) control centre. The fixed nodes do primary job of collecting and transmitting data and then shifts to sleep mode for another timeperiod. Mobile nodes are used for analysing the environment and thus minimizes energy consumption and increases network -life. The sinks are data aggregating nodes which collects all data from mobile nodes and relay it to control centres or onshore station. The communication among sink and control centre is done using radiochannel whereas acoustic and optical channels are used for communication between mobile and sink nodes. The results are obtained using the OPNET++ simulation tool. The simulation results prove that the proposed protocol can perform reliable data transmission with minimum energy consumption as compared to underwater acoustic networks. Architectures for Monitoring Underwater Pipeline [13]: The sea-beds has a mesh of pipelines used for different purposes. For example, the optical fibre cables are laid to connect the world through internet. There are also underwater pipelines which are used to transfer oil, natural gas, or petroleum from Galiot to the outside world. The maintenance of underwater pipeline infrastructures is considered as most important asset to maintain financial growth of manufacturer and consumer countries. Therefore, by understanding depth of the problem the researchers in [20] have developed and compared various UWSN architectures that can be used for monitoring pipelines. These UWSN architectures are underwater wired network, underwater acoustic wireless sensor network, underwater RF (radio frequency) wireless sensor network, integrated wired and radio or acoustic wireless sensor networks. These network architectures are compared using three factors: physical security, connectivity of network and the power supply to network. In this research work, authors have developed and analysed a variety sensor network architecture for pipeline monitoring. The five underwater architectures discussed in this work have different degree of reliability. In wired networks there is a possibility of a single point of failure, resulting into non-reliable data transmissions. According to researchers, by using long-range communication channels one can increase the reliability degree of underwater wireless networks. But an increase in wireless transmission range leads to increased energy consumption. In the integrated wired/wireless networks, the batteries are used only if there is fault in wires. Thus, limitations of powers supply are not a problem for integrated networks.
526
P. A. Shelar et al.
Underwater Swarm Architecture [14]: The researchers have presented a combined acoustic and optical approach for underwater swarm communication. The swarm is a wireless network of mobile nodes and is smart enough to adopt itself to different networking requirements depending on arrangement of nodes. Such networks are required when an application has two opposite requirements like, to transmit data at low or high speed for long or short distance. The origin of these work was a project studied on VENUS vessel which required an underwater communication network that functions beyond the present underwater modems. To fulfil the project needs a special designed sensor was developed, known as SWAN-SAR (SWArm Networking-Short Acoustic Range). The nodes were special due to its characteristics like low power requirement for data transmission, reduces the multipath effect due to high modulation frequency. The combined use of acoustic and optical communication system is the best choice for swarm architectures as these two techniques are viewed as complimentary to each other. The work says that the proposed system is most flexible as the acoustic-optical system can operate combinedly, individually or also in hybrid mode. The choice of mode depends on swarm control systems and is decided at run time. Aqua-Net [15]: The UWSN is emerged as a backbone for marine applications. The propose work aimed to provide a common platform named as Aqua-net that will ease the process of underwater application development. It follows layer structure and serve as a base for designing of underwater architecture and protocols. The framework also supports cross-layer as it includes a vertical layer which is accessible for all applications and protocols. The Aqua-net is used to implement one MAC protocol and two routing protocols. The three features of Aqua-net are as follows (i) expandable protocol stack: it is easy to incorporate a new model into Aqua-net stack, (ii) easy to debug the system software (iii) easy to update the protocol stack. The Aqua-net also facilitates cross-layer architecture, because sticking to layered design is not sufficient to work with dynamic UWSN. The cross-layer design is important because, (i) the battery consumption is affected by all layers; thus, it is important to involve all layers to reduce energy consumption, (ii) some parameters of network protocol may require access to multiple layers. (iii) to increase overall system performance crosslayer approach is important. The protocol layers are implemented in kernel or user space. The Aqua-net system is run in user space to make the system user-friendly and flexible to use. Therefore Aqua-net is suitable platform for underwater application development. Cluster Based Energy-Efficient Architecture [16]: The UWSN has to face many challenges like limited bandwidth, high propagation delays, minimum power, mobility of node and many more. Among all the above problems, the most crucial issues are nodes energy and reliability. As the underwater nodes cannot be charged and battery is difficult to replace. Therefore to address this issue, the author in [] have proposed an energy-aware clustering architecture with reliable data transmission. The architecture works on two-step procedure. The first step is to select cluster heads while the second step is searching of the shortest path between head node and sink by using the Euclidian distance formula. The cluster head selection is done by
Study of Architectural Designs for Underwater Wireless Sensors …
527
selecting nodes with maximum amount of energy. And to calculate energy efficiency, the researchers have used two parameters, throughput and reliability [13]. There is also an assumption made, that at least one node is with max energy level, but if no node has maximum energy then no cluster head will be selected. The experimental results showed that the proposed work can decrease the communication cost and increase the network availability to some extent. UWSN Architecture for Event Detection [17]: The UWSNs has proven to be a promising solution for exploring underwater world and thus nowadays to encourage research on underwater environments, many sensor nodes are deployed on seabed [1]. The issue tackled in this research work is identifying an event and accordingly performing an action as a response with awareness of two challenges of UWSN, which are limited battery of sensor nodes and location identification of node. The UWSN applications like underwater pollution monitoring or marine fish tracking are depended on monitoring of underwater environment. The underwater nodes generally sense and record sensory data depending on application requirement. The gathered data is then routed to sink nodes in reactive or proactive fashion [5]. Therefore, for efficient data sensing activity the network should be energy efficient and at same time should determine the exact event-location. For example, an UWSN application of monitoring of marine shellfishes, where the water properties like temperature, salinity, pH value are gathered and analysed to detect a pollution event. An immediate and proper response should be generated for event coverage. In case when there is no event situation, the more than half number of sensor nodes do not need to route their data to sink nodes. This approach is used to achieve energy efficient underwater wireless sensor network. The architectural design for event determination, is actually a set of nodes deployed in hierarchical fashion in a 3D-eucledian space, where sink node is embedded with global positioning system. The sink node is generally a buoy with radio and acoustic transceivers, with a purpose to connect to underwater nodes and base-station. The sensor nodes in this architecture are static, mobile or semimobile. Therefore, the cost of localising a node may be high due to nodes drifting with water flow. Thus, a representative node of any physical node mainly known as virtual node is used as a cheaper option for accurate event detection. Thus, in accordance to above discussion this paper has proposed a prediction model for event localization and it is embedded into sink node. Pipeline Architectures for Underwater Big Data Analytic [18]: The tsunami monitoring applications, marine object tracking, water quality monitoring systems, threedimensional map generation of sea-beds and many more such applications generate a numerous amount of bigdata. There is various technique used to collect and relay data to sink node like wired or wireless underwater networks, using AUV to collect and relay data to a predefined location. But among all methods the wireless underwater communication is proven to be best solution for collecting and relaying data to sink nodes. Due to cost and power efficiency with long-distance coverage. Thus, to address this problem the authors have proposed several pipeline architectures for analysing bigdata. The proposed architecture increases the speed of data processing and transmission. They have also proposed five different types of underwater nodes
528
P. A. Shelar et al.
to construct this pipeline architecture. These nodes are named as per there functionalities such as sensing nodes (s-node), sensing-reduction node (sr-node), nodes for only processing are (p-nodes), super nodes, gateway nodes are (g-nodes). The paper has proposed a set of four pipeline architectures, by incorporating the five different network nodes. There are two pipeline architectures with p-nodes known as homogenous and heterogenous pipeline architecture. The third architecture is developed by using super-node. The pipelining aims to minimize propagation delay, reduce energy consumption, and increase overall throughput.
2.1 Comparative Study See Table 1.
3 Proposed Architecting Methodology The comparative study and the everyday increasing underwater applications suggest the need to upbringing innovative underwater architecture. The study of various pros and cons of the underwater environment and different underwater architectures indulged us in proposing a generic UWSN architecture. The word generic is used as it will serve as a common platform for multiple UWSN applications. The proposed architecture is named Acoustic and Optical Sensor-based Clustered Underwater Wireless Architecture (AO-CUWA). As the name suggests the architecture is designed with a combination of optical and acoustic sensors. The reason was the distance traveled by optical and acoustic waves. The coverage of optical communication medium is 10–30 km whereas for acoustic it’s up to 100 km. Now, as in state-of-art, the UWSN architectures can be designed only using acoustic sensors. But it hampers data reliability and data security which is well achieved in combinationbased architectures. The basic functioning of all underwater applications is data sensing and forwarding it to the sink node or surface buoys by using acoustic or optical communication mediums. The sink node further transfers the data to an offshore station by using a radio wave as the communication medium. In UWSN sensing accurate data, data reliability, data security, and data analysis are of prime importance. The energy of underwater nodes is another concern as the underwater sensor’s batteries cannot be replaced and recharged. Therefore, systematic data transmission with minimum energy consumption is one of the biggest challenges for UWSN. In order to resolve these issues, clustering is one of the best possible options. Therefore, AO-CUWA is a clustered-based architecture where only the clustering of optical sensors is performed. As the major players of AO-CUWA are optical sensors, as they require maximum node energy to sense and relay data. The clustering strategy used by AO-CUWN is the OSNR-based Energy Efficient Clustering algorithm (OECA)
Study of Architectural Designs for Underwater Wireless Sensors …
529
Table 1 Comparative study of different UWSN architectures UWSN architecture name
Energy consumption
Security (Y/N)
Application specific
Clustering (Y/N)
Spatial coverage
Trees of wheels (ToW) architecture [6]
Not considered
N
NS
Y 3D Self-configuration algorithm. Ranking algorithm
Tic-Tac-Toe [7]
Less
N
Non-time critical applications
N
3D and 2D
3Dcluster-based UWSN [8]
More
Y
Surveillance applications
Y EM-based UWSN
3D
EDETA [9]
Less
N
Real-time Y testing of other LEACH MAC protocols
3D & 2D
Multipath virtual More sink architecture [10]
N
Environmental N monitoring and Surveillance applications
3D
Inter-cluster and intra cluster architecture [11]
Less
N
NS
Y MARPCP
2D
OA-UWSN architecture [13]
Less
N
Time-critical applications
N
3D
Architectures for Less monitoring underwater pipeline
N
Pipeline monitoring
N
2D,3D
Underwater swarm architecture
Less
N
Application with two opposite requirements
N
3D
Aqua-Net
Less
N
All applications NA
Cluster based energy-efficient architecture
Less
N
Long-term applications
Y NA Clustering based on nodes energy
UWSN architecture for event detection
Not considered
N
Marine shellfish tracking, environment monitoring
N
3D
Pipeline architectures for underwater big data analytic
Less
N
Tsunami monitoring applications, marine object tracking
N
3D
NA
530
P. A. Shelar et al.
Fig. 4 Proposed acoustic and optical sensor-based clustered underwater wireless architecture
[15]. The proposed architecture is highly secured as both the communication channels acoustic and optical are secured with post-quantum cryptographic algorithms. The cryptographic algorithms are designed with a discipline of minimum energy and memory consumption. As per our knowledge, there are very rare architectures that focus on underwater data security. The below diagram gives an idea of the proposed architecture. The orange colour nodes denote the optical sensors whereas the blue colour is for acoustic sensors. The circle denotes clusters of optical sensors. The dotted lines denote acoustic communication medium whereas the orange colour lines indicate acoustic communication medium (Fig. 4).
4 Conclusion In this work, we overviewed different types of UWSN architectures. The state-ofart and comparative study done will ease the underwater application development process. The architecture designing varies with application requirements, like timecritical applications require high-speed data transmission but whereas surveillance applications require secure data transmission. In comparative study, its observed that security of underwater data transmission is less explored area. In future study, we aim to research on implementation issues of underwater architectures and as well as evaluating different technologies to implement underwater architectures. These research works are a way towards underwater application development.
Study of Architectural Designs for Underwater Wireless Sensors …
531
References 1. Yang G et al (2018) Challenges and security issues in underwater wireless sensor networks. Elsevier 2. Akyildiz IF, Pompili D et al (2007) Challenges for efficient communication in underwater acoustic sensor networks. ACM 3. Detweiller C et al (2007) An underwater sensor network with dual communications, sensing, and mobility. IEEE, OCEANS 4. Detweiller C et al (2007) The state of the art in underwater acoustic telemetry. IEEE, OCEANS 5. Awan KM et al (2019) Underwater wireless sensor networks: a review of recent issues and challenges. Hindawi 6. Khodadoustan S et al (2013) Tree of wheels: a new hierarchical and scalable topology for underwater sensor networks. Wirel Inf Netw Bus Inf Syst 7. Ojha T et al (2013) Tic-Tac-Toe-Arch: a self-organising virtual architecture for underwater sensor networks. IET Wirel Sens Syst 8. Mangla J, Rakesh N et al (2018) Cluster-based energy-efficient communication in underwater wireless sensor networks. In: International conference on smart system, innovations and computing, smart innovation, systems and technologies. Springer 9. Climent S et al (2012) Underwater sensor networks: a new energy efficient and robust architecture. Sensors 10. Seah WKG et al. Multipath virtual sink architecture for underwater sensor networks. IET Wirel Sens Syst 11. van Kleunen W. Proteus II: design and evaluation of an integrated power efficient underwater sensor node. Int J Distrib Sens Netw 12. Mohamed N et al (2011) Sensor network architectures for monitoring underwater pipeline. Sensors 13. Ojha T et al (2013) Into the World of underwater swarm robotics: architecture, communication, applications and challenges. Sensor 14. Peng Z et al (2018) Aqua-Net: an underwater sensor network architecture—design and implementation. Wirel Sen Syst 15. Shelar PA, Mahalle PN, Shinde GR, Bhapkar HR, Tefera MA (2022) Performance-aware green algorithm for clustering of underwater wireless sensor network based on optical signal-to-noise ratio. Math Probl Eng
8IoT Based Single Axis Sun Tracking Solar Panel D. Priyanka, M. Yasaswi, M. Akhila, M. Leela Rani, and K. Pradeep Kumar
Abstract Analysis and implementation of a solar tracking device to produce energy from renewable sources are the project’s main goals. In rural areas, this is meant to increase demand for electricity. With the help of an Arduino Uno, LDR sensors, a motor, and a solar panel, the solar panel tracks the path of the sun to maximize energy collection. Utilizing the charge controller and anticipated loads, the battery stores the energy generated by the solar panel. A comparison between clear and cloudy days is made hour by hour after the data has been analyzed. We are also implementing solar home automation in order to make efficient use of the solar energy that was stored in the battery and to automatically turn on and off loads based on the quantity of solar energy used. And we also added a lcd display to show the voltage generated and being used and also the humidity and temperature of the day. Keywords Arduino · Solar · Stepper motor · LDR · Tracking
1 Introduction More methods and tools for producing electrical energy from renewable energy sources have been investigated by mankind. Renewable energy is energy produced from organically replenishing natural resources. Of all the discovered sources of sustainable energy, solar energy is the most appropriate [1]. All living creatures receive energy, heat, and light from the sun. Solar power is a renewable resource that can be used at no cost. In addition, solar energy is pollution-free, good for the ecosystem, and abundant. Solar energy is a type of energy that the sun produces as solar rays. Solar panels capture and receive solar radiation from the sun, turning it into energy from electricity. Intensifying the light that strikes a solar panel is one way to increase its efficiency [2].
D. Priyanka (B) · M. Yasaswi · M. Akhila · M. L. Rani · K. P. Kumar Seshadri Rao Gudlavalleru Engineering College, Gudlavalleru, Andhra Pradesh, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_48
533
534
D. Priyanka et al.
The best solar panel technology keeps panels aligned with where the sun is in relation to improve efficiency. Solar power has the benefit of being portable; Anytime or anywhere small power generation is required, it can be used. Single axis trackers are a type of technology that moves a solar panel along an axis to monitor the sun’s shifting position as the days and years go by Floyd [3]. To achieve the lowest angle of incidence, the panel is adjusted (angle at which the sun hits a solar panel). The trackers tilt to follow the sun as it travels from east to west throughout each day in order to maximize energy production. Both inactive and active trackers are possible. While inactive trackers used compressed fluid that switched sides when warmed by the sun to cause them to change their tilt, active trackers used gears and motors to move solar panels. Trackers can also be maintained in alignment by determining an ideal panel angle based on GPS data or, more simply, by rotating the panels anticlockwise [4].
2 System Description A single-axis tracking system for sunlight based on an adaptable and portable Arduino platform is the suggested system. The Arduino is connected to the stepper motor which helps panel to rotate and to the two LDR sensors which senses the light [5]. The Arduino is also connected to the load i.e. voltage and also a display which displays temperature, volatge, humidity, charge of the battery. The Arduino rotates the solar panel to the lighting position using data from the two LDR sensors, a motor controller, and stepper motor [6]. The voltage divider technique is used to make the voltage sensor. After being kept in the battery, the energy produced by the solar panel is then sent to the loads. By shifting or modifying the angle of solar panels in response to the sun’s position, a solar tracker carries out the tracking function [7]. The movement is controlled by the time of day. A tilted PV panel mount and a single electric motor are used in a single-axis solar tracking device to move the panel roughly in relation to the Sun’s position [8]. The rotation line may be oblique, vertical, or horizontal. Figure 1 shows a picture of a single axis solar tracking device.
2.1 Arduino Uno An open-source microcontroller device is the Arduino Uno. It is based on the ATmega328P microprocessor from Microchip [9]. The board has sets of digital and analogue input/output (I/O) pins that enable it to connect to various expansion boards (shields) and other electronics [10]. The board has 14 digital I/O pins and 6 PWM output pins spread across its 6 conventional I/O pins. It can be configured using a type B USB connection and the Arduino IDE (Integrated Development Environment). It can be fueled by a USB cable or an external 9-V battery, and it accepts voltages
8IoT Based Single Axis Sun Tracking Solar Panel
535
Fig. 1 Solar tracking system block diagram
Fig. 2 Standard board of Arduino
between 7 and 20 V. It shares some similarities with the Arduino Mini and Leonardo (Fig. 2).
2.2 Solar Panel A Solar panel, also known as “PV panels,” are a type of equipment that converts the energy contained in the particles that make up solar light into electricity that can be used to power various electrical devices [11]. Solar panels can be used for a wide variety of other things in addition to generating electricity for residential and business solar electric systems, including remote power systems for cabins, telecommunications equipment, remote sensing, and more (Fig. 3).
536
D. Priyanka et al.
Fig. 3 Solar cell
Fig. 4 Stepper motor
2.3 Stepper Motor Stepper motors actually consist of several “toothed” electromagnets organized as a stator revolving around a central iron rotor [12]. An external motor circuit or a microcontroller powers the electromagnets. One electromagnet is powered in order to magnetically attract the gear teeth and spin the motor shaft. The teeth of the gear are slightly misaligned from the next electromagnet when they are lined up with the first electromagnet. In other words, the gear rotates a little to line up with the next electromagnet when the first one is turned off and the next one is switched on. The procedure is then repeated [13]. A complete rotation is made up of an integer number of steps, each of which is referred to as a “step.” As a result, the engine can be turned by a precise angle (Fig. 4).
2.4 LDR The Light Dependent Resistors a unique form of resistor that, as its name implies, changes resistance in response to light intensity [14]. It operates on the photoconductivity principle. With a rise in light intensity, its resistance falls. It is frequently used as a light sensor, light meter, automatic street lamps, and in locations where light
8IoT Based Single Axis Sun Tracking Solar Panel
537
Fig. 5 LDR
sensitivity is required. Another name for it is a light sensor. LDR typically come in 5, 8, 12, and 25 mm sizes (Fig. 5). To obtain the required output voltage signals in the circuit A 4.7 k equivalent resistor is used in the example above. 5 V is the regulated voltage source. The following equation is used to determine the output voltages [15]: Assuming R2 is 4.7 k and a voltage divider with Vout = [R2 /(RLDR + R2 )]Vcc , The maximal output voltage is 0.02 V. Let R1 = RLDR(max) = 1M. When R1 is set to RLDR(mid) = 1 k and R2 is assumed to be 4.7 k, the resulting voltage is 4.12 V.
3 Implementation of Software 3.1 System Flowchart Initializing starts with the header files, variables, inputs, and outputs. The dada from LDRs is read after that [16]. The two LDR and sensitivity differences are contrasted. If the measured difference between the collection of sensors is greater than the sensitivity number, the direction pin on an Arduino is HIGH. The motor turns clockwise (CW). Alternately, if the Arduino’s direction pin is LOW, the motor rotates in the CCW way (Fig. 6).
538
D. Priyanka et al.
Fig. 6 Flowchart
3.2 Compilation and Simulating The Arduino IDE software is created to simulate and compile the entire system. Figure 7 displays successful test results for generated code.
4 Result and Analysis Hour by hour comparisons are made using the data from the two LDRs after being transformed to analogue voltages. However, not much has altered throughout the day. To save energy, the system is turned off if there is no illumination. The contrast of the voltages on a cloudy day is shown in Table 1; Graph 1. As seen in Figure, a line chart is used to depict the voltages from the two LDRs (Table 2; Graph 2). A line chart is used, as shown in Figure, to depict the voltages from the two LDRs (Figs. 8, 9, 10 and 11).
8IoT Based Single Axis Sun Tracking Solar Panel
539
Fig. 7 Program compilation
Table 1 Cloudy day voltage comparison
Graph 1 Cloudy day voltage
Time
LDR1
LDR2
8:00
3.76
3.89
9:00
3.97
4.23
10:00
4.25
4.53
11:00
4.58
4.82
12:00
4.87
4.76
13:00
4.70
4.65
14:00
4.70
4.56
15:00
4.33
4.37
540 Table 2 Sunny day voltage comparison
Graph 2 Sunny day voltage
Fig. 8 Panel at noon
D. Priyanka et al.
Time
LDR1
LDR2
8:00
1.56
1.62
9:00
1.83
1.86
10:00
2.41
2.53
11:00
2.66
2.73
12:00
2.90
3.10
1:00
3.20
3.20
2:00
2.57
2.80
3:00
2.24
2.55
8IoT Based Single Axis Sun Tracking Solar Panel
Fig. 9 Panel in morning facing east
Fig. 10 LCD Display
Fig. 11 Panel in evening facing west
541
542
D. Priyanka et al.
5 Conclusion A single-axis photovoltaic tracking system was created and constructed in the present work using the Arduino platform. The sun’s light strength was measured using LDR light sensors. The assistance of solar cells. The stepper motor’s torque was sufficient to move the screen. The best option for the Endeavour is stepper motors because they are quiet and reasonably priced. This solar tracker is made to be reliable, cost-efficient, and small enough to be used in remote areas. This work’s advancement in concept for green energy was intended to benefit the populace [17]. By using a different solar panel, better sensors, and a different charge controller design, this system can be made to power the complete house. Because they are thought of as sophisticated systems with moving parts, solar systems are a little more costly than their stationary counterparts. In comparison to fixed systems, trackers need greater upkeep. The practical applications to be considered are In distant areas, it can also be used to generate power. It could be utilised as a home power backup system. The solar street lighting system can utilise it. The applications to be considered for future are dual axis solar tracking systems where you can use two axis instead of one.
References 1. Rosenblatt A, Aaron N (2014) Solar tracking system. Project Report for Swarthmore College Engineering Department 2. Otieno OR (2009) Solar tracker for solar panel. University Of Nairobi 3. Floyd TL (2007) Electronic fundamentals: circuit, devices and application 4. Armstrong S, Hurley WG (2009) Investigating the effectiveness of maximum power point tracking for solar system. In: The IEEE conference on power electronics specialists 5. Damm J (1990) An active solar tracking system. Home Brew Mag 17 6. Monk S (2011) Programming Arduino: getting started with Sketches 7. Deepthi S, Ponni A, Ranjitha R, Dhanabal R (2018) Comparison of efficiencies of single-axis tracking system and dual-axis tracking system with fixed mount. Int J Eng Sci Innovative Technol (IJESIT) 2(2) 8. Beltran JA, Gonalez Rubio JLS, Garcia Beltran CD (2007) Design, manufacturing and performance test of a solar. Tracker Made by Embedded Control, CERMA, Mexico 9. Gupta S, Sharma N (2016) A literature review of maximum power point tracking from a PV array with high efficiency. IJEDR 4(1) 10. Cooke D (2011) Single versus dual axis solar tracking. Alternate Energy eMagazine 11. Gupta S, Singh OV, Urooj S (2017) A review on single and multijunction solar cell with MPPT techniques. In: 3rd IEEE international conference on nanotechnology for instrumentation and measurement. GBU, India, 16–17 Nov 2017 12. Abu El-Sebah MI (2016) Simplified intelligent universal PID controller. Int J Eng Res 5(1):11– 15 13. Lokhande K (2014) Automatic solar tracking system. Int J Core Eng Manage (IJCEM) 1
8IoT Based Single Axis Sun Tracking Solar Panel
543
14. Saxena AK, Dutta V (1990) A versatile microprocessor-based controller for solar tracking. Proc IEEE 1105–1109 15. Safan YM, Shaaban S, Mohamed I (2016) Hybrid control of a solar tracking system using SUI-PID controller 978-1-5090-6011-5”2017 IEEE Ayushi Nitin Ingole (2016, May 24–26) Arduino based solar tracking system. In: International conference on science and technology for sustainable development. Kuala Lumpur, Malaysia 16. The Sun’s Position (2013) In photovoltaic education network, from http://pveducation.org/pvc drom/properties-ofsunlight/suns-position 17. Ingole AN (2016) Arduino based solar tracking system. In: International conference on science and technology for sustainable development. Kuala Lumpur, Malaysia
Blockchain Protocols: Transforming the Web We Know Hitesh Jangid and Priyanka Meel
Abstract The concept of Decentralized Internet or Web3.0 has gained the audience’s attention and has shown exponential growth. The researcher’s community believes that Web3.0 is the next big step in the progression of the web we know now. Bitcoin whitepaper made Blockchain famous among researchers and developers. Blockchain is the technology that provides Decentralization, Auditabilty, Consistency, and Anonymity. The introduction of blockchain protocols compatible with smart contracts such as Ethereum, Near, Polygon, Solana, etc., makes it easier for the developer to quickly build Decentralized Applications, i.e., Dapps. These Decentralized Applications create Decentralized Internet and an alternative to web2.0. In this research paper, We have extensively discussed the core Blockchain technology inspired by the Bitcoin whitepaper and four Blockchain protocols: Ethereum, Near, Solana, and Polygon. We have distinguished these protocols based on the Consensus mechanism, Level of Blockchain, Speed of Transactions, Smart Contracts, etc. Keywords Blockchain · Decentralization · Ethereum · Near · Polygon · Solana · Web3.0
1 Introduction As of 2022, more than 5 billion people are using the Internet daily, reflecting their trust and faith in it. We must ensure the Internet is safe and secure. Currently, a significant population is using the second generation of the world wide web (www), also known as web2.0. Web2.0 have revolutionized the 21st century with its features. Still, it has drawbacks, including centralization, censorship, data piracy and breaching, Next to no data privacy, etc., making web2.0 a weapon of tracking and manipulating users’ minds and decision-making ability. Data is very crucial with advanced technology available today. H. Jangid (B) · P. Meel Department of Information Technology, Delhi Technological University, New Delhi 110042, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_49
545
546
H. Jangid and P. Meel
Web3.0 is the 3rd generation of the world wide web (www). Web3.0 uses Blockchain to achieve maximum decentralization, enabling more effective artificial intelligence and machine learning-based content. It is intelligent, Personalised, interoperable, and allows high virtualization. In addition, web3.0 improves the security of users’ data by eliminating central authority and providing data ownership to the user. Web3.0 is the decentralized Internet to which Blockchain provides decentralized cryptographic storage, which stores the data over peers available on the network known as the node instead of the central data center. It uses various consensus protocols as proof of work(PoW), proof of stack(PoS), etc. [1], through which data is stored and verified by the peers available on the network. This peer-to-peer communication is enabled using Blockchain, also strengthing its security. The enhanced security provided by this technology has reached wider audiences through protocols such as Ethereum [2], Near [3], Solana [4], and Polygon [5], allowing us to create decentralized applications (dapps). Exploring these protocols gives developers the freedom to develop decentralized applications. Study the parameters of different protocols such as consensus protocols, TPS, Gas fee, and ease of writing smart contracts.
2 Background 2.1 Web1.0 The period of early internet between 1991 and 2004 is known as web1.0. During this time the World Wide Web (www) was different from what we know it today; websites used to be static only [6]. Most users are consumers only, mostly users only read the published content on the web. Advantages • It provides knowledge to users online. Information is just one click away. • It allows users to create their portfolio on the web and present it to the world. Disadvantages • Mostly read-only. • Only admins have privileges to modify content on the website. • Programming language required to post any information on the web.
2.2 Web2.0 The next big evolution of the www also known as world wide web, the Internet we are using from 2004-today, majorly falls under the category of web2.0 [6], where a
Blockchain Protocols: Transforming the Web We Know
547
user reads published content and generates content on the web. Most websites are dynamic in nature. Web2.0 is often called the social web. Social media, Email, Video streaming, Multiplayer online gaming, etc., are examples of Web2.0 [7]. Advantages • • • •
Read and write nature. It provides platform users to generate and share content, not only consume. Developer-friendly tools make life easy for website developers. Improved person-to-person communication and created great opportunities for business. • It provides interactivity on the web. Disadvantages • Most web2.0 is centralized and controlled by tech giants like Google, Amazon, Meta, etc. • Censorship of content on the web. • Web services are costly. • Collection, selling, and misusing user data without the user’s permission. e.g., Cambridge Analytica [8]. • Piracy of content of the user Issues. • Admin can remove user content from their platform.
2.3 Web3.0 Web3.0 is the latest generation of www also known as the World Wide Web [9]. Web3.0 uses artificial intelligence and machine learning to achieve maximum decentralization through Blockchain. This enables people-to-people communication, and the User keeps his ownership of the data or digital property and can sell without the involvement of a central authority. Tim Berner Lee is behind the core principle from which the current world wide web originated [6]. He backs the idea of transforming the web into a giant database on which we can perform complex and valid queries. He has also coined the term “Semantic Web,” the capability of web2.0 combined with the semantic web will enable access to massive data. Netflix CEO Reed Hashting believes web3.0 will be Full video web as bandwidth will be 10 Mb all the time [7]. There are different definitions of web3.0 from experts according to their expertise. Web3.0 be can be defined in the following: Intelligent: It is said that web3.0 will be Intelligent and able to execute innovative applications built on artificial intelligence. It will enable the web to human-computer interaction more intelligent. It will make the web powerful as it can understand different languages with proper semantic meaning; therefore, Users can interact on web3.0 in their native language.
548
H. Jangid and P. Meel
Personalized: In web3.0 Blockchain is the core technology that focus on personalizing user data. The User owns his data while storing and sharing data with other peers. Interoperable: Web3.0 promotes open source projects, which leads to sharing of knowledge and information that can be customized to make software that can run on different devices. Web3.0 applications are interoperable between various devices like computers, mobile, TV, microwaves, etc. Virtualization: Web3.0 enables high bandwidth and High-resolution graphics, making it easier to create a 3D virtual life experience. Blockchain is core technology of web3.0 which provides features such as Decentralization, Auditabilty, Consistency, and Anonymity
3 Blockchain Ever since the white paper named “Bitcoin: A peer-to-peer Electronic Cash System” [10] explaining the decentralized digital currency was introduced by the pseudonym Satoshi Nakamoto, the term Blockchain became popular among the audience. This paper uses cryptography, Proof of work [1], and Decentralization technology together in digital cash transfer where users can directly transfer payment to another user without the involvement of any central financial authority. The concept behind the bitcoin white paper is decentralization, which attracts experts from various fields; from that, Decentralized Internet and Web 3.0 got very eminent. Properties of Immutability, Security, Proof of work, and decentralization have made audience trust this technology. Blockchain is decentralized storage on the public network where connected devices act as nodes; a node can be full or partial. This storage is often called a digital ledger. Every node in the network keeps a copy of this ledger. This ledger is a collection of valid transactions, and these transactions are stored in Blocks. Every transaction is verified by the owner’s digital signature, which ensures its safety and immutability. Blocks get broadcasted to every node in the network, and only valid transactions are accepted. The first block in the Blockchain is Genesis Block, and the subsequent block carries the previous block’s hash. This way, all the blocks form a chain; hence it is called a Blockchain. A block in the Blockchain consists of the following fields: Version: Version number represents current software information and its updates. Previous Block Hash: It is the previous block’s hash value with which the current block is connected. It gives reference to the current block about its parent block. Nonce: A unique number is used to generate a hash value of the block during the Proof of work mechanism. Merkle Root: It is often called transaction root; the block consists of the number of transactions; Merkle root is obtained by applying the Merkle tree where every leaf node is being hashed using SHA-256. Every transaction is considered a leaf node, as given in Fig. 1.
Blockchain Protocols: Transforming the Web We Know
549
Fig. 1 Structure of a block
Fig. 2 Blocks in blockchain
Timestamp: It represents the unix time at which the current block is generated. Difficulty Target: It is the Proof of work to solve Complex mathematical problem with respect to computation power by the miner to generate the hash of the block. Block Hash: Hash of the current block generated by the miner (Fig. 2). The first block in Blockchain is called a genesis block [10], and its previous block hash is 0, and subsequent blocks carry the previous block hash. Blocks have the number of effectively hashed transactions into the Merkle tree generated Merkle root. If you want to change data in a particular block, then you have to change all the blocks coming after that block because as soon as data in a block gets modified, then the hash of the block changes, and so does the previous block of every block after that particular block, which is very difficult and next to impossible hence Blockchain is immutable. Ever since Blockchain got the attention of researchers around the world, and they are excited to use Blockchain in various fields like Government work, Medical information, finance, arts, etc. Blockchain technology provides security to user data, and it is decentralized so that no authority can edit or remove data as Blockchain is
550
H. Jangid and P. Meel
immutable. Proof of concept of requirements of various fields is ready, it give rise to different Blockchain based protocols like Ethereum, Near, Solana, Polygon, HyperLedger, Polkadot, Cardano, etc. These protocols provide a platform for the developer to create decentralized applications(dapps). These protocols use Blockchain as the core technology for a particular protocol, which helps transform the centralized Internet into a decentralized one. This paper will do a comparative analysis of Ethereum, Near, Solana, and Polygon protocols.
4 Blockchain Protocols A blockchain protocol is a set of rules and procedures that govern the operations and interactions between nodes in a blockchain network. It ensures that the network is secure, transparent, and decentralized. It has key features such as consensus mechanisms, cryptographic algorithms, peer-to-peer networking, and smart contracts. Ethereum is one of such and most popular Turing complete Blockchain protocols. Vitalik Butterin explains popular features of Ethereum in his white paper “A next-generation smart contract and decentralized application platform” [2], He also highlights the Scalability problems and possible solutions like: • Sharding • Increase the number of transactions • Sidechain. We have studied blockchain protocols that used the above solutions and introduced Blockchain Protocols such as Near used sharding, Solana used proof of history to increase the number of transactions, and Polygon used a sidechain, a layer-2 solution using Ethereum as a base layer.
4.1 Ethereum Ethereum is a general purpose, layer one and Turing complete blockchain protocol based on white paper “Web 3.0 and its Reflections on the Future of E-Learning” introduced by Vitalik Buterin [2] that uses a proof of work consensus mechanism. Idea behind Ethereum protocol is to make Decentralized Internet which we often call web3.0. Ethereum is generally providing the most important foundation layer i.e. blockchain for building smart contract based decentralized applications. Smart contracts in ethereum can be written in ethereum’s own high level language known as Solidity which gets translated to bytecode that runs on Ethereum Virtual Environment(EVM). Other than solidity it also supports Julia, Vyper, Mutan(obsolete). Native Cryptocurrency of Ethereum is ETHER (ETH) [11]. In Ethereum, the blocks are generated about every 13 s [11]; A shorter block time translates to higher chances of the duplicate block from different miners. Each
Blockchain Protocols: Transforming the Web We Know
551
Account in Ethereum has a unique 20-byte address and contains four pieces of information 1. The Nonce 2. Current Ether Balance 3. Smart Contract Code and 4. Storage(empty by default) [2]. The first two are common across all the blockchains, and the last is related to smart contracts. Accounts are of two types: • Externally Owned Account: Account controlled by private key and type of account for the human user. • Smart Contract Account: Accounts controlled by the smart contract. An account similar to the bot. Smart contract execution occurs on every node on the Ethereum network inside the Ethereum Virtual Machine (EVM) [2]. One of the essential parts of the Ethereum platform is the Ethereum Virtual Machine (EVM). It is a virtual machine that allows for the execution of smart contracts. Decentralized autonomous organisations (DAOs) are supported by EVM, which run independently and are decentralised without the need for a central authority. Transaction in Ethereum is not just sending Ether but running specific functions on a smart contract. Transactions in Ethereum have details 1. Account address of receiver 2. Signature of Sender 3. Optional data for smart contract 4. GAS limit and 5. GAS Price. Advantages of Ethereum • Ethereum Protocol is Turing complete. • First Smart contract-compatible Blockchain protocol that promoted decentralization in the true sense. • Reduced Computation Power compared to Bitcoin • It supports Interoperability and Open Source. Disadvantages of Ethereum • • • • •
Scalability Issue. Low Throughput Very High GAS fee Poor User and Development Experience. High computation power required due to PoW consensus.
4.2 Near NEAR is a layer one, sharded, proof-of-stake blockchain built for usability and scalability based on white paper named “Nightshade: Near protocol sharding design” [3]. Near is the smart contract-compatible blockchain that uses delegated proof of stack consensus mechanism. The efficiency has been improved using one of Vitalik Buterin’s proposed scaling solutions, i.e., Sharding and token holders of the native
552
H. Jangid and P. Meel
cryptocurrency of Near govern it. With Rainbow Bridge of Ethereum, it eanble interoperability with Near, a trustless bridge that allows you to transfer assets like erc20 token and NFTs between Ethereum and NEAR. Using NEAR’s rainbow bridge, it will be possible to interact with the smart contract and decentralized applications on both sides of the aisle. Zhang, PeiYun, MengChu Zhou, JingRu Zhen, and JiLai Zhang explain the scalability of blockchain can be improved using sharding in their research paper titled “Enhancing scalability of trusted blockchains through optimal sharding” [12]. The architecture of the Sharding used by near is called “Nightshade.” [3] which explains creating a single shard chain instead of multiple parallel sidechains. Transactions completed on each shard carry the snapshot of the block produced over Near Blockchain. All shards work in parallel, and an assigned network of validator nodes maintains each shard. This design makes the NEAR blockchain about as fast as Ethereum2.0’s upcoming beacon chain. Theoretically NEAR blockchain can process around 100,000 transactions per second [13]. Currently, the NEAR blockchain has only one shard; the decision to implement additional shards will be decided by community vote if they deem them necessary. At that point, the blockchain will automatically create, destroy or merge shards based on network conditions. Advantages of Near • • • • •
Lower Gas Fee compared to Ethereum. Improved Block rate due to Sharding. Carbon Neutral Blockchain [14]. Developer Friendly Environment. Proof of Stack Consensus.
Disadvantages of Near • Centralized Governance is matter of concern as 35% of token are held by Near Insiders [13].
4.3 Solana Solana is a general-purpose layer one blockchain protocol based on the Proof of History(PoH) and Proof of stack consensus mechanism inspired by white paper “Solana: A new architecture for a high performance blockchain v0. 8.13” [4].This paper claims Solana is fast, scalable, supports up to 710K transactions per second, and significantly reduces block time to 400 msec compared to any other blockchain currently. Recently Pierro, Giuseppe Antonio, and Roberto Tonelli produced results regarding the scalability of Solana Blockchain in the research paper “Can Solana be the solution to the blockchain scalability problem?” [15] shows 2812 TPS, which is
Blockchain Protocols: Transforming the Web We Know
553
Fig. 3 Proof of history in Solana
way faster than any other blockchain protocol. Solana Gas fee per transaction is next to negligible, supporting stateless smart contract (program in Solana environment) development in Rust, C, and C++ programming language. In Solana Proof of Stack(PoS) [1], one lead validator at a time processes a transaction and writes back to the network state; the other validator reads the state and verifies the transaction, and confirms them. To maximize the efficiency, Solana splits up the ledger into small sections, and the validator takes a turn being the leader about every 400 msec. The switching of validators according to a predetermined schedule is randomly generated based on how much they have staked and previous transaction data. To ensure that all validators trade-off leadership at the right time and malicious validator skip anyone else’s turn to send some transaction Solana uses Proof of History (PoH) (Fig. 3). Proof of History(PoH) [1], anyone can confirm V1 started and finished publishing transactions at the correct time. We can timestamp data by inputting some piece of external data like “john doe” into the hash chain. We proved that data1 existed before hash2 was computed, and weIn Proof of History validator prove time is passing by creating a table of the hashes; due to preimage resistance, we can compute this chain in series on one core. We always need to know hash_k to get hash_k+1, so no one can generate a Proof of History sequence faster than running a single core on a fast computer. But the given data upfront observer can verify all hashes are correct in parallel [4]. With all validators producing Proof of history, they can trust the exchange of leadership according to schedule without communicating. By verifying validator V1’s published append hash1 into our data; we can verify that data1 was generated after hash1. It will allow everyone to trust the order of the events. PoH proves block
554
H. Jangid and P. Meel
time because no one can switch leadership without communicating and provides a mechanism for trusting the event order. It also helps with the finality Solana consensus. Advantages of Solana • Extremely high transaction speed and meager gas fees. • Environment friendly due to PoH and PoS. Disadvantages of Solana • Lack of Stability.
4.4 Polygon Polygon previously known as Matic is a layer-2 Blockchain protocol [16] inspired by“Matic White Paper” [5] using proof of stack as a consensus mechanism. Vispute, Abhishek, Siddhesh Patel, Yuvraj Patil, Shubham Wagh, and Mahesh Shirole explains the Sidechain and produced results of 29.98% TPS improvement over every Sidechain in the research paper “Scaling Blockchain by Autonomous Sidechains” [17]. It is the solution to Ethereum scalability issues. There are two leading scaling solutions: the Layer-2 Scaling solution and the Sidechain. Layer-2 scaling relies on the main layer’s security, i.e., Ethereum Blockchain. In contrast, Sidechain scaling depends on its security mode by having a separate consensus mechanism like a polygon POS chain. Apart from providing the scaling solution, Polygon creates an ecosystem that connects multiple scaling solutions from sidechains with different consensus mechanisms to layer-2 scaling options such as plasma, Optimistic rollups, etc. The architecture of Polygon consists of four layers: Ethereum layer, Security layer, Polygon Network layer, and Execution layer [18]. Out of which Ethereum and Security Layers are optional, and Polygon Network and Execution layer are mandatory. Polygon can uses the Ethereum layer as a base layer and takes advantage of the high security provided by Ethereum. This layer is nothing but a group of smart contracts on Ethereum and can be used for finality, checkpointing, Staking, dispute resolution, and message passing between Ethereum and polygon chains. Security Layer is another non-mandatory layer that can provide a “validator as service” function. It allows Polygon chains to check the validity of any polygon chains and Validator management. The polygon layer is the mandatory layer that provides power to Polygon, makes a sovereign blockchain network that serves its community, and maintains Transaction collation, Block production, local Consensus, Ordering, and data. The Execution Layer is responsible for implementing and executing transactions needed to perform polygon networks. It is responsible Execution Environment and smart contract execution.
Blockchain Protocols: Transforming the Web We Know
555
Advantages of Polygon • • • •
High transaction speed and low gas fee. Great developers support a quick and easy development environment. Smart contracts can move from Ethereum to Polygon. Ethereum Scalability Solution.
Disadvantages of Polygon • Ethereum Dependent.
4.5 Comparison of Different Blockchain Protocol Table 1 compares Blockchain protocols against different parameters, which implicitly distinguish each protocol from another. Ethereum, Near, Solana, and Polygon use PoW, PoS, PoH, and PoS Plasma-based sidechains as consensus mechanisms, respectively [2–5]. Polygon is a Level-2 Blockchain that uses Ethereum as the base layer, while the other three protocols are level-1 blockchains. Ethereum completes an average of 16 transactions per second(TPS), and every 14 s, it generates a Block in the Blockchain. Theoretically, it is possible to perform 7,50,000 transactions in Solana [4], and it takes an average of 40 msec to create a new block in the Blockchain. All Blockchain protocols support smart contracts; a self-executing program executes when conditions are fulfilled. The smart contract language in Ethereum and Polygon is Solidity. Near supports Rust and AssemblyScript [13]. Rust, C, and C++ are supported in Solana [?]. The nature of smart contracts in Ethereum, Near, and Polygon is stateful, while in Solana, it is stateless. The gas fee in Ethereum to perform the transaction is very high. Near and Solana provide relatively low gas fees. Near and Polygon have 100 validators in the network, and Solana has more than 1000 validators [19]. Ethereum has more than 200 million unique accounts [11], and Polygon has more than 147 million [20]. Solana and Near have grown exponentially since their launch, with more than 32 million and 14 million accounts [13, 19], respectively. Solana supports very high TPS; hence it consumes a high amount of power daily, followed by Polygon and Ethereum. Near power consumption is relatively low. Native Tokens of Ethereum, Near, Solana, and Polygon are ETH, NEAR, SOL, and MAT(Matic) respectively.
5 Conclusion Decentralized Internet or web3.0 provides the security of sensitive and personal user data. Smart contract-compatible blockchain protocols enable to creation parallel Decentralized Internet. This paper offers extensive details about such four protocols
556
H. Jangid and P. Meel
Table 1 Comparison of differenct blockchain protocols Protocol Parameter Ethereum Near Consensus mechanism Layer of blockchain TPS Block time (s) Smart contract languages Smart contract Gas fee/transaction Validators Native token Number of accounts Power consumption (day) Founder
Solana
Polygon
Proof of work
Proof of stack
Proof of history
L-1
L-1
L-1
PoS plasma based sidechain L-2
16 14 Solidity
750000 0.4 Rust, C, C++
65000 2 Solidity
Stateful Very high
100000 1 AssemblyScript, Rust Stateful Low
Stateless Very low
Stateful Moderate
None ETH 200M
100 NEAR 14M
1000+ SOL 32M
100 MAT 147M
14,400 kWh
205 kWh
1,10,160 kWh
65700 kWh
Vitalik Buterin
Illia polosukhin
Anatoly Yakavento
Sandeep Nailwal
Ethereum, Near, Solana, and Polygon. Each protocol has advantages and disadvantages, as Ethereum is very secure and Turing complete, but scalability and high gas fees are concerned. Near solve scalability and gas fee problems but is not entirely decentralized, Solana offers high scalability, nominal gas fee, and decentralization, but it is not stable. Polygon is a layer-2 blockchain that provides increased scalability, low gas fees, and is decentralized but depends on Ethereum as the base layer. We suggest a scalable blockchain protocol using sharding, High transaction speed with the help of Proof of History and Proof of Stack as a hybrid consensus mechanism, and having a separate layer of security. Such protocol is compatible with smart contracts written in a frequently used programming language like Javascript, Java, Python, etc. Such a protocol will accelerate the development of decentralized Applications in various fields at enormous speed.
References 1. Bashir I (2022) Blockchain consensus. In: Blockchain consensus. Apress, Berkeley, CA. https:// doi.org/10.1007/978-1-4842-8179-6_5 2. Vitalik B (2014) A next-generation smart contract and decentralized application platform. White Paper 3(37):2–1
Blockchain Protocols: Transforming the Web We Know
557
3. Skidanov A, Polosukhin I (2019) Nightshade: near protocol sharding design. 39. https:// nearprotocol.com/downloads/Nightshade.pdf 4. Anatoly Y (2018) Solana: a new architecture for a high performance blockchain v0. 8.13. Whitepaper 5. Jaynti K, Sandeep N, Anurag A (2021) Matic whitepaper. Tech. Rep. Sep, Polygon, Bengaluru, India 6. Shivalingaiah D, Naik U (2008) Comparative study of web 1.0, web 2.0 and web 3.0 7. Silva JM, Rahman ASMM, Saddik AE (2008) Web 3.0: a vision for bridging the gap between real and virtual. In: Proceedings of the 1st ACM international workshop on communicability design and evaluation in cultural and ecological multimedia system, pp 9–14 8. Kanakia H, Shenoy G, Shah J (2019) Cambridge analytica—a case study. Ind J Sci Technol 12(29):1–5 9. Lal M (2011) Web 3.0 in education and research. BVICAM’s Int J Inf Technol 3(2) 10. Satoshi N (2008) Bitcoin: a peer-to-peer electronic cash system. Decentralized Bus Rev:21260 11. Ethereum Scan (2022) Ethereum (ETH) blockchain explorer. Accessed 5 Nov 2022. https:// etherscan.io/ 12. Zhang P, Zhou M, Zhen J, Zhang J (2021) Enhancing scalability of trusted blockchains through optimal sharding. In: 2021 IEEE international conference on smart data services (SMDS). IEEE, pp 226–233 13. Near Explorer (2022) Explore the near blockchain. NEAR Explorer|Dashboard. Accessed 5 Nov 2022. https://explorer.near.org/ 14. Team NEAR (2022) The near blockchain is climate neutral NEAR protocol, 10 Oct 2022. https://near.org/blog/the-near-blockchain-is-climate-neutral/ 15. Pierro GA, Tonelli R (2022) Can Solana be the solution to the blockchain scalability problem? In: 2022 IEEE international conference on software analysis, evolution and reengineering (SANER). IEEE, pp 1219–1226 16. Gangwal A, Gangavalli HR, Thirupathi A (2022) A survey of layer-two blockchain protocols. J Netw Comput Appl: 103539 17. Vispute A, Patel S, Patil Y, Wagh S, Shirole M (2021) Scaling blockchain by autonomous sidechains. In: Proceeding of fifth international conference on microelectronics, computing and communication systems: MCCS 2020. Springer Singapore, pp 459–473 18. Polygon LightPaper (2021) Ethereum’s internet of blockchains. Polygon Technology, 27 Feb 2021. https://polygon.technology/lightpaper-polygon.pdf 19. Solana Explorer (2022) Solana Explorer. Accessed 5 Nov 2022. https://explorer.solana.com/ 20. Polygon Scan (2022) Polygon (MATIC) blockchain explorer. Accessed 5 Nov 2022. https:// polygonscan.com/ 21. Cai W, Wang Z, Ernst JB, Hong Z, Feng C, Leung VCM (2018) Decentralized applications: the blockchain-empowered software system. IEEE Access 6:53019–53033 22. Khan U, An ZY, Imran A (2020) A blockchain Ethereum technology-enabled digital content: development of trading and sharing economy data. IEEE Access 8:217045–217056
An Internet of Things-Based Smart Asthma Inhaler Integrated with Mobile Application P. Srivani , A. Durga Bhavani , and R. Shankar
Abstract The proposed work “Smart Asthma Inhaler” is based on a remote health monitoring system that analyse and continuously monitors the health of the asthma patient through Internet of Things (IoT) technology. The system also analyses the number of times the medicine is inhaled and the air quality in the patient’s environment. Whenever an asthma patient uses the inhaler, the sensors attached to the device sense and broadcasts the patient’s health information to the server through a Mobile app designed for further analysis in the Ubidots cloud platform. The data stored along with GPS location is shared through the app to the cloud. A web portal is designed to analyse the data, generate reports and suggest medications and notifications to Asthma patients. The data collected by the sensors can be analyzed to provide insights into the user’s asthma triggers and patterns, which can inform personalized treatment plans. Keywords Smart Asthma inhaler · Internet of Things · Ubidots
1 Introduction Asthma is a chronic lung disorder that affects the airways in the lungs. People with asthma experience recurring episodes of wheezing, breathlessness, chest tightness, and coughing. These symptoms are caused by inflammation and narrowing of the airways, making breathing difficult. It is important for people with asthma to work closely with their healthcare providers to develop a personalized treatment plan that meets their individual needs. The pervasiveness of asthma widely depends on geographical locations and socioeconomic classes which is seen more in children nowadays due to pollution, obesity, paracetamol, fast food etc. [1]. There are two types of asthma: Allergic(Extrinsic), which is triggered by an allergic reaction to environmental allergens such as pollen, dust mites, animal dander, or mould. Symptoms can be triggered by exposure to these allergens and may worsen P. Srivani (B) · A. D. Bhavani · R. Shankar BMS Institute of Technology and Management, Bangalore, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_50
559
560
P. Srivani et al.
during allergy season. The other type is Non-allergic(Intrinsic) which is caused by factors like stress, exercise, cold air, infections, smoke, or chemical irritants. Understanding the type of asthma, a person has, can help healthcare providers develop an effective treatment plan. Uncontrolled respiratory diseases such as asthma can put people at greater risk. Tracking the count of inhalation and environmental parameters such as pollen counts, air pollution, and temperature can help patients identify triggers for their asthma attacks and take steps to avoid them. This information identifies patterns in the patient’s symptoms and triggers, which can inform treatment plans and help to better manage the condition. The IoT is a network of physical objects or devices that are embedded with electronics, software, sensors, and network connectivity, which enables them to collect and exchange data. IoT initiatives in healthcare [2] are a revolution in patient care proactively with enhanced diagnosis, remote monitoring, handheld gadgets, monitoring and controlling with smart sensors. These IoT health monitoring devices should overcome challenges concerned with reliability, affordability, security, safety and adaptability for accurate results [3]. The cloud-based smart healthcare system that integrates with IoT has created a digital revolution in the healthcare industry for remote monitoring of patients. The asthma tracker monitors the patient using several sensors like a gas sensor, body temperature and humidity, pulse sensor, and heart rate [4]. However, the high price of the device may make it inaccessible to people who cannot afford it and may also require a certain level of proficiency to use it. The integration of IoT with other digital technology has become a pathway for the makers of smart inhalers to be racing to diagnose and treat asthma kind of chronic disease to monitor the patients remotely. Several sensors can be integrated to monitor and manage health conditions. The SAM [5] smart asthma monitoring provides localized and personal responses for better health conditions. The authors have collected patients’ travel plans and locations to analyse the air quality index to track asthma patients. Further to this, a pollen detector sensor aids in detecting the presence of pollens in the environment which is one of the elements to initiate asthma [6]. Several wearable devices are upcoming that support both end users and the medical team for extensive asthma monitoring and analysis of risk factors to avoid adverse effects in advance [7]. Numerous companies have come out with Smart inhalers integrated with mobile apps and sensors that can track, and monitor patients’ conditions and assess personal risk factors [8–10]. More than detecting and analyzing the level of asthma, there is a need for recommending context-aware medication, in-time alarms, and custom-made medications [11]. The researchers have proposed a prototype which can be implemented in real-time in future. Most of the existing systems are either too complex or expensive to afford. AstraZeneca’s improvised smart inhaler is bulky and expensive. Most of the existing ones use too many sensors and Bluetooth modules for connectivity. A smart asthma inhaler with Internet of Things (IoT) technology can connect to the internet and transmit data to other devices, such as smartphones or cloudbased servers, in real-time. This allows patients and healthcare providers to access and analyze the data remotely, enabling personalized treatment and better asthma
An Internet of Things-Based Smart Asthma Inhaler Integrated …
561
management. IoT-enabled inhalers can also provide alerts and reminders to patients for timely medication use, and track environmental factors that can trigger asthma symptoms. The proposed system, an IoT-enabled application “Smart Asthma Inhaler” monitors the patient’s health conditions, tracks the location and analyzes what parameters have affected the patient. The objective of this proposed work uses sensors that track medication use, inhaler count, and environmental factors. Connectivity options allow for data transmission to other devices, such as smartphones or cloudbased servers. These Cloud-based analytics enable healthcare providers to access and analyze patient data, helping to improve treatment and care. A database was created to keep track of the complete history of the medications given to asthma patients.
2 Methodology A smart asthma inhaler is a device that uses technology, such as sensors and connectivity, to track and monitor the usage and effectiveness of asthma medication. The proposed work can provide real-time feedback to patients and healthcare providers, helping to improve asthma management and treatment. This system consists of two user interfaces. The patient’s user interface is an Android-based application and a Web application and cloud services on the Doctor’s side. The proposed architecture consists of four major modules namely the Smart asthma inhaler, mobile application, data collection system, and a web portal. The architecture of the entire system is shown in Fig. 1. The IoT system is implemented in the inhaler and the environment to collect the data of the patient and his environment for monitoring his/her health condition. The smart inhaler is embedded with a counter that takes the frequency of the number of times the patient inhales. The temperature and air quality sensors are used to collect the ambient temperature and quality of air, that triggers the asthma conditions in such patients. The first module deals with sensing the data such as temperature, air quality levels, and the count or frequency of using the inhaler by the patient. An ESP8266-12E microcontroller board which has inbuilt Bluetooth and Wi-Fi capability is connected to the sensors like air quality and temperature. This board is a self-contained System On Chip (SOC) which is extremely cost-effective as it supports more analog and digital pins along with Wi-Fi capability with a 3.3 V AMS Voltage regulator. The LM35 temperature sensor is used to measure the atmospheric temperature in 0 C. MQ135 is the Air Quality sensor used to measure if there is air pollution in the environment surrounding the patient or not. The proper alarm point for the gas detector should be determined after considering the temperature and humidity influence through which it measures smoke, CO2 , benzene and many more. Further, the sensors send data for data collection to the cloud service namely Ubidots through mobile applications. The second module deals with patient’s login, and provides the GPS location through the mobile application created when the inhaler is pressed. This information
562
P. Srivani et al.
Fig. 1 Architecture of the proposed system
will also be sent to the data collection system module. The third module contains the data sensed from the inhaler as well as the GPS location with respect to each patient. This information will be pushed to the cloud service (Ubidots) as well as to a local server, which stores it in the form of a database for further use. In the fourth module, there are several processes running to generate the final output. As soon as the patient’s information is received the data will be analyzed to create graphs. The reports generated will be viewed by the physicians involved so as to analyze them and prescribe medications. A notification of the report will be sent to the patient by the doctor in case of an emergency. Initially, the Asthma patient registers/logs in to the mobile application developed. The mobile application then registers or validates the login for authentication. The server then sends a validation acknowledgement to the mobile application which is in turn sent to the patient. When the Asthma patient presses the inhaler, the data sensed by the various sensors attached to the inhaler along with the GPS location from the mobile application will be sent to the cloud which remains there for analyzing purposes. The patient can then view the data from the mobile application. Similarly, the physician can also log in and analyze the patient’s report and suggest medications.
An Internet of Things-Based Smart Asthma Inhaler Integrated …
563
3 Results and Discussion A smart inhaler is embedded with sensors and a push button to record the parameters like environment temperature, air quality in the environment and the number of times the patient has inhaled. This helps in analyzing how the air pollutants and temperature would have affected the patient which is collected in the database. ESP8266 Wi-Fi module shares the sensor data continuously with the cloud application. Figure 2 shows the Ubidots cloud platform which shows the air quality level, temperature, and the count (the number of times the medicine is inhaled) values when the inhaler is pressed by the asthma patient. Figure 3a shows the temperature readings are graphically analysed as and when the values are read. Figure 3b shows the graphical analysis of the air quality levels sensed by the Air Quality sensor. Here “0” represents “active high” and “1” represents “active low”. Figure 4a shows the registration page of the user where first-time users can register before using the inhaler application. The patient will have to provide a valid email
Fig. 2 Ubidots screen with three variables
(a)
(b)
Fig. 3 a Graphical analysis of temperature b Graphical analysis of air quality
564
P. Srivani et al.
(a)
(b)
Fig. 4 a User registration page b Login page
id and password for sign-up purposes. Figure 4b shows the Login page where the already registered patients can log in. There is also an option to sign-up for new users. The user credentials are verified with authenticity to enter a unique username, email id and password. And also, to ensure that users do not create multiple accounts or duplicate content in the app. When already registered patient attempt to register the second time with the same login email previously used. Figure 5a shows the page in the Android application where the successfully logged-in patients can send their location i.e., latitude and longitude details to the server. Initially, the data collection is made in the Wamp server where the details of all the registered patients are stored. The encrypted_password feature in the database provides an encrypted code for the user’s password. This is mainly for security purposes. It also shows the date the login was created on. It also collects the Latitude and Longitude details from the mobile application. Figure 5b validates the registered user. Based on the data collected, a report will be generated, where the doctor can analyze the condition of the asthma patient. This report also determines how the environment plays a role in triggering asthma attacks, and tracks the count of medicine inhalation. The report is accessible to the patient through his Mobile app as well as available on the cloud.
An Internet of Things-Based Smart Asthma Inhaler Integrated …
(a)
565
(b)
Fig. 5 a Send location page, b Validation
4 Conclusion Smart inhalers can monitor the usage of inhalers, track environmental parameters such as air quality, location and temperature, and provide personalized feedback to patients. This information can help patients to identify triggers for their asthma attacks, track their symptoms, and manage their medication use more effectively. The uniqueness of the proposed work demonstrates the prototype of cost-effective model design. The proposed system targets the Indian market where pollution is a major factor which triggers asthma attacks. By providing objective and accurate data on a patient’s medication usage, the treatment plans tailored to each patient can be developed, and in doing so dramatically improve their quality of life. Cloud technology can also help to connect patients with healthcare providers and provide real-time monitoring and feedback. This can help to identify potential issues before they become severe and allow for timely interventions and adjustments to treatment plans. By using smart inhalers with integrated sensors and cloud technology, patients can take a more proactive approach to managing their respiratory disease, reduce the risk of severe symptoms and complications, and improve their quality of life. Furthermore, to improve on this, the GPS location tracker can alert the user when they are in an area with high pollution levels or other potential triggers. The limitations of smart asthma inhalers are the dependency on technology and are not reliable in
566
P. Srivani et al.
areas with poor connectivity. One limitation of cloud-based analysis is the need for secure data storage and transmission to protect the privacy of the user. It is also important to ensure that the data is accurate and that the analysis is done by qualified professionals who can interpret the data correctly. While there is promising research on the effectiveness of smart inhalers, more studies are needed to confirm their benefits and identify potential drawbacks. By using smart inhalers with integrated sensors and cloud technology, patients can take a more proactive approach to manage their respiratory disease, reduce the risk of severe symptoms and complications, and improve their quality of life.
References 1. Singh S, Salvi S, Mangal DK, Singh M, Awasthi S, Mahesh PA, Kabra SK et al (2022) Prevalence, time trends and treatment practices of asthma in India: the Global Asthma Network study. ERJ Open Res 8(2) 2. Khatoon N, Roy S, Pranav P (2020) A survey on applications of Internet of Things in healthcare. Internet Things Big Data Appl Recent Adv Challenges 89–106 3. Lakshmi GJ, Ghonge M, Obaid AJ (2021) Cloud based iot smart healthcare system for remote patient monitoring. EAI Endorsed Trans Pervasive Health Technol 7(28):e4–e4 4. Anan SR, Hossain M, Milky M, Khan MM, Masud M, Aljahdali S (2021) Research and development of an iot-based remote asthma patient monitoring system. J Healthc Eng 5. Isaac N, Sampath N, Gay V (2018) SAM smart asthma monitoring: focus on air quality data and Internet of Things (IoT). In: 2018 12th international symposium on medical information and communication technology (ISMICT). IEEE, pp 1–6 6. Krishnan MS, Ragavi S, Kumar MSR, Kavitha D (2019) Smart asthma prediction system using internet of things. Indian J Public Health Res Dev 10(2):1103–1107 7. Shahida M, Choudhury S, Venugopal V, Parhi C, Suganya J (2021) A smart monitoring system for asthma patients using IoT. In: 2021 fifth international conference on I-SMAC (IoT in social, mobile, analytics and cloud) (I-SMAC). IEEE, pp 130–135 8. Reddel HK, Bateman ED, Schatz M, Krishnan JA, Cloutier MM (2022) A practical guide to implementing SMART in asthma management. J Allergy Clin Immunol Pract 10(1):S31–S38 9. Pleasants RA, Chan AHY, Mosnaim G, Costello RW, Dhand R, Schworer SA, Merchant R, Tilley SL (2022) Integrating digital inhalers into clinical care of patients with asthma and chronic obstructive pulmonary disease. Respir Med 107038 10. Tsang KCH, Pinnock H, Wilson AM, Salvi D, Shah SA (2022) Predicting asthma attacks using connected mobile devices and machine learning: the AAMOS-00 observational study protocol. BMJ open 12(10):e064166 11. Dey A, Haque KA, Nayan A-A, Kibria MG (2020) IoT based smart inhaler for contextaware service provisioning. In: 2020 2nd international conference on advanced information and communication technology (ICAICT). IEEE, pp 410–415
Machine Learning Prediction Models to Predict Long-Term Survival After Heart and Liver Transplantation Vandana Jagtap, Monalisa Bhinge, Neha V. Dwivedi, Nanditha R. Nambiar, Snehal S. Kankariya, Toshavi Ghatode, Rashmita Raut, and Prajyot Jagtap
Abstract Organ transplantation is an important treatment for incurable end-stage disease. The survival analysis method is an appropriate or basic technique to find out the influence of such an operation. Objective of this research is to present a methodology for predicting patient’s best outcomes after organ donation. The importance of computer-based medical prediction is growing as the number of medical records grows daily. Additionally, the usage of machine learning techniques is necessary to discern patterns from these massive amounts of data. We propose a multilayer perceptron neural network model which is definite for the prediction of the survival of a patient who undergoes through transplantation procedure. In this study we are using two different datasets for experimentation; heart transplantation supplied by UNOS and liver transplantation supplied by the Stanford LT program. For dimensionality reduction, the Principle Component Analysis (PCA) with ranking is applied. The Association rule mining algorithm generates rules for association and correlation between the attributes. The model separates the data set into training and test sets by extracting the pertinent characteristics. The input dataset is subjected to kfold cross-validation. We achieved accuracy result of multilayer perceptron neural network model, for heart transplantation dataset is 72.093% and for liver transplantation dataset is 87.5% respectively. The study indicates multilayer perceptron is better for correct outcome of predicting a long survival of organ after transplantation. Keywords Artificial neural network · Classification · Data analysis · Organ transplantation · Survival prediction
V. Jagtap · M. Bhinge (B) · N. V. Dwivedi · N. R. Nambiar · S. S. Kankariya · T. Ghatode · R. Raut · P. Jagtap Dr. Vishwanath Karad, MIT World Peace University, Pune, India e-mail: [email protected] V. Jagtap e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_51
567
568
V. Jagtap et al.
1 Introduction Healthcare is a very important domain as it is directly related to human life and any hasty or incorrect decision can lead to loss of life or permanent damage to a living person. Typically, organ transplantation is the last option tried to ensure the survival of a critically ill patient. The reasons necessitating organ transplant may be organ failure or irreversible damage to the organ due to disease or injury. The endurance of the patient depends on the condition of the organ, the number of donors available, and the condition of patient disease. Higher demand than supply of organs requires that organ transplant decisions are taken with due diligence to ensure maximum survival probability of the organ and hence the patient. The patient’s and the organ’s long-term survival, as well as the quality of life, are crucial factors in organ transplantation. Data mining methods can provide a correct prediction for survival and analyze the transplantation factors to reveal novel patterns. The information maintained by transplantation centers is used as guidance to doctors to better allocate organs. The medical field uses the MELD score i.e. model for end-stage liver disease score for outstanding and accurate organ allocation. The MELD score has a sickest first policy for organ allocation. So without considering the both the donor’s and the recipient’s characteristics, from the waiting list organ is allocated to the first patient. According to the MELD score, medical experts get the judgment and predicts the outcome of organ transplantation. It is challenging to estimate the accurate transplant prediction after applying the MELD score for the patient having transplantation. The low survival rate and high recurrence rate account for incorrect prediction for transplantation [1, 2]. An appropriate selection of input parameters and classification algorithm results in a high survival rate. There are various machine learning techniques supporting the medical diagnostic decision. ANN is a classification model in the form of the structure and elements of biological neural systems. ANN is displayed because of the intellectual learning process where the neurological elements are interconnected by neurotransmitters, and are fit for anticipating new perceptions gained from existing information.
2 Review of Literature We have conducted the literature survey in two parts. The first part provides survey papers relevant to survival in organ transplantation. And second part provides a survey on the artificial neural network model used by the different researchers, with the dataset used and accurate results.
Machine Learning Prediction Models to Predict Long-Term Survival …
569
3 Literature Survey-I Raji et al. [3, 4] in their work uses a multilayer perceptron ANN model to find out the problem of allocation of organ and survival prediction, for a patient who has undergone liver transplantation and got greater survival rates in terms of accuracy for the larger and richer dataset. Oztekin et al. in their research [5], uses four different classification models and propose an integrated method to improve outcome prediction for the combined heart–lung 2 3 4 5 7 7 10 21 transplantation dataset. This indicates an integrated method enables more effective and efficient analysis. Manuel Cruz Ramırez and his team [6] have developed, a rule-based system using Pareto fronts obtained by the MPENSGA2 algorithm for allocating donors to the receivers and determining the best match between different donor-receiver pairs maintaining graft survival after transplantation. Vinaya Rao and his team [7] develop two different classification models using WEKA software for clinical decision support solutions. This study indicates efficient management of organ success that helps doctors to guide related healthcare decisions. Lin et al. [8] in their research compared models in multiple time-point and single time-point for survival prediction of graft i.e. Recipient and organ in kidney transplantation. They achieve good prediction discrimination and model calibration using the ANN model. Ali Dag, Oztekin et al. [9] A hybrid approach using four classification algorithms for prediction at three time points was proposed in their research. According to the results, a logistic regression with SMOTE balancing technique yields the best results for prediction in three-time points (1-yrs, 5-yrs, 9-yrs). Haibo et al. [10] have provided a comprehensive review of imbalanced data learning with different technologies and metrics for the performance evaluation of data imbalance cases. They also highlighted challenges and opportunities for future research direction. Sotiris Kotsiantis and his team [11] have provided and described various techniques for handling and learning from imbalanced datasets. Kaur and Wasan [12] in their research provided a case study on classification techniques such as ANN, rule-based, and decision tree algorithms on diabetic patient’s medical data.
3.1 Literature Survey-II Oztekin and his team [5] use combined heart–lung transplantation data from the UNOS dataset. This study indicates ANN outperforms with greater accuracy than the decision tree and logistic regression model. Lin et al. [8] use USRDS dataset parameters of the patient with endstage renal diseases as well as UNOS data on transplant outcomes. They have got the monotonic result for single and multiple output ANN and logistic regression. Caocci et al. [13]: 78 patients who received UDHSCT for b-thalassemia are used as a dataset for and determined sensitivity and specificity values, this indicates ANN outperforms better than logistic regression in terms of sensitivity and specificity values. Lau et al. [14] uses a liver transplantation database
570
V. Jagtap et al.
from Austin Health and applies random forest and ANN model on the dataset and got higher values of AUC-ROC for the ANN model. Khosravi et al. [15] uses a dataset of patients undergone liver transplantation surgery at Shiraz Namazee Hospital, Iran. They compare the performance of ANN and Cox PH regression models and got higher AUC-ROC values for ANN than the Cox PH model.
4 Materials and Methods 4.1 Dataset Description In this study, we used dataset on heart transplantation from the United Network for Organ Sharing (UNOS), the official network of the United States government, and dataset on liver transplantation from the Stanford Liver Transplant Program. • Heart Transplantation Dataset Description: Heart transplantation dataset—The United Nations Organ Sharing (UNOS) database, the dataset for this study was provided by a legitimate United States Government network run by the Health Resources and Services Administration [5, 16]. We have selected files providing data on all waiting list registrations and heart transplants done or reported in the United States commencing on October 1, 1987, and ending on December 31, 2018, from the rich dataset. It contains 15,580 instances (samples) and 133 attributes (columns). • Liver Transplantation Dataset Description: This gives information about the transplant experience in a particular region, over some time. The liver transplantation database includes information on inclinations such as dying while waiting, withdrawing from the list of receiving a transplant or censored liver transplant patients. Also, information on patients on a waiting list from 1990 to 1999 for LT is included in the database.
4.2 Artificial Neural Network(ANN)model Prediction is an Artificial Neural Network’s (ANN) core competency; because of the neural connections made by neurotransmitters throughout the cognitive learning process, ANN can anticipate new perceptions derived from previously learned knowledge [9]. The process by which the values for the various weights are chosen in a multilayer perceptron such that the connection and the network representing is appropriately resolved [5].
Machine Learning Prediction Models to Predict Long-Term Survival …
571
4.3 K-fold Cross Validation The bias which is caused by random-sampling of holdout and training data examples is reduced by using k-fold cross validation to compare predicted accuracy of more than two techniques. The full dataset (D) is arbitrarily split into k folds or subsets (the folds: D1, D2, DK), each of nearly similar size, and they are mutually exclusive. ‘K’ times are spent for training and testing the categorization model. This is tested on the single-fold (Dt) and further trained on all remaining but one-fold in each phase or time (t1, 2, k) (Dt). Rotation estimation is another name for k-fold cross-validation [6].
4.4 Data Splitting With a ten-fold cross validation process, we separated the data into train and testing set for the multilayer perception neural network model. We randomly split the dataset into a “75–25%” pattern. Using this method, the training set had 75% of the data patterns while the test set had 25% of the data patterns. Splitting the data up enhances model performance and prediction.
4.5 Role of Parameters in Survival The United Network for Organ Sharing, or UNOS, handed the dataset that was used in this study. 26 Clinical characteristics are given to the model as input. And one attribute, GRFT STAT, which is expressed as binary [14, 17, 18], is taken as an output attribute. The dataset has 26 clinical input variables that can be used to accurately predict long-term survival (Table 1). Age is displayed for both the donor and the receiver as AGE and AGE DON, respectively. Age differences between the donor and receiver are crucial to survival. Compared to an organ from a younger donor, the organ from an older donor had worse transplant survival. Age mat level is a representation of the donor-recipient age match level. Recipients were at a higher risk of graft failure since it was assumed that they would get an organ from a donor who was older than 65 [19, 20]. Veroux, Massimiliano BO and BO DON Specifically, the recipient and donor blood types have a crucial role in the longevity of the transplant. Both the recipient’s and the donor’s body mass indices are shown as numeric attributes, BMI CALC and BMI Table 1 Representation of output variable
GRFT STAT = 0
Best survival (graft survived)
GRFT STAT = 1
Poor survival (graft failure)
572
V. Jagtap et al.
DON CALC, respectively. The terminal lab creatinine of the dead donor is represented numerically as CREAT DON [21, 22]. The nominal property DDAVP DON indicates if synthetic anti-diuretic hormone was administered to the deceased donor body prior to transplant or not. If the receiver has diabetes or not, DIAB. Both immediate and long-term survival depend on it. If the dead donor had a history of diabetes or not is indicated by the variable DIABETES DON. The recipient ethnicity category’s numerical value is ETHCAT, while the donor ethnicity category’s value is ETHCAT DON. The ethnicity category match level for donors and recipients determines whether they fall within the same group. The genders of the donor and recipient are indicated by the attributes GENDER and GENDER DON, respectively. Here, a high graft survival rate is mostly due to correct gender matching [12]. Clinicians have established that transplants involving male recipients and female donors perform badly or boost the risk of survival. The MALIG TCR factor determines whether or not the recipient has any known malignancies. The recipient’s pre-transplant medical state, or MED COND TRR, is one of the key determinants of survival. Whether the patient has had any prior transplantation is indicated by the numeric parameter NUM PREV TX. The terminal values of SGOT, SGPT, and total bilirubin are numerical values that are associated to the dead donor and are denoted by the variables SGOT DON, SGPT DON, and TIBILI DON. The weights of donors and recipients are represented numerically by the variables WGT KG DON CALC and WGT KG TCR, respectively [23].
5 Problem Statement and Proposed Methodology The fundamental goal of this study is the survival prognosis of the patient after transplantation by using a suitable classification algorithm or techniques. A multilayer perceptron ANN model is used for training and outcome prediction with the proper selection of data attributes. The PCA (Principal Component Analysis) with ranking is used for dimensionality reduction. To find out the relation between attributes association rule mining algorithms is used. The input dataset is subjected to the k-fold cross-validation; during the training dataset the errors are reduced providing the best long-run survival result. The performance of the classification algorithm is assessed using the performance measures [24]. The subsequent Fig. 1. shows an outline of the proposed methodology consisting of five sequential phases. The five sequential phases of this method’s process flow are data preparation, training a classification model, model building or assessment, measuring the designed MLP model’s performance, and evaluating the MLP model’s performance. Phase 1 of the proposed methodology is data preparation, which involves data cleaning, data censoring, and identifying the most relevant attribute. The Multilayer Perceptron ANN model decreases error during training in Phase 2 and is trained with ten-fold cross-validation. Model assessment with back-propagate of the errors in the MLP model is done in Phase 3. Phase 4 uses performance measures and performance error measures to evaluate the performance of the created MLP model. and Phase 5
Machine Learning Prediction Models to Predict Long-Term Survival …
573
Fig. 1 Outline of proposed methodology
includes evaluating or representing the designed MLP model’s performance using the AUC-ROC curve. The output of the model is graft status, which is binary that is in the form of 0 and 1. The output of the methodology is prediction, based on performance measures values. The backpropagation algorithm trains the clinical attributes of the patient of the classification model. While training the clinical attributes of patients, the hidden layer uses an activation function known as the sigmoid function. For a patient undergoing transplantation to have the best chance of survival, suitable donor-recipient matching is necessary.
6 Result and Analysis We experimented ANN model for heart and liver transplantation datasets. The training is carried out with ten-fold cross validation. For each fold accuracy is calculated and it is summed. We have compared the multilayer perceptron classification model performance based on the performance measures. The performance results multilayer perceptron ANN model on two different datasets are given in Table 2. Table 2 The comparative study table of classification model
Parameters
Liver transplantation dataset
Heart transplantation dataset
Accuracy
87.50%
72.09%
Precision
0.875
0.720
Recall
0.875
0.721
F1-score
0.875
0.720
574
V. Jagtap et al.
Fig. 2 Comparative result graph—I
Fig. 3 Comparative result graph—II
From this result, comparative graph performance measures are drawn. The graphs in Fig. 2. Show that the performance of the multilayer perceptron neural network model outperformed with 87.5% accuracy for the smaller dataset of liver transplantation and 72.5% accuracy for the heart transplantation dataset. Figure 3. shows the other performance measures for these algorithms, like precision, recall, and F1score.
7 Analysis The likelihood of a patient’s recovery is increased when the suitable course of an illness is predicted by the medical prognosis aids in the decision-making process for treating patients. In organ transplantation, patient’s long-term survival is crucial. The
Machine Learning Prediction Models to Predict Long-Term Survival …
575
choice of input parameters is necessary if patients are to survive for a long time after receiving an organ transplant. To get relevant properties from the dataset, the right data mining techniques must be used. The properties are retrieved, and PCA is used to reduce their dimensionality. The qualities were ranked by figuring out the mean and the standard deviation. For association and correlation between characteristics, association rule mining is utilized. This demonstrates that the traits we chose are appropriate for predicting survival. The choice of dataset and model has an impact on the results of each study and experiment. An effective and appropriate tool for medical prognosis is ANN. For two distinct datasets, namely heart transplantation and liver transplantation, we employed the ANN model. Ten-fold cross-validation is used during ANN model training. Accuracy is determined for each fold and totaled. We improve the performance metrics of the model with adequate preprocessing for long-term survival data by choosing appropriate characteristics and a model.
8 Conclusion Organ-transplantation is an important treatment for final-stage terminal disease. In organ transplantation matching donor-recipient characteristics plays important role in survival. The multilayer perceptron ANN model is used for training and outcome prediction with proper selection data attributes. We applied PCA for extracting relevant attributes with standard deviation. Association rule mining is used for association and correlation between attributes. The study provides a comprehensive literature survey in the area of survival in organ transplantation. It also provides a survey on artificial neural networks used by different researchers in their work with the dataset used and their results. With two datasets—heart transplantation and liver transplantation—we compared classification model performance. Accuracy, precision, recall, and f1-measure performance metrics are used to assess the model’s performance. With the right choice of input parameters, we might improve accuracy outcomes by suggesting the ANN model. As a result, we can say that the multilayer perceptron ANN model is the best technique for precisely predicting the long-term survival output following organ transplantation.
References 1. Zhang M et al. (2012) Pretransplant prediction of posttransplant survival for liver recipients with benign end-stage liver diseases: a nonlinear model. PLoS One 7(3), Art. no. e31256 2. Weiss N, Thabut D (2019) Neurological complications occurring after liver transplantation: role of risk factors, hepatic encephalopathy, and acute (on Chronic) Brain Injury. Liver Transpl 3. Raji CG, Vinod Chandra SS (2017) Long-term forecasting the survival in liver transplantation using multilayer perceptron networks. IEEE Trans Syst Man Cybern Syst 47(8)
576
V. Jagtap et al.
4. Raji CG, Vinod Chandra SS (2016) Predicting the survival of graft following livertransplantation using a nonlinear model. Springer J Publ Health 24(5):443–452, Oct 2016 5. Oztekin A, Delen D, Kong ZJ (2009) Predicting the graft survivalfor heart–lung transplantation patients: An integrated data mining methodology. Elsevier Int J Med Inform 78(12):e84e96 Dec 2009 6. Ram´ırez MC, Martınez CH, Fernandez JC, Briceno J, la Mata M (2013) Predicting patient survival after liver transplantation using evolutionary multi-objective artificial neural networks. Elsevier J Artif Intell Med 58(1):37–49 7. Rao V, Behara RS, Agarwal A (2014) Predictive modeling for organ transplantation outcomes. In: IEEE international conference on bioinformatics and bioengineering (BIBE), Boca Raton, USA, Nov 2014 8. Lin RS, Horn SD, Hurdle JF, Goldfarb-Rumyantzev AS (2008) Single and multiple time-point prediction models in kidney transplant outcomes. Elsevier J Biomed Inform 41(6):944–952, Dec 2008 9. Dag A, Oztekin A, Yucel A, Bulur S, Megahed FM (2017) Predictingheart transplantation outcomes through data analytics. Elsevier J Decis Support Syst 19:42–52, Feb 2017 10. He H, Garcia EA (2009) Learning from Imbalanced Data. IEEE Trans Knowl Data Eng 21(9):1263–1284 11. Kotsiantis S, Kanellopoulos D, Pintelas P (2006) Handling imbalanced datasets: a review. GESTS Int Trans Comput Sci Eng 30 12. Kaur H, Wasan SK (2006) Empirical study on applications of data mining techniques in healthcare. Citeseerx, J Comput Sci (2):194200. ISSN 1549-3636 13. Caocci G, Baccoli R, Vacca A, Mastronuzzi A et al (2010) Comparison between an artificial neural network and logistic regression in predicting acute graft-vs-host disease after unrelated donor hematopoietic stem cell transplantation in thalassemia patients. Elsevier J Exp Hematol 38(5):426–433, May 2010 14. Lau L, Kankanige Y, Rubinstein B, Jones et al (2017) Machine learning algorithms predict graft failure after liver transplantation. J Transpl Soc Int Liver Transplant Soc 101(4):e125–e132, Apr 2017 15. Khosravi B, Pourahmad S, Bahreini A, Nikeghbalian S, Mehrdad G (2015) Five years survival of patients after liver transplantation and its effective factors by neural network and cox proportional hazard regression models. Semantic scholar, Hepatitis monthly, 2015 16. Kant S, Jagtap V (2019) Comparative study of machine learning approachesfor heart transplantation. In: Saini H, Sayal R, Govardhan A, Buyya R (eds) Innovations in computer science and engineering. Lecture notes in networks and systems, vol 74. Springer, Singapore. https:// doi.org/10.1007/978-981-13-7082347 17. Anand HS, Vinodchandra SS (2016) Association rule mining using treap. Springer Int J Mach Learn Cybern 1–9, May 2016 18. Dag A, Topuz K, Oztekin A, Bulur S, Megahed FM (2016) A probabilistic data-driven framework for scoring the preoperative recipient-donor heart transplant survival. Elsevier J Decis Support Syst 86:112,June 2016 19. Raji CG, Vinod Chandra SS (2016) Prediction and survival analysis of patients after liver transplantation using RBF networks. In: Springer international conference on data mining and big data, pp 147–155, June 2016 20. Scientific Registry of Transplant Recipients (2020). Available at: https://www.srtr.org/. Accessed 4 Apr 2020 21. Zhou J, Xu X, Liang Y, Zhang X, Tu H, Chu H (2021) Risk factors of postoperativedelirium after liver transplantation: a systematic review and meta-analysis. Minerva Anestesiol 87:684–694 22. Lu R-Y, Zhu H-K, Liu X-Y, Zhuang L, Wang Z-Y, Lei Y-L, Wang T, Zheng S-S (2022) A non-linear relationship between preoperative total bilirubin level and postoperative delirium incidence after liver transplantation. J Pers Med 12:141
Machine Learning Prediction Models to Predict Long-Term Survival …
577
23. de Boer JD, Braat AE, Putter H et al (2019) Outcome of liver transplant patients withhigh urgent priority: are we doing the right thing? Transplantation 24. Kotsiantis SB (2007) Supervised machine learning: a review of classification techniques, informatica 31, 2007
Using Blockchain Smart Contracts to Regulate Forest Policies Aaryan Dalal and Reetu Jain
Abstract This paper explores the implementation of blockchain algorithms, specifically using the Ethereum platform and Solidity programming language, for the management of forest policies, including the Clean Water Act (CWA) and the Sandalwood Act 2022. The use of smart contracts can help automate the compliance checks for these policies and ensure that all parties adhere to the terms outlined in the acts. The deployment of blockchain technology helps improve the implementation and monitoring of these policies by creating a decentralized system of record-keeping, reducing the complexity of the current system, and providing a secure and transparent record of all transactions. The adoption of blockchain, implemented through Solidity, can increase transparency and accountability, reduce administrative overhead, and improve data security in the management of forest policies. Keywords AI · Logical agent · Minmax algorithm · Game tree · Wumpus world
1 Introduction Each country has different policies and acts to regulate forests and monitor their conditions. An article published by the U.S. Government Accountability Office mentions that the federal government of the U.S. owns nearly 650 million acres of land in the U.S. which is approximately 30% of the nation’s total surface area [1]. Furthermore, this federal land is managed by 4 major federal land management agencies—National Park Service (NPS), the Department of Agriculture’s Forest Service, Fish and Wildlife Service (FWS) and the Department of Interior’s Bureau of Land Management (BLM)—which are responsible for managing 95% of these lands (U.S. GAO, Managing federal lands and waters). A. Dalal (B) Jakarta Intercultural School, South Jakarta, Indonesia e-mail: [email protected] R. Jain On My Own Technology Pvt Ltd, Mumbai, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_52
579
580
A. Dalal and R. Jain
The United States Code titled Conservation illustrates an array of law governing how the Forest Service and other agencies manage these public lands. These laws are made of acts and policies formed through the years. For example, some landmark laws include but are not limited to: Clean Air Act of 1970—an environmental law which regulates air quality levels and permeates regulation of sources of pollution; Comprehensive Environmental Response Compensation and Liability Act of 1980—which is aimed at cleaning up already polluted areas and assigns liability to individuals involved with improper disposal of hazardous waste and a method to provide funding for cleanup; Endangered Species Act of 1973—to not only prevent the extinction of certain species but to recover populations under the threat of extinction [2]. These are just a few of the landmark laws in the Conservation followed by the Forest Services (U.S.D.A, Laws and regulations).
2 Problems Faced by Federal Agencies The federal agencies monitoring the public land use funds from the Land and Water Conservative fund to enhance recreational activities or conserve land. However, BLM does not maintain centralized data about this land unlike the other federal agencies, resulting in incomplete inventory of its land, slow responses to information requests from the congress (U.S. GAO, Managing federal lands and waters). This issue, however, stems largely from the fact that the agencies are responsible for a lot of public land, leading to distribution of inspections to local Land Departments which result in a more complex system causing miscommunications and delays in processing information (Clearwater Minnesota, Sustainable Forest Management Policy and Statement of Operational Commitments). Due to the sub distribution responsibility spread throughout the country, loss of authority and quality of inspection increases worsening the authenticity of the act/policy enacted. The federal government requires coal mining companies to restore the land upon which its activities were conducted, usually in the form of a bond. In some cases, rather than securing a bond through another company or providing collateral, coal mine operators are permitted to guarantee these costs of their own finances (selfbonding) by the federal government. However, self-bonds are highly risky due to increasing industry bankruptcies and lower coal demand (U.S. GAO, Managing federal lands and waters). The federal government also requires ranchers to obtain permits or leases from the agencies to graze livestock on federal lands. Unauthorized grazing consists of various forms such as grazing more livestock than permitted or unpermitted grazing. Since these issues are usually handled informally—through a phone call—they aren’t recorded officially, and thus do not have a complete record of unauthorized grazing incidents to analyze (U.S. GAO, Managing federal lands and waters). It is evident that a lot of the problems faced by the federal government revolve around monitoring and recording data on the implementation of policies and permits. Since a lot of the policies mentioned have no quantifiable amount, the agencies have
Using Blockchain Smart Contracts to Regulate Forest Policies
581
trouble recording official data, resulting in inefficiency and waste of resources. Most importantly, the goals set by the policies and act are not being met, which displays waste of natural resources mainly because there are minimal official actions taken against poor adherence of the policy or act, and thus a new system is required which helps monitor data collection while keeping the integrity of the policy intact.
3 Blockchain Algorithms and Smart Contacts Blockchain algorithms were developed with the purpose to store levels of immutable records as blocks forming a chain—hence the word blockchain. These algorithms quickly found their way into the digital financial market since transactions became safer and more transparent. Typically, transactions between two people happen through a trusted third party such as a bank which holds transaction records and ledgers. However, with the increased number of banks, fraud, embezzlement and hacking, third parties are becoming less reliable. With a blockchain algorithm however, the transaction process changes completely. If person A requests $5 from person B, the transaction is recorded in a “block” of information, which is then broadcasted to each party in the blockchain network. The people in the network verify this transaction—the block of information—by “mining” it. The verified block is then added to the existing chain of verified blocks, forming a reliable, transparent and permanent history of transactions for both person A and person B since information stored in blockchains are immutable (WTIA Cascadia Blockchain Council 2021). Smart contracts are a set of algorithms stored on a blockchain that run when some set conditions are met and are typically used to execute certain agreements so that all people on the blockchain can be immediately certain of the outcome without any third-party intermediary’s involvement. The smart contracts on the blockchain operate following basic conditions which the owner creates. Wen the conditions are made, the contract executes commands such as issuing tickets, registering vehicles, reserving seats or even granting permits. Once the commands are executed by the contract, the blockchain is updated and the transactions are updated. Since the data is immutable, those in the blockchain network can view details but not change data due to its immutability (IBM, what are smart contracts on Blockchain?). Because smart contracts are digital and the completion of a condition independently executes the command, workflow is faster with higher efficiency and no paperwork to process. The exclusion of third parties and inclusion of participants in all transactions make blockchain smart contracts reliable and transparent. Blockchains are encrypted and immutable, hence very difficult to hack. Additionally, since all records are connected in a chain, altering a single record requires changing the chain altogether, making data even more secure (IBM, what are smart contracts on Blockchain?).
582
A. Dalal and R. Jain
4 The Clean Water Act 1972 Due to the sheer number of forest policies and acts enacted by the federal government, this paper will focus on implementing a sample smart contract for Wastewater Management, a section of the Clean Water Act (CWA) Compliance Monitoring which is a landmark law of forest policies implemented by the federal government. The CWA Compliance Monitoring is conducted by the United States Environmental Protection Agency (EPA) to monitor and ensure compliance with clean water laws and regulations to protect the environment and human health as the CWA is the primary federal law governing water pollution (EPA, Clean Water Act (CWA) Compliance Monitoring). The National Pollutant Discharge Elimination System (NPDES) regulates point sources which discharge pollutants into waters of the United States. Using a range of techniques such as Discharge Monitoring Reports reviews and on-site compliance evaluations, Compliance monitoring provides assistance to enhance compliance with NPDES permits (EPA, Clean Water Act (CWA) Compliance Monitoring). The EPA conducts investigations of Publicly owned Treatment Works (POTWS) checking for overflows of raw sewage and inadequately controlled stormwater discharges from municipal sewer systems that end up in waterways or city streets [3]. These inspections will become the conditions for the smart contract. EPA inspections for combined sewer systems involve: • reviewing the NPDES permit and any enforcement orders • verifying the permittee is in compliance with the permit • verifying that the permittee is preventing combined sewer overflows (CSOs) during dry weather • reviewing compliance with the nine minimum CSO controls • verifying that the permittee is adhering to a schedule in the long-term control plan • implementing a monitoring program • eliminating overflows in sensitive areas • minimizing industrial discharges during overflow events. EPA inspections for sanitary sewer systems involve: • reviewing the NPDES permit and any enforcement orders • verifying that the permittee is in compliance with the NPDES standard permit conditions to mitigate and institute proper operation and maintenance. • determining if there are any unpermitted discharges such as sanitary sewer overflows (SSOs). Pretreatment inspections involve: • reviewing the approved program, annual reports, NPDES compliance status, previous inspection reports, pretreatment files, citizen complaints • interviewing officials knowledgeable of the program • inspecting various industrial user operations, if appropriate.
Using Blockchain Smart Contracts to Regulate Forest Policies
583
(EPA, Clean Water Act (CWA) Compliance Monitoring). These inspections are vital to wastewater management and thus the CWA. The smart contract will treat these inspections as conditions to be met, and execute commands as each condition is met, such as issuing notices to areas where the CWA has been violated and recording data in the blockchain where the CWA has been followed with a timestamp.
5 Karnataka State Sandalwood Policy 2022 India has implemented the Sandalwood policy to protect, conserve and regenerate its natural sandalwood resources. The procedure is aimed at preserving the sandalwood species, promoting sustainable management of their forests, and providing secure and fair access to resources. To ensure secure and fair access to resources, the Sandalwood policy has introduced several regulations. It states that sandalwood trees can only be harvested from government-approved plantations and forests by licensed individuals. It also grants certain rights to the local communities, which allows them to benefit from the exploitation of their sandalwood forests. Furthermore, the policy also promotes the sustainable management of sandalwood resources. It provides guidelines for adequately identifying, monitoring and controlling sandalwood trees so their populations can be maintained appropriately. The policy also encourages the development of sandalwood-based industries, such as producing essential oils, incense and perfumes. Finally, the Sandalwood policy promotes research and development activities in production and management. This includes researching sandalwood species’ genetic diversity, habitat, and ecology and developing new production and management technologies. The Sandalwood policy has effectively protected and conserved India’s natural sandalwood resources. It has provided secure and fair access to resources while promoting sustainable management and research activities. The Sandalwood Policy aims to conserve the sandalwood species, promote sustainable management of their forests, and provide secure and fair access to resources. It states that sandalwood trees can only be harvested from government-approved plantations and forests by licensed individuals. It also grants certain rights to the local communities, which allows them to benefit from the exploitation of their sandalwood forests. Furthermore, the policy also promotes the sustainable management of sandalwood resources. It provides guidelines for adequately identifying, monitoring and managing sandalwood trees so their populations can be maintained appropriately. The policy also encourages the development of sandalwood-based industries, such as producing essential oils, incense and perfumes. Additionally, it promotes research and development activities in sandalwood production and management. Overall, the policy has successfully conserved India’s natural sandalwood resources. It has provided secure and fair access to resources while promoting sustainable management and research activities. The Sandalwood Policy lays out several necessary
584
A. Dalal and R. Jain
Standard Operating Procedures (SOPs) for protecting the species. The SOPs include correctly identifying sandalwood trees, periodically monitoring their populations, and providing secure resource access. The policy also encourages the development of sandalwood-based industries and research activities. The policy also states that sandalwood trees can only be harvested from government-approved plantations and forests and that the government should monitor and supervise harvesting activities. In addition, the policy also grants certain rights to the local communities, allowing them to benefit from the exploitation of their sandalwood forests. Furthermore, the SOPs also include promoting sustainable management of the species, including proper management of sandalwood tree populations and developing sandalwoodbased industries. Overall, the Sandalwood Policy has successfully conserved India’s natural sandalwood resources. It has provided secure and fair access to resources while promoting sustainable management and research activities. The policy that is implemented in India for sandalwood protection is the Karnataka State Sandalwood Policy 2022. This policy removes all existing restrictions on growing and marketing sandalwood and allows for cultivating sandalwood trees on private lands. The Karnataka Forest (Amendment) Bill, 2001 also allows for sandalwood trees to be grown in every garden, park and green space freely.
6 Implementing the Smart Contract 6.1 General Overview/Outline For this paper, the smart contracts are created using the Ethereum Environment, where they will be programmed using solidity programming language compiled in Remix IDE. The contracts are deployed and transactions between owners and users will be through the Ether (ETH) currency—the native cryptocurrency of Ethereum— validated and updated throughout the Ethereum environment by users mining puzzles carried out by the Ethereum Virtual Machine (EVM). The current process involves granting a permit to a permittee on a set of conditions reviewed and agreed upon by the EPA and the Owner. This process involves a lot of paperwork and time for both the EPA and the owner. When official checks are conducted, the EPA inspections occur, following a review of the previously granted permit and cross reference and measurement of the same terms made in the permit. The investigators then decide on the viability of the permit based on the terms mentioned and the terms being followed. The decision is made and recorded. However, since most of this process is on paper and leaves a lot of decision-making to the investigation team, there are high chances for loss of data or fraud which is costly on many levels. The implementation of smart contracts in this situation will be two-fold—one contract to replace the permit being granted, and one contract for the investigation
Using Blockchain Smart Contracts to Regulate Forest Policies
585
team—both of which will communicate together to verify terms and automatically resolve the situation and record it in the secured blockchain.
6.2 Smart Contract for the Permit The permit is granted when the EPA and the owner set the terms of the permit. According to the Clean Water Act, clean water is measured and classified as clean when a certain number of parameters are met—such as: • • • • • •
Temperature Dissolved Oxygen Levels pH level Conductivity Oxidation Reduction Potential (ORP) Turbidity
In the smart contract using Solidity 0.8.7, the following code sets a structure of parameters to store different variables each corresponding to the different parameters of clean water. An object named “limit” is then created to reference the “parameters” structure of variables. A function is then programmed to reference limit to the structure parameters now with each variable assigned a value set by the Clean Water Act. However, the permit needs two other basic conditions for it to be granted to a facility dealing with water sources and discharges. The source of the water needs to be an authorized point source, and the discharge location needs to be authorized. Another structure is created for two Boolean variables authorizing the source and the point of discharge, a reference to the structure is created with the name conditions. A function then assigns the variables with appropriate Boolean values once the source is verified and authorized. In this case, the permittee is not authorized to discharge their water waste. Since the conditions of the permit have been set, all that is left is for the permittee to transfer the permit fee to the EPA. Thus, if either the source is discharge for the water is authorized, the permittee is granted the permit and the fee is paid to the EPA.
6.3 Smart Contract for the Investigation Since the investigation contract will be using values and comparing data from the permit contract, the first step is to inherit the permit contract, allowing us to access variables in the permit contract file, and declare the Ethereum address of the Owner of the contract—the EPA. Next, like the code in the permit contract file, a structure for the recorded values will be made in the investigation contract file. In this case however, the values are not being set but are being recorded by the investigation team. BY recording these values in a smart contract, irrespective of the outcome of
586
A. Dalal and R. Jain
the investigation, the data collected is permanently stored in the blockchain, approved by miners on the blockchain network. The recorded values are stored in the values structure with recorded as the reference name for the structure. Then, a set of values are recorded and stored in various variables. The point source of the water is identified since the EPA investigations require that. Irrespective of its authorization, the source is stored as a string in the blockchain for comparisons with future investigations if any. Then, the authorization of the source and the discharge point are recorded as Booleans. However, the recorded Boolean for the discharge condition is true if the facility discharges, and false if the facility does not discharge any water—not mentioning if it is authorized or not. Additionally, the code defines 3 other unassigned integer variables—violations, amount, and fine (set at $100)—to be used later in the code. Next, a function first compares all the recorded values individually with all the set parameters in the permit individually. If the values don’t match, then the variable defined earlier, violations, increments by 1. Once all the parameters have been compared and the violations have been recorded, the function verifies the point source and discharge authorizations. Since the values in both the permit contract and the investigation contract are stored as Booleans, the && operators are used to satisfy all possible cases. If the conditions are violated—such as permit not authorizing discharge and investigation recording discharge—then the violations variable is updated again. The last bit of code in the investigation contract set the permittee’s address as the sender. The function transfer ether requires an input of a payable Ethereum address as the destination address—EPA. The number of violations are multiplied by the fine amount—set by the EPA—and stored in the amount variable declared earlier and transferred to the destination address declared in the function. A similar smart contact can also be created for sandalwood protection which can track all the existing sandalwood trees from source to the manufacturing plants. Some conditions that can possibly be used for a sandalwood smart contract are its: • • • • •
Scent Age Weight Burning Color
A smart contract for sandalwood tracking would typically include details such as the origin of the sandalwood, the date of harvest, and the ownership history.
7 Conclusion The two smart contracts independently interact with each other, leaving minimum external effort and completely excludes any parties other than the EPA and the Permittee. By essentially digitizing the Clean Water Act, it not only records all data and stores it in a secure blockchain, but in the long run creates a massive database
Using Blockchain Smart Contracts to Regulate Forest Policies
587
of records which was either not easily accessible or did not exist. Additionally, the use of smart contracts prevents tampering of data due to the transparency of the blockchain network. More. The paper displayed contracts for a small fraction of the CWA, however, addressed a lot of common conditions which each investigation will encounter. More importantly, it clearly illustrates the vast scope of smart contracts in this field. Smart Contract Permits (SMP) can be standardized for each investigation, each standardized SMP containing different limit parameters and values, different point source authorizations etc. It may seem that the setup of this system is complex, however, once the initial permits and conditions are programmed, their data can be inherited by Standardized Investigation Contracts. Furthermore, each facility or permittee can have a history of investigation contracts, allowing future investigations to inherit other contracts and compare values not just with permits, but with past investigations also. This database will then allow machine learning or AI algorithms to analyze patterns of heavily investigated important body of waters throughout the United States of America. The scope of these contracts however extends beyond the CWA and can easily be integrated into different forest policies. Blockchain technology could be used to improve the implementation of the Sandalwood Policy. The use of blockchain could provide a secure and transparent way to monitor sandalwood harvesting activities and ensure government compliance. It could also provide an automated way to track the ownership of sandalwood trees and monitor harvesting activities. Overall, blockchain technology could be used to improve the implementation of the Sandalwood Policy. It could provide a secure and transparent way to monitor sandalwood harvesting activities and ensure government compliance while also providing an automated way to track the ownership of sandalwood trees and grant rights to local communities. The implementation of the policy will have a positive economic impact on the state. It will create jobs, as well as provide additional income to farmers. Furthermore, it will incentivise farmers to practice sustainable harvesting, which will help reduce deforestation. Additionally, it will help protect the environment by preserving a valuable natural resource.
References 1. U. S. G. A. Office (2023) Managing federal lands and waters. U.S. GAO. [Online]. Available: https://www.gao.gov/managing-federal-lands-and-waters#:~:text=Four%20major%20fede ral%20land%20management,about%2095%25%20of%20these%20lands. [Accessed: 09 Jan 2023] 2. U.S.D.A (2023) Laws and regulations. US Forest Service. [Online]. Available: https://www.fs. usda.gov/about-agency/regulations-policies/laws-regulations. [Accessed: 09 Jan 2023] 3. EPA (2023b) Water enforcement. EPA. [Online]. Available: https://www.epa.gov/enforcement/ water-enforcement. [Accessed: 16 Jan 2023]
Curbing Anomalous Transactions Using Cost-Sensitive Learning S. Aswathy and V. Viji Rajendran
Abstract Credit card fraud is one of the most common crimes in online transactions, posing a significant threat to banks, credit card companies, and consumers. There are several strategies to prevent and identify these anomalous activities, but they all struggle with the class imbalance problem. Usually, traditional algorithms tend to ignore this unequal distribution of data, thereby neglecting the misclassification of costs. This leads to poor classification, which affects the fraud prediction. Generic data level algorithms are typically used to balance such unequal data, but cost misclassification (Singh and Jain in J Inf Optim Sci 41:1–14, 2020 [1]) persists, so this paper attempts to solve both the class imbalance problem (Singh and Jain in J Inf Optim Sci 41:1–14, 2020 [1]; Singh et al. in J Exper Theor Artif Intell 34:571–598, 2021 [2]) and the cost misclassification problem by using a correlation-based feature selection method, optimizing these features with the help of an optimization algorithm, the Genetic algorithm (GA) (Roseline et al. in Comput Electr Eng 102:1–11, 2022 [3]), and finally, a cost-sensitive (Ling et al. in Cost sensitive learning. Encyclopedia of machine learning. Springer, Boston, pp 231–235, 2011 [4]) classifier to deal with cost misclassification. Keywords Class imbalance problem · Cost-sensitive learning · Genetic algorithm · Metaheuristic · Credit card fraud detection · Cost misclassification
1 Introduction The Internet and digitization have improved our lives. It reshaped people’s consumption behaviors. Today, payment can be done in different modes; even cashless transactions are made possible with digitization, which ensures transparent and quick S. Aswathy (B) · V. Viji Rajendran Computer Science and Engineering, NSS College of Engineering, Palakkad, India e-mail: [email protected]; [email protected] V. Viji Rajendran e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_53
589
590
S. Aswathy and V. Viji Rajendran
transactions. Credit cards is one such popular payment method that is sponsored by banks and building societies, and helps customers pay with borrowed funds as opposed to funds from a bank account. It is only second to debit cards in popularity. Customers can use credit cards to spread out payments over time and get rewards and points for their purchases. Credit scores are even used to measure a person’s financial standing. Even with all these advantages, it is not invincible due to vulnerabilities. Suspicious activities and the mishandling of sensitive data pose a serious threat to the integrity of the system. Credit card fraud can be defined as using someone else’s card to make purchases or apply for cash advances without their knowledge or agreement. Often, fraudsters use digital methods to acquire the credit card number and the associated personal data needed to carry out nefarious transactions. Credit card theft can be deemed identity theft when the fraudster uses the victim’s personal information illegally. According to a recent Nilson Report [5], credit card fraud losses will reach 43 billion dollars in five years and 408.5 billion dollars globally during the next decade, making credit card fraud detection more vital than ever. All parties involved in the payment life cycle will bear the brunt of these rising costs, from banks and credit card companies, which bear the cost of such fraud, to consumers, who pay higher fees or have their credit scores lowered, to merchants and small businesses, which face charge-back fees. Due to the surge in online crime and fraud of all kinds, it is more crucial than ever for businesses to take decisive action to avoid payment card fraud. Traditional machine learning algorithms are used to cease such suspicious activities but these algorithms are prone to class imbalance problem. In machine learning, classification refers to a predictive modeling problem in which a class label is predicted for given input data [6]. The label could be classified into two or more classes. Class imbalance [1, 2] occurs when the majority of the data belongs to a single class label. It can happen in both two-class and multi-class classifications [7]. Generally, machine learning algorithms assume that data is evenly distributed. When there is a class imbalance, the machine learning classifier [8] tends to be more biased towards the majority class, resulting in incorrect classification of the minority class [7]. It occurs because the traditional machine learning algorithm cost function is constantly attempting to optimize quantities such as the error rate without taking the distribution of the data into account. This leads to misclassification of cost which causes poor classification. Misclassification costs can be assumed as weights applied to specific outcome, these weights are incorporated into the model and may alter the prediction. In most cases, the machine learning algorithm ignores the cost of misclassification and attempts to balance data using various techniques to solve the imbalance data problem. Data sampling is the most commonly used technique. It is further divided into undersampling [9] and oversampling [9], which have different techniques under them like Synthetic Minority Oversampling Technique (SMOTE) [2, 9–12], Adaptive Synthetic Algorithm (ADASYN) [9], Tomeklinks [2, 9, 11], etc. The next session will be based on these methods, which are used to curb the class imbalance problem [13]. But then again cost misclassification remains, it is solved by employing cost sensitive learning.
Curbing Anomalous Transactions Using Cost-Sensitive Learning
591
Cost-sensitive learning [4] is one type of algorithm-level approach that can be used to address class imbalance. It is divided into three major groups. They are costsensitive resampling, algorithms, and ensembles, which are used to study imbalanced datasets. There are dependent [14] and independent costs. If misclassification costs differ between samples, a dependent, cost-sensitive learning approach is used. Depending on the instance [14], different costs are assigned to samples in some cases. In this paper, cost-sensitive learning [4] is used to overcome cost misclassification. Metaheuristic techniques are employed to assist in cost-sensitive learning (CSL) [4]. Initially, the dataset is given to the model, where feature selection occurs. To optimize the features selected using the filtering method, a nature-inspired algorithm is used. The features are chosen based on their dependence or correlation, and a threshold is established for the same. The costs are then assigned to these data using a cost matrix [4]. Finally, it is classified using LightGBM [11, 15], a gradient-boosting decision tree. The rest of the report follows this order: Sect. 2 discusses some of the different methods to curb the class imbalance problem, cost misclassification [1] and how feature selection and optimization helps in obtaining a better result. Section 3 contains description about the dataset used. Section 4 consists of proposed system followed by result in Sect. 5. Finally, Sect. 6, presents the conclusion and future work.
2 Related Works There are several methods to balance imbalanced data in a class, which if ignored can cause substantial biases in favour of the majority class, reducing classification accuracy and resulting in more false negatives. While doing so, it deliberately ignores misclassification of cost, which leads to poor classification. One of the popular methods used for balancing the imbalanced dataset, which is discussed by Amit Singh et al. [2], are data level algorithms. It uses sampling techniques to perform balancing, such as SMOTE [2, 9, 12], TOMEKLINKS [2, 9], and ADASYN [2, 9]; all these manage to balance data but at the cost of overfitting, underfitting, missing valuable information, and data overlap, all the while ignoring the cost of classification. Hybrid techniques like SMOTE-Tomek [2, 9, 16], SMOTEEN [2, 9, 12] were also introduced by him to resolve the drawbacks caused by the former techniques, which were successful in balancing the dataset all the while ignoring classification cost. In order to curb this problem, cost-sensitive learning (CSL) [1, 4, 13] is introduced by Ling [4]. In CSL [1, 4], certain features are assigned a cost, like a higher cost or a lower cost, based on the instance [14], thus helping to prevent poor generalization due to misclassification of cost. By applying cost-sensitive learning [1, 4, 13], misclassification of cost [1] is resolved, but to realize this in a very large dataset would be tedious; hence, a couple of features are to be selected from the dataset and should be optimized with the help of optimizing algorithms. A nature-inspired algorithm has always been a great tool for selecting the best features from the population. Several such algorithms and their variants are available, which includes the
592
S. Aswathy and V. Viji Rajendran
flower pollination algorithm, the genetic algorithm and particle swarm optimization. Xin-she Yang introduced FPA [1] and was able to curb class imbalance in the dataset and produce relevant features. Later, Ajeet Singh [1] proposed this novel method of cost sensitive flower pollination algorithm (CSFPA) [1], which managed to bring down the average cost while balancing the dataset. But the problem with FPA [1] is that it gets stuck with the local optima restricting the search space from exploring a diverse population. This could be achieved with the help of a genetic algorithm whose convergence rate is higher than that of FPA [1] and whose search space is vast. Optimization yields relevant features, but to classify this, a base learner must be proposed with cost sensitivity [4]. A machine learning technique such as random forest [17] is used as the base learner, which provides an accuracy of 96%. The proposed system with LightGBM [11, 15] as a classifier yields an accuracy of 99%, which is higher than the CSFPA model. The study highlights that the existing system often overlook the misclassification of cost, even if they manage to eliminate the class imbalance problem, resulting in poor classification. Additionally, sampling [17] and resampling techniques [2] can lead to information loss, further hampering classification. The proposed technique aims to address this issue by minimizing the cost of misclassifying fraud transactions while simultaneously improving classification accuracy. Unlike the previous CSFPA [1] approach that failed to provide diverse solutions and got stuck with local solutions, the proposed technique employs a Genetic algorithm [3], which balances the data and provides a wider spectrum of solutions than FPA [1]. The contributions of this study are significant, as the proposed approach can help financial institutions improve their fraud detection capabilities, save costs, and reduce losses caused by credit card fraud. By addressing the misclassification of cost, financial institutions can make better-informed decisions while minimizing financial risks, thereby enhancing their overall performance and profitability.
3 German Credit Dataset The German credit dataset [18] is used to classify transactions. The dataset is highly imbalanced. It comprises of 1000 instances (loan applicants) and 21 features encompassing financial status, demographic information and employment history, each of which describes a person’s credit status (good or bad). A person who requests credit from a bank is denoted by an entry in the dataset. Each person is classified as a good or bad credit risk based on a set of characteristics. The dataset is essentially used to find out credit worthiness of an individual. Since it contains imbalanced data, credit risk analysis becomes tedious. In order to overcome this, several techniques are employed. The features used here are Transaction amount, Transaction location, Transaction time, Job, Age, Number of transactions in a given time period, Billing address, ZIP code, Credit Amount, Card verification value. The dataset’s widespread accessibility and free usage have made it easier to develop and test a variety of credit risk models.
Curbing Anomalous Transactions Using Cost-Sensitive Learning
593
4 Proposed System This paper aims to address the issue of misclassification of costs resulting from the class imbalance problem, which has significant implications for credit card companies, banks, and loan lending agencies. It seeks to propose effective strategies to mitigate the class imbalance problem and improve the accuracy of classification models, thereby reducing costs for financial institutions. In order to achieve this, the paper proposes a cost-sensitive [4] metaheuristic [1] technique for credit card fraud detection [1], which uses LightGBM [11, 15] as the classification model. The suggested technique aims to minimize the cost of misclassifying [1] fraud transactions as non-fraud, which is a crucial issue in credit card fraud detection. By leveraging this approach companies can improve their fraud detection capabilities and minimize financial losses due to fraudulent activities. The proposed system consists of four modules- Feature selection, feature optimization, cost sensitive learning and classification, Fig. 1 shows the diagrammatic representation of the same.
4.1 Feature Selection The technique of finding the best features in order to build a useful model is known as “Feature selection”. There are various techniques available for performing feature selection. It can be broadly classified into supervised techniques and unsupervised techniques. Supervised techniques use models such as regression and classification to identify relevant features in labelled whereas unsupervised techniques process unlabelled data. In this, filter method is used for selecting features from German dataset [18] which then uses correlated coefficient to find correlated features and take these relevant variables to performs optimization. Filter Method. It picks up on the intrinsic characteristics [19] of the features measured using uni-variate statistics. These methods are quick and less expensive to compute than wrapper methods. It is efficient at dealing with high-dimensional data because they are computationally less expensive.
Fig. 1 Proposed system
594
S. Aswathy and V. Viji Rajendran
Table 1 Correlation matrix
Age
Job
Credit amount
Age
1.000000
0.015673
0.032716
Job
0.015673
1.000000
0.285385
Credit amount
0.032716
0.285385
1.000000
Correlated Coefficient. In correlation, it measures the linear relationship between two or more variables. The idea behind correlation [1] is that the variables present can be predicted from one another avoiding redundancy (features). Here the best variables will be highly correlated with the target, but at the same time they shouldn’t be uncorrelated with other variables, i.e., if two features are correlated, only one has to be present in the model since the other can be retrieved from the former. Correlation can be determined with the help of a Pearson matrix [20]. A threshold value can be set using this co-linearity between features. The variables will be either dropped or selected based on their correlation coefficient [1] value with the target variable. If the variable has a greater value, it is kept; otherwise, it is removed. Table 1 shows a correlation matrix, where highest correlated features could be identified from the matrix based on their values.
4.2 Feature Optimization Features are optimized with the help of the Genetic Algorithm (GA) [3]. It is a type of optimization algorithm inspired by the theory of natural selection. It is a heuristic search algorithm that mimics the process of natural selection, reproduction, and mutation of living organisms in nature to find the best solution (offspring) to a problem. In a typical GA [3], the initial potential solutions (chromosomes) are generated randomly. The effectiveness of each potential solution is then assessed (fitness function) based on how well it solves the problem. Individuals with higher fitness scores are chosen for reproduction (crossover) to produce new offspring (chromosomes). This process is repeated over and over until a viable solution is discovered. During reproduction, chromosomes exchange genetic information (crossover) to produce new offspring with traits inherited from their parents. There is also the possibility of mutation (random changes in offspring’s genetic makeup) introducing new traits that may be useful in solving the problem. The algorithm continues until a stopping criterion is met, such as the maximum number of generations, solution convergence, or a predetermined fitness score. In this paper the genetic algorithm is used for finding relevant and optimized feature from the dataset thereby making classification efficient. First, an optimization function is defined to take a binary vector indicating which features are selected, and returns a scalar fitness value indicating how well the solution performs. Next, genetic algorithm parameters are defined, including the number of generations, number of parents mating, population size and mutation rate. Then an initial population is
Curbing Anomalous Transactions Using Cost-Sensitive Learning
595
created randomly, consisting of binary vectors indicating the presence or absence of each feature. Followed by the creation of genetic algorithm object using the optimization function and genetic algorithm parameters. Algorithm is executed to yield the best solution and corresponding fitness value. Finally, the selected features and fitness value are obtained. Overall, the genetic algorithm [3] is used to search for the combination of features that maximizes the fitness value, allowing for better predictions of credit risk. The pseudocode for the algorithm will be as follows: Genetic Algorithm data = load_data() target = "Credit amount" corr_matrix = compute_correlation_matrix(data, target) relev_ features = select_relev_features(corr_matrix) def optimization_function(solution, solution_idx): selected_features=get_selected_features(solution, relev_features) X_selected = select_data_ features(data, selected_features) fitness = evaluate_solution(X_selected, target) return fitness num_gen = 20 num_parents_mating = 5 sol_per_pop = 10 num_genes = len(relev_ features) mutation_percent_genes = 10 initial_population = create_initial_population (sol_per_pop, num_genes) ga_ instance=create_genetic_algorithm_instance(optimization_function,num_gen,num_parents_ mating,sol_per_pop, num_genes, mutation_percent_genes, initial_population) ga_instance.run() best_solution, best_fitness = get_best_solution(ga_instance) selected_ features=get_selected_features(best_solution, relevant_ features) print ("Selected features: ", selected_features) print ("Fitness value: ", best_fitness)
After finding best feature it’s fed into a cost sensitive base learner [1], LightGBM [11, 15] for distinguishing genuine and ingenuine transaction. Here cost sensitive learning [1, 4] is applied to minimize the cost misclassification that usually happens for imbalanced dataset.
4.3 Cost Sensitive Learning Cost-Sensitive Learning (CSL) [4] is a machine learning approach that considers the cost associated with misclassifying different classes in a classification problem. Traditional algorithm has this tendency to ignore misclassification of cost and only focus on balancing the data which affects the classification, this is where CSL [4] becomes effective. It addresses the latter issue by incorporating a cost matrix into the training process. The cost matrix [4] defines the cost associated with misclassifying each class. The proposed system applies reweighting scheme that assigns different weights to each instance based [14] on the cost matrix [4]. Instances with a higher cost of misclassification are given higher weights, which makes them more influential during training. Here Table 2 depicts a cost matrix [4] where the cost of misclassifying an instance of class A as class B is d, and the cost of misclassifying an instance of class B as class A is e. The cost of correctly classifying an instance is 0. In summary, CSL [4] uses the cost matrix to train a cost-sensitive classifier [1] that can improve classification
596 Table 2 Cost matrix
S. Aswathy and V. Viji Rajendran
Cost matrix
Predicted A
Predicted B
Actual A
0
e
Actual B
d
0
accuracy and reduce total cost. By adjusting the decision threshold and incorporating the cost matrix, the cost-sensitive classifier [1] can make more informed decisions that minimize the expected cost.
4.4 LightGBM LightGBM [11, 15] is an open-source gradient boosting framework that is used to build highly accurate models for various tasks such as classification, regression, and ranking. It is designed to be efficient, scalable, and highly optimized for handling large-scale datasets. One of the key advantages of LightGBM [11, 15] is its ability to handle imbalanced datasets, as it provides a parameter that automatically adjusts the weights of different classes in the dataset, which make its ideal to use as a base learner for cost sensitive classifier [1]. After selecting the relevant features using the genetic algorithm, LightGBM [11, 15] classifies the credit risk based on the selected features. A cost-sensitive [1] approach is used by applying different costs to false positives and false negatives [12] in order to address the imbalanced dataset and the cost of misclassification. The first step is to divide the dataset into the desired ratio (70:30) for training and testing sets. After that, employ LightGBM to train a binary classification model using the chosen features on the training set. Then to implement cost-sensitive learning [4, 13], define the cost matrix based on the severity of false positives and false negatives. Pass the cost matrix [4] to the LightGBM [11, 15] model using the is_unbalance parameter to apply cost-sensitive learning. Once the model is trained, its performance on the testing set using metrics such as accuracy, precision, recall and F1-score is evaluated. The expected cost of misclassification is found using the defined cost matrix.
5 Results The main evaluation is done on three different optimization algorithms namely; Genetic Algorithm [3], Particle Swarm Optimization and Flower pollination Algorithm [1] by employing cost sensitivity. Analyzed how these algorithms performs with and without feature selection. Based on the analysis it’s found that both cost sensitive genetic algorithm and particle swarm optimization outperforms their featureless counterparts in terms of precision, recall, accuracy and F1-score. Between cost sensitive genetic algorithm and particle swam optimization, the former algorithm
Curbing Anomalous Transactions Using Cost-Sensitive Learning
597
Table 3 Performance result Without feature selection
With feature selection
Performance metrics
FPA (%)
PSO (%)
GA (%)
FPA (%)
PSO (%)
GA (%)
Accuracy
96
97
98
96
98
99
Precision
70
75
80
96
85
90
Recall
85
80
85
99
75
80
F1-score
77
77
82
99
80
85
showed the best overall performance with feature selection achieving an accuracy of 99%, precision of 90%, 80% recall and F1-score 85%. The cost sensitive PSO with feature selection obtained an accuracy of 98%, precision of 85%, recall 75% and F1-score of 80%. The cost sensitive FPA [1] with feature selection obtained an accuracy of 96% precision of 96%, recall 99% and F1-score of 99%. While cost sensitive genetic algorithm and particle swam optimization, the former algorithm showed the best overall performance without feature selection achieving an accuracy of 98%, precision of 80%, 85% recall and F1-score 82%. The cost sensitive PSO with features obtained an accuracy of 97% precision of 75%, recall 80% and F1-score of 77%. Whereas the cost sensitive FPA with feature selection obtained an accuracy of 96% precision of 70%, recall 85% and F1-score of 77%. Among these, the proposed cost sensitive genetic algorithm performed well in terms of accuracy obtaining 99% with feature selection and 98% without feature selection (Table 3) [1].
6 Conclusion and Future Work Various methods to deal with the class imbalance problem were discussed in this paper. Data-level approaches [2] and how they use undersampling and oversampling methods to even out data distribution were reviewed. The cost misclassification caused by data imbalance, which is commonly overlooked in data level approaches were also examined. Cost-sensitive learning [4] is investigated, as well as its role in eliminating cost errors, which helps achieve better classification. Here, the features are selected from the dataset using the correlation coefficient [7] through Pearson matrix [20], which yields relevant features. These features are then optimized using GA [3], a cost is assigned, and they are then classified using LightGBM [11, 15]. It’s understood from the result that the proposed model manages to obtain a slightly higher accuracy while comparing with another optimization algorithms. Even though the classification yields better accuracy, it has yet to be employed on a large dataset, as that would provide a better understanding of the performance of the technique under different conditions. Also, with optimization and a nature-inspired algorithm, there will be numerous parameters for optimization. This could be solved by fine-tuning parameters for better performance of the algorithm. The technique
598
S. Aswathy and V. Viji Rajendran
can be extended to incorporate new and relevant features, such as transaction time and location, to improve its performance and make it more robust against different types of fraud.
References 1. Singh A, Jain A (2020) Cost-sensitive metaheuristic technique for credit card fraud detection. J Inform Optim Sci 41:1–14 2. Singh A, Ranjan RK, Tiwari A (2021) Credit card fraud detection under extreme imbalanced data: a comparative study of data-level algorithms. J Exp Theor Artif Intell 34:571–598 3. Roseline JF, Naidu GB, Pandi VS, alias Rajasree SA, Mageswari N (2022) Autonomous credit card fraud detection using machine learning approach. Comput Electr Eng 102:1–11 4. Ling CS, Sheng VS (2011) Cost sensitive learning. Encyclopedia of machine learning. Springer, Boston, pp 231–235 5. Nilson Report (2020) Card fraud losses worldwide. 1187. https://nilsonreport.com/mention/ 1313/1link/ 6. Arróyave R (2022) Data science, machine learning and artificial intelligence applied to metals and alloys research: past, present, and future. In: Encyclopedia of materials: metals and alloys, vol 4. Elsevier, pp 609–621 7. Paramasivam S, Leela Velusamy R (2022) Cor-ENTC: correlation with ensemble approach for network traffic classification using SDN technology for future networks. J Supercomput 78:1–25 8. Itoo G, Singh MS (2020) Comparison and analysis of logistic regression, Naive Bayes and KNN machine learning algorithms for credit card fraud detection. Int J Inform Technol 13:1–9 9. El-Naby AA, Hemdan EE-D, El-Sayed A (2022) An efficient fraud detection framework with credit card imbalanced data in financial services. Multimedia Tools Appl 81:1–22 10. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority oversampling technique. J Artif Intell Res 15:24–37 11. Solorio-Fernández S, Carrasco-Ochoa JA, Martínez-Trinidad JF (2019) A review of unsupervised feature selection methods. Artif Intell Rev 53:1–42 12. Krishnan S (2021) Class imbalance problem and ways to handle it. https://medium.com/nerdfor-tech/class-imbalance-problem-and-ways-to-handle-it4861a195398a 13. Mienye ID, Sun Y (2021) Performance analysis of cost-sensitive learning methods with application to imbalanced medical data. Inform Med Unlocked 25:1–10 14. Höppner S, Baesens B, Verbeke W, Verdonck T (2022) Instance-dependent cost-sensitive learning for detecting transfer fraud. Eur J Oper Res 297:1–10 15. Ge D, Gu J, Chang S, Cai JH (2020) Credit card fraud detection using LightGBM model. In: International conference on e-commerce and internet technology (ECIT), China, pp 232–236 16. Viadinugroho RAA (2021) Imbalanced classification in python: SMOTE-Tomek links method. https://towardsdatascience.com/imbalancedclassification-in-python-smote-tomek-links-met hod-6e48dfe69bbc 17. Chang V, Doan LMT, Di Stefano A, Sun Z, Fortino G (2022) Digital payment fraud detection methods in digital ages and Industry 4.0. Comput Electr Eng 100:1–21 18. https://archive.ics.uci.edu/ml/datasets/Statlog+%28German+Credit+Data%29 19. Fischer JS (2021) Correlation-based feature selection in python from scratch. https://johfischer. com/2021/08/06/correlation-based-feature-selection-in-python-from-scratch/ 20. Gupta A (2020) Feature selection techniques in machine learning. https://www.analyticsvid hya.com/blog/2020/10/feature-selection-techniques-in-machine-learning/
Comparative Analysis of Machine Learning Algorithms on Mental Health Dataset Lakshmi Prasanna and Shashi Mehrotra
Abstract Maintaining good health for the well-being of a person is very important. Due to prolonged working hours and hectic schedules, IT and non-IT professionals are prone to stress. Companies provide health incentives to employees to address this issue, which is insufficient to alleviate stress. This paper compiles data from Open Source Mental Illness (OSMI) on the working conditions of both IT and non-IT employees. The data was further analyzed to find the essential attributes affecting a person’s health. The paper explored the Decision tree, Bayesian classification and Logistic Regression techniques for predicting employee stress. Experimental results show the Decision tree classification achieves a better result than the Bayesian classification and Logistic Regression. With these findings, industries can now focus on reducing stress and create a more pleasant work environment for their employees. The primary objective of this study is to help the medical community by providing a novel method for generating relevant stress regarding rules and predicting a patient’s stress level using machine learning algorithms. Keywords Decision tree · Naive Bayes theorem · Logistic regression · Confusion matrix
1 Introduction These days depression is a concern in society is a mental disorder. The person experience excessive worry, uneasiness, and concern about the future. It is a type of disorder that can affect both the psychological and physical health of an individual [1]. It is due to the amygdala in the brain, which is responsible to generate fear responses [2]. Lakshmi Prasanna (B) Koneru Lakshmaiah Education Foundation, Vaddeswaram, Guntur, Andhra Pradesh, India e-mail: [email protected] S. Mehrotra Faculty of Engineering, College of Computing Science and Technology, Teerthanker Mahaver University, Moradabad, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_54
599
600
Lakshmi Prasanna and S. Mehrotra
Employees in the private sector are frequently depressed [5] or anxious due to the high duration of work and short time frames [3]. Providing a stress-free environment should be given priority for better employee welfare and higher productivity. One tactic for lowering stress is educating and assisting peers dealing with mental health concerns so they can support one another more effectively [4]. The experts usually train peer supports who participate in the internet therapy of individuals. The term depression is still not well understood and is not always consistent throughout a person’s life. Depression [5] can last for a short period, intermittent, or persistent, and may become more volatile as people age [7]. There are more or less evident signs of depression besides sadness or anxiety. For instance, elderly persons who report having depression might seem tired, have difficulties falling asleep, or seem cranky and angry. Additionally, disorientation or focus issues are more common in depressed people. The paper is structured as follows: the mental health conditions of employees in technical and non-technical organizations are described in Sect. 1. The literature review in Sect. 2. Section 3 describes the algorithms used in this paper data collection, data cleaning, and a the machine learning algorithms we used in our study. Section 4 pertains to data collection, data cleaning, and confusion matrix. The experiment and result analysis are presented in Sect. 5. The work presented in this paper is concluded at the end of Sect. 6.
2 Related Work Mental health is a very sensitive topic, and gaining very importance these days. Researchers are working in this direction. This section discusses some related work. Kataria and Maan [1] researched the impact of overtime on jobs on health. They observed the workload and modernized (hectic) lifestyle affect health. People become more susceptible to mental illnesses throughout time, such as anxiety. Data from a poll of working and non-working professionals are collected, preprocessed, and used the following methods, namely Decision trees, Random Forest, Naive Bayes, Logistic Regression, SVM, and KNN, to forecast the employees’ mental health. D’monte et al. [2] design a hybrid model using decision trees and logical regression techniques to predict stress levels. The decision tree generates rules and logistic regression to expect the level of stress. Cognitive behavioral therapy is one of the finest therapies for anxiety disorders that considers the patient’s surroundings and habits. Reddy et al. [3] tried to observe the various stress levels in working people and the factors leading to the causes of stress. Contributing factors to the mental pressure: Gender, whether he is a tech employee, and family history. The author claims with the help of these findings, businesses may now focus on finding better ways to make their work atmosphere less stressful and more pleasant.
Comparative Analysis of Machine Learning Algorithms on Mental …
601
Fortuna et al. [4] used guided and unguided conversations to assist those dealing with mental health issues. Guided chats with specific prompts, and the author discovered that guided talks are more beneficial. Chen et al. [5] proposed a novel method to identify users with depression or at-risk. Over time, he performed temporal analysis and assessments of eight fundamental emotions as features from Twitter tweets. Gaur et al. [6] attempted to identify people’s mental health conditions by utilizing Reddit posts as a web-based intervention taking into account social media networks. More and more people are using social media sites to discuss and look for guidance on mental health problems. In particular, Reddit users are free to discuss topics in subedits whose organisation and content may be formally assessed, as well as the posts they make to diagnostic categories for mental health. Yang et al. [7] developed a model to predict older persons’ mental health by considering the five distinct machine learning algorithms: System Framework, Gradient Boosting Machine, Keras based Convolutional Neural Network, Regularized Greedy Forests, and Logistic Regression. Adamou et al. [8] worked on the early identification of patient referrals near a suicide incident. The data on patients who committed suicide between 2013 and 2016 were analyzed using structured and free-text medical notes. To help with risk assessment, they focused on exploring various text-mining techniques. Nishant et al. [9] used machine learning methods like Naïve Bayes (NB), Logistic Regression (LR), Support Vector Machine (SVM), Linear Discriminant Analysis (LDA), and two decision trees classifying algorithms: C5.0 and Random Forest to identify diabetes, breast cancer, heart disease.
3 Background This section describes the algorithms used in this paper. Machine learning (ML) a part of Artificial Intelligence is a method of training a model. As there is a lot of data, this is very useful in the healthcare industry. The prediction model will eliminate human error and reduce diagnostic time if fed to an intelligent system correctly and trained accordingly. (i) Decision Tree Decision tree [1, 2, 7], a model that looks like a tree, it is possible to model multiple choices, if-else statements, and decisions. In this case, decision trees determine the 20 features that are most likely to cause stress. Because these areas can now receive more attention taking the necessary steps to control pressure is highly beneficial. (ii) Naive Bayes Theorem The Naive Bayes algorithm [1, 2, 7] uses Bayes theorem is a supervised learning algorithm for solving classification problems. The majority of applications involve classifying text using a high-dimensional training dataset.
602
Lakshmi Prasanna and S. Mehrotra
It makes predictions based on an object’s probability as a probabilistic classifier. The Naive Bayes Algorithm pertains to many well-known tasks, including spam filtering, sentiment analysis, and article grouping. X = (x1 , x2 , x3 , . . . , xn ) P(y|x1, . . . , xn) =
P(x1|y)P(x2|y) . . . P(xn|y)P(y) P(x1)P(x2) . . . P(xn)
(1) (2)
(iii) Logistic Regression: Logistic regression [2, 7] is a categorization algorithm. It is used to forecast a two-fold outcome based on a number of free factors. When working with binary data, the right kind of analysis to use is logistic regression.
4 Design and Methodology This section discusses our study methodology (Fig. 1). The dataset is gathered and afterward required preprocessing is done. The data is later divided into training and testing dataset. Machine learning algorithms implementation include decision trees, the Naive Bayes, and logistic regression. Finally, comparisons are made between the algorithms’ outcomes.
4.1 Data Description We used Open Source Mental Illness (OSMI) data set for our experiments.
Fig. 1 Workflow process
Comparative Analysis of Machine Learning Algorithms on Mental …
603
The nonprofit organization OSMI seeks to uplift people’s consciousness of mental health issues by educating business circles about how mental disorders affect the potency of a person and ameliorate the safety and standards of the workplace for employees. The organization instructs HR and management on how to encourage mental health at work, various forms of stress and disorders of the mind among tech professionals and to identify the most significant contributing factors,
4.2 Data Cleaning The initial dataset consists of 27 attributes covering personal and professional lives and includes 1000 responses from various individuals. Aims to verify the essential characteristics of the survey replies. Some of the worthy features can help to create a specific model. Twenty of the features which apply to our study are considered. Assign “0” to all the cells with “NaN” (not a number). The label encoder transforms the category input into numbers. The remaining 20% of the replies are for testing, and the remaining 80% is to train the model.
4.3 Performance Evaluation This section describes the evaluation measures we used in our study. The confusion matrix presents a true positive, false positive, true negative, and false negative [9] (Fig. 2). TN: FN: FP: TP:
True_positive False_negative Flase_positive True_positive
Precision: Precision evaluates the model’s performance. The percentage of correctly predicted outcomes is found. Fig. 2 Confusion matrix representation
604
Lakshmi Prasanna and S. Mehrotra
Precision = TP/(TP + FP) Recall: It depicts the proportion of positive outcomes correctly identified by the model. Recall = TP/(TP + FN) f-1 score: It measures the accuracy of the model on the dataset. f-1 score = 2 ∗ Precision ∗ recall/(precision + recall)
5 Result and Analysis See Fig. 3. Analysis: It is analyzed from Tables 1, 2 and 3 that the Decision Tree achieved a better result than the Bayesian classification model and Logistic Regression in terms of precision, recall, and f-1 score. Comparitive Analysis of Algorithms 1
Performance Metrics
0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Decision Tree
Naive Bayes
Logistic Regression
Algorithm Precision
Recall
f-1 score
Fig. 3 Graphical bar presentation of precision, recall, and f1-score obtained from the model
Table 1 Result presentation of decision tree Precision
Recall
f-score
Support
Accuracy 0.98
0.8
0.9
0.95
1.075
Macro Avg.
0.6
0.55
0.45
1.259
Weighted avg.
0.83
0.85
0.9
1.259
Comparative Analysis of Machine Learning Algorithms on Mental …
605
Table 2 Result presentation of Bayesian classification Precision
Recall
f-score
Support
Accuracy
0.76
0.84
0.77
1.84
0.9
Macro avg.
0.66
0.59
0.52
1.055
Weighted avg.
0.45
0.55
0.42
1.259
Table 3 Result presentation of logistic regression Precision
Recall
f-score
Support
Accuracy
0.91
0.88
0.89
24
0.8
Macro avg.
0.72
0.74
0.73
315
Weighted avg.
0.99
0.98
0.99
315
6 Conclusion and Future Scope The paper presents a comparison of the decision tree, Naive Bayesian and Logistic Regression classification techniques for predicting a person’s mental health. The experimental results show that the Decision tree obtained better results than the Bayesian classification model and Logistic Regression model. It is observed, that tech workers are more likely to experience stress. In order for businesses to take the appropriate steps to improve work environment, this research aids in identifying stressors among employees. Additionally, it can assist doctors in determining the real cause of a patient’s mental health issues and to provide appropriate treatment. We will implement some deep learning and evolutionary model for diagnosis of mental stress in employees. Later on this exploration work can be applied to foresee other medical issue like anxiety, overthinking and so on consequently giving reasonable ecological circumstances to prosperity of an individual.
References 1. Katarya R, Maan S (2020, Dec) Predicting mental health disorders using machine learning for employees in technical and non-technical companies. In: 2020 IEEE international conference on advances and developments in electrical and electronics engineering (ICADEE). IEEE, pp 1–5 2. D’monte S, Tuscano G, Raut L, Sherkhane S (2018, Jan) Rule generation and prediction of anxiety disorder using logistic model trees. In: 2018 international conference on smart city and emerging technology (ICSCET). IEEE, pp 1–4 3. Reddy US, Thota AV, Dharun A (2018, Dec) Machine learning techniques for stress prediction in working employees. In: 2018 IEEE international conference on computational intelligence and computing research (ICCIC). IEEE, pp 1–4 4. Yang H, Bath PA (2019, May) Automatic prediction of depression in older age. In: Proceedings of the third international conference on medical and health informatics, pp 36–44
606
Lakshmi Prasanna and S. Mehrotra
5. Chen X, Sykora MD, Jackson TW, Elayan S (2018, Apr) What about mood swings: identifying depression on twitter with temporal measures of emotions. In: Companion proceedings of the web conference, pp 1653–1660 6. Gaur M, Kursuncu U, Alambo A, Sheth A, Daniulaityte R, Thirunarayan K, Pathak J (2018, Oct) Let me tell you about your mental health! Contextualized classification of Reddit posts to DSM-5 for web-based intervention. In: Proceedings of the 27th ACM international conference on information and knowledge management, pp 753–762 7. Fortuna KL, Naslund JA, LaCroix JM, Bianco CL, Brooks JM, Zisman-Ilani Y, Muralidharan A, Deegan PDigital peer support mental health interventions for people with a lived experience of a severe mental illness: a systematic review.JMIR Mental Health 7(4):e16460 8. Adamou M, Antoniou G, Greasidou E, Lagani V, Charonyktakis P, Tsamardinos I, Doyle M (2018) Toward automatic risk assessment to support suicide prevention. Crisis 9. Nishant PS, Mehrotra S, Mohan B, Devaraju G (2020) Identifying classification techniques for medical diagnosis. In: ICT analysis and applications. Springer, Singapore
Comparative Analysis of Machine Learning Algorithms for Predicting Mobile Price D. N. V. S. Vamsi and Shashi Mehrotra
Abstract While buying new mobile, deciding whether a mobile with given features is worth its price often takes time. The main theme of this paper is presenting a comparative analysis of the Machine Learning algorithms, Decision Tree, Random Forest, and Regression tree models for predicting the price range of a mobile, using ML algorithms. The cost of the mobile depends on a set of features such as battery capacity (maH), clock speed (GHz), memory size (GB), random access memory (RAM), no. of cores (dual, quad, octal) etc. Our experimental result demonstrates that the random forest model outperformed other models; the Decision tree and regression model were used for the study. In our paper we have implemented machine learning techniques to train the model for predicting the mobile price range and we have further extended this work to find out which technique among the used ones give us the prediction with highest accuracy. Keywords Random forest · Logistic regression · Decision tree
1 Introduction Now-a-days people are crazy about mobile; many mobiles are available in the market from various brands, configurations, and uses. Most people need help with their budget when buying a new mobile phone. Some people might get attracted to the mobile brand even though it is expensive but does not perform well, and some will not even give much thought about specifications like memory, processor, RAM etc. [1]. For example, if a camera is good, its storage might be less, If both are good, D. N. V. S. Vamsi (B) Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Guntur, Andhra Pradesh, India e-mail: [email protected] S. Mehrotra College of Computer Science and Engineering, Teerthanker Mahaver University, Moradabad, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_55
607
608
D. N. V. S. Vamsi and S. Mehrotra
then the mobile performance might be slow because of fewer processors. To ease the customer’s task, ML algorithms can be utilised to predict the price of a mobile using its specifications. Many researchers have devised such types of solutions using ML algorithms. Arora et al., in a project [2], have gathered information that takes into account variables like display size (inches), weight (grams), thickness (mm), internal memory capacity (GB), etc. Weka’s ZeroR method, Naive Bayes algorithm, and J48 Decision Tree are used to forecast mobile phone prices. Many studies approximated used car prices in Mauritius. To predict prices, they employed several ML techniques, such as Multiple Linear Regressions and k-nearest neighbours (KNN). Priyadarshini et al. [3] used chi-squared-based feature selection and took into account 10 out of 21 available characteristics, including internal memory, RAM, pixels, battery power, and pixel width. Different techniques are being utilised to forecast prices. ŠVM, RFÇ, and Logistic Regression all produced accuracy results of 95%, 83%, and 76%, respectively. SVM, RFC, and Logistic Regression’s accuracy increased to 97%, 87%, and 81%, respectively, after feature selection [4]. To forecast students’ test performance, Priyam et al. [5] used educational data and three decision tree algorithms (ID3, C4.5, and CART). These algorithms forecast students’ success on the final test based on their performance on internal exams. The author concluded that among the three algorithms, C4.5 is the best for small datasets since it offers greater efficiency and accuracy than the other two. Yang [6] performed two experiments. The first is about implementing a 3-way decision tree, and the second is about a 4-way decision tree, which is the extended version of the first experiment. These days various algorithms, namely Support Vector Machine (SVM), Random Forest Classifier (RFC), and Logistic Regression, are being used for price predictions. The study aims to explore and identify a machine learning algorithm for mobile price prediction and target the appropriate segment of customers. The paper presents three algorithms, namely Random Forest (RF), Logistic Regression (LR) and Decision Tree (DT), for predicting the mobile price and performing a comparative analysis of the mobile price. The further sections of paper are organised as: Sect. 2 explaining related work, Sect. 4 methodology, Sect. 5 experiment and results and the conclusion in Sect. 6.
2 Related Work In her study, Singh [1], discovered several correlations and trends between various dependent and independent variables. The most significant insight gained was finding prominent characteristics for price prediction and classifying client preferences into economic groups. She trained the model using carefully chosen data because the categorisation included supervised learning. Rochmawati et al. [7, 8] used j48 and the Hoeffding tree techniques, which produced precise results if any person is exposed to light, average, or bad case of COVID or not COVID. They also stated that the differences between the algorithm j48 and the Hoeffding Tree are not very noteworthy.
Comparative Analysis of Machine Learning Algorithms for Predicting …
609
Benllarch et al. [9] investigated the performance of Very Fast Decision Tree (VFDT) and Extremely Fast Decision Tree (EFDT) algorithms using Sklearn packages in python. The first algorithm develops the value of the Nmin parameter of the model, and the second algorithm deals with the equalising of classes in the dataset. Patel et al. [10] implemented ID3, C4.5 and CART algorithms on datasets to find the best algorithm using the WEKA tool. This tool has an inbuilt collection of ML techniques for data mining, pre-processing, and classification. Zou et al. [11, 12] in their paper studied mathematic type of logistic model, defined erf (error function), and improved the Sigmoid function. Thereby reducing the number of repetitions and implying that the classification effect performs well. Their paper confirmed a prediction model which anticipates whether customers accept a specific vehicle or not. It can be used as reference for studying the binary classification problem. ´ Nishant et al. [13] in their paper implemented Naíve Bayés, Support Vector Machines (SVM), Random Forest, Logistic Regression, C5.0, and Linear Discriminant Analysis models for medical diagnosis and experimented with them over three datasets, namely the Clëveland H˙eart Disease dataset, Wîsconsin Diagnostic Breast Cancer dataset, and Pima Indians Diabetes.
3 Background This section describes Logistic Regression, Decision Tree and Random Forest algorithms which we used for our study. A decision tree is a strategy where every root shows a succession of information divisions till a Boolean result is reached at the leaf. Progressively applications, every course in the decision tree is a decision that can be changed over into human dialects or programming languages. By taking into account all ways, the whole tree corresponds to a compound Boolean expression which includes conjunctions and disjunctions to settle on a Boolean decision [14]. For setting up grouping frameworks in light of multiple covariates or for constructing expectation calculations for an objective variable, decision tree procedures employ a commonly used information-digging method. This method divides the data into sections resembling branches, creating a modified tree with roots, inner, and leaf hubs. Since the algorithm is non-parametric, it can handle large, complicated datasets without using a complex parametric design. When there are several examples, the information focus might be split into preparing and approving datasets. Using the agreed dataset to find the correct tree size and the training dataset to make a decision tree model to get the efficient final model [6]. Confusion matrix It gives you the outputs like True Positive, True Negative, False Positive and False Negative for binary classification, but in our case, we have taken four classes, so this is a multi-class classification [13].
610
D. N. V. S. Vamsi and S. Mehrotra
Random Forest A random forest is an assortment of numerous decision trees. It is a Regulated AI Calculation that is utilised generally in Characterization and Relapse issues. It fabricates DTs on different examples and takes their greater part vote in favour of characterisation and normal if there comes any occurrence of regression [15]. Logistic Regression LR is one of the popular ML algorithms, which comes under the Supervised Learning method. It is used to anticipate the categorical dependent variable using a given set of independent variables [13]. Decision Tree It is an important and an in-demand tool for classification and prediction. A Decision Tree is a tree like structure where every internal node denotes attribute testing, every branch shows result of test, every leaf node holds class label [13].
4 Methodology This section discusses our study methodology (Fig. 1).
Fig. 1 Work flow process diagram
Comparative Analysis of Machine Learning Algorithms for Predicting …
611
5 Data Description We collected the dataset from kaggle for our experiments. It contains 2000 data points, 20 features, and a price range as the target variable. The price range (target variable) is classified into four categories, as shown in Table 1. Table 1 shows the dataset that contains 20 input features, of which six are categorical, and the rest 14 is continuous. Both categorical and continuous features are mentioned in Table 2 Table 1 Features available in our data set
Table 2 Categorical representation of the class
Name of the feature
Type of the feature
Has Bluetooth or not
Categorical
Battery power
Continuous
Clock speed
Continuous
Dual sim support
Categorical
Front camera megapixels
Continuous
Has 4 g support or not
Categorical
Internal memory in gigabytes
Continuous
Mobile depth in cm
Continuous
Weight of the mobile
Continuous
Number of cores in processor
Continuous
Primary camera megapixels
Continuous
Pixel resolution height
Continuous
Pixel resolution width
Continuous
RAM
Continuous
Screen size in cm
Continuous
Screen width
Continuous
Battery standby time
Continuous
Has 3 g support or not
Categorical
Has touchscreen or not
Categorical
Has Wi-Fi option or not
Categorical
Price range
Categorical representation
0
Low
1
Medium
2
High
3
Very high
612
D. N. V. S. Vamsi and S. Mehrotra
6 Random Forests A random forest is an ensemble of many decision trees. Again, sklearn is used for importing Random forests. The criterion is ‘Gini’, and the max depth is set to 100. It achieved an accuracy of 88%, which is superior to decision trees and logistic regression algorithms. The Gini index allows big partitions and tries to expand the tree based on nodes (sub-nodes) with the slightest impurity possible [15].
7 Experiment and Results We implemented all the three models; Decision tree, logistic regression and Random forest using the same data set in Python. Algorithms: The sklearn python library for implementing the ML algorithm. We have used the following algorithms to experiment and analyse comparative performance evaluation: Logistic regression, Decision Tree, and random Forest. RFs Criterion is ‘Gini’, and max depth is set to 100. Performance Evaluation Metrics This section describes the matrices used for the performance evaluation. Accuracy It gives you the number of total samples that were correctly classified by the classifier [9]. This can be calculated using the equation: (TP + TN)/(TP + TN + FP + FN). Precision Which part of predictions are positive class is positive is given by precision [9]. To calculate, use the following formula: TP/(TP + FP). Recall It gives us what part of every positive samples is correctly guessed as positive by the classifier [16]. It is also known as True Positive Rate (TPR), Sensitivity. For calculating use: TP/(TP + FN). F1-score It merges precision and recalls into one measure [17]. Mathematically it’s the harmonic mean of precision and recall. It can be calculated as follows: F1 - score = 2 ∗ [(Precision ∗ Recall)/(Precision + Recall)] = 2TP/(2TP + FP + FN)
Comparative Analysis of Machine Learning Algorithms for Predicting …
613
Fig. 2 Presentation of data correlation
Correlation matrix The correlation matrix [13] is a matrix which shows the correlation between every possible pair of features in the dataset. The correlation matrix for the elements in the dataset is plotted and shown in Figs. 2 and 3. From Fig. 3 we can observe that the highest accuracy i.e., 88% is achieved by the Random forest algorithm. Followed by Decision Tree algorithm 83% accuracy. The logistic regression performed poorly among the mention algorithms used for the experiment, it achieved accuracy only 60%. The random forest model outperformed all the models for all the measures, Precision, Recall, F1-score. Performance metrics have been improved by a good margin when compared to Logistic Regression and Decision Trees, as shown in Fig. 3.
8 Conclusion and Future Scope Uncertainty in buying mobile is a problem experienced by many users. The paper presents comparative analysis of the three machine learning techniques logistic regression, decision tree, random forests. We established models to predict the price range based on the features of the mobile. Our experimental results show that Random forest model performed best among the Decision tree, logistic regression for the mobile price prediction. In the future research work we will extend our work for customer segmentation based on their income to target the suitable customer segment as per the mobile price.
614
D. N. V. S. Vamsi and S. Mehrotra
100.00
Comparative Analysis of Logistic Regression, Decision Tree and Random Forest Algorithms
90.00
Measures in Percentage
80.00 70.00 60.00 50.00 40.00 30.00 20.00 10.00 0.00 Accuracy
Precision Logistic regression
Recall Decision tree
Fl score
Random forest
Performance metrics
Fig. 3 Graphical presentation of performance measures of the models
References 1. Singh R (2021) Exploratory data analysis and customer segmentation for smartphones 2. Aora P, Srivastava S (2022) Mobile price prediction using WEKA 3. Kalaivani KS, Priyadharshini N, Nivedhashri S, Nandhini R (2021) Predicting the price range of mobile phones using machine learning techniques. AIP Conf Proc 2387(1):140010 4. Nowozin S, Rother C, Bagon S, Sharp T, Yao B, Kohli P (2011) Decision tree fields. In: 2011 international conference on computer vision. IEEE, pp 1668–1675 5. Priyam A, Abhijeet GR, Rathee A, Srivastava S (2013) Comparative analysis of decision tree classification algorithms. Int J Current Eng Technol 3(2):334–337 6. Yang F-J (2019) An extended idea about decision trees. Int Conf Comput Sci Comput Intell 2019:349–354. https://doi.org/10.1109/CSCI49370.2019.00068 7. Rochmawati N, Hidayati HB, Yamasari Y, Yustanti W, Rakhmawati L, Tjahyaningtijas HP, Anistyasari Y (2020) Covid symptom severity using a decision tree. In: 2020 third international conference on vocational education and electrical engineering (ICVEE). IEEE, pp 1–5 8. Benllarch M, El Hadj S, Benhaddou M (2019) Improve extremely fast decision tree performance through training dataset size for early prediction of heart diseases. In: 2019 international conference on systems of collaboration big data, internet of things & security (SysCoBIoTS), pp 1–5. https://doi.org/10.1109/SysCoBIoTS48768.2019.9028026 9. Vaca C, RiofrÍo D, Pérez N, BenÍtez D (2020) Buy & sell trends analysis using decision trees. In: 2020 IEEE Colombian Conference on Applications of Computational Intelligence (IEEE ColCACI 2020). IEEE, pp 1–6 10. Patel HH, Prajapati P (2018) Study and analysis of decision tree-based classification algorithms. Int J Comput Sci Eng 6(10):74–78
Comparative Analysis of Machine Learning Algorithms for Predicting …
615
11. Zou X, Hu Y, Tian Z, Shen K (2019) Logistic regression model optimization and case analysis. In: 2019 IEEE 7th international conference on computer science and network technology (ICCSNT), pp 135–139. https://doi.org/10.1109/ICCSNT47585.2019.8962457 12. Segal MR (2004) Machine learning benchmarks and random forest regression 13. Nishant PS, Mehrotra S, Mohan B, Devaraju G (2020) Identifying classification techniques for medical diagnosis. In: ICT analysis and applications. Springer, Singapore, pp 95–104 14. Safavian SR, Landgrebe D (1991) A survey of decision tree classifier methodology. IEEE Trans Syst Man Cybern 21(3):660–674 15. Higham NJ (2002) Computing the nearest correlation matrix—a problem from finance. IMA J Numer Anal 22(3):329–343 16. Abo-Tabik M, Costen N, Darby J, Benn Y (2019) Decision tree model of smoking behaviour. In: 2019 IEEE smart world, ubiquitous intelligence & computing, advanced & trusted computing, scalable computing & communications, cloud & big data computing, internet of people and smart city innovation (smart world/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), pp 1746–1753. https://doi.org/10.1109/SmartWorld-UIC-ATC-SCALCOM-IOP-SCI.2019.00311 17. Praveena MDA, Krupa JS, SaiPreethi S (2019) Statistical analysis of medical appointments using decision tree. In: 2019 fifth international conference on science technology engineering and mathematics (ICONSTEM), pp 59–64. https://doi.org/10.1109/ICONSTEM.2019. 8918766
Investigating Library Usage Patterns Inutu Kawina and Shashi Mehrotra
Abstract Libraries are experiencing a shift in their usage patterns due to the introduction of digital materials and technology. This study aims at investigating the library usage patterns among library patrons. The methodology used the San Francisco dataset to visualize and analyze the library usage, and the usage patterns reveal that most people using the libraries are retired staff and staff members. These library users have access to the library materials, facilitating them to borrow up to a specific number of materials compared to other library users and are aware of the library’s services. In this regard, libraries need to market their services and sensitize their users on the services provided to increase library usage among its users. Keywords Academic libraries · Patrons · Book circulation · Library usage · Library materials technology
1 Introduction Libraries are transitioning in how they deliver information to users and how they operate. With these changes comes the introduction of technology, which has completely changed library usage. These technological advancements in libraries show how people spend more time searching online materials than on a book, therefore, a preference for the internet over libraries [1]. Therefore, this paper aims to investigate library usage among various patrons to find circulation patterns for library materials. I. Kawina (B) Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Guntur, India e-mail: [email protected] S. Mehrotra College of Computer Science and Engineering, Teerthanker Mahaveer University, Moradabad, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_56
617
618
I. Kawina and S. Mehrotra
1.1 Definition of Terms According to Velmurugan and Vel [2], an academic library is a repository of knowledge connected to a university or college to support its curriculum. Hassan and Ekoja [3] define the procedure of borrowing library materials to its users as book circulation. The library staff can identify the lent-out books to a patron at the circulation desk. Bohyun [4] defines a patron as a library user who accesses the library to use the services offered by the library. Rest of the paper is organized as follows: Sect. 2 discusses related works, Sect. 3 is background, and Sect. 4 is methodology that presents design and methodology of our study. Section 5 will present experiment and analysis of the finds discovered and finally Sect. 6, is the conclusion.
2 Related Work This chapter presents the study and other researchers’ past research on related subjects. It paves the path for additional research by offering guidelines and addressing gaps that other researchers still need to address. According to Yu [5], users need to provide information that users require on their own to satisfy their needs. Most current library review practices focus on occurrences and aggregate measures, these statistics hide essential patterns. In this paper, he focused on using appropriate data mining techniques to analyze the circulation data of the internal information management system and make recommendations— classifying the readers based on grades, majors and other factors. He further studied the reader’s interests to identify the patterns in book borrowing. The server used as a data mining tool is the 2005 Microsoft SQL Server; the Microsoft clustering algorithm categorized the readers. The Microsoft association rules algorithm evaluated the readers’ interests. The Microsoft time series algorithm evaluated the readers’ peaks and valleys. The findings revealed that the books that readers borrow change with their grades. To determine book circulation, Marasinghe [6] examined the lending patterns of library items. The study used a quantitative research approach and secondary data from the Koha Library Management System for 2014–2018. Koha which is an integrated library management system manages the circulation, cataloguing and classification, reports of library materials as well as the return of library materials, therefore the circulation module was used in the identification of lending patterns of library items. The study used graphs and descriptive statistics like percentages, frequencies, and correlation analyses to depict the data. The results revealed that both the overall annual book borrowings and the annual book borrower rate had declined over time.
Investigating Library Usage Patterns
619
Samson and Swanson [7] aimed to identify library services and resources and solicit feedback from campus workers to improve outreach programs. All non-faculty and non-administrative support workers received surveys. The findings revealed that staff members need empowerment through knowledge and awareness of the library and that support staff are enthusiastic about becoming familiar with and using the library’s resources and services. The findings provide librarians with suggestions for facilitating an efficient outreach program to enhance the ability to support staff to perform their duties and share information with the students and faculty with whom they interact.
3 Background Libraries are known to be the pathway to knowledge and have moulded how the world is growing. They play an essential role in education and organizations, as they serve as a basis for learning and connecting people [8]. Over the years, libraries have seen a decline in their usage by their patrons. Many factors contribute to this reduction, which includes the introduction of digital materials such as electronic books and information in CD and DVD format. These materials have changed the behavior of library users to a great extent. Many users use the library sparingly because they prefer to use information readily available on the internet. In response to the increase in demand for digital materials, many libraries have adopted the practice of purchasing electronic materials and employed the use of technology to access information in the library. However, the decline continues even though these measures are being implemented [9].
4 Design and Methodology The methodology encompasses the procedure and processes employed to analyze the dataset and come up with concrete solutions to the problem (Fig. 1).
4.1 Data Collection/Description The research used the San Francisco library usage dataset, available on Kaggle.
620
I. Kawina and S. Mehrotra DATA COLLECTION Library usage dataset DATA PREPROCESSING Data cleaning Feature selection Missing value treatment
EXPIREMENT Visualization (using Graphs)
ANALYSIS Interpretation of results
Fig. 1 Graphical representation of the process work flow of the study
4.2 Data Description The dataset used for the study consists of 420,000 records of library users; each column includes statistics on the types of users using the library. These columns are patron-type code, patron-type definition, Age range, home library code, home library definition, active circulation month, notice preference code, notice preference definition, outside of the country, provided an email address and supervisor district. The patron type code was assigned based on the library patrons, which meant that 0 represented adults, 1 represented juveniles, 2 represented young adults, 3 represented seniors, 5 represented staff, 8 represented visitors, and 55 represented retired staff.
4.3 Data Preparation Anaconda Navigator version 2022.10, which supports python 3.9, was used in data preprocessing and visualization. 420,000 records contained some columns unrelated
Investigating Library Usage Patterns
621
to the study; therefore, it was necessary to drop them. Since a patron-type code was available to identify the type of patron utilizing the library, there was a need to eliminate the patron-type definition. Other columns dropped included home library code, home library definition, active circulation month, notice preference code, notice preference definition, outside of the country, provided an email address, and supervisor district. The dataset was subjected to data visualization following data cleaning. Data visualization was used to graphically represent the information on charts regarding the patron types with total checkouts and total renewal.
5 Experiment and Analysis This chapter presents the findings from the study and discusses the most important findings found in the dataset so that insight can be gained and decisions can be made from the discovered results.
5.1 Experiment This section presents the findings obtained from the 420,000 sample size of the San Francisco library usage dataset. The results in Fig. 2 indicate that the retired staff showed more library checkouts than other patrons, with 1100 total checkouts. They are followed by staff with a total of 900 checkouts and visitors having 400 total checkouts for library materials. The lowest numbers are observed among the adults, with 180 total checkouts, and the juveniles, with 190 total checkouts. Figure 3 shows that the highest renewals were among staff members, with retired staff, with the highest being staff members with more than 580 total renewals, followed by retired staff with 500 total renewals. The lowest was recorded among juveniles, with 40 total checkouts and young adults, with 50 total checkouts.
5.2 Analysis The findings in Fig. 2 reveal that retired staff had the highest number of checkouts than other library users, followed by staff members. The retired staff are the library users who were part of the library staff but have reached their retirement age. This group of users is usually given access to the library services after their retirement; they have access and can borrow up to a certain number of books depending on the policies that the library has set up for this group of users. The staff members are those who work in the library and, just like the retired staff, are given privileges on how
622
I. Kawina and S. Mehrotra
Fig. 2 Graphical presentation for an average score of Patron usage of library materials
Fig. 3 Graphical representation of average score of patron type and total renewal
Investigating Library Usage Patterns
623
they can access library materials or the services offered by the library; they have a chance to borrow a certain number of books based on the policies set up for them. Library employees are typically allowed to borrow more books than others and may also obtain books to read at home before returning them. Juveniles, adults, and senior adults use the library less; this is a result of a lack of interest on the part of these library users and a need for more awareness of the library’s services. Another factor contributing to the low usage of libraries is the need for more computer technologies by library users who prefer the use of digital information compared to searching for information in the library [10]. Figure 3 displays statistics on total renewals of library materials, which show that the highest renewals of library materials are among library staff rather than retired staff, owing to the library staff’s stationing at the library. These people are aware of the rules governing the collection and renewal of library materials. The lowest records of total renewals are among juveniles and young adults because they have yet to conclude in reading the books they borrowed. The other reason is that they need to remember to renew the materials. The total circulation and total renewal results show that staff and retired staff have the highest total checkouts and total renewals than other categories of library users. There is a need for proper sensitization on the use of library materials, as well as circulation and renewals of library materials, is required. Marketing of their services is also needed to promote the services offered to increase library circulation of library materials. When users need to renew, there is a need to put up reminders for individuals. This reminder could be done through messages through the library system, direct messages, or phone calls to the users to remind them. Another measure to put in place is prolonging the usage of the library materials; this is to give time for the users to finish reading the materials.
6 Conclusion and Future Works The size of libraries has increased over the past ten years, yet most libraries have seen a decline in usage patterns of the services. The reason is that library items now exist in various formats and are used in technologically advanced ways, such as online digital data, and the libraries also make available the content digital in CDs etc. As a result, the study sought to visualize how library resources are utilized during circulation and renewal. The results show that most active users are current and former staff members, whereas other users use circulation and renewal of library materials less frequently. To make the library useful for its intended users, it is, therefore, necessary to promote and make them aware of the available services and how they can utilize the library at most. Future research would concentrate on the enhancement and improvement of library and library resource consumption through the use of technology like artificial intelligence. Data mining techniques can be used to increase library circulation
624
I. Kawina and S. Mehrotra
and give users better access to the collection’s resources. This would not only promote library use but also assist library staff in managing the library to its fullest potential.
References 1. Wenborn C (2018) How technology is changing the future libraries. The Wiley network. https://www.wiley.com/en-us/network/research-libraries/libraries-archives-databases/libraryimpact/how-technology-is-changing-the-future-of-libraries 2. Velmurugan S, Vel S (2012) Academic libraries in India. Int J Res Manage 3(2):31–35. https:/ /www.researchgate.net/publication/337242769 3. Hassan DS, Ekoja II (2015) Appraisal of circulation routine duties in academic libraries. Inform Manager 15(1–2):12–16. https://www.ajol.info/index.php/tim/article/view/144885 4. Bohyun (2012) What do libraries call users, and what do library users think of themselves in relation to libraries? Library Hat. http://www.bohyunkim.net/blog/ 5. Yu P (2011) Data mining in library reader management. In: 2011 International conference on network computing and information security, vol 2. IEEE, pp 54–57 6. Marasinghe MMIK (2020) Trend analysis of library book circulation of the Open University of Sri Lanka. J Univ Librarians Assoc Sri Lanka 23(1):57–72. https://doi.org/10.4038/jula.v23i1. 796657 7. Samson S, Swanson K (2014) Support your staff employees: they support the academy. Montana, USA 8. White B (2012) Guaranteeing access to knowledge: the role of libraries. WIPO magazine. https://www.wipo.int/wipo_magazine/en/2012/04/article_0004.html 9. Eje OC, Dushu YT (2018) Transforming library and information services delivery using innovation technologies. Library Philos Pract (e-journal) 10. Iwhiwhu EB, Okorodudu OP (2012) Public library information resources, facilities and services: user satisfaction with Edo state central library, Bernin-city, Nigeria. Library Philos Pract (e-Journal)
Advanced Application Development in Agriculture—Issues and Challenges Purnima Gandhi
Abstract Voluminous data is generated from complex data sources with varied characteristics in agriculture. The agricultural data is made available in a variety of formats, is generated and collected using diverse techniques, and is exposed through numerous data sources. Data management in agriculture presents more difficult problems for emerging nations than for industrialized nations. The research scope is formulated by evaluating the challenges and opportunities related to advanced application development in agriculture for developing countries. The paper discusses the research evolution of data management, applications, and analytics algorithms development in agriculture. The advanced data systems, applications, and architectures with big data characteristics in agriculture are reviewed. It highlights the significant constraints in providing data management-related solutions for application development in agriculture. The paper poses research questions; and discusses the possible answers for future endeavors. Keywords Big data · Big data analytics · Machine learning · Dark data · IoT · Farm management system
1 Introduction In Indian agriculture, dedicated storage, archival, and retrieval systems are required due to the massive volume of data generated from unverified sources at impulsive rates with a variety of shapes, sizes, formats, and architectures. A massive amount of data emanates from various complex sources like machine-generated data, data services, social media data, Internet of Things (IoT) based data, etc. Data management is critical as it performs various crucial tasks such as ingestion, storage, processing, analytics, integration, visualization, etc. Many research efforts have been made by industries and researchers to develop data management tools and technologies. These P. Gandhi (B) Institute of Technology, Nirma University, Ahmedabad, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_57
625
626
P. Gandhi
products are specifically performing data management-related tasks such as storage, processing, integration, visualization, ingestion, etc. Advanced systems such as big data analytics (BDA) have achieved substantial consideration in various application domains such as marketing and e-commerce, finance, banking, and agriculture. It has also fulfilled modern users’ demands to some extent. It can generate fast and accurate results to aid faster and more accurate decision-making. It has the potential to refer the massive-scale data and produce comprehensive insight via remote sensing, geospatial processing, advanced analytics algorithms, cloud resources, and advanced storage systems. It can be one of the best solutions to bridge the gap between end-users and information in user-friendly, easy-to-access, cost-effective, and multilingual interfaces. The definition of complexity in data varies daily depending on how tools and technologies get exhausted to handle the data efficiently. Data management and analytics are the complex processing of complex data. Data management tools help stakeholders such as government agencies, research organizations, and corporations in agriculture discover valuable information from massive datasets. Data management strategies help modern users effectively perform big data administration, organization, and governance. They ensure high data quality and accessibility for analytics applications.
1.1 Research Questions The following research questions guide the research, Q1. “What are the major constraints and open challenges in developing advanced applications and systems in agriculture?” Q2. “How to deal with disparate data sources, formats, size in agriculture domain?” Q3. “How the intervention of advanced technologies make big data management process effective and efficient?” The main contributions of the paper are: 1. The paper highlights the potential areas of advanced application development in agriculture. 2. It discusses the inventions and research made in application development in agriculture. 3. It identifies the major limitations of advanced application development in agriculture. 4. It discussed open challenges and probable solutions for future endeavors.
Advanced Application Development in Agriculture—Issues …
627
2 Advanced Analytics in Agriculture It is clear how crucial the agriculture industry is to India given that there are more than a billion mouths to feed. For more than 58% of rural households that are dependent on agriculture, it is one of the most crucial sources of income. Approximately 1.2 billion people are fed by the Indian agricultural system today, and 1.54 billion people are expected to be fed by 2035. Agriculture data is distinguished from other types of data in terms of technology by its volume (data in large quantity), variety (datasets with structural heterogeneity), velocity (diverse speed/data rate is necessary and relevant), veracity (reliability is crucial), variability (variation in data flow rate), and complexity (complexity in the context of the datasets—a large volume of data coming from disparate sources and needs to be linked, connected, and correlated). The process of making accurate and timely decisions requires a great deal of skill on the part of agriculture stakeholders like end-users, farmers, policymakers, domain experts, academicians, data scientists, and researchers. Agricultural data is widely accessible in both private and public data forms. Through web portals, websites, online repositories, data-driven apps, and other means, the government, private agencies and organizations, researchers, industries, and end consumers largely disseminate these data. These are complex, diverse, cluttered, and multidimensional data. Emerging technologies such as big data analytics analyze such vast volumes of structured and unstructured data to produce more accurate and timely results that will aid in more precise managerial decisions and recommendations.
2.1 Potential Areas of Advanced Analytics Applications in Agriculture People now recognize the value of big data and big data analytics owing to contemporary trends in the application domain. Understanding the usefulness of big data analytics is the biggest struggle. Moreover, big data analytics is used to manage and analyze agricultural data so that the end-users can quickly reach valuable information to use potential opportunities and achieve benefits. Modern technologies help improve productivity through recommendations and a decision support system. Numerous opportunities to create cutting-edge, appropriate, information-intensive and precise applications in agriculture have been made possible by big data analytics. Following is a description of the potential areas of applications for agricultural management systems. Advanced Sensor Technology Systems The advanced sensor-based systems viz. remote sensing applications such as soil health monitoring, watershed management, crop coverage, and crop health estimates
628
P. Gandhi
are implemented to develop robust agriculture systems. For example, Unmanned Aerial Vehicles (UAVs) are more accurate in capturing multispectral images than satellites. The UAV applications are used to assess crop health damage and yield. Data Analytics Applications Data analytics applications such as spatial and GIS-based applications are designed to improve natural resource management using advanced tools and technologies. For example, weather analytics applications to improve crop production, reduce cost during crop development, etc.; the prediction systems to estimate crop yield based on spatial–temporal data. Advanced Systems Advanced agriculture applications such as supply chain management are used for agro products to enhance traceability from soil to market. The workforce requirement problem in farms can be solved by using autonomous systems. These systems establish communications with other systems to resolve the issue. Crop growth models are used to optimize pest control measures to estimate yields, harvest dates, potential pests, and disease outbreaks. Recommendation Systems It generates recommendations on agricultural inputs in real-time or batch processing. The information related to soil, fertilizer, seeds, pesticides, water, and crops is collected from local farms in real-time and analyzed using advanced analytics tools such as spatial tools. Decision Support Systems It helps agro users to improve their decisions related to farming activities. For example, it allows farmers to find water requirements/irrigation based on real-time data such as weather, crop development, and soil conditions.
3 Advanced Systems for Agriculture—Literature Survey The agricultural domain’s diverse and complex data has presented numerous storage and processing challenges. Modern technologies, such as data science, big data analytics, artificial intelligence, etc. are more successful and popular in performing massive-scale data management effectively and efficiently than traditional systems. These technologies are fast, fault-tolerant, and scalable. This section depicts the research evolution that has been made in application development in agriculture. The paper has reviewed the existing agricultural systems and applications developed using contemporary technologies such as data mining, machine learning, big data analytics, artificial intelligence, data science, etc. It also identifies key research gaps in the existing systems.
Advanced Application Development in Agriculture—Issues …
629
The paper reviews conventional information systems and applications. The web, mobile, and web-GIS-based applications and systems are evaluated. These systems are demonstrated and implemented for agriculture using traditional tools and technologies. The study identified that conventional technologies are often insufficient to depict a broad view of analytics in a natural context. The review is extended to highlight the application development in agriculture with modern technologies. Table 1 represents the state-of-the-art emerging systems and applications in the agriculture domain.
4 Major Constraints—Big Data Management in Agriculture Nowadays, modern technologies have gained substantial popularity amongst agrousers. The agricultural stakeholders and users seem to be very enthusiastic about integrating smart and advanced farming techniques to improve sustainability in agriculture. In contrast to user demand and interest, the applicability of big data analytics in agriculture is very limited compared to other industries and domains. The adoption of big data applications and key advancements in agriculture is very limited compared to other fields such as banking, finance, health, etc.
4.1 Diversity in Analytics The existing big data systems are developed with a fixed focus. For example, these systems mainly propose theoretical aspects of applications of modern technologies, conceptual frameworks, platforms, and architectures. The development of big data applications exists in limited sub-domains such as remote sensing, spatial analytics, crop prediction, weather analytics, etc. Moreover, the existing big data applications mainly targeted only one of the characteristics of big data, i.e., volume. The applications are only designed with large volume datasets, which automatically shed the other characteristics of big data from analytics. The existing systems lack interest in developing applications with real-life datasets as it is very hard to gather data sets from disparate sources such as dark datasets. This leverages inadequate solutions after analytics.
4.2 Advanced Architectures and Infrastructures The agriculture data is complex, unpredictable, and multivariate; these data must be interpreted and analyzed properly to address the challenges related to information
630
P. Gandhi
Table 1 Advanced systems and applications in agriculture Advanced systems and applications
Complex data Agricultural types, formats, area tools, and technologies
Objective and features
Shah et al. [1]
Big data analytics algorithms for prediction
Crop
The Crop Recommendation System is implemented using big data tools and technologies
Sahu et al. [2]
Hadoop, MapReduce programming model
Crop
A system is proposed using a predictive analytics algorithm to identify crops based on various soil parameters
Majumdar et al. [3]
Big data mining
Crop
The various data mining applications, such as efficient yield estimation, optimized crop production, etc., are implemented and compared on big data
Pantazi et al. [4]
Machine learning tools
Crop—wheat
The prediction model is proposed to find a yield of wheat crop within-field variation based on soil data, crop growth characteristics, and satellite imagery data
Singh et al. [5]
IoT, mobile applications, AI
Crop
The innovative methodology is developed to increase yield
Kumar et al. [6]
Quality centric Crop architecture
Setiawan et al. [7]
ICT platform
A Quality Centric Architecture is proposed for crop production system
Crop harvesting Designed and built a database-based platform to trace sugarcane harvesting activities
Armstrong et al. [8] Data mining tools
Soil
Comparative analysis has been made between existing data mining techniques to find the most effective solution
Foix et al. [9]
Sensing network
Leaf
The simulation-based leaf probing is done using a task-driven active sensing network
Papadopoulos et al. [10]
Fuzzy models
Fertilizer
The fuzzy decision support system is built and implemented to curate the knowledge base and analysis model to simulate fertilizer balance
Bendre et al. [11]
Predictive analytics, MapReduce programming model
Weather, precision agriculture
Identified the data sources for precision agriculture using various ICT components. Weather forecasting using various big data algorithms is performed (continued)
Advanced Application Development in Agriculture—Issues …
631
Table 1 (continued) Advanced systems and applications
Complex data Agricultural types, formats, area tools, and technologies
Objective and features
AgroDSS—Kukar et al. [12]
Data analytics Farm with predictive management models (case study—pest population dynamics)
AgroDSS is a cloud-based portal that allows farmers to upload their data and find efficient solutions utilizing various data analytic tools and techniques
Rajeswari et al. [13] Cloud and mobile-based big data/ Predictive analytics on IoT data, MapReduce programming model
Fertilizer requirements, analysis of the crops, market, and stock requirements for the crop
An integrated agricultural model is proposed to perform various big data analytics algorithms. Propose a methodology to improve crop production and regulate the cost of agricultural products using big data analytics
Sharma et al. [14]
Predictive analytics
Information systems
Compared classification methods GWO (Grey Wolf Optimization and SVM (Support Vector Machine) and proposed hybrid method to analyze agricultural data
Sørensen et al. [15]
Big data
Farm Information Management System
A farm information management system is built and implemented. It has also been compared with existing systems
Peisker et al. [16]
No tools and technologies are specified
Rural development
A conceptual framework is developed for rural development. An integrated analytics approach is used to design the framework
Shalini et al. [17]
XLMINER and WEKA, data mining algorithms for spatial data
Seismic information, disaster management
Discuss various data mining algorithms for disaster management
Ma et al. [18]
Machine learning algorithms for big data
Plants
The paper discussed machine learning for data-driven discovery
Chaudhary et al. [19]
Data integration techniques
Agro advisory system
A Service-Oriented Architecture is proposed. The Spatial Data Integration techniques for Agro Advisory Systems are applied to perform analytics on data (continued)
632
P. Gandhi
Table 1 (continued) Advanced systems and applications
Complex data Agricultural types, formats, area tools, and technologies
Objective and features
Shah et al. [20, 21]
Integrated architecture using various big data tools such as NoSQL databases, Spark
Crop, weather, market, farm management system
Developing Big Data Analytics Architecture for Spatial Data
Shah et al. [22]
An integrated platform built on top of various big data tools
Crop, weather, market
A Big Data Analytics and Integration Platform for Agriculture is built and implemented. The different agro applications are developed as proof of concept
Shah et al. [23]
Cassandra and Spark
Information The Spark-based agricultural system, crop, information system is implemented for weather, market spatial and non-spatial data storage and processing
Shah et al. [24]
Cassandra, Spark, MongoDB
Agro advisory system
Big data analytics architecture for the agro advisory system is proposed. The crop management system is implemented as a proof of concept
systems, recommendations, agro-advisory systems, smart agriculture, etc. Furthermore, the major barriers to the development of big data applications in agriculture are a lack of developers’ APIs, tools, infrastructures, semantics, data standards, data interpretation, integrated and advanced data models, single access points for private and public data, technical expertise, and, finally, data. The research and innovation mainly address the problems related to weather or crop production prediction and forecasting. Developing open-source, cost-effective, and highly efficient architectures and infrastructures is important to enhance big data application development.
4.3 Complex Analytics of Complex Data Types Complex data types, like geographical, spatial–temporal, sensor, and other types, are very important in many data-driven applications. Because big data techniques and technologies are used inefficiently, these data types are primarily ignored or underutilized. Therefore, handling such complex data would be impossible with a
Advanced Application Development in Agriculture—Issues …
633
single big data tool. There is very less work has been done in developing and implementing integrated infrastructure, platforms, and architectures for big agricultural data management.
4.4 Open Challenges It is very crucial for data scientists to prepare agricultural data for analytics especially when data is hidden and not easily accessible by the end users. A substantial amount of agricultural data is dark data. It is very challenging to manage such data because of the lack of technological solutions as well as the missing digital data. These leverage the lack of benchmark datasets in agriculture.
5 Conclusion The paper has identified major research gaps and feasible solutions in big data management and application development in agriculture. Big data management in agriculture has vast scope for improvement. The proprietary systems are robust but very expensive. On the other hand, the existing open-source systems are less appealing, distracted, less focused, and very limited. The intervention of various emerging technologies in agriculture with effective adoption and applicability is one of the best solutions to improve agriculture management systems. It is necessary to find new data types, sources, and formats in order to construct complex applications in the agricultural area. To execute advanced analytics, realworld datasets, particularly complex datasets with streaming data, will be gathered and stored in a data repository. Algorithms for data analytics and visualization that analyze real data streams like weather, disasters, etc. will be developed in real-time and close to real-time. Analytical services like crop recommendation, crop disease alert, crop price prediction, fertilizer recommendations, rainfall prediction, supply chain management, agro-inputs procurement, etc., are to be implemented. These services generate multilingual and customized solutions in alerts based on adverse events and weather-based crop calendars. The development of big data applications utilising complex data types, such as spatial data, is required. For instance, find the aggregated weather by agro-climatic zone and the number of regions with rainfall below the threshold value.
634
P. Gandhi
References 1. Shah P, Hiremath D, Chaudhary S (2005) Big data analytics for crop recommendation system. In: 7th international workshop on big data benchmarking (WBDB 2015), New Delhi, organized by ISI Delhi center, and IIPH Hyderabad, 14–15 December 2015 2. Sahu S, Chawla M, Khare N (2017) An efficient analysis of crop yield prediction using Hadoop framework based on random forest approach. In: 2017 international conference on computing, communication and automation (ICCCA). IEEE, pp 53–57 3. Majumdar J, Naraseeyappa S, Ankalaki S (2017) Analysis of agriculture data using data mining techniques: application of big data. J Bigdata 4(1):20 4. Pantazi XE, Moshou D, Alexandridis T, Whetton RL, Mouazen AM (2016) Wheat yield prediction using machine learning and advanced sensing techniques. Comput Electron Agric 121:57–65 5. Singh RS, Gelmecha DJ, Ayane TH, Sinha DK (2022) Functional framework for IoT-based agricultural system. In: AI, Edge and IoT-based Smart Agriculture. Academic Press, pp 43–69 6. Kumar V, Hiremath D, Chaudhary S (2022) An architecture for quality centric crop production system. In: Obi Reddy GP, Raval MS, Adinarayana J, Chaudhary S (eds) Data science in agriculture and natural resource management. Springer Singapore, Singapore, pp 127–141 7. Setiawan A, Wahyuddin S, Rijanto E (2019) An ICT platform design for traceability and big data analytics of sugarcane harvesting operation. In: 2019 international conference on computer, control, informatics and its applications (IC3INA). IEEE, pp 181–186 8. Armstrong L, Diepeveen D, Maddern R (2007) The application of data mining techniques to characterize agricultural soil profiles. In: 6th Australasian data mining conference (AusDM 2007) 9. Foix S, Alenyà G, Torras C (2018) Task-driven active sensing framework applied to leaf probing. Comput Electron Agric 147:166–175 10. Papadopoulos A, Kalivas D, Hatzichristos T (2011) Decision support system for nitrogen fertilization using fuzzy theory. Comput Electron Agric 78(2):130–139 11. Bendre MR, Thool RC, Thool VR (2015) Big data in precision agriculture: weather forecasting for future farming. In: 2015 1st international conference on next generation computing technologies (NGCT). IEEE, pp 744–750 12. Kukar M, Vraˇcar P, Košir D, Pevec D, Bosni´c Z (2019) AgroDSS: a decision support system for agriculture and farming. Comput Electron Agric 161:260–271 13. Rajeswari SKRKA, Suthendran K, Rajakumar K (2017) A smart agricultural model by integrating IoT, mobile and cloud-based big data analytics. In: 2017 international conference on intelligent computing and control (I2C2). IEEE, pp 1–5 14. Sharma S, Rathee G, Saini H (2018) Big data analytics for crop prediction mode using optimization technique. In: 2018 fifth international conference on parallel, distributed and grid computing (PDGC). IEEE, pp 760–764 15. Sørensen CG, Fountas S, Nash E, Pesonen L, Bochtis D, Pedersen SM, Basso B, Blackmore SB (2010) Conceptual model of a future farm management information system. Comput Electron Agric 72:37–47 16. Peisker A, Dalai S (2015) Data analytics for rural development. Indian J Sci Technol 8(S4):50– 60 17. Shalini R, Jayapratha K, Ayeshabanu S, Selvi GC (2017) Spatial big data for disaster management. In: IOP conference series: materials science and engineering, vol 263. IOP Publishing, p 042008 18. Ma C, Zhang HH, Wang X (2014) Machine learning for big data analytics in plants. Trends Plant Sci 19(12):798–808 19. Chaudhary S, Kumar V (2018) Service oriented architecture and spatial data integration for agro advisory systems. In: Geospatial infrastructure, applications and technologies: India case studies, pp 185–199 20. Shah P (2019) Developing big data analytics architecture for spatial data. Springer, Cham
Advanced Application Development in Agriculture—Issues …
635
21. Shah P, Chaudhary S (2018) Big data analytics framework for spatial data. In: Big data analytics: 6th international conference, BDA 2018, Warangal, India, December 18–21, 2018, Proceedings 6. Springer International Publishing, pp 250–265 22. Shah P, Chaudhary S (2018) Big data analytics and integration platform for agriculture. In: Proceedings of research frontiers in precession agriculture (extended abstract), AFITA/WCCA 23. Shah P, Hiremath D, Chaudhary S (2017) Towards development of spark based agricultural information system including geo-spatial data. In: 2017 IEEE international conference on big data (big data). IEEE, pp 3476–3481 24. Shah P, Hiremath D, Chaudhary S (2016) Big data analytics architecture for agro advisory system. In: 2016 IEEE 23rd international conference on high performance computing workshops (HiPCW). IEEE, pp 43–49
Naïve Bayesian Classifier Committee (NBCC) Using Decision Tree (C4.5) for Feature Selection Manmohan Singh , Monika Vyas , Kamiya Pithode , Nikhat Raja Khan , and Joanna Rosak-Szyrocka
Abstract This paper employs the naive Bayesian classifier (NBC) which belongs to the set of quantitative data handling that uses simple probabilistic classifier. This classifier operates by applying theorem with strong independence assumptions. Different–different prescriptive easement for the underlying probability model would call a classifying an instance separate model. NBC is using independent conditionally attributes is basically based on each class, and then we have applied theorem for chosen in same class highest probability instance. The concept of naïve Bayesian classifier committee (NBCC) draws inspiration from boosting. Naïve Bayesian classifier committee (NBCC) is a technique that generates multiple naïve Bayesian classifiers in sequential trial in specific classification. In all committee member is an NBC built unique portion of available attributes for the task. Attribute subsets are selected based on the similarity of the attribute distribution and are modified based using leave-one-out cross-validation evaluation. Through the all stage of classifications member would give analyzing the suitable class. The purpose of the project is to overcome the above-defined problem and to generate a naïve Bayesian classifier committee which may provide better accuracy on relatively large dataset. A training data can be shuffled and taking a sample from it and applying C4.5 decision tree algorithm (DTA) on that sample to select an attribute subset on which NBC can be applied. The approach would be quite accuracy of NBC on relatively larger dataset and may work well with dataset having correlated attributes. Keywords Naïve Bayesian classifier (NBC) · Naïve classifier committee · Classification · c4.5 · Decision tree M. Singh (B) · K. Pithode · N. R. Khan Computer Science and Engineering Department, IES College of Technology, Bhopal, MP 462044, India e-mail: [email protected] M. Vyas Civil Engineering Department, IES College of Technology, Bhopal, MP 462044, India J. Rosak-Szyrocka Department of Production Engineering and Safety, Faculty of Management, Czestochowa University of Technology, Czestochowa, Poland © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), ICT with Intelligent Applications, Lecture Notes in Networks and Systems 719, https://doi.org/10.1007/978-981-99-3758-5_58
637
638
M. Singh et al.
1 Introduction Naïve Bayesian classifiers (NBC) simple robust to noise and effective, these merits prove that it can be employed in various classification tasks. It has been a fundamental approach in information retrieval for long time. To use Naive-Bayesian learning (NBL), probabilities must be measuring all class characteristic. For a relating amid in probabilities estimated for the corresponding frequencies. Probability distribution is measures on the basic of relevant probabilities from its quantitative values are drawn. Using classifier is provide suitable accurate surprisingly approach in ML. all instance of NBC undertake in the all characteristic are independent of all other mention class, after that we have to found probability by apply the Bayes’ algorithm [1–4].
2 Literature Review The authored analysis is measurement the task of instance of class takes into now is depending on the data set. If applying simple the Bayesian classifier is commonly using the data set instance and its logical attribute [1]. Domingos and Pazzani [5] explained about the surprisingly is to be batter accuracy it exhibits in many more attribute dependences upon data set. 1. The Bayes rate: lowest error rate. 2. A classifier is locally optimal: is similar or equal Bayesian classifier. 3. A classifier is globally optimal: classifier is globally optimal. Rish [6] specified systematic analysis of classification efficiency depend on the several class of data set. Zheng [7] explained effective model of improvement of using algorithm efficiency is depend on every member of class and it’s all so depended on the subset of same class data. There are various approach of accuracy estimation. Kohavi [8] specify the various approach of accuracy estimation. For estimating the final accuracy of a classifier, it requires an estimation method which has low bias and low variance. The bias is a method to estimate a parameter which is defined as the expected value minus the estimated values. An unbiased estimation method is a method that has zero bias. Variance describe the component of error that result from random variation in training data and measures how sensitive an algorithm is to changes the training data. Oza and Tumer [9] deal with uncertainty is a key problem in many decisions making situations. Probability theory and decision theory provide a standard framework for dealing with uncertainty in a normative fashion. Classification technique (CT) for finding a suitable model that to identify and differentiate between related data to construction of suitable model for the prediction of objects whose class labels are unknown. The use of the algorithm is that problem of in Bayesian algorithm when we are using all useful attribute independence interface is continue to give improvement in performance, it can give edge to poor performance of the NBC when all useful
Naïve Bayesian Classifier Committee (NBCC) Using Decision Tree …
639
attributes treated as independent, redundant strong attribute has twice influence. This can result in poor estimation [7]. Ratanamahatana and Gunopulos [10] address this issue, a number of NBC model can be generated using different for all portion is based on the distribution probability, NBC is viewed as a committee member using eave-one-out in training set the purpose is cross validation, and accepted only if it is less than the NBC constructed using the entire feature otherwise discarded. While there is various feature subset selection technique may work better in terms of accuracy for larger datasets and with the dataset which have highly correlated prediction [11, 12].
3 Proposed System Creating Subset In this paper we are using training data set to run C4.5 with feature Selection by subset and shuffling data in each trail. It is important to use proper portion of data, for very small portion of data using decision tree is not a suitable representative manner. Other side if large overly complex portion by using decision tree. To address this, it would be needed to add each characteristic in the feature subset and then resulting subset can be applied to the next NBC.
4 Proposed Algorithm NBC_C4.5 (Att, Dtraining , T ) Generate a naïve NBC using a class selected by C4.5-DT. INPUT: Att: a set of attributes, D-training: Att and classes, T: for different object (trials) N sits default value. OUTPUT: committee. METHOD: (1) Algorithm using Att and Dtraining , called NBbase . (2) ε N Bbase = Leave-1-out-Evaluation (NBbase , Dtraining ). (3) Add NB-base into class as the I member which uses in all attributes NB0 = NBbase . (4) l = 1 and t = 1. (5) WHILE (t ≤ T ){-Data-set and take 15% sample. -Run C4.5 on that 15% of sample. -Att-subset = first trial it contains sample attributes that appear only in the first 3. -NBtemp = Algo-classifier Att subset . -εt = Leave-1-out-Evaluation (NBtemp , Att subset ). -IF (εt > ε N Bbase ) NBt = NBbase t = t + 1}else {NBt = NBtemp t = t + 1}} (6) T = t – 1.
640
M. Singh et al. Set of hypothesis Dataset
Set of hypothesis
Training data
Preprocess
Samples
C4.5 decision tree induction
Attribute Subset
Naïve Bayesian Classifier Committee
Fig. 1 Decision tree of Committee learning algorithm using C4.5
(7) RETURN—NBC containing NBt , t = 0, 1, 2, …, T committee of NBC. In Fig. 1 this figure shows the process of selecting a portion based on its characteristic building a NBC in a trial. To do this C4.5decision tree is applied to 15% of the sample from training data and whichever the attributes that appear on the first three-level of the tree are selected as an attribute subset in the first trial on which naïve Bayesian classifier can be applied, and in next each trial the attribute subset is a union of attribute subset which are selected in all previous trial and attribute subset which appears in the current trial and in each sequential trial algorithm shuffle the training data and select 15% sample from it. Once the attribute subset is created, NBCC does not need to do any calculation to build NBC using all necessary probabilities is available from the generation of the NBC based on class. To determine whether NBC is accepted as a class member, apply for data leave-one-out cross-validation is accepted and discarded is based on, the rate in error represent by εt in same error is very low is represent by NBbase, ε N Bbase . This algorithm generates T number of NBC considering all attribute and their subsets. These NBCs are then used to identify extra uterine gestation where no NBC better than NBbase . The committee of NBCs includes NBbase .
4.1 Illustration of NBC_C4.5 Algorithm This section provides a demonstration of the NBC_C4.5 algorithm, showing its behavior on sample training instances. The instances, shown in Table 1 data set. As over here we have a sample dataset which contains only 14-tuples we are not generating a committee because it is applicable for relatively large dataset. But just to get overview of the above algorithm it needs to execute C4.5 decision tree on this 14 tuples dataset. Which result in the following tree (Fig. 2)? According to the proposed algorithm, the attributes that are constantly apply in the very usefully level of three, namely tear-prod-rate, astigmatism and spectacleprescription, are selected. The naïve Bayesian classifiers are then applied to this data
Naïve Bayesian Classifier Committee (NBCC) Using Decision Tree …
641
Table 1 Training instances for illustration Age
Spectecal-prescrip
Astigmatism
Tear-prod-rate
Class (contact-lenses)
Young
Myope
No
Reduced
None
Young
Myope
No
Normal
Soft
Young
Myope
Yes
Reduced
None
Young
Myope
Yes
Normal
Hard
Young
Hypermetrope
No
Reduced
None
Young
Hypermetrope
No
Normal
Soft
Young
Hypermetrope
Yes
Reduced
None
Young
Hypermetrope
Yes
Normal
Hard
Pre-presbyopic
Myope
No
Reduced
None
Pre-presbyopic
Myope
No
Normal
Soft
Pre-presbyopic
Hypermetrope
No
Reduced
None
Pre-presbyopic
Hypermetrope
No
Normal
Soft
Pre-presbyopic
Hypermetrope
Yes
Reduced
None
Pre-presbyopic
Hypermetrope
Yes
Normal
Hard
Presbyopic
Myope
No
Reduced
None
Presbyopic
Myope
No
Normal
None
Presbyopic
Myope
Yes
Reduced
None
Presbyopic
Myope
Yes
Normal
Hard
Presbyopic
Hypermetrope
No
Reduced
None
Presbyopic
Hypermetope
Yes
Normal
Soft
Persbyopic
Hypermetrope
Yes
Reduced
None
Presbyopic
Hypermetrope
Yes
Normal
None
Fig. 2 Committee learning algorithm using C4.5
642
M. Singh et al.
set of attributes which generate the following results. There are five examples which have class soft that classified as its mean she examples with class soft are classified correctly. Out of four examples of class 3 are classified correctly and one is classified as none which is wrong, and out of 15 examples of class none 12 are classified correctly and three are misclassified in which 2 are misclassified as in class hard and one in class soft. While applying the same dataset without attribute subset selection it results in misclassifying seven instances while the former misclassifies only four instance which results in better prediction.
4.2 Method Detail 1. shuffle () This method selects the data randomly from dataset which are created as a C program having three arguments. The first two arguments are name of the data file which is to be shuffled and new file which have randomly selected data respectively and the last argument is the number which are used to randomly selecting the data. 2. get-Structure () Is to finder returns the structure of the data set as an empty set of instances if it is of ARFF format. 3. Gain Ratio Attribute Eval () There are several attribute selection criteria which is useful to determine the best test attribute at the node. 4. Build Classifier () This method would be useful to build the classifier from a training dataset. As an argument it accepts the instances and create a committee or a set of classifiers on the data which are having different attribute subset and measure the error rate on training dataset and over here it uses NBC as a base classifier to create a committee of classifier in sequence. 5. Cross Validate Model () Using as an argument it takes base classifier, set of instances along with number of folds and to shuffle the dataset using a random number generator or it can be specified as a NULL in case shuffle the dataset is not required. The data set is randomly divides into k-subsets, ensuring that each subset have a proportionate representation of classes found in full dataset.
Naïve Bayesian Classifier Committee (NBCC) Using Decision Tree …
643
Table 2 Ecoli dataset information S. no
Attribute name
Type
Description
1
mcg
Numerical
The signal sequence recognition method proposed by Mc-Geoch’s
2
gvh
Numerical
The signal sequence recognition method proposed Von Heijne’s
3
lip
Numerical
The consensus sequence score, for Signal Peptidase II proposed by Von Heijne’s
4
chg
Numerical
The existence of a charge on a N-terminus of predicted lipoproteins
5
aac
Numerical
Discriminate analysis score based on the amino acid content of outer membrane and periplasmic proteins
6
alm1
Numerical
The score generated by the ALOM membrane prediction program spanning region
7
alm2
Numerical
The score generated by the ALOM for the sequence, after removing putative cleavable signal regions
8
Class
Predictive
Protein localization sites
5 Datasets 5.1 Ecoli The Ecoli dataset was obtained from an Expert System (ES) that predicts protein localization sites in Gram-Negative Bacteria. It is developed for the demonstration of protein localization sites. It contains examples with information like sequence names, and various methods for signal sequence recognition and based on that it predicts the protein localization sites. Because of known underlying signal sequence recognition, this dataset may be particularly useful for testing constructive induction and protein localization sites discovery methods. In Table 2 we have use Ecoli dataset for batter prediction by using different instance in based on object. Now we have use total number of 8 attribute and same attribute type and were we have to use of this attribute. In Table 3 we have use Ecoli value information dataset for predict the result now is would be based on the field name and its possible valued now we have use total eight filed name and same according its possible value.
5.2 Adult Classification Dataset A simple database containing 14 attributes. This data was extracted from the census bureau database.
644
M. Singh et al.
Table 3 Ecoli dataset’s value detail S. no.
Field name
Possible values
Distinct values
1
mcg
Continuous
–
2
gvh
Continuous
–
3
lip
Continuous
–
4
chg
Continuous
–
5
aac
Continuous
–
6
alm1
Continuous
–
7
alm2
Continuous
–
8
Class
Om, omL, imL, imS, cp, im, pp, imU
7
In Table 4 we have use adult dataset information dataset for predict the result now is would be based on the field name and its possible valued now we have use total 15 filed name and same according its distinct value.
5.3 Breast Cancer (Wisconsin) Classification Dataset A breast cancer database was obtained from the university of Wisconsin Hospitals, Madison from Dr. William H. Wolberg. It contains 10 attributes in which attributes 2 through 10 have been used to represent instances. Each instance has one of two possible classes: benign or malignant. In Table 5 we have use breast cancer dataset information dataset. For predict the result now is would be based on the field name, it’s possible valued now we have use total number of 11 of filed name and same according its distinct value.
5.4 Performance Study on Classifier Accuracy The graph shown in Fig. 3 shows the accuracy of the model tested using both NBC_ C4.5 and C4.5 methods. The results indicate that accuracy is maintained across both methods. In Table 6 we have use accuracy performance dataset. For predict the result now is would be based on the method using number of tuples and Ecoli, breast cancer, adult information (13,440, 58,716, 32,561) after using NBC_C4.5 result is Ecolis is 81.76, Breast cancer is 95.28% and adult information is 80.07%.
Naïve Bayesian Classifier Committee (NBCC) Using Decision Tree …
645
Table 4 Adult dataset information S. no
Field name
Possible values
Distinct values
1
Age
Continuous
2
Workclass
Private, Self-emp-not-inc, Self-emp-inc, Federal-gov, Local-gov, State-gov, Without-pay, Never-worked
3
Fnlwgt
Continuous
4
Education
Bachelors, Some-college, 11th, HS-grad, Prof-school, Assoc-acdm, Assoc-voc, 9th, 7th–8th, 12th, Masters, 1st–4th, 10th, Doctorate, 5th–6th, Preschool
5
Education-num
Continuous
6
Marital-status
Married-civ-spouse, Divorced, Never-married, Separated, Widowed, Married-spouse-absent, Married-AF-spouse
7
7
Occupation
Tech-support, Craft-repair, Other-service, Sales, Exec-managerial, Prof-specialty, Handlers-cleaners, Machine-op-inspct, Adm-clerical, Farming-fishing, Transport-moving, Priv-house-serv, Protective-serv, Armed-Forces
14
8
Relationship
Wife, Own-child, Husband, Not-in-family, Other-relative, Unmarried
6
9
Race
White, Asian-Pac-Islander, Amer-Indian-Esmiko, Other, Black
5
10
Sex
Female, Male
2
11
Capital-gain
Continuous
12
Capital-loss
Continuous
13
Hours-per-week
Continuous
14
Native-country
United-States, Cambodia, England, Puerto-Rico, Canada, Germany, Outlying-US(Guam-USVI-etc.), India, Japan, Greece, South, China, Cuba, Iran, Honduras, Philippines, Italy, Jamaica, Vietnam, Mexico, Portugal, Ireland, France, Dominican-Republic, Laos, Ecuador, Taiwan, Haiti, Columbia, Hungary, Guatemala, Nicaragua, Scotland, Thailand, Yugoslavia, El-Salvador, Trinidad & Tobago, Peru, Hong, Holand-Netherlands
40
15
Salary (class)
> 50 k, ≤ 50 k
2
7
16
Figure 3 shows accuracy based on different size of datasets. The accuracy of the model has been tested for both NBC_C4.5 and NB method. It is provided that accuracy is increased substantially. Shows in Table 7 the performance accuracy of the NBC_C4.5 versus NB data set off by using number of tuples, using both NBC_C4.5 and NB-Tree methods. The result indicates significant improvement on accuracy 94.00% in breast cancer. Figure 4 shows the performance accuracy of the model on the data set of varying sizes, using both NBC_C4.5 and NB-Tree methods. The result indicates significant improvement on accuracy.
646 Table 5 Breast cancer dataset information
M. Singh et al.
No.
Field name
Possible values Distinct values
1
code_number
Continuous
2
cl_thick
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
10
3
cell_size
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
10
4
cell_shape
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
10
5
mar_adhe
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
10
6
single_epi_cell_ size
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
10
7
bar_nuclei
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
10
8
bl_chromatin
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
10
9
norm-nucloeli
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
10
10
Mitoses
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
10
11
Class
2 (benign), 4 (malignant)
2
ACCURACY CHART
Fig. 3 Accuracy performances (NBC_C4.5 versus C4.5)
100
ACCURACY(%)
95 90 85
NBC_C4.5
80
C4.5
75 70 65 Ecoli
Breast Cancer
Adult
DATASET
Table 6 Accuracy performance (NBC_C4.5 versus C4.5)
Dataset
No. of tuples
NBC_C4.5 (%)
C4.5 (%)
Ecoli
13,440
81.76
77.45
Breast cancer
58,716
95.28
94.12
Adult
32,561
80.07
78.90
Naïve Bayesian Classifier Committee (NBCC) Using Decision Tree … Table 7 Accuracy performance (NBC_C4.5 versus NB)
647
Dataset
No. of tuples
NBC_C4.5 (%)
NB (%)
Ecoli
13,440
81.76
76.21
Breast cancer
58,716
95.28
94.00
Adult
32,561
80.07
78.34
ACCURACY CHART
Fig. 4 Accuracy performances (NBC_C4.5 versus NB)
100 95
ACCURACY(%)
90 85
NBC_C4.5
80
NB
75 70 65 Ecoli
Breast Cancer
Adult
DATASET
Table 8 Accuracy performance (NBC_C4.5 versus NBTree)
Dataset
No. of tuples
NBC_C4.5 (%)
NBTree (%)
Ecoli
13,440
81.76
77.40
Breast cancer
58,716
95.28
94.02
Adult
32,561
80.07
79.60
Shows in Table 8 the performance accuracy of the NBC_C4.5 versus NB data set off by using number of tuples, using both NBC_C4.5 and NB-Tree methods. The result indicates significant improvement on accuracy 94.02% in breast cancer (Fig. 5). ACCURACY CHART
Fig. 5 Accuracy performances (NBC_C4.5 versus NBTree)
100
ACCURACY(%)
95 90 85
NBC_C4.5
80
NBTree
75 70 65 Ecoli
Breast Cancer DATASET
Adult
648
M. Singh et al.
5.5 Performance Study on Accuracy (k-Fold Cross-Validation) Figure 6 shows the performance study of impact of various values of k for k-fold cross-validation on accuracy of the dataset of various sizes. The experiment shows that accuracy of NBC_C4.5 using varying number of folds for cross-validation which improves accuracy slightly as number of folds increases as it reduces standard deviation as number of folds increase up to some extent. Shows in Table 9 the performance accuracy Ecoli, Breast cancer and adult information data set off by no. of folds methods. The result indicates significant improvement on accuracy depend on number of folds in 10 is continues improve 81.76%, 95.42%, 80.09% (breast cancer). Shows in Table 9 the performance accuracy Ecoli, breast cancer and adult information data set off by number of folds methods. The result indicates significant improvement on accuracy depend on number of folds in 10 is continues improve 81.76%, 95.42%, 80.09% (breast cancer).
ACCURACY(%)
ACCURACY CHART 99 96 93 90
Ecoli Adilt
87 84
Breast Cancer
81 78 75 2
5
10
20
NO OF FOLDS
Fig. 6 Accuracy performances (with different values of fold)
Table 9 Accuracy performance (with different values of fold)
No. of folds
Ecoli (%)
Breast cancer (%)
Adult (%)
2
80.98
95.30
80.03
5
81.42
95.40
80.08
10
81.76
95.42
80.09
20
81.77
95.24
80.05
Naïve Bayesian Classifier Committee (NBCC) Using Decision Tree …
649
6 Conclusions Using NBCC in C4.5 decision tree for feature selection builds a NBCC in sequential trial, by shuffling depend by the using data as well as choosing a part of it as a sample and then apply suitable C4.5 decision tree algorithm in each sequential trial and forms a union of the attribute which appear in the first three level of the tree. Although this C4.5 decision tree is only used for the purpose of creating an attribute subset. This is to be used to improve NBC learning, it may also perform better, than the other Bayesian suitable algorithm, in this decision trees C4.5 algorithm batter result is predicted as all other algorithm performs well on considerably larger dataset. Several improvements for the NBCC apply C4.5 decision tree for feature selection algorithm are possible. The proposed algorithm uses a voting system that does not used weights. However, appropriate weighting techniques may increase of efficiency of the algorithm. Various feature selection techniques such as filter and wrapper methods may be used to select the attribute subset and apply NBC. The current implementation of the algorithm is limited to categorical or nominal attributes but applying proper discretization techniques such as entropy-based discretization would enable the algorithm to handle numerical or continuous attributes as well. This algorithm used different attribute subsets to create an NBCC in sequential trial instead of that it is also possible to approach a set of classifiers with each using each training data sampling each and all changes of the data set size from the actual data set.
References 1. Langley P, Iba W, Thompson K (1992) An analysis of Bayesian classifiers. In: AAAI, 90:223– 228 2. Turner K, Oza NC (1999) Decimated input ensembles for improved generalization. In: IJCNN’99 International Joint Conference on Neural Networks. Proceedings (Cat. No. 99CH36339), IEEE 5:3069–3074 3. Klawonn F, Angelov P (2006) Evolving extended naive Bayes classifiers, pp 643–647 4. Lu J, Yang Y, Webb GI (2006) Incremental discretization for naïve-Bayes classifier, pp 223–238 5. Domingos P, Pazzani M (1996) Beyond independence: conditions for the optimality of the simple Bayesian classifier, pp 105–112 6. Rish I (2001) An empirical study of the naive Bayes classifier. In: IJCAI 2001 workshop on empirical methods in artificial intelligence, 3(22):41–46 7. Zheng Z (1998) Naive Bayesian classifier committees, pp 196–207 8. Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: IJCAL, 14:1137–1145 9. Oza NC, Tumer K (2001) Input decimation ensembles: decorrelation through dimensionality reduction, pp 238–247 10. Ratanamahatana CA, Gunopulos D (2002) Scaling up the naive Bayesian classifier: using decision trees for feature selection 11. Das S (2001) Filters, wrappers and a boosting-based hybrid for feature selection. Citeseer 1:74–81 12. Zheng Z, Webb GI (1998) Stochastic attribute selection committees, pp 321–332