188 23 20MB
English Pages 822 [791] Year 2022
Smart Innovation, Systems and Technologies 312
Jyoti Choudrie Parikshit Mahalle Thinagaran Perumal Amit Joshi Editors
IOT with Smart Systems Proceedings of ICTIS 2022, Volume 2
123
Smart Innovation, Systems and Technologies Volume 312
Series Editors Robert J. Howlett, Bournemouth University and KES International, Shoreham-by-Sea, UK Lakhmi C. Jain, KES International, Shoreham-by-Sea, UK
The Smart Innovation, Systems and Technologies book series encompasses the topics of knowledge, intelligence, innovation and sustainability. The aim of the series is to make available a platform for the publication of books on all aspects of single and multi-disciplinary research on these themes in order to make the latest results available in a readily-accessible form. Volumes on interdisciplinary research combining two or more of these areas is particularly sought. The series covers systems and paradigms that employ knowledge and intelligence in a broad sense. Its scope is systems having embedded knowledge and intelligence, which may be applied to the solution of world problems in industry, the environment and the community. It also focusses on the knowledge-transfer methodologies and innovation strategies employed to make this happen effectively. The combination of intelligent systems tools and a broad range of applications introduces a need for a synergy of disciplines from science, technology, business and the humanities. The series will include conference proceedings, edited collections, monographs, handbooks, reference books, and other relevant types of book in areas of science and technology where smart systems and technologies can offer innovative solutions. High quality content is an essential feature for all book proposals accepted for the series. It is expected that editors of all accepted volumes will ensure that contributions are subjected to an appropriate level of reviewing process and adhere to KES quality principles. Indexed by SCOPUS, EI Compendex, INSPEC, WTI Frankfurt eG, zbMATH, Japanese Science and Technology Agency (JST), SCImago, DBLP. All books published in the series are submitted for consideration in Web of Science.
Jyoti Choudrie · Parikshit Mahalle · Thinagaran Perumal · Amit Joshi Editors
IOT with Smart Systems Proceedings of ICTIS 2022, Volume 2
Editors Jyoti Choudrie Hertfordshire Business School University of Hertfordshire Hatfield, Hertfordshire, UK
Parikshit Mahalle Vishwakarma Institute of Information Technology Pune, Maharashtra, India
Thinagaran Perumal University Putra Malaysia Serdang, Malaysia
Amit Joshi Global Knowledge Research Foundation Ahmedabad, India
ISSN 2190-3018 ISSN 2190-3026 (electronic) Smart Innovation, Systems and Technologies ISBN 978-981-19-3574-9 ISBN 978-981-19-3575-6 (eBook) https://doi.org/10.1007/978-981-19-3575-6 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
Preface
The Sixth International Conference on Information and Communication Technology for Intelligent Systems (ICTIS 2022) targets state of the art as well as emerging topics pertaining to information and communication technologies (ICTs) and effective strategies for its implementation for engineering and intelligent applications. The conference is anticipated to attract a large number of high-quality submissions, stimulate the cutting-edge research discussions among many academic pioneering researchers, scientists, industrial engineers, students from all around the world, and provide a forum to researcher; propose new technologies, share their experiences, and discuss future solutions for design infrastructure for ICT; provide a common platform for academic pioneering researchers, scientists, engineers, and students to share their views and achievements; enrich technocrats and academicians by presenting their innovative and constructive ideas; focus on innovative issues at international level by bringing together the experts from different countries. The conference was held during April 22–23, 2022, physically on April 22, 2022, at Hotel Pride Plaza, Bodakdev, Ahmedabad, and digitally on April 23, 2022, platform: Zoom, and was organized by Global Knowledge Research Foundation in collaboration with KCCI and IFIP INTERYIT. Research submissions in various advanced technology areas were received, and after a rigorous peer review process with the help of program committee members and external reviewer, 150 papers were accepted with an acceptance rate of 16%. All 150 papers of the conference are accommodated in two volumes, and also, papers in the book comprise authors from eight countries. This event success was possible only with the help and support of our team and organizations. With immense pleasure and honor, we would like to express our sincere thanks to the authors for their remarkable contributions, all the technical program committee members for their time and expertise in reviewing the papers within a very tight schedule, and the publisher Springer for their professional help. We are overwhelmed by our distinguished scholars and appreciate them for accepting our invitation to join us through the virtual platform and deliver keynote speeches and technical session chairs for analyzing the research work presented by
v
vi
Preface
the researchers. Most importantly, we are also grateful to our local support team for their hard work for the conference. Hatfield, UK Pune, India Serdang, Malaysia Ahmedabad, India
Jyoti Choudrie Parikshit Mahalle Thinagaran Perumal Amit Joshi
Contents
1
2
3
4
5
Hand Gesture-Controlled Simulated Mouse Using Computer Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sarvesh Waghmare
1
E-Commerce Web Portal Using Full-Stack Open-Source Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Archit Tiwari and Shalini Goel
11
Design of a QoS-Aware Machine Learning Model for High Trust Communications in Wireless Networks . . . . . . . . . . . . . . . . . . . . Shaikh Shahin Sirajuddin and Dilip G. Khairnar
19
Application of Machine Learning in Mineral Mapping Using Remote Sensing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Priyanka Nair, Devesh Kumar Srivastava, and Roheet Bhatnagar
27
An Improved Energy Conservation Routing Mechanism in Heterogeneous Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . Shashank Barthwal, Sumit Pundir, Mohammad Wazid, and D. P. Singh
37
6
Prediction of COVID-19 Severity Using Patient’s PHR . . . . . . . . . . . . M. A. Bharathi and K. J. Meghana Kumar
49
7
A Survey on Crop Rotation Using Machine Learning and IoT . . . . . Nidhi Patel, Ashna Shah, Yashvi Soni, Nikita Solanki, and Manojkumar Shahu
57
8
Survey of Consumer Purchase Intentions of Live Stream in the Digital Economy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Li-Wei Lin and Xuan-Gang
67
10 Leveraging Block Chain Concept: Service Delivery Acceleration in Post-pandemic Times . . . . . . . . . . . . . . . . . . . . . . . . . . . Praful Gharpure
73
vii
viii
Contents
11 An Effective Computing Approach for Damaged Crop Analysis in Chhattisgarh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Syed Zishan Ali, Mansi Gupta, Shubhangi Gupta, and Vipasha Sharma 12 An Improve Approach in Core Point Detection for Secure Fingerprint Authentication System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Meghna B. Patel, Ronak B. Patel, Jagruti N. Patel, and Satyen M. Parikh
81
91
13 Investigating i-Vector Framework for Speaker Verification in Wild Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 Asmita Nirmal and Deepak Jayaswal 14 Application of Deep Learning for COVID Twitter Sentimental Analysis Towards Mental Depression . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 Naseela Pervez, Aditya Agarwal, and Suresh Sankaranarayanan 15 A Blockchain Solution for Secure Health Record Access with Enhanced Encryption Levels and Improvised Consensus Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 Monga Suhasini and Dilbag Singh 16 Addressing Item Cold Start Problem in Collaborative Filtering-Based Recommender Systems Using Auxiliary Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 Ronakkumar Patel and Priyank Thakkar 17 Research on the College English Blended Teaching Model Design and Implementation Based on the “Internet + Education” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 Yu Liu 18 Qualitative Analysis of SQL and NoSQL Database with an Emphasis on Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 Jyoti Chaudhary, Vaibhav Vyas, and C. K. Jha 19 A Rule-Based Sentiment Analysis of WhatsApp Reviews in Telugu Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 Kalpdrum Passi and Sujay Kalakala 20 Face Model Generation Using Deep Learning . . . . . . . . . . . . . . . . . . . . 181 Rajanidi Ganesh Phanindra, Nudurupati Prudhvi Raju, Thania Vivek, and C. Jyotsna 21 Pipeline for Pre-processing of Audio Data . . . . . . . . . . . . . . . . . . . . . . . 191 Anantshesh Katti and M. Sumana
Contents
ix
22 Classification on Alzheimer’s Disease MRI Images with VGG-16 and VGG-19 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 Febin Antony, H. B. Anita, and Jincy A. George 23 Combining Variable Neighborhood Search and Constraint Programming for Solving the Dial-A-Ride Problem . . . . . . . . . . . . . . . 209 V. S. Vamsi Krishna Munjuluri, Mullapudi Mani Shankar, Kode Sai Vikshit, and Georg Gutjahr 24 Decentralization of Traditional Systems Using Blockchain . . . . . . . . 217 Harsh Mody, Harsh Parikh, Neeraj Patil, and Kumkum Saxena 25 Design of a Secure and Smart Healthcare IoT with Blockchain: A Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 Trishla Kumari, Rakesh Kumar, and Rajendra Kumar Dwivedi 26 Detecting Deceptive News in Social Media Using Supervised Machine Learning Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 Anshita Malviya and Rajendra Kumar Dwivedi 27 4G Communication Radiation Effects on Propagation of an Economically Important Crop of Eggplant (Solanum melongena L.) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 Trushit Upadhyaya, Chandni Upadhyaya, Upesh Patel, Killol Pandya, Arpan Desai, Rajat Pandey, and Yogeshwar Kosta 28 Annadata: An Interactive and Predictive Web-Based Farmer’s Portal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 Cdt. Swarad Hajarnis, Kaustubh Sawant, Shubham Khairnar, Sameer Nanivadekar, and Sonal Jain 29 A Survey on Privacy Preserving Voting Scheme Based on Blockchain Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 P. Priya, S. Girubalini, B. G. Lakshmi Prabha, B. Pranitha, and M. Srigayathri 30 A Survey on Detecting and Preventing Hateful Comments on Social Media Using Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . 285 I. Karthika, G. Boomika, R. Nisha, M. Shalini, and S. P. Srivarshini 31 Anomaly Detection for Bank Security Against Theft—A Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 G. Pavithra, L. Pavithra, B. Preethi, J. R. Sujasre, and R. Vanniyammai 32 A Framework of a User Interface Covid-19 Diagnosis Model . . . . . . 311 M. Sumanth and Shashi Mehrotra
x
Contents
33 Priority Queue-Based Framework for Allocation of High Performance Computing Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317 Manish Kumar Abhishek, D. Rajeswara Rao, and K. Subrahmanyam 34 A Novel Hysynset-Based Topic Modeling Approach for Marathi Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329 Prafulla B. Bafna and Jatinderkumar R. Saini 35 Identification of Malayalam Stop-Words, Stop-Stems and Stop-Lemmas Using NLP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 Sarath Kumar, Jatinderkumar R. Saini, and Prafulla B. Bafna 36 Face Recognition-Based Automatic Attendance System . . . . . . . . . . . 351 Yerrolla Chanti, Anchuri Lokeshwar, Mandala Nischitha, Chilupuri Supriya, and Rajanala Malavika 37 Soft Computing-Based Approach for Face Recognition on Plastic Surgery and Skin Colour-Based Scenarios Using CNN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363 Divyanshu Sinha, J. P. Pandey, and Bhavesh Chauhan 38 Text-to-Speech Synthesis of Indian Languages with Prosody Generation for Blind Persons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375 A. Neela Madheswari, R. Vijayakumar, M. Kannan, A. Umamaheswari, and R. Menaka 39 Microservices in IoT Middleware Architectures: Architecture, Trends, and Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381 Tushar Champaneria, Sunil Jardosh, and Ashwin Makwana 40 Sustainability of Green Buildings and Comparing Different Rating Agencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397 Devender Kumar Beniwal, Deepak Kumar, and Vineet Kumar 41 On Product of Doubt ψ − O ˛ − Fuzzy Subgroup . . . . . . . . . . . . . . . . . . 411 A. Mohamed Ismail, M. Premkumar, A. Prasanna, S. Ismail Mohideen, and Dhirendra Kumar Shukla 42 A Nobel Approach to Identify the Rainfall Prediction Using Deep Convolutional Neural Networks Algorithm . . . . . . . . . . . . . . . . . . 417 Prachi Desai, Ankita Gandhi, and Mitali Acharya 43 Disaster Preparedness Using Augmented Reality/Virtual Reality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429 Shwetali Jadhav, Sampada Nemade, Sneha Thombre, Ketaki Despande, and Vishakha Chivate
Contents
xi
44 A Proposed Blockchain-Based Model for Online Social Network to Detect Suspicious Accounts . . . . . . . . . . . . . . . . . . . . . . . . . . 437 Heta Dasondi, Meghna B. Patel, and Satyen M. Parikh 45 Journal on Delivery Management Platform . . . . . . . . . . . . . . . . . . . . . . 447 C. Selvarathi, Kuna Harsha Kumar, and M. Pradeep 46 A Comparative Study of Gene Expression Data-Based Intelligent Methods for Cancer Subtype Detection . . . . . . . . . . . . . . . . 457 R. Jayakrishnan and S. Sridevi 47 Analysis of Visual Descriptors for Detecting Image Forgery . . . . . . . 469 Mridul Sharma and Mandeep Kaur 48 Analysis of EEG Signals Using Machine Learning for Prediction and Detection of Stress . . . . . . . . . . . . . . . . . . . . . . . . . . . 485 Tushar Kotkar, Kaushiki Nagpure, Pratik Phadke, Sangita Patil, and P. K. Rajani 49 Crop Decision Using Various Machine Learning Classification Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495 Jayesh Kolhe, Guruprasad Deshpande, Gargi Patel, and P. K. Rajani 50 Time-Delay Compensator Design and Its Applications in Process Control—A Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503 P. K. Juneja, P. Saini, A. Dixit, N. Varshney, and S. K. Sunori 51 Smart and Safety Traffic System for the Vehicles on the Road . . . . . 509 P. Kumar, S. Vinodh Kumar, and L. Priya 52 Automated System for Management of Hardware Equipment in Colleges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 523 Vaidehi Bhagwat, Aishwarya Krishnamurthy, Himanshu Behra, Ikjot Khurana, and Gresha Bhatia 53 Smart Student Attendance System Based on Facial Recognition and Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535 Nangunuri Shiva Krishna, Swetha Pesaru, and M. Akhil Reddy 54 Detecting Zeus Malware Network Traffic Using the Random Forest Algorithm with Both a Manual and Automated Feature Selection Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547 Mohamed Ali Kazi, Steve Woodhead, and Diane Gan 55 Intelligent Classification of Documents Based on Critique Points from Relevant Web Scrapped Content . . . . . . . . . . . . . . . . . . . . 559 Prince Hirapar, Raj Davande, Mittal Desai, Bhargav Vyas, and Dip Patel
xii
Contents
56 Survey on Convolutional Neural Networks-Based Object Detection Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 569 Waiel Tinwala, Shristi Rauniyar, and Swapnil Agrawal 57 Implementation of Threats Detection Modeling with Deep Learning in IoT Botnet Attack Environment . . . . . . . . . . . . . . . . . . . . . 585 Kanti Singh Sangher, Archana Singh, Hari Mohan Pandey, and Lakshmi Kalyani 58 Performance Prediction Using Support Vector Machine Kernel Functions and Course Feedback Survey Data . . . . . . . . . . . . . 593 Vikrant Shaga, Haftom Gebregziabher, and Prashant Chintal 59 Car Type and License Plate Detection Based on YOLOv4 with Darknet Framework (CTLPD) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 603 Hard Parikh, R. S. Ramya, and K. R. Venugopal 60 Dynamic Search and Integration of Web Services . . . . . . . . . . . . . . . . 613 Sumathi, Karuna Pandith, Niranajan Chiplunkar, and Surendra Shetty 61 SAMPANN: Automated System for Pension Disbursal in India (Case Study: BSNL VRS Pension Disbursal) . . . . . . . . . . . . . . . . . . . . . 627 V. N. Tandon, Shankara Nand Mishra, Taranjeet Singh, R. S. Mani, Vivek Gupta, Ravi Kumar, Gargi Bhakta, A. K. Mantoo, Archana Bhusri, and Ramya Rajamanickam 62 FOG Computing: Recent Trends and Future Challenges . . . . . . . . . . 643 Shraddha V. Thakkar and Jaykumar Dave 63 Study of Fake News Detection Techniques Using Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 653 Debalina Barik, Sutirtha Kumar Guha, and Shanta Phani 64 Identification of Skin Diseases Using Deep Learning Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667 K. Himabindu, C. N. Sujatha, G. Chandi Priya Reddy, and S. Swathi 65 IoT-Based Saline Controlling System . . . . . . . . . . . . . . . . . . . . . . . . . . . . 677 Mittal N. Desai, Raj Davande, and K. R. Mahaadevan 66 In-situ Measurement in Water Quality Status—Udalka Uttarakhand, India . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 689 S. Harini, P. Varshini, S. K. Muthukumaaran, Santosh Chebolu, R. Aarthi, R. Saravanan, and A. S. Reshma
Contents
xiii
67 Implementing AI-Based Comprehensive Web Framework for Tourism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 697 Nada Rajguru, Harsh Shah, Jaynam Shah, Anagha Aher, and Nahid Shaikh 68 Smart Water Resource Management by Analyzing the Soil Structure and Moisture Using Deep Learning . . . . . . . . . . . . . . . . . . . . 709 Sharfuddin Waseem Mohammed, Narasimha Reddy Soora, Niranjan Polala, and Sharia Saman 69 The Assessment of Challenges and Sustainable Method of Improving the Quality of Water and Sanitation at Deurbal, Chhattisgarh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 721 K. S. Prajwal, B. Shivanath Nikhil, Pakhala Rohit Reddy, G. Karthik, P. Sai Kiran, V. Vignesh, and A. S. Reshma 70 Design of Social Distance Monitoring Approach Using Wearable Smart Tags in 5G IoT Environment During Pandemic Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 731 Fernando Molina-Granja, Raúl Lozada-Yánez, Fabricio Javier Santacruz-Sulca, Milton Paul López Ramos, G. D. Vignesh, and J. N. Swaminathan 71 Smart Congestion Control and Path Scheduling in MPTCP . . . . . . . 741 Neha Rupesh Thakur and Ashwini S. Kunte 72 Internet of Things-Enabled Diabetic Retinopathy Classification from Fundus Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 757 Vinodkumar Bhutnal and Nageswara Rao Moparthi 73 Open Research Issues of Battery Usage for Electric Vehicles . . . . . . . 765 Hema Gaikwad, Harshvardhan Gaikwad, and Jatinderkumar R. Saini 74 Comparative Cost Analysis of On-Chain and Off-Chain Immutable Data Storage Using Blockchain for Healthcare Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 779 Babita Yadav and Sachin Gupta 75 Analysis of Various Toxic Gas Levels Using 5G ML-IoT for Air Quality Monitoring and Forecasting . . . . . . . . . . . . . . . . . . . . . 789 Sumeet Gupta, Paruchuri Chandra Babu Naidu, Vasudevan Kuppan, M. Alagumeenaakshi, R. Niruban, and J. N. Swaminathan Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 803
About the Editors
Prof. Jyoti Choudrie is Professor of Information Systems in Hertfordshire Business School, Management, Leadership and Organization (MLO) department where she previously led the Systems Management Research Unit (SyMRU) and currently is a convenor for the Global Work, Economy and Digital Inclusion group. She is also Editor-in-Chief for Information, Technology and People journal (An Association of Business School 3 grade journal). In terms of research, Professor Choudrie is known as the Broadband and Digital Inclusion expert in University of Hertfordshire, which was also the case in Brunel University. To ensure her research is widely disseminated, Professor Choudrie co-edited a Routledge research monograph with Prof. C. Middleton: The Management of Broadband Technology Innovation and completed a research monograph published by Routledge Publishing and focused on social inclusion along with Dr. Sherah Kurnia and Dr. Panayiota Tsatsou titled: Social Inclusion and Usability of ICT-Enabled Services. She also works with Age (UK) Hertfordshire, Hertfordshire County Council and Southend YMCA where she is undertaking a Knowledge Transfer Partnership project investigating the role of Online Social Networks (OSN). Finally, she is focused on artificial intelligence (AI) applications in organizations and society alike, which accounts for her interests in OSN, machine and deep learning. She has been a keynote speaker for the International Congress of Information and Communication Technologies, Digital Britain conferences and supervises doctoral students drawn from around the globe. Presently, she is seeking three to four doctoral students who would want to research Ai in society and organizations alike. Dr. Parikshit Mahalle is a senior member IEEE and is Professor and Head of Department of Artificial Intelligence and Data Science at Vishwakarma Institute of Information Technology, Pune, India. He has his Ph.D from Aalborg University, Denmark, and continued as Post Doc Researcher. He has 21 + years of teaching and research experience. He is a member of Board of Studies in Computer Engineering SPPU, Ex-Chairman Board of Studies (Information Technology)—SPPU and various
xv
xvi
About the Editors
Universities. He has 9 patents, 200+ research publications (citations-1873, H index19) and authored/edited 30+ books with Springer, CRC Press, Cambridge University Press, etc. He is the editor in chief for IGI Global—International Journal of Rough Sets and Data Analysis, Associate Editor for IGI Global—International Journal of Synthetic Emotions, Inter-science International Journal of Grid and Utility Computing, member—Editorial Review Board for IGI Global—International Journal of Ambient Computing and Intelligence. His research interests are algorithms, Internet of things, identity management and security. He has delivered 200 plus lectures at national and international levels. Dr. Thinagaran Perumal received his B.Eng. in Computer and Communication System Engineering from Universiti Putra Malaysia in 2003. He completed his M.Sc. and Ph.D. Smart Technologies and Robotics from the same university in 2006 and 2011, respectively. Currently, he is appointed as Senior Lecturer at the Department of Computer Science, Faculty of Computer Science and Information Technology, Universiti Putra Malaysia. He is also currently appointed as Head of CyberPhysical Systems in the university and also been elected as Chair of IEEE Consumer Electronics Society Malaysia Chapter. Dr. Thinagaran Perumal is the recipient of 2014 IEEE Early Career Award from IEEE Consumer Electronics Society for his pioneering contribution in the field of consumer electronics His research interests are toward interoperability aspects of smart homes and Internet of things (IoT), wearable computing and cyber-physical systems. His recent research activities include proactive architecture for IoT systems; development of the cognitive IoT frameworks for smart homes and wearable devices for rehabilitation purposes. He is an active member of IEEE Consumer Electronics Society and its Future Directions Committee on Internet of Things. He has been invited to give several keynote lectures and plenary talk on Internet of things in various institutions and organizations internationally. Dr. Amit Joshi is currently Director of Global Knowledge Research Foundation and also an entrepreneur and researcher who has completed his graduation (B.Tech.) in Information Technology and M.Tech. in Computer Science and Engineering and completed his research in the areas of cloud computing and cryptography in medical imaging with a focus on analysis of the current government strategies and world forums needs in different sectors on security purposes. He has an experience of around ten years in academic and industry in prestigious organizations. He is an active member of ACM, IEEE, CSI, AMIE, IACSIT-Singapore, IDES, ACEEE, NPA and many other professional societies. Further currently, he is also the International Chair of InterYIT at International Federation of Information Processing (IFIP, Austria). He has presented and published more than 50 papers in National and International Journals/Conferences of IEEE and ACM. He has also edited more than 20 books which are published by Springer, ACM and other reputed publishers. He has also organized more than 40 national and international conferences and workshops through ACM, Springer, IEEE across five countries including India, UK, Thailand and Europe.
Chapter 1
Hand Gesture-Controlled Simulated Mouse Using Computer Vision Sarvesh Waghmare
Abstract Hand gestures are an essential component in our everyday lives for nodding and conversation. At this time, human–computer interactions have so many strategies. However, this type of interaction with a computer could create revolutions by not using complicated hardware. Using this technology, any individual can communicate with a computer in a more natural manner by simply using his/her hand to navigate throughout the computer screen. By virtually displaying our hand in front of the camera, we will be able to move the cursor and control our system. All sorts of human beings can use this system effectively and in an easy way to control computer systems and numerous gadgets. Based upon the above concept, this paper is staged. This paper clarifies the methodologies for providing numerous steps in color detection, and vision-based strategies for identifying a hand and controlling the virtual mouse. Keywords Computer vision · OpenCV · MediaPipe · HCI (Human–computer interactions)
1.1 Introduction Everyone needs technology. Everyone is becoming tech-friendly; technology is an essential part of every day-to-day life. So here, to provide this need, computers play a very important role. Computers give solutions to every person of any age and in any area of industrialized society. Why are we interacting with computers? To make our work and lifestyle simpler, isn’t it? Human–computer interaction (HCI) has to turn out to be a very critical subject matter for research. Today, we have a computer mouse as an input device that is used by us to communicate with computers, but there are some limitations to the mouse and its accuracy. Regardless of how much the mouse’s accuracy has improved recently, a mouse is made up of hardware elements, so there may be some issues, such as the click not S. Waghmare (B) Pimpri Chinchwad College of Engineering, Pune, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_1
1
2
S. Waghmare
working or having an endurance time limit. Furthermore, because the mouse is a hardware tool, it has a limited life span, and after that time, we must replace it for optimum functionality. As a generation increases, the whole lot will become virtualized, together with hand recognition and speech recognition. Hand gestures are a very effective and popular method of communication among people. In fact, in our everyday lives, hand gestures are frequently used. Hand gestures are deeply rooted in our attitude toward expressing one’s thoughts to each other, so this form of the interplay among people and computing gadgets may be executed with the aid of hand gestures for communication. The biggest hassle is that the way to make hand gestures understood with the aid of computer systems is through (1) data gloves or (2) vision-based approaches. This paper presents a vision-based method for hitting upon detection of hand gestures and executing some features, which consist of left clicks and right clicks, that are carried out by a physical computer mouse. Then, using computer vision, I created a user interface in which user can perform right and left clicks with hand gestures. The system needs to be optimized to fulfill the requirements, which include accuracy and precision. Hand recognition, which I will be able to discuss during this paper, is powered by OpenCV. OpenCV is a library of programming functions primarily aimed at realtime computer vision. That is used for capturing images from a webcam. Thus, once the technology is developed, things like professional, smoother software are created and developed for simple use of this technique. Artificial intelligence and machine learning are trending technologies that facilitate the simple operation of systems with no human interactions. So, in this case, this system is additionally a part of the technologies that will help us work a lot more efficiently. This system allows the user to control the cursor with the help of hand gestures, and the system’s webcam is in use for input monitoring of hand gestures. Here, OpenCV is used due to its package known as video capture, that is used to capture statistical data from a live video in a time-frame-by-frame manner.
1.2 Existing Systems As per the previous studies, most of them assign the cursor pointer as a palm centroid by using many algorithms, such as convex hull algorithms and so on, and fingers are assigned as commands for that cursor, like right clicking and left clicking. The main drawback of the system is that if any program is on the top of the screen, hand gestures are not mapped like a mouse, so we have to take our hand in an upwards direction to meet our cursor (centroid of palm) and so that our commanding system, which are fingers, is above the screen. So, we can navigate in the top section but cannot give a command to the system as our fingers are out of the action area [1–3]. There are some problems like some individuals forgetting to form the worst-case scenario of multiple identifications of a target.
1 Hand Gesture-Controlled Simulated Mouse Using Computer Vision
3
Fig. 1.1 The left click is performed with the forefinger, while the middle finger serves as the cursor pointer and the right click is performed with the middle finger
1.3 Proposed Solution As per the drawback, I want to provide a solution as I will assign a cursor pointer to my hand’s topmost part, which is the middle finger, and assign the middle finger as right clicking and the forefinger as left clicking. Because in our hands, only the thumb, forefinger, and middle fingers are more dominant than the ring and little finger. As multiple identifications may occur, most importantly, the approach should be more lenient toward a particular color detection. Hence, in this paper, I am using red and blue color caps for finger identification (see Fig. 1.1).
1.4 Requirements for Proposed System To achieve this project, I am using OpenCV, which is a Python library mainly aimed toward real-time video capturing and processing (computer vision). An “import cv2” is used to import the OpenCV library into the code. It additionally makes use of NumPy, a Python library primarily aimed toward high-level mathematical operations on arrays and matrices [4, 5]. Table 1.1 mentions the minimum requirements for doing this project. This environment of specifications is very important to achieve good results and prevent some of the runtime errors and drawbacks from previous projects.
4 Table 1.1 Requirements
S. Waghmare Item
Specification
Microsoft Window
Support architecture: -32bit (×86) -64bit (×86)
OpenCV
OpenCV v4.5.1
MediaPipe
Medialise v0.8.5
PyCharm
PyCharm v2021.1.1
OS
Windows 7 or above/Ubuntu OS
Ram
4 GB or above
Camera
12 MP or above
CPU
i3-3rd gen or above/AMD zen2 or above
GPU
Any Nvidia/AMD/Intel graphics
1.5 Methodology In the methodology, the method of each factor is described in Fig. 1.2 step by step, and its running can be validated (see Fig. 1.2).
1.5.1 Video Acquisition Video acquisition steps are explained here for video capturing and several other video operations. Videos are captured from system default regular webcam at resolution 1920 × 1080 with fps of 40 (default setting) [2, 6]. Real-Time Video Capturing An accurate interactive device constantly includes a few sensors that give the best inputs to the device. Right here, a webcam is used to capture actual-time video at constant frame rates and resolutions. After taking pictures of actual-time video inputs, snapshots are extracted from video frame by frame using means and processed in line with the RGB shade layout of a matrix (m × n) in which every detail includes a (1 × 3) matrix of red, a matrix of green, and a matrix of blue channels. Hence, this is the main reason behind choosing finger colors of blue and red for the forefinger and middle finger, respectively, for detection. RGB colors are dominating colors and are also called the “mother” of all colors because, using their combinations, all colors are formed [7]. For webcam input: cap = cv2.VideoCapture(0) Flipping of Video After a while, during video acquisition, if we tend to move our hand toward one direction, let us say left, the image of the hand moves toward the opposite direction to the movement of the hand. In this case, the hand moves toward the right and vice versa. Once the video is previewed, it is determined that the video
1 Hand Gesture-Controlled Simulated Mouse Using Computer Vision Fig. 1.2 The process is depicted in a flowchart
5
6
S. Waghmare
is horizontally inverted. So, it is necessary to flip the image horizontally. That is achieved in OpenCV using the function of flipping [1]. BGR (Blue, Green, Red) to RGB (Red, Green, Blue) image. image = cv2.cvtColor(cv2.flip(image,1), cv2.COLOR_BGR2RGB).
1.5.2 Color Conversions As OpenCV supports image conversions, we have to do things like RGB Image → HSV Image → Gray Scale Image for the identification of the targeted object. Most importantly, the model that is developed must have a specific color detection. If we miss this step or misplace it, we are going to have problems. To get around this, we have got multiple ways to do the color variation and detection properly using HSV Color Code. Using these strategies, the system may be developed for higher usage without any problems [8]. To achieve our desired colorations, we needed to detect red and blue colors in the flipped image. A subtractive approach is employed wherever a grayscale image is generated once the HSV image from the flipped image is subtracted from the red and blue band images individually. The result we tend to get is that the red and blue additives of the images within the grayscale coloration version were obtained and were able to subtract the background colors, extra fingers, and skin color visible in the image.
1.5.3 Noise Noise is a very important and crucial factor in any computer vision system, and that needs to be pointed out clearly. Because of dust or other issues, little undesirable errors emerge when scanning images. Filtration After capturing the red and blue additives from the photograph, a few pixels are scattered, which creates salt and pepper-like noise. Hence, to get rid of those effects, the OpenCV Median Filter is in use [3]. Median Filtering The function “cv2.medianBlur()” finds the median of all pixels below the kernel window and uses that value to replace the middle pixel. This works particularly well for getting rid of the salt and pepper sound. One exciting feature of the term is that the filtered value for critical details in Gaussian and field filters can be a value that is not present in the original image. In median filtering, however, this is not the case because the center detail is often affected by the approach of a few pixels inside the image. Noise is successfully reduced as a result of this. The kernel must be an odd positive integer in size [5]. Syntax: (median = cv2.medianBlur(img))
1 Hand Gesture-Controlled Simulated Mouse Using Computer Vision
7
Unwanted Objects from Image Because the system uses a webcam as an input, numerous issues arise as a result of the image’s environment. When attempting to import our essential packages, this is a critical factor. NumPy will be used for numeric processing, while “import cv2” will be used for OpenCV connectivity [9]. The function “is contour bad” is then defined. This function takes over implementation and provides the criteria for marking a contour as “bad” and removing it from the image.
1.5.4 Cursor Mapping Using Centroid of Red Part Now, the image is ready after the use of numerous steps listed above, but the mapping of the cursor pointer is still pending. Consequently, the finger position in the image and the cursor relationships need to be set up. First of all, the labeled matrix is solved via the approach of the use of NumPy. In this, calculations are executed on the concept of major-axis length for the red part of a finger. Mathematically, centroid (X , Y ) is also calculated with the help of segmenting the colored objects from the image from the underneath equation [10, 11]. k Centroid : X =
i=0
xi /K , Y =
k i=0
yi /K
(1.1)
where X i and Yi , X and Y are co-ordinates of ith pixel and K denotes numbers of pixel in the image [11]. Now as centroid is calculated for calculation of tip of red part of object, we have to add Yi in centroid. K /2 Cursor Tip : X t = X , Y t = Y + 2 yi /K i=0
(1.2)
where X t , Y t are co-ordinates of pixel which act as tip of object like a cursor and K denotes numbers of pixel present in image (Fig. 1.3).
1.5.5 Perform Different Operations In this, left click is assigned to the blue color, which is the forefinger, and right click is assigned to the red color, which is the middle finger. At the initial stage of image acquisition, when the hand is opened completely, it is at its maximum size of a palm, and this length can be calculated as its max/major length. This information, at the initial stage of image acquisition, is used as the main or mother image of this project.
8
S. Waghmare
Fig. 1.3 The tip of the red part is used as a cursor pointer, using the red part’s centroid as the origin
Right Click Based on the mother image, it got information about the middle finger’s major length, and, by folding the middle finger, its defined threshold changes; hence, the right click will perform. Condition : Right Click = Major Axis Length < Threshold Left Click Based on the mother image, it got information about the forefinger’s major length and, by folding the forefinger, its defined threshold changes; hence, the left click will perform. Condition : Left Click = Major Axis Length < Threshold
1.6 Further Development By using this information, one can further develop. Real-Time Operation: Pause/Play. Real-Time Operation: Page Up/Down. Real-Time Operation: Tab Shifting. Real-Time Operation: Dragging/Dropping. Many applications for people with disabilities could benefit from gesture recognition. With the help of advanced graphical user interface software, we can create a graphical user interface software fused with this technology and amalgamate that interface with extraordinarily high-profile cameras that will control machine
1 Hand Gesture-Controlled Simulated Mouse Using Computer Vision
9
learning robots. With the further refinement of the videos, we will be in a position to take advantage of an advanced GUI. By that time, this autonomous robot might be deployed as a home service robot, a complex operation managing robot, or a defense mission robot. The most well-known application of gestures is in the field of digital painting, where users may paint in 3-D structures and have their artwork appear in 3-D. This is incredible. Those concepts could also be used in virtual reality, augmented reality, and gaming. Games based on hand movements, such as snake games and running games, can be created.
1.7 Conclusion Computer vision and machine learning techniques aid in the development of human– computer interactions based on perceived standard colors extracted from visual input. The successful and unique dis-jointing of standard color is a crucial step toward accomplishing this goal. This paper is very effective in available gesture cursor operation with machine learning, computer vision-based color recognition, and image segmentation with the aid of marking those circumstances. The goal of this project was to create a system that could collect images and perform mouse functions, such as moving the mouse pointer, dragging and clicking, and using colored caps on the fingers. The OpenCV, MediaPipe, and NumPy environments were used to create this system. Since vision-based technology is far less expensive, it is replacing contact technology. We like to employ a virtual digital camera that is already built into digital devices with this technology. We feel that this technology has a bright future in human–computer interaction (HCI) systems after doing this research. Robotics, medical equipment, laptop games, and other applications will all benefit from it.
References 1. Prakash, B., Mehra, R., Thakur, S.: Vision based computer mouse control using hand gestures. In 2015 International Conference on Soft Computing Techniques and Implementation (ICSCTI). IEEE (2015). 978-1-4673-6792-9/15 2. Varun, K.S., Puneeth, I., Prem Jacob, T.: Virtual mouse implementation using Open CV. In: 2019 International Conference on Trends in Electronics and Informatics (ICOEI). IEEE (2019). ISBN: 978-1-5386-9439-8 3. Shetty, M., Bhatkar, M.K., Lopes, O.P., Daniel, C.A. (Computer Engineering Fr. CRCE, Mumbai): Virtual mouse using object tracking. In: 2020 5th International Conference on Communication and Electronics Systems (ICCES). ISBN: 978-1-7281-5371-1 4. MediaPipe Hands Homepage: https://google.github.io/mediapipe/solutions/hands.html. Last accessed 19 Oct 2021 5. Pyimagesearch Removing Contours Homepage: https://www.pyimagesearch.com/2015/02/09/ removing-contours-image-using-python-opencv/. Last accessed 21 Nov 2021
10
S. Waghmare
6. Jeon, C., Kwon, O.-J., Shin, D., Shin, D.: Hand-mouse interface using virtual monitor concept for natural interaction. IEEE Access (2017) 7. Xue, X., Zhong, W., Ye, L., Zhang, Q.: The simulated mouse method based on dynamic hand gesture recognition. In: 2015 8th International Congress on Image and Signal Processing (CISP). IEEE (2015). 978-1-4673-9098-9/15/$31.00 8. Gurnani, A., Mavani, V., Gajjar, V., Hand gesture real time paint tool-box: machine learning approach. In: 2017 IEEE International Conference on Power, Control, Signals and Instrumentation Engineering. IEEE (2017). 978-1-5386-0814-2/17 9. Titlee, R., Ur Rahman, A., Zaman, H.U., Rahman, H.A.: A novel design of an intangible hand gesture controlled computer mouse using vision based image processing. In: 2017 3rd International Conference on Electrical Information and Communication Technology (EICT). IEEE (2017). ISBN 978-1-5386-2307-7/17/$31.00 10. Meena Prakash, R., Deepa, T., Gunasundari, T., Kasthuri, N.: Gesture recognition and fingertip detection for human computer interaction. In: 2017 International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS). IEEE (2017). 978-1-50903294-5/17/$31.00 11. Dehankar, A.V., Thakare, V.M., Jain, S.: Detecting centroid for hand gesture recognition using morphological computations. In: 2017 International Conference on Inventive Systems and Control (ICISC). IEEE (2017). 978-1-5090-4715-4/17/$31.00
Chapter 2
E-Commerce Web Portal Using Full-Stack Open-Source Technologies Archit Tiwari and Shalini Goel
Abstract E-commerce applications are tools for accessing the Internet as well as purchasing products and services. People nowadays do not prefer to go out to buy; instead, they prefer the goods and services to be delivered at their place. Such technologies are evolving day by day as a huge advancement of technology is observed with a rapid pace. This paper offers a fresh perspective on how open-source technologies can be used in building e-commerce platforms, thereby reducing various costs such as licensing fees of enterprise software and tools. This full-stack web application enables the user to buy products from an online platform. Apart from buying, the user is provided with a wide variety of features such as selecting products of a particular category, filtering the products on the basis of color, size, and many other features that make this application more user friendly. Also, the data which is received from the front end or posted on the front end is in an unstructured json format. It is tough to store such data in a tabular format. So as to store such kind of unstructured data, we need a NoSQL database, i.e., MongoDB. Keywords Open-source technologies · NoSQL · RDBMS · Full-stack web application · Relational databases · Node.js · React.js · MongoDB
2.1 Introduction Internet and computer are the two most important and widely used things these days. Online work is easier than offline work, and we can save more time. People prefer to shop for goods and services online rather than going to markets. The scope of online business has expanded as a result of this changing trend. The number of people who want to sell their goods and services online has skyrocketed. Over the last five years, online buying has exploded. While e-commerce was just getting started a decade ago, consumers are now demanding more convenience through online shopping. When a customer uses an Internet-enabled mobile device (such as a smartphone or A. Tiwari (B) · S. Goel Department of Computer Science and Engineering, HMR Institute of Technology and Management, New Delhi, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_2
11
12
A. Tiwari and S. Goel
tablet) to explore and browse a store or brand, this is known as mobile shopping. Smartphone shopping affects 90% of purchase behavior among consumers who use mobile devices. Most of the people are able to buy things by making use of their smart phones. The personal relationship that a consumer has with their mobile device is an extra issue that necessitates merchants developing a fundamentally different strategy to satisfy this type of mobile shopper. The consumer’s goal is not just to buy the goods, but also to learn about it and find out what other options are available in those areas. Recently, e-commerce merchants have recognized the expanding market for mobile-based shopping and have provided customers with an optimized and distinctive mobile buying experience. In order to develop full-stack web applications, organizations use various licensed software. Organizations have to pay a huge amount to buy this software. Apart from the cost of buying, companies have to pay various other costs such as cost of renewal of license, cost of maintenance of software, and many other. Apart from heavy costs, if a bug is found in the software or the organization wishes to add a new feature to the software, then the organization will have to wait for the vendor organization to update the software and make the changes accordingly. But, in open-source technologies, the source code of the software is public, i.e., it is available for everyone. Anyone from any part of world can contribute a new feature or a solution to any preexisting problem and can make modifications. That is why many organizations are switching to open-source technologies from paid licensed software. The “e-Commerce Web App” uses the full-stack technologies that is React.js for the front end, and Node.js and Express.js to manage the middleware. The database used to store the records of user is MongoDB. The web app is developed using various libraries of React.js and Node.js that makes the web app more user friendly. Furthermore, a React.js library known as Stripe.js is used to manage the payment gateway, i.e., to handle the checkout form and hold the credit card details.
2.2 Literature Review To develop a user-friendly website with database management in order to provide according to Aravindhan et al. [1] services, make the work easier with costeffectiveness, and trust the user according to Shetty et al. [2]. In system development, usage of database is very common. A database system, according to Paradaens et al. [3], is a collection of programs that run on a computer and assist the user in collecting, changing, protecting, and managing information. The popularity of relational databases has increased tremendously over past 30 years. The usage of relational databases for systems database of organizations is found to be very common these days. Some of the Relational Database Management Systems (RDBMS) are MySQL, Oracle, and PostgreSQL. Each RDBMS is well known and has its own set of advantages. SQL databases were created in order to save information which was structured in manner. Such databases were also known as relational databases. The design of the database is represented by the schema, to which the data should adhere. The data is saved in a structured tabular form, i.e., in the form of rows and columns.
2 E-Commerce Web Portal Using Full-Stack Open-Source Technologies
13
The information stored in these databases can be retrieved with the help of queries written in Structured Query Language (SQL), when the boom in the Internet and Web 2.0 according to Gandhi et al. [4] started to collect a huge amount of unstructured data of users. So, the only commercial solution for data storage that was available at that time was SQL relational databases. The unstructured data of users retrieved from Web was unstructured in manner, and hence, it was quite challenging to store this data in a table-like format. So, a new type of database was required that can store the unstructured data easily. This gave rise to NoSQL databases. MongoDB is a NoSQL database. Using MongoDB, it is easy to store unstructured data as it stores the data if the form of documents which are in JSON (JavaScript Object Notation) format. These new databases needed to support unstructured data that was not suitable for schemas, such as key-value stores, documents, text, graphs, and wide columns. MongoDB is generally referred to as a non-relational database because one cannot establish a connection between the unstructured data of the collections in MongoDB. MongoDB is based on the principles of CAP Theorem. CAP Theorem focuses on Partition, Consistency, and Availability [5]. It uses MERN; thus, the processing and loading of the site is quick.
2.3 Technologies Used 2.3.1 Node.js Node.js is a JavaScript operating environment written in C++. Node.js is a runtime environment for JavaScript. For high performance, Node.js employs the Google Chrome V8 engine and offers a plethora of system-level APIs such as file operations, web programming, and so on. The JavaScript code on the browser side is subject to various security restrictions at run time, and the client system’s operation is restricted. Node.js is designed for network services and uses event-driven, asynchronous programming. The core of Node.js design ideas is event-driven, and it provides the vast majority of APIs in an event-based, asynchronous style. Consider the Net module, where the net Socket object has the following events: connect, data, end, timeout, drain, error, close, and so on. The Node.js developer must register the corresponding callback function based on the business logic. These callback functions are executed asynchronously, which means that, while they appear to be registered sequentially in the code structure, they do not rely on the order in which they appear and instead wait for the corresponding event to fire. The primary benefit of event-driven and asynchronous programming is that they make full use of system resources. The code can be implemented without waiting for a specific operation to complete, and the limited resources can be used for other tasks. This design is ideal for back-end network service programming, which is Node.js’s goal. Concurrency request processing is a major issue in server development, and blocking functions can result in resource waste and time delays. Developers can improve resource utilization
14
A. Tiwari and S. Goel
and performance by using event registration and asynchronous functions. Many of the functions, including file operations, are executed asynchronously, which differs from traditional languages, as evidenced by the supported module provided by Node.js. Node.js’ network modules are particularly large in order to facilitate server development. Developers can build a web server on this foundation using HTTP, DNS, NET, UDP, HTTPS, TLS, and other protocols.
2.3.2 React.js In the world of web development, React.js is the most extensively used front-end JavaScript library. React.js, and simply React are all different ways to refer to React.js. To develop user interfaces for single-page applications, React.js is used as it is an open-source JavaScript library. React.js promotes reusability by enabling the developer to create components which can be reused in different parts of code. In order to develop large web applications that can change data without refreshing the web page, React.js is used. The reason of popularity of React.js is that it is fast, simple, and scalable. In React, the component renderer receives a set of immutable values as properties in its HTML tags. The component itself cannot introduce any kind of change in properties; instead, it can pass a call back function to do so. An in-memory data structure is built by react whose job is to compute any kind of change and then update that change on browser. The react library only displays the components that actually change. React.js is simply easier to grasp right away. It is very easy to learn react.js and build web applications as it follows a component-based approach. Moreover, due to well-defined lifecycle of React.js and use of only plain JavaScript, developers find React.js easy to learn and develop web applications. React employs a unique syntax known as JSX, which allows you to mix HTML and JavaScript. This is not required; developers can still write in plain JavaScript, but JSX is far more user friendly. To learn react, you just need basic knowledge of CSS and HTML. And because React.js promotes reusability, it supports a lot of code reusability. As a result, we can create iOS, Android, and web applications all at once. React employs one-way data binding, and Flux, an application architecture, controls the flow of data to components via a single control point—the dispatcher. It is easy to debug selfcontained components rather to debug the entire React.js web application. React.js applications are extremely simple to test. React views can be thought of as functions of the state, which means we can play with the state we pass to the React.js view and examine the output and triggered actions, events, functions, and so on.
2.3.3 Express.js Express.js is a web framework for Node.js. It is a fast, robust, and asynchronous in nature. Express.js is a back-end web application framework which is suitable for
2 E-Commerce Web Portal Using Full-Stack Open-Source Technologies
15
Node.js. It is an open-source software. It was designed by keeping in mind that using this framework developers can build APIs and web applications. Express.js is also referred to as de facto standard server framework for Node.js. In various Renowned development stacks such as MERN stack and MEAN stack, Express.js is used as a back-end component along with a front-end framework or library supported by JavaScript and MongoDB as a database software.
2.3.4 MongoDB MongoDB is a NoSQL database management system which is free and open source. As an alternative to conventional relational databases, the database technology that can be used is NoSQL. NoSQL databases prove to be advantageous when you have to deal with huge amount of data which is distributed in nature. MongoDB can store a variety of data types. It is one of several non-relational database technologies that emerged in the mid-2000s under the NoSQL banner, typically for usage in big data applications and other processing tasks which involved data that is not suitable for a traditional relational model. The architecture of MongoDB consists of collections and documents apart from relational databases which consist of rows and columns. MongoDB is used by organizations as it provides various features such as load balancing, server-side JavaScript execution, indexing, running Ad-Hoc queries, aggregation, and many other features. MongoDB saves data records, which are made up of documents; these documents further contain a data structure which comprises key-value pairs. The smallest unit in MongoDB is a document. These documents resemble JSON (JavaScript Object Notation), but they use a variant known as BSON (Binary JSON). The variant BSON is used as more data types are supported by it as compared to JSON. Any type of data can be stored in a collection but spreading of data across multiple databases is not possible.
2.4 Proposed System When the user launches the application, the authentication section of the web app appears which asks the user to login to the web page. The user is asked to enter his/her login credentials on the login page form. Based on the credentials entered by the user, the password is verified. If the password verification is successful, the user is directed to the main page, else a warning shown and the user is asked to enter the correct details again. Also, by making use of AES encryption standards the password is first encrypted and then stored to the back end. The password stored in the back end is encrypted using AES encryption standards. When the user registers itself, the password is first encrypted using an encryption key and then stored in the back end. At the time of login when the password needs to be verified, the encrypted password is fetched, and decrypted using the encryption key. Through this, we can ensure the
16
A. Tiwari and S. Goel
privacy and safety of data of user. When the correct credentials are entered by the user, the login is successful and the main screen is displayed. After the successful login if the user refreshes the web page by mistake, the changes made by the user stay as they were. This feature is achieved using Redux Persist. Redux Persist is a library that allows saving a Redux store in the local storage of an application. The main page consists of an announcement bar, a navigation bar which displays the name of the web app, and a slider which displays various trending deals offered for the users. Beneath the slider, the user is provided with an option to select products which belong to a specific category. Apart from this, a footer is also designed which displays various details about the web shop such as contact details and many more. When a user clicks on a specific product, he or she is taken to a web page that displays the product’s specifications. This is referred to as routing. Routing is the process of directing a user to different pages based on their action or request, which is accomplished using the React Router DOM. Dynamic routing in a web app is implemented using React Router DOM. React Router DOM enables component-based routing based on the needs of the app and platform, as opposed to traditional routing architecture, which handles routing in a configuration outside of a running app. On this page, the user can select the size and increase or decrease the quantity of the product that he or she wishes to buy. A button is also provided which adds that product to the cart. When the user checks the cart, the product that he or she has chosen appears there. Additionally, the user’s specifications (size and quantity) appear over there along with the product. The total amount to be paid is displayed on the right-hand side of the web page. When the user clicks on the checkout button, a payment gateway appears. This payment gateway is developed using Stripe.js. The payment gateway is a form which asks the user to enter the details such as name, Billing/Shipping address, and card details such as Credit Card Number, Card Expiry date, and CVC. When the user enters the correct details, the button turns green showing the successful payment. The middle ware APIs used for various purposes such as routing, connection to front end, and connection to databases have been developed using Node.js. The database used for this web application is MongoDB. Various information related to the web app such as details of the user, configuration of each product, and order history is stored in the form of documents in MongoDB. These documents contain unstructured data in the form of key-value pairs which is in the JSON format.
2.5 Conclusion As a result, by using this full-stack web application, the user can easily purchase any product of his or her choice. The size and color filters make this web app more user friendly. Using MongoDB as a back-end database also made it easier to store unstructured documents in JSON format. Furthermore, the React.js and Node.js libraries provided a variety of features that enhanced the usability and reduced the
2 E-Commerce Web Portal Using Full-Stack Open-Source Technologies
17
cost of building and maintaining the web application. Thus, a full-stack web application was developed. In the future, two-factor authentication may be added to the application to improve user security and privacy during authentication.
References 1. Aravindhan, K., Periyakaruppan, K., Anusa, T.S., Kousika, S., Lakshmi Priya, A.: Web application based on demand home service system. In: International Conference on Advanced Computing & Communication Systems (ICACCS), June 05, 2020 2. Shetty, J., Dash, D., Joish, A.K., Guruprasad: Review paper on web frameworks, databases and web stacks. Int. Res. J. Eng. Technol. (IRJET) 7(4) (2021) 3. Paradaens, J., Bra, P.D., Gyssens, M., Gucht, D.V.: The Structure of the Relational Database Model. Springer-Verlag, Berlin Heidelberg, Heidelberg (2012) 4. Gandhi, M., Shah, P., Solanki, D., Shah, M.: Decentralized freelancing system—trust and transparency. Int. Res. J. Eng. Technol. (IRJET) 6(9) (2019) 5. Hills, M., Klint, P., Vinju, J.J.: An Empirical Study of PHP Feature Usage: A Static Analysis Perspective. ISSTA 2013, pp. 325–335. ACM (2013) 6. Shahriari, S., Shahriari, M., Gheiji, S.: E-commerce and its impact on global trend and market. Int. J. Res. Granthaalayah 3(4) (2015)
Chapter 3
Design of a QoS-Aware Machine Learning Model for High Trust Communications in Wireless Networks Shaikh Shahin Sirajuddin and Dilip G. Khairnar
Abstract Incorporating trust levels in wireless networks is always a trade-off between quality of service and control overheads. It requires an effective design of trust model that considers the nodes with high trust values. The trusted nodes show good performance in terms of various parameters that defines quality, like high packet delivery ratio, high energy efficiency, and minimum delay. For designing such protocol, one must consider such parameters and to classify whether the node is malicious or legitimate. This classification also must be accompanied with effective routing mechanism that will choose shortest and legitimate path for better QoS. For designing such a dynamic trust-based routing network, this work proposes a machine learning model for QoS-aware routing. This model considers node-to-node distance, node energy, and node clustering as performance in order to improve the overall routing trust levels. The proposed algorithm is compared with ad hoc-ondemand-distance-vector routing (AODV)-based non-trust routing algorithm, and an improvement in QoS is observed. This improvement ensured a delay reduction, and an energy consumption reduction when compared to the existing AODV-based non-trust-routing network. Keywords Trust · Machine learning · Clustering · Routing · QoS
3.1 Introduction In order to establish trust in any mobile network, both direct and indirect trust evaluation is important. Direct trust evaluation considers parameters like node-to-node distance, node energy levels, etc. While indirect trust considers parameters like nodecluster level, packet forwarding ratio, etc. The following process is followed by any wireless network in order to establish trust-based communications. • Initialize network parameters, and select source and destination nodes for communication S. S. Sirajuddin (B) · D. G. Khairnar SSPU University, Pune, Maharashtra, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_3
19
20
S. S. Sirajuddin and D. G. Khairnar
• Monitor behavioral node data, which includes, – Node positions – Node energy levels – Packets successfully communicated by each node. • Cluster positions for nodes, etc. Using these parameters, evaluate direct and indirect trust values, and provide these values to a trust manager • The trust manager connects to a certificate authority to evaluate the internal nodes which will be used for communication • Here, trust levels for each participating node are evaluated and nodes are categorized into trusted and malicious using a classification algorithm • Trusted nodes are used for communication, while untrusted nodes are classified as malicious nodes and are separated from the network • The process is repeated for all communication.
3.2 Literature Review In order to establish trust in wireless networks, researchers have proposed various techniques which analyze network parameters and assist in deploying route optimizations, packet transfer optimizations, etc., in the network. These techniques either use single algorithms or combine multiple algorithms for effective trust deployment. For instance, the work in [1] uses a fusion of chicken and dragonfly optimization techniques in order to identify trusted nodes for network routing. This work divides nodes into different clusters; the heads of which are selected using Multi-Objective Taylor Crow Optimization (MOTCO). Nodes selected via MOTCO algorithm further undergo integrity, consistency, availability, and forwarding checks in order to estimate their trust levels. Hybrid techniques are also termed as ensemble techniques, and they tend to improve overall trust establishment efficiency for any network. A similar ensemble technique for trust evaluation can be observed from [2], wherein Tanimoto Support Vector Regression-Based Corrective Linear Program Boost Classification (T-SVR-CLPBC) is proposed. The proposed technique improves secure communication performance via minimizing delay and improving packet delivery ratio for the network. This comparison can be further enhanced via comparison of the ensemble trust performance with trust and reputation management models as suggested in [3], wherein models like Distributed Reputation-based Beacon Trust System (DRBTS), cluster head-based trust, multi-version multi-path (MVMP), parameterized and localized trust management scheme for sensor networks security (PLUS), etc., are described and their performance is compared. Another secure, trustworthy and energy-efficient trust-based routing model using fuzzy decisions is suggested in [4]. In this work, the multidimensional scaling-map (MDS MAP) routing algorithm is used along with fuzzy logic for trust modeling, due to which overall energy consumption is reduced by 15%, end-to-end delay is reduced by 10%
3 Design of a QoS-Aware Machine Learning Model …
21
and packet delivery ratio is improved by 8% when compared with standard TrustAware Routing Framework (TARF) and Trust and Centrality degree-Based Access Control (TC-BAC) algorithms. This comparison can be further extended by referring other standard algorithms for cluster-based trust management. For example, the work in [5] proposes use of Low-energy adaptive clustering hierarchy (LEACH)-based trust establishment protocol, which considers member to member, master to member, master to master and master to base-station trust values. It also considers dispositional risk while evaluating the final trust value for any node. The performance can be further improved using a deep-ensemble trust model as described in [6], wherein agentbased trust and reputation management (ATRM), quality-of-security-based distance vector routing (QDV), Agent-based trust model (ATSN), Reputation-and-trust-based system network (RFSN), collaborative reputation mechanism to enforce node cooperation (CORE), distributed reputation-based beacon trust system (DRBTS), and bio-inspired trust and reputation model for wireless sensor network (BTRM-WSN) are combined. A combination of these models allows improved routing and clustering mechanisms to be placed, such that the nodes selected for performing communication have high energy, low distance, high packet forwarding ratio, and minimum packet loss. Another interesting research model that uses Random Repeat Trust Computational Approach (RRTCA) is proposed in [7]. This model continuously updates internal routing mechanisms to add and delete trustworthy nodes in the network. This addition and deletion is done via checking each node for their previous QoS values; if these values go below threshold, then the nodes are removed from the trust list, and thresholds are adjusted. Depending upon these new thresholds, nodes are rescanned, and probability of non-trust communication is reduced. Random repeat trust computational approach for real-time networks. Another multi-hop routing framework for trust evaluation can be observed from [8], wherein network parameters like node bandwidth and its rank are considered while performing inter-node routing. Rank is evaluated using node-to-node distance, energy, and traffic intensity between the selected nodes. The algorithm suggests an improvement of 5% in both delay and packet delivery ratio (PDR) performance when compared to Amodu [8] and Tam [8] approaches, but the PDR values mentioned in this paper are over 100%, which suggests re-evaluation of this work before actual implementation. Trust computations are also extended to industrial communications and are needed for improved QoS performance. For instance, the work in [9] proposes GDTMS, which is a Gaussian distribution-based comprehensive trust management system for fog computing-based Industrial Internet of Things (IIoT) devices. This protocol can be further evaluated with other trust-based routing protocols as discussed in [10]. Moreover, the work in [10] also discusses different attacks in wireless networks which include both active and passive attacks. An availability predictive trust factor-based semi-Markov mechanism (APTFSMM) is proposed in [11], wherein a semi-Markov process is adopted for cluster head selection. The performance can be improved via the use of optimization algorithms like Genetic Algorithm, Particle Swarm Optimization, etc. The work in [12] suggests the use of whale optimization for selection of best cluster heads in network communication. The algorithm considers node’s residual energy, average forwarding ratio, average cluster distance, transmission delays, and
22
S. S. Sirajuddin and D. G. Khairnar
node traffic density while performing communication. The work in [15] suggests use of clustering along with protected data aggregation for improved performance in energy optimization. Using these observations, the proposed machine learning dynamic clustering protocol is designed. This protocol is described in detail in the next section, which is followed by its comparative analysis and evaluation. Details about network structure and simulation parameters are also mentioned in the result section of this text.
3.3 Proposed QoS-Aware Machine Learning Model for High Trust Communications in Wireless Networks (QAMLHT) It can be observed that the proposed architecture works using the following process, Network is setup, and source and destination nodes are selected. Let these nodes be named S and D, respectively (Fig. 3.1). • Apply k-Means to cluster nodes depending upon their energy levels. • Let cluster number of source node be ‘c,’ while cluster number of destination node be ‘d.’ • Evaluate the following equation for all nodes in the current cluster to apply Dempster–Shafer (DS) optimization. f de =
(xc − xd )2 + (yc − yd )2 E c ∗ Frd
(3.1)
where xc is the x-position currently selected node from cluster ‘c,’ xd is the x-position of a single node in the lower numbered cluster, yc is the y-position currently selected node from cluster ‘c,’ yd is the y-position of a single node in the lower numbered
Fig. 3.1 Proposed QAMLHT architecture
3 Design of a QoS-Aware Machine Learning Model …
23
cluster, Frd is the forwarding ratio of the node in ‘d’ cluster, and E c is the energy of currently selected node from cluster ‘c.’ • Select the node from cluster number ‘d’ which has minimum value of f de • Nodes with minimum value of f de indicate nodes in the direction of destination, having minimum distance and maximum energy • Due to consideration of distance, forwarding ratio, and energy of these nodes, the overall trust levels increase, because the selected nodes have highest lifetime, lowest communication delay, and highest packet delivery performance • In case of node failure, the value of forwarding ratio reduces, thereby removing the node from the selected communication sequence • The entire process is repeated for any new communication sequence.
3.4 Statistical Analysis In order to evaluate performance of the proposed machine learning inspired trustbased routing protocol, the following network parameters were considered. MAC & Routing Protocol: Mac/802.11, & AODV Number of nodes: 30 to 100 Network Size: 300 × 300 Packet Size and interval: 1000 bytes per packet, and 0.01 s per packet. From the evaluations done in Table 3.1, it is observed that end-to-end delay has been reduced by 35% when compared with AODV and ATSN protocols. Similar observations can be seen for energy, and throughput from Tables 3.2 and 3.3 as follows, Via this evaluation, it can be observed that overall network lifetime is improved by almost 40% when compared with AODV, and by almost 30% when compared Table 3.1 Delay performance
No. of nodes
Delay (ms) AODV
ATSN [6]
Proposed
20
0.26
0.30
0.15
30
0.27
0.31
0.16
40
0.35
0.40
0.21
50
0.49
0.56
0.29
60
0.58
0.67
0.35
70
0.66
0.76
0.40
80
0.78
0.90
0.47
90
0.89
1.02
0.53
100
0.92
1.05
0.55
24 Table 3.2 Energy performance
Table 3.3 Throughput performance
S. S. Sirajuddin and D. G. Khairnar No. of nodes
Energy (mj) AODV
ASTN [6]
Proposed
20
5.30
6.10
3.17
30
5.90
6.79
3.52
40
6.20
7.13
3.70
50
6.50
7.48
3.88
60
6.70
7.71
4.00
70
6.90
7.94
4.12
80
7.50
8.63
4.48
90
7.90
9.09
4.72
100
0.92
1.05
0.55
No. of nodes
Thr (kbps) AODV
ATSN [6]
Proposed
20
321.00
279.13
500.11
30
335.00
291.30
521.92
40
337.00
293.04
525.04
50
339.00
294.78
528.15
60
342.00
297.39
532.83
70
345.00
300.00
537.50
80
347.00
301.74
540.62
90
352.00
306.09
548.41
100
355.00
308.70
553.08
with the standard ATSN protocol. This improvement allows the system to be applied for real-time use cases like low-powered body sensor networks. An increase of 25% in throughput can be observed, which makes the system applicable to high-speed communication applications. From the results, it can be observed that the proposed trust-based routing implementation is able to optimize the delay, energy, and throughput performance in the network with 35%, 25%, and 30%, respectively, while keeping packet delivery ratio performance to a respectable value when compared with standard AODV and ATSN protocols.
3.5 Conclusion and Future Scope Due to the use of energy, distance, and packet forwarding ratio during routing for trust establishment, overall node selection process is optimized. This optimization allows the network to select proper paths while performing node-to-node communication.
3 Design of a QoS-Aware Machine Learning Model …
25
As a result of which, end-to-end delay is reduced by 35%, while energy consumption is reduced by 30%, both of which indicate that the proposed protocol is not only secure, but has high QoS during real-time communications. An increase of 25% in throughput allows the system to be used for high-speed networks and have higher bandwidth in terms of node-to-node communications. Moreover, the system can be used with better machine learning models for further improving its real-time performance.
References 1. Rodrigues, P., John, J.: Joint Trust: An Approach For Trust-Aware Routing in WSN, pp. 1–16. Springer Science+Business Media, LLC, part of Springer Nature (2020) 2. Anitha Josephine, J., Senthilkumar, S.: Tanimoto Support Vector Regressive Linear Program BoostBased Node Trust Evaluation for Secure Communication in MANET, pp. 1–21. Springer Science+Business Media, LLC, part of Springer Nature (2020) 3. Momani, M., Trust Models in Wireless Sensor Networks: A Survey, pp. 37–46. Springer-Verlag Berlin Heidelberg (2010) 4. Beheshtiasl, A., Ghaffari, A.: Secure and trust-aware routing scheme. In: Wireless Sensor Networks, pp. 1–16. Springer Science+Business Media, LLC, part of Springer Nature (2019) 5. Ramesh, S., Yaashuwanth, C.: Enhanced approach using trust-based decision making for secured wireless streaming video sensor networks. Multimedia Tools Appl 1–20 (2019) 6. Chen, H., Wu, H., Zhou, X., Gao, C.: Agent-based trust model in wireless sensor networks. In: Eighth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/distributed Computing (SNPD 2007), vol. 3, pp. 119–124. IEEE (2007) 7. Nivedita, V., Nandhagopal, N.: Improving QoS and efficient multi-hop and relay based communication frame work against attacker in MANET, pp. 1–11. Springer-Verlag GmbH Germany, part of Springer Nature (2020) 8. Alghamdi, T.A.: Route Optimization to Improve QoS in Multi-Hop Wireless Sensor Networks. Springer Science+Business Media, LLC, part of Springer Nature (2020) 9. Fang, W., Zhang, W., et. al.: TMSRS: Trust Management-Based Secure Routing Scheme in Industrial Wireless Sensor Network with Fog Computing, pp. 1–14. Springer Science+Business Media, LLC, part of Springer Nature (2019) 10. He, J.Y., Xu, F.: Research on Trust-Based Secure Routing in Wireless Sensor Networks. ISCME (2019) 11. Amuthan, A., Arulmurugan, A.: An availability predictive trust factor-based semi-Markov mechanism for effective cluster head selection in wireless sensor networks. Int. J. Commun. Syst. 1–16 (2019) 12. Sharma, R., Vashisht, V.: WOATCA: a secure and energy aware scheme based on whale optimisation in clustered wireless sensor networks. IET Commun. 1199–1208 (2020) 13. Umamaheswari, S.: Performance analysis of wireless sensor networks assisted by on-demandbased cloud infrastructure. Int. J. Commun. Syst. 1–11 (2020) 14. Reza, M., Aghdam, G., et al.: Space-time block coding in millimeter wave large-scale MIMONOMA transmission scheme. Int. J. Commun. Syst. 1–8 (2020) 15. Fakhet, W., El Khediri, S., et al.: New K-means Algorithm for Clustering in Wireless Sensor Networks
Chapter 4
Application of Machine Learning in Mineral Mapping Using Remote Sensing Priyanka Nair, Devesh Kumar Srivastava, and Roheet Bhatnagar
Abstract The machine learning is an effective approach toward acquiring patterns on voluminous data popularly termed big data. Remote sensing is one such field that can be employed with the ML concepts to ascertain solutions to several environmental problems. With the raw spatial data captured by the sensors like LANDSAT, meaningful insights can be drawn adhering to the specifics of the arena. The images are captured in the form of electromagnetic waves often termed as spectral signatures based on the reflectance properties of elements on the earth’s surface. The paper intends to showcase the relevance of machine learning concepts pertaining to a specific area of application in geosciences with the identification of potential mineral mapping areas as the key objective. To derive the most appropriate results, the key indicators on the earth’s surface are focused where the dataset is band mapped with multispectral data. The spectral resolution is a key concept that provides an unambiguous picture of mineral spectra across the spectral regions. The image classification provides the classifiers which can be further accurately assessed to determine the several regions including vegetation, soil, and water. The paper intends to delve into the intricacies of remote sensing as an effective tool of data capturing in the form of spectral signatures and the role of machine learning algorithms for effective geospatial analysis to derive regions of exploration interest with extensive scrutiny of work under the study arena. Keywords Machine learning · Remote sensing · Geospatial analysis · Mineral mapping · Spectral signatures · Spectral unmixing
P. Nair (B) · D. K. Srivastava · R. Bhatnagar Manipal University Jaipur, Jaipur, Rajasthan, India e-mail: [email protected] R. Bhatnagar e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_4
27
28
P. Nair et al.
4.1 Background The application of data science and machine learning is emerging as a promising field aiding solutions to geosciences and the arena of remote sensing. Remote sensing is an effective tool to acquire satellite imagery to be accustomed to various geological challenges and areas of betterment. A lot of insights acquired in the form of spatial data can be employed effectively to draw insights based on the surface about the geological, petro-physical, and logging data. Several potential regions with mineral deposits have been identified on earth’s surface area, and research is being carried out to explore the regions with capacitive mineral bank digging below the surface regions. Even though site mapping and physical scrutiny of geological assets are more effective, the aerial and satellite capture employing remote sensing arena facilitates insight into wider geographical locations as well as capture required data in the format required, especially in the era where technology has become the primary source of aiding solution to the environmental crisis. Remote sensing can be used to collect valuable information in the form of satellite imagery that can be supplemented with machine learning processing to draw meaningful insights into discovering potential areas of mineral mapping [1]. However, there is a limiting factor with the aerial photographs on account of providing accurate information of inaccessible regions penetrating deep into the earth’s surface. Even though the limitation persists to depth data of micrometers to centimeters, the data can still be mapped well with supporting tip-offs. The indirect tracing can be carried out employing indicators like surrounding anomalies, oxidations, lithological structure, and area with potential alterations. The reconnoiter of the locations with mineral resources relies on the basic geological construct of the region [2]. To work on the direct and indirect data employing satellite imagery, multispectral sensors and hyperspectral sensors evince a major role in mineral exploration and mineral mapping. Subsequent many theories can be taken into consideration while formulating and processing the data concerning mineralogy. The type of rock can demonstrate the presence of a specific kind of mineral in the region. The environmental conditions affect the rock alterations, and thereby, spatial associations can also be derived effectively. With the discussed hypothesis, we can infer that geological structure and extant rock composition can imply data points of the mineral deposits to map accordingly [3]. The satellite imagery is associated with spectral signatures captured in the form of electromagnetic waves. For the minerals and rocks, the spectral signatures can be analyzed to detect anomalies and patterns on the persistence lattice structure. With terabytes of spectral signatures gathered by the sensors, automated information extraction processing and recognition can be supplemented by employing data mining approaches, thereby delving into implementation of suitable machine learning algorithms. Post-data collection applying remote sensing concepts, gathering data from sensors like LANDSAT and MODIS, the preprocessing of data is the next requisite. Principle component analysis (PCA) can be performed on the geospatial images for exploration of mineral deposits [4, 5]. The visual appearances of the rocks are
4 Application of Machine Learning in Mineral Mapping …
29
essential data captured by the sensors that can draw insights on the associated mineral deposits in the nearby locations [6]. The texture and colors of the rocks when captured by the sensors are absorbed at different wavelength. To comprehend the relationship between lithology and mineral deposits, let us have an association formulation. The rocks of the classification like amphibole, biotite, olivine, plagioclase, pyroxene, quartz, etc. differ in their color, fracture, form, and structure [7]. They vary ranging from being crystalline, flaky, and granular to being irregular grains. Exemplifying the association further, biotite mica is a part of the felsic mineral group and dark color mica are called the biotite. Hardness is another scale of measurement for a rock that have evolved over time. The cleavage of the rock can result as a 2 perfect set or a 1 perfect set. The spectra or the spectral resolution depends on the idiosyncrasy of the different elements [8]. Landsat 7 and Landsat 8 data can be used to identify mineral deposits employing hydrothermal alterations in rocks. Potential mineral mapping regions can be obtained by employing spectral unmixing of hyperspectral data (or multispectral data) based on the spectral signature of mineral. Each material on earth’s surface has a unique spectral property or a unique spectral signature [9]. The Landsat 8 collects the data points and measures reflectance at nine bands of wavelengths, whereas advanced hyperspectral sensors measure more than 200 bands for the same range of wavelength. The reflection from land cover increases minutely from the visible range on the spectrum to the infrared range of the spectrum. The spectral band composition for intended dataset pertaining to Landsat 8 Operational Land Image (OLI) and Landsat 7 Enhanced Thematic Mapper (ETM+) varies in wavelengths. SWIR of Landsat 8 OLI has a band specific ranging between 1.57 and 1.65 that extricate soil moisture content and vegetation covers, whereas for Landsat 7 ETM+, the specifics range between 1.55 and 1.75. The hydrothermally altered rocks are identified with associated mineral accumulate between 2.09 and 2.35 of SWIR. Thermal infrared spectroscopy (TIRS) band corresponds to thermal mapping and identification of soil moisture content [10]. The relevance of bands pertaining to image classification is evident. Spectral unmixing of land covers is then followed by characterizing the reflectance from the specific land cover. The soil and rocks have different reflectance in its own cover which is more critical than differentiating the reflectance between the varied land covers. A rugged surface has less reflectance due to multiple scattering due to which the surface appears bright. However, if a surface is smooth and even, mirror-like reflectance is observed [11]. The smooth surface can relate to moist and clay-like texture which may thereby signify presence of minerals like iron-oxide or another mineral constituent. Advanced Learning Techniques have proven to be effective in discriminating against targeted aspects of system recovery.
30
P. Nair et al.
4.2 Related Work Minerals have several absorption and adsorption bands. Based on the geo-analysis, various patterns of land cover can be determined including spectral pattern in remote sensing data and ground truth. The proximity between two bands of the spectrum is scrutinized to evaluate the correlation between data values. The employment of multispectral data depicts several narrow bands that is useful in differentiating between the spatial distribution of various entities captured on spectrum varying as vegetation, soil, and water [12]. The alteration in rocks due to hydrothermal implications and moisture content in soil can lead to prediction of minerals and its different composition in an area. The studies also foreground the relevance of spectral signatures and adhering to assessment accuracy. There is a tradeoff between spatial and spectral resolution. Higher the number of spectral bands, greater is the information we can obtain for better comprehension and visualization [9]. Researchers also proposed models determining the uniaxial compressive strength of rocks to analyze the stability of structures using genetic programming. The advances in statistical knowledge have paved ways to several machine learning algorithms providing solutions that include methods like support vector machines. Bayes point machine has also seen its application in image classifications. PCA is used in feature extraction for remotely sensed data. Another state-of-the-art method for geospatial analysis is symbolic machine learning (SML) techniques including data reduction sequencing and analysis of association rules. It pertains to the analogy with genetic association [13]. The machine learning-based methods have been extensively employed in the field of geosciences for varied application areas. In a methodical survey of mineral exploration by Jung et al. [2], the support vector machine proved to be the most widely used technique, followed by in-depth learning models. According to the authors, in-depth study methods can be used extensively for mapping mineral viewing maps. Since with the study, the number of known deposits was relatively small, it greatly impaired the ability to do standardized reading indepth. According to a review of machine learning in analysis of mineral using remote sensing data by Shirmard et al. [1], Advanced Learning Techniques such as CNN have proven to be effective in discriminating against targeted aspects of system recovery. Convolutional neural networks (CNN), random forest (RF), and support vector machines have better results with lithological units, types of conversions, and mineral mapping. It could be a new way of investigating the use of Bayesian neural networks in remote sensing to detect and map features associated with mineral accumulation. Large geological data features make it difficult to use CNN archives directly from MPM to achieve high performance and robustness of separation and prediction. Chung et al. [14] revealed how to obtain magnesite and related minerals, including dolomite, calcite, and talc, in terms of mineralogical, chemical analysis, and experimental analysis. Prospects for the future suggest that high levels of lithological data could lead to better predictability. Francisco, Pastor et al. [15] performed the identification of the variscite emanating from the mining complex of Gavà, Barcelona in
4 Application of Machine Learning in Mineral Mapping …
31
Catalonia by implementing machine learning algorithm using Raman spectra. Raman spectra was used for identifying mining sites. In this study, selectin of spectral bands played a vital role. Li and Chen [16] studied applications of deep convolutional neural networks in prospecting prediction based on two-dimensional geological big data. Deep convolutional neural network algorithm named AlexNet was used. Schnitzler et al. [17] applied random forest algorithm to estimate a geochemical variable in mining exploration employing multi-sensor core logging data. As an extension to the work, it was enlisted to employ multiparameter databases that can give generalized accuracy. The scope of using all variable should be marked and should automatically prioritize the most useful data point.
4.3 Geospatial Analysis Using Machine Learning Geospatial Data Analysis can be performed on cloud employing Google Earth Engine. To map the minerals with hydrothermal alterations in rocks, either of Landsat 8 Operational Land Imager (OLI) and Landsat 7 Enhanced Thematic Mapper plus (ETM+) can be used. The Operational Land Imager (OLI) is the dataset onboard Landsat 8 satellite launched in February, 2013. The satellite has a repository of earth’s images with a 16-day repetition cycle. Considering portions of India on geographical maps, we select the target area by marking the polygon on GEE or importing the location points. To limit the data acquisition with less than 5000 records, filter bounds are applied to the Landsat 8 OLI dataset. As the target area pertains top locations in India, the filter is applied to the country considered. The properties and features of the Landsat 8 data can be acquired by printing the same, post-application of filter bounds. Different combinations of spectral bands are used to comprehend the spectral reflectance. Landsat data can be visualized by mapping to different color composites as well as the code snippet is shown in Fig. 4.1. Each spectral band can be viewed in two modes primarily that is as grayscale (one band) and as RGB composite (red, green, blue) which results in images displayed in varied brightness. False color composites are the color composition excluding red, green, and blue, thereby making it not suitable for human comprehension. As discussed, post-capturing of images by sensors, the images can be classified employing image classification concept. For classification of rater images, automated approach or JavaScript data can be used where multispectral images are used as input. Individual classes are formulated based on the principle of pattern recognition [15]. Each image pixel relates to a specific class. Unsupervised and supervised algorithms may be used based on the prior availability of land cover data details. Whereas unsupervised methods are used where analysts have no control over the data, supervised methods are adopted with detail availability of land cover [18]. Some of the popular supervised algorithms include nearest neighbor, linear and logistic regression, decision trees, support vector machines, artificial neural network,
32
P. Nair et al.
Fig. 4.1 LANDSAT data visualization by mapping to different color composites
and random forests. K-means clustering and association rules are the popular unsupervised methods. Defining the land cover classes separated as spectral bands, the training site is selected. The data points are then generalized as statistical parameters. The accuracy assessment stage is followed by the output stage of image classification. The dataset is imported with the Google Earth Engine (GEE) explorer. We can employ Landsat 7 RTM plus or Landsat 8 OLI dataset as most appropriated. The target coordinates are filtered on geometry and then by date (in period). The geometry is then imported as feature collection with designated class property labeled. The training data for the classes is marked on the map (more the focal mark, the better training is actuated). The selection bands are set and configured. Sampling is performed on the regions that aid the mentioned prediction bands, and the training data is selected. The classifier is then trained using supervised machine learning algorithm, and the classified image is thereby obtained. Finally, the classified image is added to the map. The pseudo flow of the process to be followed is shown in Fig. 4.2. Remote sensing and machine learning facets can be implemented harnessing Google’s cloud platform (GEE), QGIS, and machine learning tools. We can infer the workflow pertaining to lithology with mineral mapping. The satellite imagery can be acquired on a Google’s cloud platform and squared with the training area. Ancillary data is marked with geological maps and defining signature separability. The comparison of classification algorithms is a complex and open problem. First, the notion of performance can be defined in many ways: accuracy, speed, cost, readability, etc. Second, an appropriate tool is necessary to quantify this performance. Third, a consistent method must be selected to compare the measured values. With the image classification employing suitable methods, accuracy assessment is the next vital step toward exploring potential mapping regions. With accuracy assessment, we are pondering on the question of how accurate is a classified image. For carrying out the assessment, truth data and sample of pixel of already available images are required. The truth data can be obtained from field
4 Application of Machine Learning in Mineral Mapping …
33
Fig. 4.2 Supervised classification of land cover in GEE
samples or ariel images. One commonly used and efficient method of accuracy assessment is realization of confusion matrix with measures including overall accuracy, producer’s and user’s accuracy, etc.
34
P. Nair et al.
4.4 Conclusion The mining sector is poised to play a crucial role in the revival of the economy. It will form the foundation of all our efforts toward the much-needed economic recovery dented by the pandemic. The remote discovery of metallic and non-metallic minerals can pave ways to effective solution for exploration, thereby leading to exploitation and reclamation. ML-based approach is widely used in remote sensingbased applications. With the paper proposed, the role of sensors in data collection of multispectral images has been presented and the role of machine learning classifiers for feature extraction is discussed. Post-collection of images in the form of spectral signatures, suitable ML approach can be adopted which is succeeded by accuracy assessment by suitable statistical analysis. The current work intends to showcase the application of machine learning facet with satellite imagery collected harnessing the potential of remote sensing data. The work focuses on presenting an extensive scrutiny of spectral unmixing between the land covers. The extension of the work will cover the spectral unmixing within the land covers and classify the potential mining sites based on lithological data points.
References 1. Shirmard, H., Farahbakhsh, E., Muller, D., Chandra, R.: A review of machine learning in processing remote sensing data for mineral exploration. Remote Sens. Environ. 0034-4257 (2021) 2. Jung, D., Choi, Y.: Systematic review of machine learning applications in mining: exploration, exploitation, and reclamation. Minerals 11(2), 1–20 (2021) 3. Rajan Girija, R., Mayappan, S.: Mapping of mineral resources and lithological units: a review of remote sensing techniques. Int. J. Image Data Fusion 10(2), 79–106 (2019) 4. Li, J.: Texture classification of landsat TM imagery using Bayes point machine. In: Proceedings of the Annual Southeast Conference (2013) 5. Lary, D.J., Remer, L.A., MacNeill, D., Roscoe, B., Paradise, S.: Machine learning and bias correction of MODIS aerosol optical depth. IEEE Geosci. Remote Sens. Lett. 6(4), 694–698 (2009) 6. Cardoso-Fernandes, J., Teodoro, A.C., Lima, A., Roda-Robles, E.: Semi-automatization of support vector machines to map lithium (Li) bearing pegmatites. Remote Sens. 12(14) (2020) 7. Bolouki, S.M., Ramazi, H. R., Maghsoudi, A., Pour, A. B., & Sohrabi, G.: A remote sening-based application of bayesian networks for epithermal gold potential mapping in AharArasbaran area, NW Iran. Remote Sens. 12(1) (2020) 8. Rajesh, H.M.: Application of remote sensing and GIS in mineral: resource mapping—an overview. J. Mineral. Petrol. Sci. 99(3), 83–103 (2004) 9. Nair, P., Srivastava, D.K., Bhatnagar, R.: Remote sensing roadmap for mineral mapping using satellite imagery. In: 2nd International Conference on Data, Engineering and Applications, IDEA 2020 (2020) 10. Lary, D.J., Alavi, A.H., Gandomi, A.H., Walker, A.L.: Machine learning in geosciences and remote sensing. Geosci. Front. 7(1), 3–10 (2016) 11. Kaplan, U.E., Topal, E.: A new ore grade estimation using combine machine learning algorithms. Minerals 10(10), 1–17 (2020)
4 Application of Machine Learning in Mineral Mapping …
35
12. Notesco, G., Kopaˇcková, V., Rojík, P., Schwartz, G., Livne, I., Ben Dor, E.: Mineral classification of land surface using multispectral LWIR and hyperspectral SWIR remote sensing data. A case study over the sokolov lignite open-pit mines, the Czech Republic. Remote Sens. 6(8), 7005–7025 (2014) 13. Liu, L., Zhou, J., Jiang, D., Zhuang, D., Mansaray, L.R., Zhang, B.: Targeting mineral resources with remote sensing and field data in the Xiemisitai area, West Junggar, Xinjiang, China. Remote Sens. 5(7), 3156–3171 (2013) 14. Chung, B., Yu, J., Wang, L., Kim, N.H., Lee, B.H., Koh, S., Lee, S.: Detection of magnesite and associated gangue minerals using hyperspectral remote sensing-a laboratory approach. Remote Sens. 12(8) (2020) 15. Díez-Pastor, J.F., Jorge-Villar, S.E., Arnaiz-González, Á., García-Osorio, C.I., Díaz-Acha, Y., Campeny, M., Bosch, J., Melgarejo, J.C.: Machine learning algorithms applied to Raman spectra for the identification of variscite originating from the mining complex of Gavà. J. Raman Spectrosc. 51(9), 1563–1574 (2020) 16. Li, S., Chen, J., Xiang, J.: Applications of deep convolutional neural networks in prospecting prediction based on two-dimensional geological big data. Neural Comput. Appl. 32(7), 2037– 2053 (2020) 17. Schnitzler, N., Ross, P.S., Gloaguen, E.: Using machine learning to estimate a key missing geochemical variable in mining exploration: application of the Random Forest algorithm to multi-sensor core logging data. J. Geochem. Explor. 205 (2019) 18. Li, T., Zuo, R., Xiong, Y., Peng, Y.: Random-drop data augmentation of deep convolutional neural network for mineral prospectivity mapping. Nat. Resour. Res. 30(1), 27–38 (2021)
Chapter 5
An Improved Energy Conservation Routing Mechanism in Heterogeneous Wireless Sensor Networks Shashank Barthwal , Sumit Pundir , Mohammad Wazid , and D. P. Singh Abstract Wireless sensor network is considered one of the dominant reformations among all revolutionary emerging technologies. Surplus of research work has been done in past years but still brawls such as power consumption, network lifespan, and stability of network are pestering WSN. For ensuring such credibility of sensor network energy, efficient routing protocol plays a significant role. Usually, there exists two, three, or four energy level of sensor nodes in a routing protocol. But in veracity, WSN comprises of sensor nodes of several energy levels. This paper proposes Enhanced Energy Conservation Routing Protocol (EECRP), where exist five levels of heterogeneity. CH is elected using the residual energy, average energy of sensor nodes, and the number of clusters used per round at the minimum side. Simulation shows that performance of EECRP is far better than SEP, TSEP, and BEENISH. It can be noticed that EECRP enhances the stability of network by 65.18%, 40.36%, and 30.88% in contrast to SEP, TSEP, and BEENISH as well as EECRP enhanced throughput by 66.68, 69.13, and 49.16% in contrast to SEP, TSEP, and BEENISH. Keywords Wireless sensor networks · Energy conservation · LEACH · Dead nodes · Alive nodes
5.1 Introduction WSN comprises of numerous nodes, which generally emphasize on data gathering from inaccessible regions and then forwarding it to the base station [1]. Sensor nodes play a momentous role in monitoring numerous environmental factors like humidity, sound, lightening, temperature, pressure, and pollutants [2]. Military surveillances, medical diagnoses, and traffic management are the various areas where WSN is S. Barthwal (B) · S. Pundir · M. Wazid · D. P. Singh 1Department of Computer Science and Engineering, Graphic Era Deemed to be University, Dehradun 248002, India e-mail: [email protected] S. Pundir e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_5
37
38
S. Barthwal et al.
extensively used [3]. But there exist restrictions on energy resources of sensor node [4]; therefore, for minimizing the energy depletion, a commonly used method is clustering [5]. Clustering boosts the network lifetime as well as radically diminishes the energy consumption [6]. Two types of WSN are as follows: • Homogeneous WSNs. • Heterogeneous WSNs. In homogenous WSN, all nodes have same initial energy, whereas in heterogeneous wireless sensor networks, nodes vary in terms of initial energy [7]. WSN protocols are classified as: • Proactive protocols. • Reactive protocols. In proactive protocols, from various geographic regions, sensor nodes collect data and then uninterruptedly forward it to respective CHs, which is further forwarded to the BS, whereas in reactive protocols, if there is any drastic alteration in the sensed value, then data is forwarded to the CHs [8]. Now by the radio energy dissipation model in Fig. 5.1 [9], for transfer of K-bit data over a distance d, the expanded energy is: KEelec + K d 2 E fs if d < d0 E TX (K , d) = (5.1) KEelec + K d 4 E mp if d ≥ d0 Here, the receiver/transmitter circuit consumes E elec energy per bit, and if distance (d) is greater than or equals to threshold distance (d 0 ), then multi-path energy model works (E mp ) or else the free space (E fs ). The threshold distance (d 0 ) is: d0 =
E fs E mp
The energy required to receive K-bit data (ERX ) is:
Fig. 5.1 Dissipation of radio energy mode
(5.2)
5 An Improved Energy Conservation Routing Mechanism …
E RX (K ) = K ∗E elec
39
(5.3)
During data fusion, the following amount of energy is expended: E d x (K ) = p ∗ K ∗ E da
(5.4)
Here, p is the number of data packets and E da is the energy expended in data fusion of 1-bit data.
5.2 Related Work Heinzelman et al. [10] discussed LEACH. In WSN, LEACH is considered as earliest reactive and hierarchical routing protocol. Firstly, it forms clusters and then selects a CH for respective clusters. For selection of CH, all sensor node generates a random number in the network and then calculates a threshold value for all nodes as follow: P ifA ∈ Y 1 (5.5) T (A) = 1−P (r mod P ) 0 otherwise A sensor node got selected as a CH only when the sensor generated random number generated by the node present in the network is less than the threshold value. In this way, LEACH works, but it is not suitable for heterogeneous WSNs. Stable Election Protocol (SEP) was projected by Smaragdakis et al. [11] for heterogeneous wireless sensor networks. In SEP protocol, there exist two types of nodes. Some are advance nodes, and rest are normal nodes. The heterogeneity of sensor nodes assures the reliability of sensor network. It works on weighted election probability. Threshold value for normal and advance nodes can be calculated as follow: T (Anrm ) =
1 Pnrm
0
T (Aadv ) =
Pnrm 1−Pnrm r mod
Padv 1−Padv r mod
0
ifAnrm ∈ Y
(5.6)
otherwise 1 Padv
ifAadv ∈ Y
(5.7)
otherwise
Afterward, numerous routing protocols such as Distributed Energy-Efficient Clustering [12], Heterogeneity-aware Hierarchical Stable Election Protocol [13], and Threshold-Sensitive Energy-Efficient Routing [14] were proposed for heterogeneous WSNs. Threshold-Sensitive Stable Election Protocol was proposed by Kashaf et al. [15] with three levels of heterogeneity. It considered advance nodes, intermediate nodes, and normal nodes which have different energies. TSEP enhanced network lifespan as well as stability, and due to heterogeneous sensor nodes, it also
40
S. Barthwal et al.
enhanced the network throughput. Balanced Energy-Efficient Network-Integrated Super Heterogeneous Protocol was projected by Qureshi et al. [16] Four levels of heterogeneity were considered. Selection of CH is based on average energy and residual energy of the network. As a result, higher-energy sensor nodes have a better probability of being chosen as CH than lower-energy nodes.
5.3 Proposed Work Detailed discussion of the proposed EECRP protocol. It is based on TSEP, with five levels of heterogeneity. Here, ultra-nodes, hyper nodes, mega nodes, super nodes, and normal nodes are considered. N N , N S , N M , N H, and N U are total number of normal, super, mega, hyper, and ultra-nodes, respectively. The proportion of ultranodes, hyper nodes, mega nodes, and super nodes are donated as b0 , b1 , b2, and b3, respectively. NU = n ∗ b0
(5.8)
NH = n ∗ b1
(5.9)
NM = n ∗ b2
(5.10)
NS = n ∗ b3
(5.11)
NN = n ∗ (1 − b0 − b1 − b2 − b3 )
(5.12)
Normal nodes have E 0 energy. The ultra, hyper, mega, super, and optimum nodes have δ, ε, ζ , and ι times higher energy in contrast to normal nodes. The energy level of ultra, hyper, mega, super, and normal nodes is expressed as E U , E H, E M , E S, and E N, respectively. E U = E 0 (1 + δ)
(5.13)
E H = E 0 (1 + ε)
(5.14)
E M = E 0 (1 + ζ )
(5.15)
E S = E 0 (1 + ι)
(5.16)
5 An Improved Energy Conservation Routing Mechanism …
41
Now, the equations to find out the total energy of ultra-nodes, hyper nodes, mega nodes, super nodes, and normal nodes are as follows: TEU = NU ∗ E U = n ∗ b0 ∗ E 0 (1 + δ)
(5.17)
TEH = NH ∗ E H = n ∗ b1 ∗ E 0 (1 + ε)
(5.18)
TEM = NM ∗ E M = n ∗ b2 ∗ E 0 (1 + ζ )
(5.19)
TES = NS ∗ E S = n ∗ b3 ∗ E 0 (1 + ι)
(5.20)
TEN = NN ∗ E N = n ∗ (1 − b0 − b1 − b2 − b3 ) ∗ E 0
(5.21)
The network’s total energy is calculated as follows: E Total = TEN + TES + TEM + TEH + TEU = n E 0 (1 + b0 δ + b1 ε + b2 ζ + b3 ι) (5.22) As a result, the system’s total energy is raised by 1 + b0 δ + b1 ε + b2 ζ + b3 ι times. In the sensor network, the total energy dissipated (E Round ) can be calculated as follows: 2 2 + n E fs dtoCH E Round = Z 2n E elc + n E DA + C E fs dtoBS
(5.23)
Here, data bit size is Z-bit, C denotes count of clusters, and dtoBS is used for average distance in between the CH and BS, whereas dtoCH is avg. distance between the SN and in a cluster and the respective CH. The derivative of ERound can be used to find the optimal number of clusters with regard to C. Coptimum
√ E fs B n =√ 2π E amp dtoBS
(5.24)
The probability (optimal) of a SN for becoming a CH is given below: P0 =
Coptimum n
(5.25)
The probabilities necessary for the selection of CH are planned as follows: PU =
P0 (1 + δ) 1 + b0 δ + b1 ε + b2 ζ + b3 ι
(5.26)
42
S. Barthwal et al.
PH =
P0 (1 + ε) 1 + b0 δ + b1 ε + b2 ζ + b3 ι
(5.27)
PM =
P0 (1 + ζ ) 1 + b0 δ + b1 ε + b2 ζ + b3 ι
(5.28)
PS =
P0 (1 + ι) 1 + b0 δ + b1 ε + b2 ζ + b3 ι
(5.29)
PN =
P0 1 + b0 δ + b1 ε + b2 ζ + b3 ι
(5.30)
For individual sensor node to be elected as a CH, the random number (RN) produced for that sensor node must be less than the threshold value. If the preceding criterion is not met, the SN will become a cluster member. E AVG can be computed as: E AVG =
1 E Total n
(5.31)
Threshold for all SN can be calculated as follows:
PU 1−PU r mod
ThU =
1 PU
E Res E AVG(U ) ∗Coptimum
0 ThH =
PH 1−PH r mod
Th M =
PM 1−PM r mod
1 PH
E Res E AVG(H ) ∗Coptimum
Th S =
1 PM
E Res E AVG(M) ∗Coptimum
Th N =
PN 1−PN r mod
0
(5.33)
if N M ∈ Y
(5.34)
otherwise 1 PS
E Res E AVG(S) ∗Coptimum
0
if NH ∈ Y otherwise
0
PS 1−PS r mod
(5.32)
otherwise
0
ifNU ∈ Y
if N S ∈ Y
(5.35)
otherwise 1 PN
E Res E AVG(N ) ∗Coptimum
ifN N ∈ Y
(5.36)
otherwise
Y , Y , Y , Y , and Y serve as the set of ultra-nodes, hyper nodes, mega nodes, super nodes, and normal nodes, respectively, those which are not CH in the last 1/PU , 1/PH , 1/PM , 1/PS , 1/PN rounds, respectively. E AVG(U ) , E AVG(H ) , E AVG(M) , E AVG(S) , E AVG(N ) represent the average energy of ultranodes, hyper nodes, mega nodes, super nodes, and normal nodes, respectively.
5 An Improved Energy Conservation Routing Mechanism …
43
5.4 Simulation Results For simulation, we have considered a hundred-meter square field with hundred sensor nodes deployed randomly on it. For ease, we assumed nodes to be either fixed or micro-mobile and ignored the loss of energy due to collision of signals. The radio parameters have been kept same and shown in Table 5.1.
5.4.1 Stability Period It is period between network initialization and first node death. Stability plays a significant role in applications where loss of very small data can lead to heavy cost. Reliability of routing protocol is directly proportional to stability period. Figure 5.2 shows that the first node dies after 2906 rounds in EECRP, whereas in BEENISH, TSEP, and SEP, the first node dies after 2009, 1733, and 1012 rounds. It can clearly be seen that EECRP enhances stability by 65.18, 40.36, and 30.88% when compared to SEP, TSEP, and BEENISH.
5.4.2 Dead Nodes Verse the Round Numbers Figure 5.3 illustrates the very first node which dies after 1012, 1733, 2009, and 2906 rounds in SEP, TSEP, BEENISH, and EECRP, respectively, whereas the very last node drains its energy after completing 4804, 4196, 7767, and 9165 rounds. It seems that the proposed EECRP covers a greater number of rounds compared with others. Table 5.1 Simulation Parameters Parameters
Details
Value
x m , ym
Dimensions of used field
100 m × 100 m
n
No. of nodes used
100
E0
Initial level of energy of a node
0.5 J
P0
Probability of becoming a CH
0.1
E TX
Energy dissipation while transmission of data
50 × 10−9 J/bit
E RX
Energy dissipation while receiving the data
50 × 10−9 J/bit
E fs
Amplification of energy level during d < do
10 × 10−12 J/bit/m2
E mp
Amplification of energy level during d > do
1.3 × 10−15 J/bit/m
E DA
Energy used in data aggregation
10 × 10−9 J
r max
Maximum no. of rounds
15,000
b3 , b2 , b1, b0
Fraction of super, mega, hyper, ultra-nodes
0.25, 0.2, 0.15, 0.1
44
S. Barthwal et al.
Fig. 5.2 Stability period versus number of rounds
Fig. 5.3 No. of dead nodes versus No. of rounds
5.4.3 Throughput The performance of network can also be estimated on the basis of no. of packets acquired at the BS. As shown in Fig. 5.4, EECRP transmits 83,524 packets, whereas
5 An Improved Energy Conservation Routing Mechanism …
45
Fig. 5.4 Packets transmitted to BS versus number of rounds
SEP, TSEP, and BEENISH transmits 27,827, 25,782, and 42,463 packets, respectively. EECRP enhanced throughput by 66.68, 69.13, and 49.16% in comparison with SEP, TSEP, and BEENISH.
5.4.4 Alive Nodes Versus Round As shown is Fig. 5.5, EECRP spans a higher no. of rounds for the scenario in comparison with BEENISH, SEP, and TSEP. The improvement in alive nodes is because of the CH selection due to residual energy left and optimum no. of clusters.
5.4.5 Instability Period Figure 5.6 shows the instability period of EECRP which is better than SEP, TSEP, and BEENISH.
46
S. Barthwal et al.
Fig. 5.5 Alive nodes of network versus number of rounds
Fig. 5.6 Instability period
5.5 Conclusion Proposed EECRP, an energy conservation clustering protocol, is good for heterogeneous WSNs, with five variety of nodes. CH is selected on the basis of residual energy,
5 An Improved Energy Conservation Routing Mechanism …
47
average energy, and optimum no. of clusters every round. As a result, nodes with high level of energy have more probability of getting nominated as a CH, in contrast to nodes with low energy. When matched to SEP, TSEP, and BEENISH, EECRP proved to be the better protocol in terms of throughput, lifespan, and stability period.
References 1. Bhattacharyya, D., Kim, T.H., Pal, S.: A comparative study of wireless sensor networks and their routing protocols. Sensors 10(12), 10506–10523 (2010) 2. Ye, D., Gong, D., Wang, W.: Application of wireless sensor networks in environmental monitoring. In: 2009 2nd International Conference on Power Electronics and Intelligent Transportation System (PEITS), vol. 1, pp. 205–208. IEEE (2009) 3. Ali, A., Ming, Y., Chakraborty, S., Iram, S.: A comprehensive survey on real-time applications of WSN. Future Internet 9(4), 77 (2017) 4. Knight, C., Davidson, J., Behrens, S.: Energy options for wireless sensor nodes. Sensors 8(12), 8037–8066 (2008) 5. Raghuvanshi, A.S., Tiwari, S., Tripathi, R., Kishor, N.: Optimal number of clusters in wireless sensor networks: an FCM approach. In: 2010 International Conference on Computer and Communication Technology (ICCCT), pp. 817–823. IEEE (2010) 6. Rozas, A., Araujo, A.: An application-aware clustering protocol for wireless sensor networks to provide QoS management. J. Sens. 2019 (2019) 7. Purkar, S.V., Deshpande, R.S.: Energy efficient clustering protocol to enhance performance of heterogeneous wireless sensor network: EECPEP-HWSN. J. Comput. Netw. Commun. 2018 (2018) 8. Ketshabetswe, L.K., Zungeru, A.M., Mangwala, M., Chuma, J.M., Sigweni, B.: Communication protocols for wireless sensor networks: a survey and comparison. Heliyon 5(5), e01591 (2019) 9. Heinzelman, W.B., Chandrakasan, A.P., Balakrishnan, H.: An application-specific protocol architecture for wireless microsensor networks. IEEE Trans. Wireless Commun. 1(4), 660–670 (2002) 10. Heinzelman, W.R., Chandrakasan, A., Balakrishnan, H.: Energy-efficient communication protocol for wireless microsensor networks. In: Proceedings of the 33rd Annual Hawaii International Conference on System Sciences, pp. 10. IEEE (2000) 11. Smaragdakis, G., Matta, I., Bestavros, A.: SEP: A stable election protocol for clustered heterogeneous wireless sensor networks. Boston University Computer Science Department (2004) 12. Qing, L., Zhu, Q., Wang, M.: Design of a distributed energy-efficient clustering algorithm for heterogeneous wireless sensor networks. Comput. Commun. 29(12), 2230–2237 (2006) 13. Khan, A.A., Javaid, N., Qasim, U., Lu, Z., Khan, Z.A.: Hsep: heterogeneity-aware hierarchical stable election protocol for wsns. In: 2012 Seventh International Conference on Broadband, Wireless Computing, Communication and Applications, pp. 373–378. IEEE (2012) 14. Manjeshwar, A., Agrawal, D.P.: TEEN: a routing protocol for enhanced efficiency in wireless sensor networks. In: IPDPS, vol. 1, p. 189 (2001) 15. Kashaf, A., Javaid, N., Khan, Z.A. Khan, I.A.: TSEP: Threshold-sensitive stable election protocol for WSNs. In: 2012 10th International Conference on Frontiers of Information Technology, pp. 164–168. IEEE (2012) 16. Qureshi, T.N., Javaid, N., Khan, A.H., Iqbal, A., Akhtar, E., Ishfaq, M.: BEENISH: balanced energy efficient network integrated super heterogeneous protocol for wireless sensor networks. Procedia Comput. Sci. 19, 920–925 (2013)
48
S. Barthwal et al.
17. Bazzi, H.S., Haidar, A.M., Bilal, A.: Classification of routing protocols in wireless sensor network. In: International Conference on Computer Vision and Image Analysis Applications, pp. 1–5. IEEE (2015) 18. Zeng, M., Huang, X., Zheng, B., Fan, X.: A heterogeneous energy wireless sensor network clustering protocol. Wirel. Commun. Mobile Comput. (2019)
Chapter 6
Prediction of COVID-19 Severity Using Patient’s PHR M. A. Bharathi and K. J. Meghana Kumar
Abstract A significant health crisis, including the current COVID-19 outbreak, presents us for an opportunity to think about it and focus on how we may improve the way we handle health care in the future to make us humans better prepared and capable of dealing with such an incident. Since the COVID-19 trend has swayed irregularly, they have remained in the dark, unsure how much resources they will have even in the future week. At these instances, difficult period to be capable of predicting exactly what sort of resources a person have it necessary now of a positive test, or perhaps even earlier, would take place extremely beneficial to organizations, as they will be able to get or make preparations with their resource required to save that patient’s existence. The aim of the work is to devise a system that would be both outlay and reliable. Keywords Prediction model · Efficient · Data preprocessing · Data wrangling
6.1 Introduction The new coronavirus infection (COVID-19) was initially reported in Wuhan–Hubei Province (China), in the year 2019 which has swiftly spread over the world [1]. A combination as quickly of the causal illness (SARS-CoV-2) its increasing speed spreading, affecting 195 countries, areas such as United Kingdom, US, Italy and Spain, as well as France, being the most affected area [2]. While the infection continues to spread, the WHO—World Health Organization has announced COVID19 to flare-up invasion. As of 4 May 2020, there has been 3,581,885 confirmed positive cases, resulting in 248,559 deaths [2]. Forecasting is done for the three most significant disease predictors over the following ten days: (1) the number of M. A. Bharathi (B) Department of Artificial Intelligence and Machine Learning, BMS Institute of Technology and Management, Bangalore, India e-mail: [email protected] K. J. Meghana Kumar Department of Computer Science, BMS Institute of Technology and Management, Bangalore, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_6
49
50
M. A. Bharathi and K. J. Meghana Kumar
confirmed new cases, (2) the number of persons who have perished as a result of their actions [3], and (3) the number of successful recoveries. Infectious illness transmission happens in a crowd and is a dynamic mechanism of transmission. Framework is created to analyze and test a spreading infectious illness mechanisms using this method. In order to accurately estimate the future pattern of infectious diseases. As a result, the research and evaluation of infectious disease prediction models have been a popular issue of science in order to control or avoid the damage caused by communication diseases [4]. Some of the main reason for its spreading and about its risk, practically every one of the nations have pronounced either incomplete or severe lockdowns all through the influenced districts and urban communities [5]. Clinical analysts all through the globe are right now required to find a suitable immunization and prescriptions for the illness. Because there is currently no approved therapy to kill the virus, governments from all around the world are working on preventative measures to halt its spread. Among all precautions, being informed about all aspects of COVID-19 is considered particularly important. Many scholars are looking at the many aspects of the pandemic and coming up with conclusions that will help humanity contribute to this piece of knowledge. This proposal’s intention is to create a COVID-19 forecasting system to help with the present worldwide calamity. The main motivation for the COVID-19 patient severity based on predictions of the patient’s record is the global havoc it has created. The people sufferings they have undergone through the severity. If there are any means methods to reduce the loss of life to a minuscule portion that must unsheathe that approach. We are attempting to create a machine learning model that can detect a patient’s severity resulting with previous medical records. As we know that a patient’s previous health state plays a gargantuan role in deciding whether a person can survive the pandemic, we have decided to use artificial intelligence to check and tell who suffers the most specific and allied factors for the severity of infection on patient.
6.2 Methodology The system is divided into 5 sections: collection of data, preprocessing the data, researching the framework which will work the better for such dataset, model testing and training, and review (Fig. 6.1). 1.
2.
Collection of Data: The set of data can originate from a variety of places, along with a document, the database, sensors, and several others, but it cannot be used for analysis right away because it may contain a large amount of data that is missing, exceptionally huge numbers, disorganized, messy info, or textual information. As a result, data preparation is finished in order to deal with this problem. Preprocessing: One amongst the most popular critical processes in machine learning is data preprocessing. When data being obtained from multiple sources,
6 Prediction of COVID-19 Severity Using Patient’s PHR
51
Fig. 6.1 Shows the workflow of machine learning
3.
4.
5.
it is in a raw format, making analysis hard. We must train the data by removing any duplicates, correcting errors, dealing with missing values, and converting data types, among other things. Researching the framework which will work the better for such dataset. Major target could be preprocessed data to develop the most effective model possible. Choose it the most appropriate algorithm. Model Training and Testing: We must first train the model, which must accurately answer a question or make a forecast as often as possible. Each iteration of the method is a phase in the training process. Following that, we must make a forecast using a suitable technique, test a model, and evaluate it using various parameters. Review: The evaluation of a model is a key step in its development. It helps us choose the best model to describe data and forecast how well that algorithm will be performed for future uses.
6.3 Implementation For predicting severity of patients, various techniques were used such as K-Nearest Neighbor Classifier, Linear Regression, Support Vector Machine Classifier, Random Forest Classifier, and Decision Tree Classifier. First, we predict the accuracy, recall, and precision by using all these techniques. By comparing all techniques, we select which classifier is best to predict the severity of the patient’s PHR (Person Health Record).
6.3.1 Data Sets The “Novel Corona Virus 2019 Dataset” utilized in this study was obtained from Kaggle. The information was gathered from a bunch of areas, including the World
52
M. A. Bharathi and K. J. Meghana Kumar
Health Organization and John Hopkins University. However, to meet the growing demands of this research, the dataset is pre-processed substantially. It has a total of 566,602 patient records with 23 attributes.
6.3.2 Code Conventions In order to conduct operations such as numerical manipulations, data wrangling, and many more, we must first import some necessary libraries. We should divide the data which is fed for training and testing sets in the correlation of 70 to 30% before using any algorithm as shown in the above snapshot (Fig. 6.2). Create a confusion matrix for the model that has been implemented (Fig. 6.3). By invoking the following code, we could assign feature priority to all the predictor factors. The following is a list of the traits that are most vital (Figs. 6.4 and 6.5).
6.4 Results and Discussion Figure 6.6 shows the different severity of the patients in set of datasets. Figure 6.7 shows the different sex of the patients in set of data. I accepted that Random Forest performs very well to a given sample. It may have 98% accuracy, but it has low recall and precision. Linear Regression will round off
Fig. 6.2 Loading datasets into data frame
Fig. 6.3 Confusion matrix
Fig. 6.4 Feature importance of all the independent variable
6 Prediction of COVID-19 Severity Using Patient’s PHR Fig. 6.5 Feature importance of all the independent variable
Fig. 6.6 Severity of the patients
53
54
M. A. Bharathi and K. J. Meghana Kumar
Fig. 6.7 Comparison with different algorithms
Fig. 6.8 Representing model in graph
the values, and Support Vector Classifier is blending a good accuracy result than KNN and Decision Tree Classifier (Fig. 6.8). Linear Regression technique will round off the values, and Support Vector Classifier is blending a good accuracy result than KNN and Decision Tree Classifier.
6.5 Conclusion Using applications of different machine learning algorithms like Decision Tree Classifier, Random Forest, Support Vector Classifier, Linear Regression, and KNN prediction of COVID-19 severity successfully analyzed using Python language. Implementation is done efficiently, and accurate results are fetched. It can be invented that Support Vector Classifier algorithm delivers prediction results accurately based on the given datasets. Conditions like mild, high, severity, and high severity in COVID19 patients were analyzed. SVC blends reliable results with recall and precision score of 0.986667 and gives accuracy score of 98.67.
6 Prediction of COVID-19 Severity Using Patient’s PHR
55
6.5.1 Future Enhancement 1. 2. 3.
This model can be deployed with the real-time data and used by doctor to stratify the patient’s data at the initial stage to minimize the death rates. The model can be modified and utilized accordingly for distinct diseases to scrutinize risk predictions Research scholars can adopt SVC without any incertitude for future predictions of different data models.
References 1. Yang, Z., Zeng, Z., Wang, K., Wong, S.-S., Liang, W., Zanin, M., Liu, P., Cao, X., Gao, Z., Mai, Z., et al.: Modified Seir and AI prediction of the epidemics trend of Covid-19 in China under public health interventions. J. Thorac. Dis. 12(3), 165 (2020) 2. Zhou, F., Yu, T., Du, R., Fan, G., Liu, Y., Liu, Z., Xiang, J., Wang, Y., Song, B., Gu, X., et al.: Clinical course and risk factors for mortality of adult inpatients with Covid-19 in Wuhan, China: a retrospective cohort study. Lancet 395(10229), 1054–1062 (2020) 3. Rustam, F., Reshi, A.A., Mehmood, A., Ullah, S., On, B-W., Aslam, W., Choi, G.S.: Covid-19 future forecasting using supervised machine learning models. IEEE Access 8, 101489–101499 (2020) 4. Ogundokun, R.O., Awotunde, J.B.: Machine learning prediction for Covid-19 pandemic in India. medRxiv (2020) 5. Wang, C., Horby, P.W., Hayden, F.G., Gao, G.F.: A novel coronavirus outbreak of global health concern. Lancet 395(10223), 470–473 (2020) 6. Mallapaty, S.: What the cruise-ship outbreaks reveal about covid-19. Nature 580(7801), 18–19 (2020) 7. Zhao, S., Lin, Q., Ran, J., Musa, S.S., Yang, G., Wang, W., Lou, Y., Gao, D., Yang, L., He, D., et al.: Preliminary estimation of the basic reproduction number of novel coronavirus (2019ncov) in China, from 2019 to 2020: a data-driven analysis in the early phase of the outbreak. Int. J. Infect. Dis. 92, 214–217 (2020) 8. Sujath, R., Chatterjee, J.M., Hassanien, A.E.: A machine learning forecasting model for Covid19 pandemic in India. Stochast. Environ. Res. Risk Assess. 34, 959–972 (2020) 9. Tuli, S., Tuli, S., Wander, G., Wander, P., Gill, S.G., Dustdar, S., Sakellariou, R., Rana, O.: Next generation technologies for smart healthcare: challenges, vision, model, trends and future directions. Internet Technol. Lett. 3(2), e145 (2020) 10. Ardabili, S.F., Mosavi, A., Ghamisi, P., Ferdinand, F., Varkonyi-Koczy, A.R., Reuter, U., Rabczuk, T., Atkinson, P.M.: Covid-19 outbreak prediction with machine learning. Algorithms 13(10), 249 (2020) 11. Anderson, K.M., Odell, P.M., Wilson, P.W.F., Kannel, W.B.: Cardiovascular disease risk profiles. Am. Heart J. 121(1), 293–298 (1991) 12. Zhao, Z., Chen, A., Hou, W., Graham, J.M., Li, H., Richman, P.S., Thode, H.C., Singer, A.J., Duong, T.Q.: Prediction model and risk scores of ICU admission and mortality in Covid-19. PloS ONE 15(7), e0236618 (2020) 13. Parchure, P., Joshi, H., Dharmarajan, K., Freeman, R., Reich, D.L., Mazumdar, M., Timsina, P., Kia, A.: Development and validation of a machine learning-based prediction model for near-term in-hospital mortality among patients with Covid-19. BMJ Support. Palliat. Care (2020) 14. Makridakis, S., Spiliotis, E., Assimakopoulos, V.: Statistical and machine learning forecasting methods: Concerns and ways forward. PLoS ONE 13(3), e0194889 (2018)
56
M. A. Bharathi and K. J. Meghana Kumar
15. Patricio, A., Costa, R.S., Henriques, R.: Covid-19 in Portugal: predictability of hospitalization. ICU and respiratory-assistance needs. medRxiv (2020) 16. Depeursinge, A., Chin, A.S., Leung, A.N., Terrone, D., Bristow, M., Rosen, G., Rubin, D.L.: Automated classification of usual interstitial pneumonia using regional volumetric texture analysis in high-resolution CT. Invest. Radiol. 50(4), 261 (2015) 17. Singh, S., Raj, P., Kumar, R., Chaujar, R.: Prediction and forecast for Covid-19 outbreak in India based on enhanced epidemiological models. In: 2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA), pp. 93–97. IEEE (2020) 18. Singh, R.K., Drews, M., De la Sen, M., Kumar, M., Singh, S.S., Pandey, A.K., Srivastava, P.K., Dobriyal, M., Rani, M., Kumari, P., et al.: Short-term statistical forecasts of Covid-19 infections in India. IEEE Access 8, 186932–186938 (2020) 19. Singh, R.K., Rani, M., Bhagavathula, A.S., Sah, R., Rodriguez-Morales, A.J., Kalita, H., Nanda, C., Sharma, S., Sharma, Y.D., Rabaan, A.A., et al.: Prediction of the Covid-19 pandemic for the top 15 affected countries: advanced autoregressive integrated moving average (arima) model. JMIR Public Health Surveill. 6(2), e19115 (2020) 20. Bayes, C., Valdivieso, L. et al.: Modelling death rates due to Covid-19: a Bayesian approach. arXiv preprint arXiv:2004.02386 (2020)
Chapter 7
A Survey on Crop Rotation Using Machine Learning and IoT Nidhi Patel, Ashna Shah, Yashvi Soni, Nikita Solanki, and Manojkumar Shahu
Abstract Agriculture plays a significant role in forming the economy in India. Day by day, as the population increases the demand for food also increases. So, we need more efficient ways to increase crop production. These days with the surge of the Internet of Things (IoT) and machine learning technologies, we can get more effective outcomes. We have proposed a system, which collects the soil data like NPK contents, pH level, and temperature from the IoT sensors in real time which is sent to the cloud using the MQTT method by Node MCU firmware. In the cloud architecture, we have applied the k-nearest neighbor algorithm (KNN) model to the collected data to get the best suggestion for crop rotation. KNN is a supervised machine learning algorithm and can predict similar things that exist in closed proximity. In this system, we have taken different crops for the study. Farmers can also check the real-time soil contents of their farms on the dashboard. Keywords Crop rotation · IoT · Machine learning · Smart farming · KNN · Crop prediction · Agriculture · Soil · NPK · Data analysis · Algorithm · Cultivation · Farming
7.1 Introduction In India, nearly, 70% of people is dependent on agriculture for their livelihood. Agriculture is the backbone of the economic system of our country. Due to the increasing population, the food demand is increasing. So, it is a challenging task to fulfill these needs. We can facilitate all these tasks via applying advanced concepts like the Internet of Things and machine learning. Soil parameters (i.e., nitrogen, phosphorous, potassium), crop rotation system, and surface temperature of a yield play a crucial role in sustainable crop production. This project intends to help the N. Patel (B) · A. Shah · Y. Soni · N. Solanki · M. Shahu LDRP Institute of Technology and Research, Gandhinagar, Gujarat, India e-mail: [email protected] URL: https://www.ldrp.ac.in © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_7
57
58
N. Patel et al.
farmers while planting suitable crops and reduce soil degradation in cultivated fields. By using this system, the farmers can get suggestions for crops that are risk-free, profitable, and most accurate for their farms. In this system, we used Internet of Things (IoT) sensors like NPK sensors, pH sensors, and temperature sensors to get the soil parameters, and the collected data are being sent to the cloud using Node MCU firmware. This system includes a model which uses a machine learning algorithm k-nearest neighbor clustering (KNN) to determine the best crop. This algorithm uses the dataset from Kaggle to predict the appropriate crops. The result will be displayed on the frontend dashboard, and also, farmers can observe the real-time soil contents of their farm.
7.2 Background 7.2.1 Internet of Things IoT is an evolving technology that practices the Internet and intends to associate physical devices or “Things” [9]. The word “Things” implies any physical specification which could be sensed and which could be connected to the Internet. For instance, a machine supervising temperature and humidity for a specific location and transferring the data will be considered as an “Things” in the Internet of Things domain. Internet of Things is an exemplary key to the growing yield [4]. By the use of relevant sensors and transmission networks, the devices can contribute a good amount of data and provide different services. For example, governing the energy utilization of houses in a smart fashion grants reduction in energy expenses [9]. Internet of Things devices collect and assemble vast quantities of data for diverse disciplines and application areas [6]. The management and examination of IoT data can be used to automate processes, figure out circumstances, and reform many activities, in real-time also [18]. Modern improvements in IoT, information and communication technology (ICT), and wireless sensor networks (WSN) have the potential to direct some of the environmental, economic, and technical difficulties and possibilities in this area [14]. The Internet of Things (IoT) has wide range of areas and industries from health, manufacturing, communications, and energy to the agriculture sector which overcomes the incompetence and enhance the execution across the overall market [1]. Internet of Things (IoT) is also adapted in environmental monitoring, dronebased services, efficient energy management in buildings and healthcare services. Now, businesses are driven by IoT and the possibilities of growing income, diminishing working charges, and advancing capabilities. Internet of Things innovation will proceed to grow, advancing the conversion of random objects into smart-connected devices. By the end nodes and with the gateways, router sends the received data to manage centers or the cloud for processing, perform analysis, and decision-making. After the decision is made, a similar command is then transferred back to the actuator which is already installed on the system to acknowledgment of the sensed data. As
7 A Survey on Crop Rotation Using Machine Learning and IoT
59
there are various sensors and actuator devices, communication technologies, and data computing methods, in this section, we outline the existing technology that enables IoT [9]. Architecture and Design The most suitable design of the architecture is a backbone to develop a affluent IoT system, which helped to direct a lot of problems in the IoT environment such as scalability, networking, and routing. Typically, the IoT architecture is based on three main parameters: (i) Information items that cover all things associated with the IoT environment may be identifying items, control items, and sensing items; (ii) Independent network: which assimilate various characteristics such as self-adaptation, self-protection, self-configuration, and self-optimisation; (iii) Intelligent applications: which have intelligent performance over the Internet generally; the intelligent performance may be intelligent control, transferring data through network and data processing, all the applications which are concerned to the IoT can be grouped according to these dimensions [7]. The intersection among these dimensions generates a new space entitled “infrastructure of IoT,” which provides support systems to assist the special things, which can contain multiple services such as identification go goods, location identification, and data protection. Figure 7.1 represents the three parameters of the Internet of Things and the relation between them. To this end, there are numerous ways to develop an architecture of the Internet of Things [7].
Fig. 7.1 Architecture design of IoT [7]
60
N. Patel et al.
7.2.2 Machine Learning The proficiency to learn from exposure is given to machines by the technology entitled machine learning is deemed as a sort of artificial intelligence [14]. Machine learning can bring indulgence to data and empower machines to make a prophecy. In which, computers first acquire to carry out a task by scrutinizing the training datasets [12]. ML is extensively used in real-life approaches, such as visual classifications, social networking, text mining, Internet search engines, and testimonial platforms [21]. Machine learning algorithms put computerized procedures into co-operation to seize straight from datasets without hanging on pre-established comparisons as a model. ML methods are robust tools adequate for solitarily solving immense nonlinear challenges utilizing data rendered by sensors or additional various enmeshed roots. The algorithms moderately conform to amplify their effectuation as the accessible amount of training data specimens escalate [14]. ML methodologies are persistently gone through many advancements and are sued broadly across most of the fields. Decision-making and apprised operations can be simplified by using them in real-world schemas with the least human intermediacy. In smart grazing policies like crop forecast models, decision aid for irrigation sensors, and crop sustainability paradigms, the crucial features of ML models are explicitly adopted. ML algorithms permit the uprooting of peculiar data items and sagacity from plenty of information [14]. ML leads to a computational illustration of an aspect tutored at task accomplishment, composition result, and negotiated circumstances. A conventional workflow of an ML structure is shown in Fig. 7.2 utilized as knowledge discovery in databases (KDD)—focalizing on the unfolding of various algorithms and the anal-
Fig. 7.2 Machine learning model architecture [15]
7 A Survey on Crop Rotation Using Machine Learning and IoT
61
ysis round. Typically, remarkable information is inspected from the fundamental datasets, known as validation data, and is utilized synchronically with the training data to autonomously establish the edicts and exemplars erected in inured models. Lastly, the models are flashed to test data; consequences are drafted to emblematic knowledge, and acumens are conveyed either to a dashboard or other related software segments [15].
7.2.3 Crop Rotation During human history, wherever food crops have been manufactured, some sort of rotation cropping seems to have been exercised. In the significant food producing sectors of the world, several rotations of more abbreviated length are broadly utilized. Some of them are created for the greatest direct results, without sufficient concern for the ongoing suitability of the essential sources. Others are intended for huge progressive results with preserved sources [17]. Picking every crop is essential in agricultural planning [13]. A particular crop yield prediction model can assist farmers to determine what to plant and when to plant [19]. Field estimation and crop yield prediction are of major significance to global food production [5]. Cultivators and farmers get additional advantage from yield forecasts to make economic and management decisions [5]. By rendering climatic data of that area, the user-friendly Web portal generated for predicting crop yield can be utilized by any user of their selection of crop [2]. Yield prediction is an essential agricultural problem. In the earlier times, farmers utilized their yield from the earlier yield practices. Hence, for this sort of data analytics in crop prediction, there are several techniques, and with the aid of those algorithms, we can predict crop yield [2]. Crop rotation is essential for soil wellness, to maintaining nutrient balance in the soil, and for pest and disease control. The nutrients that certain vegetables and fruits take out of the soil are varied both in terms of the type and the quantity. For instance, green vegetables need major nitrogen, whereas root vegetables such as potatoes require more phosphorus. Crops referring to the related plant family are similar in their nutrient necessities. As an outcome, what you grow in a peculiar location influences soil fertility. Planting the same crop in the corresponding area year after year reduces the soil nutrients and generates a nutrient disparity. This then begins with poor harvests. Switching crops, on the other hand, enable the soil to restore. Planting different crops can break that sequence because you take away their breeding ground or food origin. To resolve which crop should grow where, you needs to familiarize yourself with the nutrient necessities of individual vegetable [8]. A well-planned rotation of crops allows the farmer to recognize clearly what is to be done each year and gives a feasible appraisal of the usual costs and returns that may be anticipated.
62
N. Patel et al.
7.3 Literature Survey See Table 7.1.
Table 7.1 Survey table Paper title
Summary
Advantages
Future scope
Smart management of k-nearest neighbors crop cultivation using algorithm (KNN) IoT and machine learning [11]
Approach
utilizing records of soil moisture and the heat it examines the status of the soil. Also, by using the IoT sensors and ML algorithms like KNN, it resolves which crop should be planted for gaining the best profits
By using this Web-application portal, farmers can easily send realtime N-P-K contents and can remotely access the application from any location at any time
The future work is to use more accurate sensors, so it collects precise soil and crop data in realtime and improve efficiency of the system
Machine learning convergence for weather-based crop selection [10]
Recurrent neural network (RNN), random forest classification (RFC)
It collects the records of weather forecast and soil parameters and then suggests farmers about the appropriate sowing time by utilizing the ML algorithms like RNN and RFC
The proposed approach provides a precise sowing period for crops and also infers required agriculture amenities
The future commitment is to enhance the performance of the algorithm by adopting IoT devices to accumulate well-defined climate and soil records of any yield. The soil parameters such as the ratio of NPK contents in the soil and soil warmth can be analyzed for better certainty of the crop assortment scheme
Machine learning approaches for crop yield prediction and nitrogen status estimation in precision agriculture: a review [3]
M5-Prime, k-nearest neighbors algorithm (KNN), SVR, artificial neural networks (ANNs), MLR, support vector machines (SVMs), random forest regression (RFR), extremely randomized trees (ERT), SOM algorithm, stepwise multiple linear regression (SMLR), spiking neural networks (SNN), fuzzy cognitive mapping (FCM)
This article is based on research advancements escorted within the time period of 15 years on ML-based methods for certain crop yield forecasts and nitrogen status calculation
This article gives a survey of different ML methods and implies the most productive manner to accomplish the best results
The federation of various ML as well as signal processing techniques into composite systems to avail from the fervor of those systems and requite for their deficiencies
(continued)
7 A Survey on Crop Rotation Using Machine Learning and IoT
63
Table 7.1 (continued) Paper title
Approach
Summary
Advantages
Future scope
Decision support system for crop rotation using machine learning [20]
Neuro-fuzzy
By employing a neuro-fuzzy-based ML algorithm and radial bias function algorithm, it prophesies the fitting crop for sowing
It is a low time-consuming procedure so that the real-time affirmation rate is performed efficiently
The future activity will be interfacing with different policies to transmit an alert in the case of a very moderate symmetry of nutrients endure in the soil. This will extend to the estimate of, implying pesticides and compost in a way that they enrich the soil to its saturation level and not adjusting the assets of the current soil
Crop recommendation ANNs (Artificial using machine learning neural networks), SVM techniques [16] (Support vector machine)
It assembles records like soil sample, pH rate of the soil, N-P-K contents of the soil, permeability of the soil, water holding potential, average rainfall, temperature, and earlier harvested crop. Based on ML methods like supervised and unsupervised learning, crop testimonial is being done
The proposed activity will avail farmers to maximize fertility in agriculture, diminish soil depravity in cultivated domains, and minimize the use of fertilizers in crop production by suggesting the right crop acknowledging various qualities. The proposed work benefits framers to explicitly pick the crop for farming and achieve sustainability
The future goal is to introduce a system that can be enlarged to estimate market need and availability of market foundation required interest and hazard, post-harvest storage, and processing technologies. This would render a full prophecy based on terrestrial, environmental, and economical perspectives
Crop prediction system Linear and nonlinear using machine learning regression [22]
The proposed scheme takes the records related to soil, weather, and prior-year production into compensation and intimates which are the most profitable crops that can be plowed in the relevant environmental infirmity
As the method lists out all conceivable crops, it assists the farmer in the decision-making of which crop to cultivate. Also, this practice takes into deliberation the prior rendering of data which will aid the farmer get acumen into the need and the expense of different crops in the market yards. As supreme types of crops will be included under this system, farmers may get to recognize the crops which have never been plowed
In prospect, all agriculture machines can be equated over the internet practicing IoT. The sensors can be manipulated in fields that will assemble data regarding the modern farm states and devices can enhance the mist, acidosis, etc., subsequently. The carriers used in fields, like tractors will be united to the Internet in the fate which will, in real-time permit data to farmers concerning crop reaping and the disease, crops may be sustaining from hence encouraging the farmer in taking relevant activity. Moreover, the best effective crop can also be determined in light of the commercial and expansion ratio
64
N. Patel et al.
7.4 Proposed System In this proposed system, we have collected the soil parameters like temperature, pH, nitrogen, phosphorous, and potassium through the IoT sensors. Now, this collected data are sent to the cloud using Node MCU firmware. After that, by using the knearest neighbor machine learning algorithm, the system predicts the most suitable crop for the crop rotation. Then, the result is displayed on the dashboard. Hence, the farmer can easily get the knowledge about which crop he can plant for further crop rotation.
7.5 Conclusion and Future Scope In this research paper, we have performed a proposed system in which we have taken the data from a dataset using IoT sensors have applied the ML technique (KNN) for predicting suitable crops. The system will help farmers to maximize fertility in agriculture and decrease the use of manure in crop production by suggesting the appropriate crop acknowledging various properties. The proposed system can be elongated to recognize market need and availability of market foundation, predictable profit hazard, post-harvest accommodation, and processing technologies in the future. The system aids farmers in precisely picking the crop for farming and achieving sustainability.
References 1. Ayaz, M., Uddin, A., Sharif, Z., Mansour, A., Aggoune, E.-H.: Internet-of-things (IoT)-based smart agriculture: toward making the fields talk. IEEE Access 1 (2019) 2. Champaneri, M., Chachpara, D., Chandvidkar, C., Rathod, M.: Crop yield prediction using machine learning. Int. J. Sci. Res. (IJSR) 9, 2 (2020) 3. Chlingaryan, A., Sukkarieh, S., Whelan, B.: Machine learning approaches for crop yield prediction and nitrogen status estimation in precision agriculture: a review. Comput. Electron. Agric. 151, 61–69 (2018)
7 A Survey on Crop Rotation Using Machine Learning and IoT
65
4. Dewangan, A.K.: Application of IoT and machine learning in agriculture. Int. Res. J. Eng. Technol. (IJERT) 9 (2020) 5. Elavarasan, D., Durairaj Vincent, P.M.: Crop yield prediction using deep reinforcement learning model for sustainable agrarian applications. IEEE Access 8, 86886–86901 (2020) 6. Garg, D., Khan, S., Alam, M.: Integrative use of IoT and deep learning for agricultural applications, pp. 521–531 (2020) 7. Hassan, Z., Ali, H., Badawy, M.: Internet of things (IoT): definitions, challenges, and recent research directions. Int. J. Comput. Appl. 128, 975–8887 (2015) 8. Hassani, N.: A gardener’s guide to crop rotation: how to plan your vegetable garden based on a plant family chart (2020). https://www.thespruce.com/crop-rotation-for-home-gardeners5084167 9. Hossein Motlagh, N., Mohammadrezaei, M., Hunt, J., Zakeri, B.: Internet of things (IoT) and the energy sector. Energies 13, 494 (2020) 10. Jain, S., Ramesh, D.: Machine learning convergence for weather based crop selection. In: 2020 IEEE International Students’ Conference on Electrical, Electronics and Computer Science (SCEECS), pp. 1–6. IEEE (2020) 11. Kumar, T.R., Aiswarya, B., Suresh, A., Jain, D., Balaji, N., Sankaran, V.: Smart management of crop cultivation using IoT and machine learning. Int. Res. J. Eng. Technol. (IRJET) 5 (2018) 12. Louridas, P., Ebert, C.: Machine learning. IEEE Softw. 33(5), 110–115 (2016) 13. Medar, R., Rajpurohit, V.S.., Shweta, S.: Crop yield prediction using machine learning techniques. In: 2019 IEEE 5th International Conference for Convergence in Technology (I2CT), pp. 1–5 (2019) 14. Mekonnen, Y., Namuduri, S., Burton, L., Sarwat, A., Bhansali, S.: Review machine learning techniques in wireless sensor network based precision agriculture. J. Electrochem. Soc. 167(3), 037522 (2020) 15. Rafique, D., Velasco, L.: Machine learning for network automation: overview, architecture, and applications. J. Opt. Commun. Netw. 10(10), D126–D143 (2018) 16. Raju, G.T., Mamatha Jajur, S., Soumya, N.G.: Crop recommendation using machine learning techniques. Int. J. Innov. Technol. Explor. Eng. (IJITEE) 9 (2019) 17. The Editors of Encyclopaedia Britannica: Crop rotation (2018). https://www.britannica.com/ topic/crop-rotation 18. Tzounis, A., Katsoulas, N., Bartzanas, T., Kittas, C.: Internet of things in agriculture, recent advances and future challenges. Biosyst. Eng. 164, 31–48 (2017) 19. van Klompenburg, T., Kassahun, A., Catal, C.: Crop yield prediction using machine learning: a systematic literature review. Comput. Electron. Agric. 177, 105709 (2020) 20. Vigneswaran, E.E., Selvaganesh, M.: Decision support system for crop rotation using machine learning. In: 2020 Fourth International Conference on Inventive Systems and Control (ICISC), pp. 925–930. IEEE (2020) 21. Wang, M., Fu, W., He, X., Hao, S., Wu, X.: A survey on large-scale machine learning. IEEE Trans. Knowl. Data Eng. 1 (2020) 22. Zingade, D.S., Buchade, O., Mehta, N., Ghodekar, S., Mehta, C.: Crop prediction system using machine learning. Int. J. Adv. Eng. Res. Dev. Spec. Issue Recent Trends Data Eng. 4(5), 1–6 (2017)
Chapter 8
Survey of Consumer Purchase Intentions of Live Stream in the Digital Economy Li-Wei Lin and Xuan-Gang
Abstract In the digital economy, live e-stream has become an emerging industry. Many consumers have gradually changed from e-commerce shopping mode to the purchase behavior of live e-stream. Our research uses the concept of the Internet to observe consumers’ brand loyalty purchase behavior. Our research can find that from the perspective of digital economy, live e-commerce will have a positive impact on consumer purchase behavior in perception of value, emotion, and contract, and consumer purchase behavior will have a positive effect on brand loyalty influence. Keywords Digital economy · Live stream · Purchase behavior
8.1 Introduction It is important for brands to gain the trust of consumers on live broadcasting platforms. Consumers view products on live broadcast platforms, communicate and interact with anchors and consumers in chat rooms, and share their experience. Live streaming with sales of products is a novel shopping model in China. Shopping and engaging on live streaming platforms are particularly popular among young people. Therefore, this consumer shopping model is worthy of study. In general, the availability of products and the interactive nature of live broadcast platforms can enhance the interest of consumers and promote brand loyalty. Factors affecting the success of such platforms include recognition by and continuous attention of the consumers with the products. Gefen [5] stated that knowledge-sharing sources on the Internet can be used to indirectly gain the trust of consumers and then transform this trust into belief.
L.-W. Lin College of Business Administration, Fujian Jiangxia University, Fuzhou 350108, China Xuan-Gang (B) School of Information, Zhejiang University of Finance and Economics Dongfang College, Zhejiang, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_8
67
68
L.-W. Lin and Xuan-Gang
Hertzum [6] highlighted that transparency of an online purchasing system can help consumers use the system in a more intuitive manner. De Neys [4] indicated that the derivation of trust is associated with an individual’s belief cognition ability. We use the term behavior to refer to the focal entity or actor that is subject to behavior. Behavior entities are often the object of a study. As with the imprinters, imprinted entities can be located at multiple conceptual and analytical levels. We consider this consumer behavior process, as described in the framework, to refer to the actual process and occurrence of imprint formation—the activities that occur during the sensitive period as opposed to the subsequent processes of consumer behavior, amplification, decay, or transformation. In the following sections, we introduce the existing research regarding each factor.
8.2 Literature Review 8.2.1 Contract Social capital theory discusses the transactional relationship between buyers and sellers. The concept of barter was a part of early Chinese history, and this concept has gradually evolved into the modern use of money to measure the value of an object. This evolution has extended in the modern world from physical shopping to e-commerce platform shopping to online virtual live shopping. Social capital theory focuses on the sharing of information to ensure successful transactions between buyers and sellers.
8.2.2 Brand Loyalty Brand loyalty is a part of the overall brand equity, which pertains to a type of brand power determined by consumer perceptions, views, and experiences regarding the brand. When customers select a certain brand despite having adequate opportunities and reason to choose other brands, they are considered to exhibit brand loyalty. Cengiz and Cengiz [3] stated that brand loyalty can be evaluated considering the behavior, attitude, and various perspectives of consumers. Lin [8] highlighted that brand loyalty is a result of the combination of the brand experience, personality, and trust. Online live streaming platforms can foster brand loyalty through consumers’ recognition of the products of a brand.
8 Survey of Consumer Purchase Intentions of Live Stream …
69
8.2.3 Cognitive Value Cognitive value refers to the overall evaluation made by customers by comparing the benefits and costs of goods or services. Cognitive value is a key concept for consumers and influences their purchase decisions as well as beliefs. Consumers’ cognitive value is linked to their memory and ideas. Consumers consider the characteristics and values of things or products that they have observed in the past. Different consumers have different perceived values and influences. We aim to understand the state of perceived value by consumers during the dynamic situation of live broadcast platforms.
8.2.4 Emotion The definition of emotion refers to a friendly connection between two parties. Another argument is that emotion comes from one’s own cognitive preparation. Emotional connection is one of the important factors for consumer brand loyalty. We studied whether emotions affect brand loyalty? When consumers watch the live broadcast, will they be affected by the extension of the host and product stories and affect their brand loyalty? This is an interesting research direction and topic. Some scholars have pointed out the importance of the plot of the movie to emotions [2, 9, 11]. We can treat live simulation as the plot of a movie, so that buyers and sellers can make emotional connections faster and more connected. We know that the plot of a movie is often very moving, and such a movie will create an effect and an emotional connection. In fact, during the investigation, we can learn that live broadcast can also cause consumers to be emotionally moved, so that consumers can identify with the product and the seller.
8.2.5 Consumer Behavior Theory Consumer behavior mainly refers to the decision-making behavior of consumers in the purchase process. Consumers will explain the state of mind due to psychological factors during the purchase process, and consumers will decide their buying behavior based on their feelings and perceptions. Consumers will be affected by a lot of information during the purchase process. Advertising information will affect consumers’ buying behavior [1, 7]. Online stores provide consumers with experience shopping that can affect consumers’ shopping behavior [10].
70
L.-W. Lin and Xuan-Gang
Fig. 8.1 Research model
8.3 Research Design 8.3.1 Research Design Our research explores the relationship between brand loyalty, consumer behavior, cognitive value, emotion and interactive behavior, and the incentives and sensitivities of online consumers to purchase products in live streaming scenarios. Based on the literature review, research questions were designed to guide the development of conceptual models. The conceptual model is shown in Fig. 8.1. Four hypotheses were established and empirically tested.
8.4 Data Analysis and Findings A structural equation model (SEM) with AMOS 23.0 was used to analyze the hypothesis relationships of the research model. SEM aims to examine simultaneously the interrelationships between a set of positioning constructs, each constructed by one or more observation items (measures). It involves the analysis of two models: measurement (or factor analysis) and structural models. The measurement model specifies the relationship between the observed measures and their infrastructure, allowing for interrelated structures. The structural model specifies the assumed causality between constructs (Fig. 8.2). According to the above correlation analysis results, it can be seen that each variable has a significant correlation at a significant level of 99%, and the correlation coefficients are all greater than 0, so they are all positive correlations (Table 8.1). For example, the correlation coefficient between CT and CV is 0.859, which is a positive correlation. By analogy, the correlation between all other variables can be explained (Table 8.2).
Fig. 8.2 Research model
8 Survey of Consumer Purchase Intentions of Live Stream …
71
Table 8.1 Results of correlation coefficient analysis (correlation analysis among the various dimensions) Variables
Correlation
CT
CV
EO
CB
CT
Pearson
1
CV
Pearson
0.859**
1
EO
Pearson
0.789**
0.871**
1
CB
Pearson
0.762**
0.795**
0.862**
1
BL
Pearson
0.741**
0.823**
0.888**
0.850**
BL
1
**
Significantly correlated at the 0.01 level (two-sided). ** was significantly related at 0.01 levels (bilateral)
Table 8.2 Mediation effect test Effect
Boot SE
Boot LLCI
Boot ULCI
Total effect
0.8098
0.0350
0.7410
0.8786
Effect occupation ratio
Direct effect
0.5082
0.0707
0.3691
0.6472
63%
Mediation effect
0.3016
0.0549
0.2026
0.4161
37%
8.5 Conclusions and Future Research This empirical study is one of the first attempts to use consumer behavior theory to explain the influence of several factors on the purchase situation of the small live broadcast platform. In order to solve this important issue in social media, in this study, we investigated the factors affecting consumers of behavior. First, we developed a new research model to study the factors that influence consumer behavior on live streaming platforms. Secondly, the results of this study show that cognitive value, contract mechanisms, and interactive behavior will affect consumer behavior, and brand loyalty through intermediary imprints will be affected. In this study, the four hypotheses are tested and all have a significant relationship. Instant live streaming platforms are becoming more and more popular; therefore, turning viewers into customers is a major issue. Consumer behavior has become one of the key success factors. As an extension of the brand loyalty of e-commerce live broadcast, combined with the groundbreaking contribution of imprint theory, this research ground breakingly researches the combination of overall operations, marketing, psychology, and other concepts, which can further extend future research. The imprint theory of the research model is combined with the overall structure and empirical nature to explore the future market development and future of live broadcast. Although the innovative contribution of consumer behavior theory is in the content of the article, it is subject to methodological limitations during the research process. Future research can consider responses and consumer immersion.
72
L.-W. Lin and Xuan-Gang
The overall concept significance of the various architectures of a certain live broadcast. We suggest that subsequent researchers can target consumers to watch the live broadcast platform for shopping and can try other updated variables to design and measure. On the other hand, you can consider live broadcasts on different platforms or the latest cross-border e-commerce live broadcasts and compare consumers in different countries or regions, so that you can distinguish different consumers’ purchase psychological factors. An development interesting paper structure.
References 1. Batra, R., Keller, K.L.: Integrating marketing communications: new findings, new lessons, and new ideas. J. Mark. 80(6), 122–145 (2016) 2. Carroll, N.: Film, emotion, and genre. In: Carroll, N., Choi, J. (eds.) Philosophy of Film and Moving Pictures, pp. 217–233. Blackwell, Oxford (2005) 3. Cengiz, H., Cengiz, H.A.: Review of brand loyalty literature: 2001–2015. J. Res. Mark. 6(1), 407–432 (2016) 4. De Neys, W.: Conflict detection, dual processes, and logical intuitions: some clarifications. Think. Reason. 20, 169–187 (2006). https://doi.org/10.1080/13546783.2013.854725 5. Gefen, D., Karahanna, E., Straub, D.W.: Trust and TAM in online shopping: an integrated model. MIS Q. 27(1), 51–90 (2003) 6. Hertzum, M., Andersen, H.H.K., Andersen, V., Hansen, C.B.: Trust in information sources: seeking information from people, documents, and virtual agents. Interact. Comput. 14(6), 575–599 (2002) 7. Lemon, K.N., Verhoef, P.C.: Understanding customer experience throughout the customer journey. J. Mark. 80(6), 69–96 (2016) 8. Lin, L.Y.: The relationship of consumer personality to brand personality and brand loyalty: an empirical study of toys and video games buyers. J. Prod. Brand Manage. 19(1), 4–17 (2010) 9. Roe, C.A., Metz, C.E.: Dorfman-Berbaum-Metz method for statistical analysis of multireader, multimodality receiver operating characteristic data: validation with computer simulation. Acad. Radiol. 4(4), 298–303 (1997) 10. Roggeveen, A.L., Grewal, D., Schweiger, E.B.: The DAST framework for retail atmospherics: the impact of in- and out-of-store retail journey touchpoints on the customer experience. J. Retail. 96(1), 128–137 (2020) 11. Tan, E.S.: Emotion and the Structure of Narrative Film: Film as an Emotion Machine. Routledge, New York (1995)
Chapter 10
Leveraging Block Chain Concept: Service Delivery Acceleration in Post-pandemic Times Praful Gharpure
Abstract The service delivery mechanism in our country is at different stages of maturity across 7000 urban centres where majority of services operate in manual mode or in standalone automation mode. This makes a personal visit to office a must-have requirement for individual to avail the service he desires. In recent pandemic times, it has become challenging to physically visit any public place, thereby making service provision even more difficult. However, with the Digital India initiative automation has taken place in almost all services offerings across all provider departments. The pandemic situation is an opportunity to leverage the work done in service automation space to overcome the need of in person visits. This paper explores avenue towards meeting this objective through proposing information linkage solution which can be scaled up to Block chain initiative over a period of time. Keywords Service delivery · Information linkage · Process · Block chain · Identity Management
10.1 Background In simple machines, we see a chain consisting of multiple links forming blocks connected to elements forming a system. These blocks form chain enables all elements in the system to work together and with one link failure the whole system comes to a grinding halt. In electronic service delivery, the system functions based on information pieces as elements or links which form a chain helping the validation checks leading to decision-making. The historical record of this chain of information blocks provides the validation checks for repeat transactions inducing a security layer in overall system. Electronic service delivery solution need to evolve such information block and complement the validation check of the linked services to eliminate duplication and misuse. Figure 10.1 depicts the concept graphically. P. Gharpure (B) Infrastructure Planning and Development, Tata Consultancy Services, Mihan SEZ, Nagpur, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_10
73
74
P. Gharpure
Fig. 10.1 Concept of information linkage forming block
Information is key in decision-making. In processes like granting permission or transaction validations, etc., the information from separate sources is pooled in by the applicant for decision-making by authority. Such an information is supplemented by an authorized document. The information sharing amongst the associated processes thus becomes an important element in creation of a link leading to chain. Information linkage is at the core of validation checks that go in service delivery processes to a large extent. The track of the information blocks in this validation process is the need of the hour. Block chain has been discussed more frequently these days, and its applicability to a wider spectrum of areas too is explored. However, the fact is that this very concept is at a stage where Internet usage was in 90s which scaled up leading to evolution to cloud technology in IT services over a decade. With availability of technical knowhow today, the Block chain usage has a potential to scale up much faster than what we saw in transition to “Cloud” as the necessary ground work is in place but as information is contained in different silos and at databases located at different level of maturity in terms of service provision, an intermediate step submitted in this concept is needed to scale fast.
10.2 IT Service Delivery—Current Scenario The IT initiatives are carried out independent of each other in terms of the service provider department/agencies. Such a situation leads to duplication of effort for both provider and the end user to submit the validation documentation multiple times. Further, the historical data is still referred to in manual form for want of digitization, which further causes the delays in delivery of service despite automation at provider end. This information linkage is an intermediate step on the journey towards inducing block chain concept in service delivery system in our country. Figure 10.2 lists the
10 Leveraging Block Chain Concept: Service Delivery Acceleration …
75
Fig. 10.2 Issues in current service delivery mechanism
key issues in current service delivery mechanism in terms of information exchange necessity, which calls for. a. b. c.
Situation Analysis of all services on offer in terms of the information gathered during its delivery. Outline a Strategy to standardize the information so as to ensure its fitment into existing IT system in use. Design a System for information exchange with concept of information linking to create a replica of information centrally with original data remaining with provider department.
10.3 Services—A Perspective Historically Urban Services have been referred to as provision of road, water, and power, sewage as physical infrastructure services along with educational institutes, healthcare facilities, recreational and religious places, etc., as social infrastructure services. With increase in urban population and technological advancements, the service needs of Urban citizen have changed dramatically in type as well as scale. In the light of transformation in quality of life of these end-users, the demand of variety of new services too have emerged specifically in last 2 decades. These services needed are of two categories classified as. a. b.
Core Services. Enabling Services.
Irrespective of service being a core or enabling the provision of the same calls for need of a “Process”.
76
P. Gharpure
Fig. 10.3 Core and enabling/complementary processes
Core Services The core services are those which operate in standalone mode with involvement of in-house team for various functions/activities to roll out a desired outcome, i.e. Service. E.g. Issue of certificate from school where in-house information is used. Complementing (Enabling Services) The Services whose output serves as input for other services or as intermediary step within delivery of other service are referred to as enabling service. E.g. Issuance of a document for validation needed other service. Every organization has a goal to expand business, enhance operations performance and improve quality of services it delivers. The citizen who is end user becomes part of service processes which are extended by different providers. Figure 10.3 illustrates the “Core & Complementary” processes, and the information linkage amongst them is the context to Block chain concept. As depicted in the figure above in majority of services, the identity and address of the requester is default information asked for. Further, in provision of service a key identifier of service output is also generated. The information linking amongst the provider department is thus possible through identity management solutions.
10.4 Identity and Access Management in Service Delivery Identity management refers to the policies, processes and technologies that establish user identities and enforce rules about access to digital resources [1]. Federated Identity Management is holistic identity solution for the swift, secure share of information with partners and providers. It facilitates replication of the same identity information across partners; it enables the formation and administration of a single identity per
10 Leveraging Block Chain Concept: Service Delivery Acceleration …
77
user—across enterprise boundaries thereby forming a Circle of trust. Figure 10.4 depict the concept graphically. Federated identity management using Security Assertion Mark-up Language (SAML) is a potential interim step solution to scale up to a Block chain-enabled service, where in the user’s identity if already created on a central website/service provider can be linked with the identity of the same user with other provider based on unique linkage parameter [3]. Federated Identity Management extends the concept of Identity Management across provider boundaries. Through formation of Circle of Trust, it provides more seamless and accelerated online experience to users, with features that protect the privacy of the data [2]. The current initiatives like “UMANG” and “DigiLocker” are already leveraging this approach which need to be scaled up to help electronic service delivery to mature to stage of Block chain stage faster.
Fig. 10.4 Circle of trust concept for information linkage [2]
78
P. Gharpure
10.5 Opportunities to Leverage—Barriers and Bridges IT implementation has found place in almost all service provider organizations and the basic requirement for end user login creation is in place as services are getting accessible over Internet. This sets in the identity establishment with the standard validation checks and defined access to information to users, both internal and external ones. This provides an opportunity to scale up the applications to enable concept of information linkage to be established amongst 2 or more providers within a Circle of Trust. Amongst the barriers, the major one is the historical data which is not digitized as yet in departments for variety of reasons. A potential acceleration can be induced through end-users by giving them avenue to upload the documents issued by department to them in prescribed format through the departmental logins, the information, then need to be verified by departmental official through their login with reference to information at their end. Once verified, the document uploaded becomes part of department’s repository for any future reference that point onward. The historical data thus can be created with citizen participation reducing the effort on part of department [4]. The information is once mapped to end user profile through authentication like Aadhaar/PAN card, and a link can be established amongst core and enabling services.
10.6 Creating Information Block The services as they get delivered create a record log of the actions that led to closure of the transaction. Every such transaction does have a unique id for the activities that get carried out in the process of delivery of service under reference. For a better clarity, a broad representation of property registration process is depicted in illustration below. The process has key stakeholders like. a. b. c. d. e. f.
Requester—Identified by his ID input like—PAN/Aadhaar Number. The property—It is identified in municipal record by Index number. The property has a Building Permit Id under which it is approved by authority. Registrar has record of owners who sells with his Id tagged. Further the city survey department also has a Record Id for previous owner. The utility connections also have owner identity recorded.
All the highlighted Information parameters in above list form an information block for particular transaction. This block gets validated when other transaction with respect to any of parameter takes place bring in the bock chain concept. The information blocks thus formed make a foundation for enhancement of Block chain concept to flow in. Figure 10.5 below illustrates this graphically [5]. In addition, all subsequent updates to provider department databases from where the information was pooled is updated with new record post-registration process. This saves the downstream activities the user performs in current processes.
10 Leveraging Block Chain Concept: Service Delivery Acceleration …
79
Fig. 10.5 Illustrative use case for information linkage
10.7 Implementation Plan The approach to implement the concept described above calls for collaborative effort of all the concerned stakeholders forming part of city service ecosystem. As a primary step, all the current ongoing initiatives need to be mapped for the information parameters they capture as part of the current process and there is a need for consistent approach to standardize these to eliminate duplication of effort and use of one information only once as far as possible. The citizen or the end user need to be provided sufficient clarity regarding the process requirements of the services under reference, just like a menu card wherein categories on offer, specifications and costs are given in concise manner. The formation of city service catalogue and listing of services inputs is the initial need to enable the concept. The formation of end user directory is another step that need to scale up in existing applications with synchronizing the user ids. Further, the Application Interfaces (APIs) need to be developed to enable the federated identity solution which shall form Global Unique Identifier (GUID) that shall get generated to facilitate the information linkage. These GUIDs are stored in respective provider databases and invoked as per the service transactions to provide the validations and updates are also recorded postsuccessful completion of the transaction. Figure 10.6 outlines various stages and key activities for implementation of the concept described.
80
P. Gharpure
Fig. 10.6 Roadmap to implement information exchange
10.8 Conclusion The approach towards information linkage has potential to provide several catalytic effects towards service delivery with seamless validations. The acceleration in current service provision along with creation of information blocks for each service specific to the users and transactions paves way to form the chain of information which we call in today’s time as Block chain.
References 1. 2. 3. 4. 5. 6.
Watkins, B.: Federated Identity Management: Validating Users from Other Organizations (2005) Making Process Lean. Article published in Express Computers (2008) Managing User Identity in e-Governance Projects. NCEG (2010) EODB Grand Challenge Solution; Government of India (2019) Gharpure, P.: Service Delivery Process Framework—Lifecycle Approach (2022) Sathya, N.S.: Securing Land Records through Block Chain. NIC Presentation
Chapter 11
An Effective Computing Approach for Damaged Crop Analysis in Chhattisgarh Syed Zishan Ali, Mansi Gupta, Shubhangi Gupta, and Vipasha Sharma
Abstract In today’s world, there is rapid increase in temperature of earth which leads to adverse effect on several environmental factors. Due to these factors, the climatic conditions are changing drastically and affecting our world. Sudden heavy rainfall causes severe damage in agriculture in which one of the key area is agriculture which suffers the most and causes loss of human lives due to farmers suicides. Keywords Damaged crop · Unspoiled crop · Clustering
11.1 Introduction Agriculture, which has existed since the dawn of time, is no longer regarded as being as conventional and conservative as it once was. Modern farming is digital, technological, and innovative, and this trend is likely to continue as the world’s population grows. Agriculture is one of the world’s fastest-growing businesses, and it doesn’t appear to be slowing down anytime soon. Torrential rains and flash floods in the Indian subcontinent wreak havoc on property, people, agricultural/horticultural crops, and livestock, potentially causing a crisis in the rural economy, livelihood, and environment. Chhattisgarh is located in Central India, between 17° 46 and 24° 05 N latitude and 80° 15 and 84° 20 E longitude, with Uttar Pradesh to the north, Jharkhand to the northeast, Orissa to the east, Telangana to the south, Maharashtra to the southwest, and Madhya Pradesh to the north and northwest [1]. The state of Chhattisgarh covers a total land area of 135,194 km2 . Chhattisgarh is India’s tenth most populous and sixteenth largest state. In terms of forest area, Chhattisgarh is ranked third in the country with 55611 km2 . It has grown by 64 km2 since last year. The forest occupies 41.13% of the state’s entire geographical area. The state’s favorable soil and climatic circumstances helped it to become the country’s main producer of paddy, jowar, groundnut, gram, oilseeds, and wheat. The state Chhattisgarh is mostly an agrarian one, known for rice cultivation S. Z. Ali (B) · M. Gupta · S. Gupta · V. Sharma Department of Computer Science and Engineering, Bhilai Institute of Technology Raipur, Raipur, Chhattisgarh, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_11
81
82
S. Z. Ali et al.
and dubbed the “rice bowl” of the country. Agriculture provides a living for almost 80% of the population. Chhattisgarh is divided into three major agro-climatic zones: the plains of Chhattisgarh, the plateau of Bastar, and the hills of Northern Chhattisgarh, which cover 50.52%, 28.62%, and 20.86% of the state’s total surface area, respectively.
11.2 Factors Affecting crop The following are the factors that agriculture thrives on:
11.2.1 Rainfall The study of precipitation pattern is crucial for any region’s agricultural planning. Climate conditions vary widely, with a total rainfall of 1200–1600 mm on average. In different sections of Chhattisgarh, annual rainfall is observed in roughly 64–91 rainy days [1]. (In C.G., almost, 90% of rainfall falls during the monsoon season, which runs from June to September.) 1000–1200, 1200–1400, and 1400–1600 mm have all been recorded. The plains of Chhattisgarh, the Bastar plateau, and the northern hills all receive annual rainfall. Temperature and humidity fluctuate in a similar way (Table 11.1). In tropical regions, daily rainfall and rainfall on different timescales play a significant role in meteorological phenomena and aid in the estimation of agricultural land use potential and hydrological investigations. Furthermore, for effective crop planning in any particular region, a thorough grasp of agro-climatic variables is required [3].
11.2.2 Soil Information on soil hydro-physical qualities may aid in the development of enhanced water management techniques and contingency crop planning for irrigated and nonirrigated areas in order to improve the region’s yield stabilization prospects [4]. Most soil features and characteristics in the region vary in direct proportion to their location on the landscape. The action and interactions of relief, parent material, and climate are primarily responsible for the development of Chhattisgarh’s soils. Biotic features, particularly natural vegetation, are influenced by climatic trends.
11 An Effective Computing Approach …
83
Table 11.1 District-wise rainfall distribution in Chhattisgarh [2] S. No.
MET. States/District
Actual (mm)
Normal (mm)
%DEP
CAT
Actual
Normal
%DEP
1
BALRAMPUR
0.0
0.0
−100
NR
42.8
58.1
−26
2
BALOD
0.0
0.0
−100
NR
19.0
54.4
−65
3
BALODA BAZAR
0.0
0.1
−100
NR
35.5
49.8
29
4
BASTAR
0.0
0.7
−100
NR
110.4
108.4
2
5
BAMETARA
0.0
0.5
−100
NR
51.1
46.3
−10
6
BIJAPUR
0.0
0.1
−100
NR
75.6
91.0
−17
7
BILASPUR
0.0
0.0
−100
NR
34.6
50.7
−32
8
DANTEWAR A
0.0
0.2
−100
NR
46.7
96.6
−52
9
DHAMTARI
0.0
0.0
−100
NR
23.9
50.1
−52
10
DURG
0.0
0.0
−100
NR
9.4
51.1
−82
11
GARIABAND
0.0
0.0
−100
NR
25.7
65.8
−61
12
JANJGIR CHAMPA
0.0
0.1
−100
NR
23.6
43.7
−46
13
JASHPUR
0.0
0.2
−100
NR
49.9
83.8
−40
14
KABIRDHAM
0.0
0.2
−100
NR
105.0
65.2
−61
15
KANKER
0.0
0.0
−100
NR
41.3
59.7
−31
16
KONDAGAO N
0.0
0.1
−100
NR
24.4
71.9
−66
17
KORBA
0.0
0.1
−100
NR
36.1
49.8
−28
18
KOREA
0.0
0.3
−100
NR
23.5
38.5
−39
19
MAHASAMU ND
0.0
0.0
−100
NR
11.6
42.7
−73
20
MUNGELI
0.0
0.0
−100
NR
25.3
44.3
−43
21
NARAYANPUR
0.0
0.0
−100
NR
35.5
79.3
−55
22
RAIGARH
0.0
0.1
−100
NR
19.5
52.8
−63
23
RAIPUR
0.0
0.0
−100
NR
27.3
50.9
−46
24
RAJNANDGOAN
0.0
0.3
−100
NR
41.6
48.9
−15
25
SUKMA
0.0
0.8
−100
NR
58.4
89.9
−34
26
SURAJPUR
0.0
0.0
−100
NR
26.8
62.0
−57
27
SURGUJA
0.0
0.1
−100
NR
25.8
69.1
−63
11.2.3 Groundwater Since the beginning of time, groundwater has been used in India for irrigation, domestic use, and industry. Groundwater is a major source of irrigation in India, accounting for almost 45% of total irrigation. Although India’s groundwater resources are 433.02 billion cubic meters, only, 58% is used in agriculture, residential, and industrial applications [5]. The enormous variances in ground water usage in India’s eastern region are mostly owing to higher rainfall throughout the primary crop season (rainy season) than
84
S. Z. Ali et al.
agricultural water need, with the crop not requiring irrigation unless during protracted dry spells. In November 2000, the Indian state of Chhattisgarh was created out of Madhya Pradesh, and it has a lot of natural resources to help it prosper. Groundwater is a vital component of irrigation sources for growing crops all year and improving the lives of poor farmers who are forced to migrate to neighboring states for a better life. Although the state of Chhattisgarh has abundant groundwater resources, development is severely hampered. Farmers may get a decent return from their land if these resources are utilized during the rabi and summer seasons. Rainfall is the most important source of groundwater recharge. Canals, streams, ponds, and springs, among other things, play a significant role in improving groundwater levels. In India’s eleven and twelve agro-ecological regions, Chhattisgarh has a subtropical climate typified by high summer heat and moderate winter ecosystems and receives an average annual rainfall of roughly 1400 mm in 73 rainy days (ranging from 5 to 70 years).
11.2.4 E-Waste With the advancement of technology and the proliferation of electronic items manufactured over the last few decades, e-waste disposal has become a significant concern [6]. E-waste is a broad word that refers to a variety of electrical and electronic devices that are no longer useful to their owners and is one of the fastest-growing waste streams. E-waste is complicated, non-biodegradable garbage that is commonly piled into mountains. These wastes are claimed to include substantial amounts of lead, cadmium, arsenic, and other heavy metals. It is important to dispose of them carefully because they have the potential to alter soil and water characteristics. Solid waste management is a burgeoning field that aspires to make the world a better place. The wind pattern of a particular area carries harmful particles into the soil-cropfood chain as a result of the burning of e-waste on an open landfill. Calcium levels that are either low or too high can create issues.
11.3 Methodology The methodology of this research paper is based on taking crop images, marking the length of field area, and identification of the damaged area. Once the damaged area is detected, a drone will capture the image of the land and send images to database for further processing. Here, the image will be processed, and cluster will be formed on damaged crop area. After this step, a report will be generated which will provide details of damaged crop area and unspoiled crop area along with a graph which will
11 An Effective Computing Approach …
85
show the percentage of this attributes. We have implemented our work at Kendri village, Raipur (Chhattisgarh) (Fig. 3) and tested the measurement of field length by using drone (Fig. 2) (Tables 11.2 and 11.3). Step1 Step2 Step3 Step4 Step5 Step6
Capture image through drone. Captured image will be sent to database. Processing of image based on damaged and unspoiled area by marking it. Cluster formation of damaged area. Damaged crop calculation. Report generation of damaged and unspoiled area.
Field Work See Figs. 11.1, 11.2, 11.3, 11.4, 11.5, 11.6, 11.7, 11.8, and 11.9. Table 11.2 Equipment configuration Product position
Entry-level professional drone with powerful obstacle avoidance
Weight (battery and propellers included)
1388 g
Max flight time
Approx. 30 min
Vision system
Forward vision system backward vision system downward vision system
Obstacle sensing
Front and rear obstacle avoidance left and right infrared obstacle avoidance
Camera sensor
1 CMOSEffective pixels: 20 M
Max. video recording resolution
4 K 60P
Max transmission distance
FCC: 4.3 mi
Video transmission system
Lightbridge
Operating frequency
2.4 GHz/5.8 GHz * 5.8 GHz transmission is not available in some regions due to local regulations
Table 11.3 Figure description Figure 11.3. Project location: Kendri, the village where we conducted our field research
Figure 11.4. We have utilized four bottles to measure the area of land that the drone has collected. For example, a drone captured 400 ft2 from a height of 20 ft, approx.
Figure 11.5. Aerial view of a drone taking Figure 11.6. Image of crop captured by the photographs to determine the extent of land it is drone from the distance of 100 ft covering at various heights Figure 11.7. Image of crop captured by the drone from the distance of 20 ft Figure 11.9. A report will be generated which will be comprised of the area of: 1. Damaged crop 2. Unspoiled crop
Figure 11.8. Cluster formed in the area where the crop is damaged
86
S. Z. Ali et al.
Fig. 11.1 Proposed architecture
Fig. 11.2 Drone utilized
Fig. 11.3 Project site
11.4 Conclusion Farmers will get benefits from this research in terms of profitable farming, crop analysis, and farmer suicide prevention. It addresses the root causes of farmer suicide by bringing together all of the required knowledge for high-yield crop production
11 An Effective Computing Approach …
87
Fig. 11.4 Top view
Fig. 11.5 Side view
Fig. 11.6 Top view of crop area (100 ft)
and all of the relevant facts on a single platform. Another benefit of this project is that it provides information on yield protection, reduces rumors about crop damage, and sends accurate data to the government. The project’s social purpose is to reduce the number of farmers who commit suicide as a result of large losses. By using our project, the government can easily analyze the amount of crop damaged and can circulate the genuine information to farmers. Farmers’ suicide rates are inversely
88
S. Z. Ali et al.
Fig. 11.7 Damaged crop
Fig. 11.8 Cluster over damaged area
proportionate to the amount of agricultural productivity data available. This project can be approached by any state’s administration. We will continue to work on our project in the future, using artificial intelligence and deep learning to make it more valuable to farmers.
11 An Effective Computing Approach …
89
Fig. 11.9 Report generation
References 1. Bhuarya, H.K., Sastri, A.S.R.A.S., Chandrawanshi, S.K., Bobade, P., Kaushik, D.K.: Agroclimatic characterization for agro-climatic zone of Chhattisgarh. Int. J. Curr. Microbiol. Appl. Sci. 7(08) (2018). ISSN: 2319-7706 2. https://mausam.imd.gov.in/imd_latest/contents/rainfall_statistics_3.php 3. Bhelawe, S., Chaudhary, J.L., Nain, A.S., Singh, R., Khavse, R., Chandrawanshi, S.K.: Rainfall variability in Chhattisgarh state using GIS. Department of Agrometeorology, Indira Gandhi Krishi Vishwa Vidyalaya, Krishak Nagar, Raipur, India (2014) 4. Ravender Singh, S.K., Chaudhari, D.K., Kundu, S.S., Sengar, Kumar A.: Soils of Chhattisgarh: characteristics and water management options. Water Technology Centre for Eastern Region (Indian Council of Agricultural Research), Bhubaneshwar, India (2006) 5. Singandhupe, R.B., Sethi, R.R.: Ground water resources scenario, its mining and crop planning in Chhattisgarh state of India. ICAR-Central Institute for Cotton Research, Nagpur, India. ICARIndian Institute of Water Management, Bhubaneswar, India (2016) 6. Dharini, K., Cynthia, J.B., Kamalambikai, B., Celestina, J.A.S., Muthu, D.: Hazardous e-waste and its impact on soil structure. School of Civil Engineering, SASTRA University, Thanjavur, India (2017)
Chapter 12
An Improve Approach in Core Point Detection for Secure Fingerprint Authentication System Meghna B. Patel, Ronak B. Patel, Jagruti N. Patel, and Satyen M. Parikh
Abstract Fingerprint recognition is a very old and reliable method among all biometric recognition methods. Fingerprint recognition is done through preprocessing and post-processing stages. Two big challenges in fingerprint recognition are the most important, namely the reduction of false features and the matching of an uneven number of details features. This problem occurred due to the rotation of the fingerprint. To overcome this problem, a center point detection method is used. The center point of finger that is known as core point is used for comparing two different aligned or displaced fingerprints. The rate of success and the speed of verification and identification process of fingerprint are depending on core point detection. Even now, the extraction of core point is major challenge in fingerprint verification. The proposed core point extraction algorithm follows image enhancement, thinning, minutiae extraction steps. The implementation of research work is carried out using .Net platform with custom fingerprints database of 2000 images of different 200 users. Keywords Biometric · Fingerprint recognition · Finger image enhancement · Finger image thinning · Core point detection
M. B. Patel (B) A.M. Patel Institute of Computer Studies, Ganpat University, Mehsana, Gujarat, India e-mail: [email protected] R. B. Patel SRIMCA, Uka Tarsadia University, Bardoli, Gujarat, India e-mail: [email protected] J. N. Patel DCS, Ganpat University, Mehsana, Gujarat, India e-mail: [email protected] S. M. Parikh FCA, Ganpat University, Mehsana, Gujarat, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_12
91
92
M. B. Patel et al.
12.1 Introduction Today, in the world of information security, to promote personal authentication, biometric is the best appropriate and secure way [1, 2]. It is used to authenticate an individual based on the person’s characteristics like physiological or behavioral. Because of uniqueness and consistency of fingerprint, it is most well-known and widely accepted biometric authentication. The popularity of fingerprint recognition is increased for making the world safe and secure because it is used in sectors like law enforcement, national and international security, and government. Still, some issues arrived in fingerprint recognition [3]. The fingerprint verification is split into four phases [4–6]. The first step known as acquisition phase used to capture the image. The second step is known as preprocessing [7]. That is further divided into the stages like enhancement of an image, image binarization and thinning. In the third step, using extraction method extracts the features like ridge bifurcation and ridge end from thinned image [8–12]. Also, the center point or core point is detected. In last forth step, using distance factor and similarity matching method match the feature of two fingerprints.
12.2 Related Work The core point of finger is used for verification as well as for identification of process. To increase the speed of these two processes, classification of finger is done based on shape of finger. So, for classification, core point detection is very important [13, 14]. The research paper [15, 16] designed core point discovery is utilized to classify biometric unique finger impression based on worldwide top designs. It is valuable for coordinating when the finger picture is turned. The solitary focuses of the common design are utilized to turn a picture. An appraise of the introduction field and a Poincaré list is utilized to identify the center point within the existing calculation. Existing strategies don’t give precise and evaluated comes about for boisterous pictures. A few strategies work fine, but a few have fizzled with unique finger impression designs and not sufficient to distinguish the genuine solitary point and identify produced center focuses by cause of smears, scars, blockages, etc. Thus, the problems arised in post-processing strategy because it use the local characteristics of core point. The research paper [17] designed a model-based method to calculate ridge orientation. We have also proposed a combined model that represents the orientation of the ridge to take into account the regularity of the base and delta points. The point loading of the model is used to increase the precision of each kernel and delta point, while the polynomial model is used to define the orientation field. The author moreover uncovers that for the introduction estimation, compared to the gradientbased approach, the filterbank-based approach is strong but wasteful and expensive in calculation.
12 An Improve Approach in Core Point Detection …
93
The research paper [18] introduces a method for detecting the baseline point using the area of interest. The direction of curvature (DC) method is used to detect the points of the coarse nucleus, while the geometry of region (GR) method is used to detect the points of the coarse nucleus. Besides indicating that the guidance estimate is always the best method for the thinnest central point described. The research paper [19] offered a method for classifying fingerprints using a hierarchical classification method. The improvement in the reduction of lost misclassification errors is shown after applying it. The approach searches a clustered database for the finger image using a query-based method. Proper segmentation of finger images is performed using the clustering method but does not detect the effective midpoint. The research paper [20] based on several features retrieved from the finger image, an algorithm was created to locate the exact center point. It does so by using coherence, the orientation field, and the Poincaré index approach to find the center point. It is critical in the development of automatic fingerprint recognition system (AFIS) based on correlation, but there is still a failure to detect the precise nucleus and delta point in automatic fingerprint recognition system (AFIS). The research paper [21] the detection of the central point is employed for the classification of the finger image based on the data mining method, it says. Each finger image comprises the digital code generated based on the peak flow pattern at first. Choose a beginning value and a trustworthy classification procedure for each class. The finger image is segmented into separate groups using the clustering algorithm. The research paper [22] proposes a method in this article to extract trustworthy and accurate singular points to tackle the difficulty described in the previous way. To improve the finger picture, the approach employs short-time Fourier transform (STFT) analysis. The algorithm retrieves the center point, but it requires more processing time and is more complex to implement. The research paper [23] have suggested an approach that uses a gradient-based estimation method to calculate the ridge orientation and then uses a neighborhood averaging method to extract a trustworthy center point to build a smooth orientation field. This approach is simple to calculate and yields a consistent center point. The research paper [24] a filter-based method and a Gabor filter bank with a constrained fixed-length finger code were used to extract local and global features. Because finger picture recognition is based on feature vectors in two-finger codes, this method is faster and more accurate compare to details-based method. The method has drawback of not being able to extract a single point to match. The research paper [25] contains both multiple resolution analysis and Poincaré index, introduced single-point extraction algorithm which improvement of Poincaré index depend on zero-pole model to detect singular points with varied resolutions. The image was divided into non-overlapping blocks corresponding to distinct blocks of varying sizes using this method to extract multi-resolution information from the finger image. Wavelet functions are utilized to compute directional fields with various resolution, and the sampling theorem is used to modify the position of the block.
94
M. B. Patel et al.
The research paper [26] proposed enhanced core point detection algorithm which follows the normalization method [27] followed by enhanced gradient-based orientation estimation for removing inconsistency [28, 29] for finding out the core point detection using Poincare index. The algorithm implemented and tested with two databases FVC2000 and FingerDoS and increased the accuracy.
12.3 Proposed Research Work Numerous comparison methods are existable in fingerprint recoganization. The proposed local center point matching method is an effective and easily understandable. The curved point which is shown at top is center point in Fig. 12.1. The geometry region (GR) is one of the best algorithm for core center point detection. The following steps show the GR algorithm. (1) (2)
Through orientation estimation method compute smoothed orientation field θ (i, j). Calculate sine of θ (i, j) for ε (i, j). (i, j) = sin θ (i, j)
(3) (4)
Initialize an image which specify core or center point with label A. Assign the corresponding pixel in A the value of the intensity difference of the integrated pixels of each region. A(i, j) =
R1
(5) (6)
(12.1)
(i, j) −
(i, j)
(12.2)
R2
Here, R1 and R2 regions have been experimentally determined, and for capturing the maximum curvature on ridges, the geometry is designed. Here, the radius is defined of 10–15 pixels which cover minimum 1 edge. Also, R1, which is interspersed with R2, must contain the maximal point. Compute maximal value at A and allocate coordinate as center point. Steps 1–5 repeat multiple times at the time of reducing size of window in step 1 until center point is located.
Fig. 12.1 Core point
12 An Improve Approach in Core Point Detection …
95
After performing these above steps, extraction of center point is performed and save it into database. The features of two images are stored in database for matching two images. The row and column means X and Y position of the pixel; type of ridge means bifurcate and ridge ending, and an absolute distance of center point is extracted for each minutiae. The minutiae pair is called as same if it contains similar kind of minutiae and different absolute value > = 0 and < = Er where Er stands for error value rate. Er is required because it is very common to get absolutely different non-zero values for two images getting from a same image. After minutiae of all pairs has been matched, as a result of recognition, the match rate has been shown instead of Boolean value whether the two images are matched or not [28]. R=
k ∗ k/m ∗ n
(12.3)
where R represents matching rate, while k represents count of matched pair, m represents no. of minutiae from template image, and n represents no. of minutiae from input image. If R value is larger than value of threshold, then two fingerprint images could be two copies of a same finger.
12.4 Implementation and Result Discussion The research implementation is done on the .Net platform with custom fingerprints database of 2000 images of different 200 users. Enhancement of image is necessary phase for improving quality of image by noise removal, joint broken ridges, and for smoothing. Here, use composite of Gaussian mask and the Sobel convolution algorithm to improve image quality [29]. The Gaussian mask is used to create smooth images and Sobel 3 * 3 convolutions to detect contours [30, 32]. Zhang-Suen thinning algorithm is applied for image thinning [10, 31]. The result of details extraction and center point detection is shown in Fig. 12.2 using the above algorithm. The two images, the original image and enrolled image, are matched is shown in Fig. 12.3.
12.5 Conclusion and Future Work The proposed algorithm increases the performance of matching rotated and displaced images after detecting the core point of finger image. The performance of algorithm is tested on custom database made by 2000 images of 200 users. The overall accuracy of proposed algorithm is 87%. Experimental results can be checked with more different existing databases like FVC2000 and FingerDos databases which contain more images.
96
M. B. Patel et al. Original Image
Enhanced Image
Thinned Image
Minutia Extraction and Core point Detection
101_01
101_01
101_01
101_01
102_01
102_01
102_01
102_01
103_01
103_01
103_01
103_01
104_01
104_01
104_01
104_01
105_01
105_01
105_01
105_01
106_01
106_01
106_01
106_01
Fig. 12.2 Result of feature extraction and center point detection
12 An Improve Approach in Core Point Detection … Original Input Image
Enrolled Image in Database
101_02_Original
102_03_Original
103_02_Original
104_04_Original
105_01_Original
106_03_Original
97 No. of Minutiae in Template
Matched Minutiae Pairs
Matching Score
35
32
91.42
28
23
82.14
26
21
80.77
17
12
70.59
46
46
100
28
26
92.86
101_01_Enrolled Image
102_01_ Enrolled Image
103_01_ Enrolled Image
104_01_Enrolled Image
105_01_ Enrolled_Image
106_01_ Enrolled_Image
Fig. 12.3 Results of fingerprint recognition
98
M. B. Patel et al.
References 1. Patel, R.B., Patel, M.B., Nakrani, T., Patel, P.: A comparative study of various authentication methods. Gradiva Rev. J. 7(6), 417–421 (2021) 2. Patel, M., Parikh, S.M., Patel, A.R.: Comparative study of handwritten character recognition system for Indian languages. In: Proceedings of 5th International Conference Information and Communication Technology for Intelligent Systems (ICTIS-2021) held during 23–24 April 2021 3. Kiefer, R., Stevens, J., Patel, A., Patel, M.: A survey on spoofing detection systems for fake fingerprint presentation attacks. In: 4th International Conference on Information and Communication Technology for Intelligent Systems, pp. 315–334. Springer, Singapore (2020) 4. Ali, M.M., Mahale, V.H., Yannawar, P., Gaikwad, A.T.: Fingerprint recognition for person identification and verification based on details matching. In: 2016 IEEE 6th International Conference on Advanced Computing (IACC), pp. 332–339. IEEE (2016) 5. Chugh, T., Arora, S.S., Jain, A.K., Paulter, N.G.: Benchmarking fingerprint details extractors. In: 2017 International Conference of the Biometrics Special Interest Group (BIOSIG), pp. 1–8. IEEE (2017) 6. Patel, M.B., Patel, R.B., Patel, A.R.: Components of Fingerprint Biometric System. Int. J. Eng. Res. Technol. (IJERT) 1(3), 1–5 (2012) 7. Patel, M.B., Parikh, S.M., Patel, A.R.: Performance improvement in preprocessing phase of fingerprint recognition. In: Information and Communication Technology for Intelligent Systems, pp. 521–530. Springer, Singapore (2019) 8. Patel, M.B., Patel, R.B., Parikh, S.M., Patel, A.R.: An improved O’Gorman filter for fingerprint image enhancement. In: 2017 International Conference on Energy, Communication, Data Analytics and Soft Computing (ICECDS), pp. 200–209. IEEE (2017) 9. Patel, M.B., Parikh, S.M., Patel, A.R.: Performance improvement in binarization for fingerprint recognition. IOSR J. Comput. Eng 19(3), 68–74 (2017) 10. Patel, M.B., Parikh, S.M., Patel, A.R.: An improved thinning algorithm for fingerprint recognition. Int. J. Adv. Res. Comput. Sci 8(7), 1238–1244 (2017) 11. Patel, M.B., Parikh, S.M., Patel, A.R.: An improved approach in fingerprint recognition algorithm. In: Smart Computational Strategies: Theoretical and Practical Aspects, pp. 135–151. Springer, Singapore (2019) 12. Patel, M.B., Parikh, S.M., Patel, A.R.: An approach for scaling up performance of fingerprint recognition. Int. J. Comp. Sci. Eng. 7(5), 457–461 (2019) 13. Patel, M.B., Patel, N.J., Patel, A.R.: Comparative study on different fingerprint classification approaches. Int. J. Comput. Technol. Appl. 5(1), 189 (2014) 14. Patel, M.B., Patel, A.R.: Performance improvement by classification approach for fingerprint identification system 15. Iwasokun, G.B., Ojo, S.O.: Review and evaluation of fingerprint singular point detection algorithms. Br. J. Appl. Sci. Technol. 4(35), 4918 (2014) 16. Gnanasivam, P., Muttan, S.: An efficient algorithm for fingerprint preprocessing and feature extraction. Procedia Comput. Sci. 2, 133–142 (2010) 17. Zhou, J., Chen, F., Gu, J.: A novel algorithm for detecting singular points from fingerprint images. IEEE Trans. Pattern Anal. Mach. Intell. 31(7), 1239–1250 (2009) 18. Atipat, J., Somsak, C.: An algorithm for fingerprint core point detection. In: Proceedings of IEEE International Symposium on Signal Processing and its Applications (ISSPA 2007) (2007) 19. Bhuyan, M.H., Bhattacharyya, D.K.: An effective fingerprint classification and search method. IJCSNS Int. J. Comput. Sci. Netw. Secur. 9(11), 39–48 (2009) 20. Kekre, H.B., Bharadi, V.A.: Fingerprint core point detection algorithm using orientation field based multiple features. Int. J. Comput. Appl. 1(15), 97–103 (2009). ISSN: 0975–8887 21. Bhuyan, M.H., Saharia, S., Bhattacharyya, D.K.: An effective method for fingerprint classification. Int. Arab J. e-Technol. 1(3), 89–97 (2010) 22. Khalil, M.S., Muhammad, D., Khan, M.K., Alghathbar, V.: Singular points detection using fingerprint orientation field reliability. Int. J. Phys. Sci. 5(4), 352–357 (2010). ISSN 1992–1950
12 An Improve Approach in Core Point Detection …
99
23. Mishra, A., Shandilya, M.: Fingerprint’s core point detection using gradient field mask. Int. J. Comput. Appl. 2(8), 19–23 (2010). ISSN: 0975 – 8887 24. Prajakta, M.M., Ramesh, P.A.: A novel approach to fingerprint identification using Gabor filter-bank. ACEEE Int. J. Netw. Secur. 2(3), 10–14 (2011). 01.IJNS.02.03.159 25. Weng, D., Yin, Y., Yang, D.: Singular points detection based on multi-resolution in fingerprint images. Neurocomputing 74(17), 3376–3388 (2011) 26. Patel, M., Parikh, S.M., Patel, A.R.: An improved approach in core point detection algorithm for fingerprint recognition. In: Proceedings of 3rd International Conference on Internet of Things and Connected Technologies (ICIoTCT), pp. 26–27 (2018) 27. Patel, M.B., Parikh, S.M., Patel, A.R.: Global normalization for fingerprint image enhancement. In: International Conference on Computational Vision and Bio Inspired Computing, pp. 1059– 1066. Springer, Cham (2019) 28. Patel, M.B., Parikh, S.M., Patel, A.R.: Performance improvement in gradient based algorithm for the estimation of fingerprint orientation fields. Int. J. Comput. Appl. 167(2), 12–18 (2017) 29. Patel, R.B., Patel, M.B., et al.: Performance improvement in fingerprint image enhancement using Gaussian mask and Sobel convolution. In: 10th International Conference on Transformation of Business, Economy and Society in Digital Era (2019) 30. Patel, R., Hiran, D., Patel, J.: Fingerprint image thinning by applying Zhang – Suen algorithm on enhanced fingerprint image. Int. J. Comput. Sci. Eng. 7(4) (2019). ISSN: 2320–7639 31. Patel, R., Hiran, D., Patel, J.: An algorithm for fingerprint details extraction. Int. J. Comput. Sci. Eng. 7(6) (2019). ISSN: 2347–2693 32. Patel, M.B., Patel, J.N., Parikh, S.M., Patel, A.R.: Comparison on different filters for performance improvement on fingerprint image enhancement. In: 4th International Conference on Smart Computing and Informatics (SCI-2020), Springer (2020)
Chapter 13
Investigating i-Vector Framework for Speaker Verification in Wild Conditions Asmita Nirmal
and Deepak Jayaswal
Abstract Nowadays, most of the mobile and handheld devices use speech and speaker verification systems (SVS). Even though these systems give satisfactory performance in constrained conditions, there are a number of real-life unconstrained conditions where their performance is not satisfactory. Such SVS cannot accurately authenticate a person when deployed in applications where varying environmental or channel conditions. The actual conditions may be very much different than those used during system training. This creates a large uncertainty in verification scores obtained during evaluation phase of the system. In this regard, we have implemented a verification system using state of the art i-vector-based approach. It is based on total variability subspace (TVS) that benefits in modeling both session and channel variabilities using a single low-dimensional space instead of two different subspaces. Our experiments are conducted using the data taken from speakers in the wild (SITW) database, and the equal error rate (EER) value we have obtained is 23.16%. Keywords i-vector · SVS · TVS · SITW · EER
13.1 Introduction Speaker verification is the task of recognizing a person from his/her speech utterances [1]. Over recent years, development of digital technology has caused fast progress in speaker verification devices in the last few decades. However, the performance of these systems degrades during the practical applications. Mismatch in the training and testing conditions of speech utterances is the main cause of degradation. These systems are trained assuming very less number of possible variable conditions of speech utterances. However, acoustic or environmental conditions observed during A. Nirmal (B) Datta Meghe College of Engineering, Navi Mumbai, India e-mail: [email protected] D. Jayaswal St. Francis Institute of Technology, Mumbai, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_13
101
102
A. Nirmal and D. Jayaswal
the practical application may be very much dissimilar than those assumed during training. Thus, there is a large possibility of verification error during the actual use of the system. To handle this training testing mismatch difficulty, it is important to compensate acoustic or environmental variability in the speech utterances so as to build the verification system robust against real-life conditions. Over the years, various feature extraction methods, modeling techniques, and compensation methods have been recommended in the literature to make the SVS robust in real-life conditions. In the beginning Gaussian mixture model (GMM) has attempted to model the speakers using linear combination of Gaussian distributions [2]. Next, to this maximum, a Posteriori (MAP) adaptation approach is proposed. Here, mean supervectors of universal background model (UBM) are adapted to speaker-specific features to obtain a speaker model is proposed. This modeling approach is known as GMM-UBM [3]. But, the issue of GMM-based systems is their performance drops drastically in uncontrolled conditions. The main focus in studies after this was to deal with variable conditions of speech utterance while maintaining the satisfactory performance. With this view, the application of support vector machines (SVM) in GMM supervector space was proposed in [4]. Followed by this, a technique known as joint factor analysis (JFA) [5] is proposed in which compensate for speaker and channel variabilities in GMM supervector space by separately modeling them. Further, the use of SVM along with JFA is proposed in [6]. Here, the speaker factors are represented using JFA and used to train SVM. This approach has used different SVM kernels. It has shown remarkable results using the cosine kernel. After that authors in [7] have discovered that channel factors extracted using JFA approach also include the speaker information. In this regard, an approach based on TVS is proposed for speaker modeling. This approach uses a single low-dimensional subspace known total variability subspace to model both the variabilities rather than using two separate high-dimensional subspaces. As the TVS subspace models both the intra- and inter-speaker variabilities in it, i-vectors extracted from it also contain these variability and therefore need to be compensated. Further, variabilities in the speaker audio recording conditions also cause significant changes in Baum–Welch (BW) statistics parameters. It is one of the intermediate steps in i-vector extraction process. These parameters are computed using speaker-specific features extracted from speaker recordings and UBM. Thus, the overall effect of uncontrolled conditions is uncertainty in i-vector extraction process. Therefore, a number of compensation approaches are suggested in the literature to deal with this issue. It includes linear discriminant analysis (LDA) [8], within-class covariance normalization (WCCN) [9], and nuisance attribute projection (NAP) [10]. All these methods aim to maximize between speaker variance and minimize intra-speaker variance. Other than this, different score domain compensation techniques include cosine similarity scoring (CSS) [11] and probabilistic LDA (PLDA) [12] approaches have been suggested in the i-vector framework. Recently, deep learning techniques [13] and convolution neural network-based [14] approaches have been proposed in speaker verification field.
13 Investigating i-Vector Framework …
103
Our main focus in this paper is to use state of the art i-vectors framework for speaker verification task. Our system implementation mainly includes two stages. First one is development of a text-independent SVS considering the i-vector framework using the audio recordings of core–core conditions of the (SITW) database [15]. The main reason of selecting particularly the SITW database is wide range of noise levels and recording conditions considered while forming this database. Further, the use of SITW database benefits us to make our system robust in real-world scenario. In the second stage of performance evaluation, we have used detection error tradeoff (DET) curve [16]. It is a graph between two types of error probabilities defined by the terms, false acceptance rate (FPR) and false rejection rate (FRR). The equal error rate (EER) is the point on the DET curve where FAR and FRR are equal. Based on the EER, system may sometimes reject the true speaker or may accept false user. Our aim is to keep the EER value as low as possible. The remaining part of our paper is divided into following different sections. Section 13.2 contextualizes theoretical background of i-vector feature extraction and i-vector scoring measure. Section 13.3 explains different steps that are involved in implementation methodology. Section 13.4 provides details of experimental setup used while developing our system. Section 13.5 provides results obtained on our evaluation data. Finally, conclusions about our work are provided in Sect. 13.6.
13.2 Background In this section, a brief theory about i-vector and cosine kernel-based similarity measure.
13.2.1 i-Vectors The identity vectors also called as i-vectors are built on factor analysis concept. Both the intra-speaker and inter-speaker variability compensation is done in a lowdimensional i-vector space instead of high-dimensional supervector space. The ivectors are fixed length representation of speech utterance regardless of its variable length. Here, speaker-independent supervector, M derived from speaker specific data is computed by the following equation as, M = m + Ty
(13.1)
where m is the speaker-independent supervector obtained by concatenating mean vectors of UBM, T is a low-dimensional total variability matrix (TVM). It models both intra- and inter-speaker variabilities, and y is the i-vector with zero mean and
104
A. Nirmal and D. Jayaswal
Fig. 13.1 Speaker verification system based on i-vector framework
unit variance Gaussian distribution. Figure 13.1 gives the idea of different stages that are required to implement SVS based on i-vector framework. The i-vectors are calculated for all the speech utterances of a speaker and averaged to form an enrollment i-vector. It acts as a reference probe of the speaker which to be compared with the i-vector extracted from test utterance during evaluation phase.
13.2.2 Cosine Similarity Score In this paper, we have used cosine distance measure to find the similarity between enrollment i-vectors and test i-vector. The similarity score, S is the angle between the two i-vectors. The distance target i-vector, ytarget and test i-vector, ytest , obtained from corresponding target speaker and test speaker utterances are calculated as, The score, S is compared with the threshold to take decision about accepting or rejecting claimed identity. We have chosen cosine kernel based because it neglects unwanted information such as intra-speaker and inter-speaker variability.
T ytarget ytarget S= ytarget · ytarget
(13.2)
The i-vector magnitude contains this unwanted information. It aids in making our system more robust in uncontrolled conditions. Other benefits of using CSS are it is less complex than other scoring methods like SVM and JFA, and there is no need to enroll target speaker while scoring.
13 Investigating i-Vector Framework …
105
Table 13.1 Steps to implement i-vector framework Step No. Description 1
Data from core–core conditions of SITW database are divided into three parts, namely development data, enrollment data, and evaluation data
3
Feature vectors are extracted from all the audio files and stored offline. Later on, UBM and TVM models are trained using these files
3
BW statistic parameters are extracted using speaker-specific features and then used along with TVM to create unique voiceprint of the speaker which is represented as i-vectors
4
Here, test trials are done using development test set and parameters; tuning is done to make sure if our trained models in any case responds correctly or not accordingly
5
Enrollment i-vectors are extracted using the step 3 and stored offline
6
Using the similar steps of i-vector extraction from enrollment utterances, i-vectors are extracted from test utterance
7
After applying LDA and WCCN, enrollment and test i-vectors are used to find the verification scores
13.3 Proposed Methodology Table 13.1 shows different steps used to implement i-vector framework. The entire implementation process is divided into different steps such as splitting the database, creating UBM and TVM, enrolling speaker using i-vectors, and computing decision score. Following are the different steps that are used in implementing our system.
13.4 Experimental Setup This section details about experimental setting that we have used during database formation, feature extraction, and speaker modeling. In this paper, we have used audio files from SITW database. The audio files from this database are first downsampled into 16 kHz. We have used audio files from core–core conditions. These conditions include single-speaker enrollment and testing files on the whole. SITW database provides several list files that give all the details about those files.
13.4.1 Database Formation We have divided audio files from the training data into two sets as male audio set and female audio set to create two separate universal background model and total variability matrix for male and female. Before developing our system, we have divided
106
A. Nirmal and D. Jayaswal
available data from development folder and core–core conditions of SITW database into three sets as explained in Sect. 13.3.
13.4.1.1
Development Data
Development data refer to data required to train UBM and TVM. We have used all the 1958 audio files from development folder of SITW database. Male and female audio files from development folder of SITW database are separated using meta-list files provided in this database. Using list files, two separate models of UBM and TVM are trained for male and female speakers. The number of audio files for male and female was 1468 and 490, respectively.
13.4.1.2
Enrollment Data
For speaker enrollment, we have used single-speaker audio files that are listed in “enroll-core.lst” file. Besides this, if the length of the audio file is greater than 18 s, we have divided it into three equal segments. This will make sure that each segment is at least of 6 s in length so as to follow the rule in this database to enroll a particular speaker.
13.4.1.3
Evaluation Data
We have used core–core list files to know target and imposter labels of audio files that are used during our test trials. For knowing the target and imposter labels, we have used list files and key files given in SITW database.
13.4.2 Implementation As explained in Sect. 13.3, i-vectors in our system are derived from the MFCC features. We have used LibROSA Python package [17] for reading all audio files from the SITW database and extracting MFCC features. A 60-dimensional feature vectors is obtained by concatenating a 19-dimensional MFCCs with log energy and their delta and delta–delta coefficients. After MFCC feature extraction, our next task is formation of UBM and TVM training, i-vector feature extraction, and scoring test trials. To do all this, tasks are accomplished through the functions provided in the speaker recognition toolkit based on Bob [18]. During our experimentations, we have tuned different sets of parameters such as number of Gaussian components of UBM, TVM size, and LDA matrix size. We observed that, UBM with 1024 Gaussian components and TVM with dimension 400 outperform other sizes. The genderdependent UBMs are trained using 1024 Gaussian components. The male audio set
13 Investigating i-Vector Framework …
107
was used to create male UBM along with TVM, and the female audio set was used to create female UBM along with its TVM. So, the result of this stage is universal background model and total variability matrix for male and female both. We have used a total of 252 and 571 female and male reference speaker models, respectively, for deriving the i-vectors. The LDA matrix is of size 200. With the methodology explained in Sect. 13.3, we have obtained total 171 i-vectors for female speakers and 513 i-vectors for male speakers. After applying LDA and WCCN, final i-vectors are obtained.
13.5 Results and Discussion All the test trials are scored against the enrolled i-vectors. More particularly, the match score between enrolled and test i-vectors is computed using cosine distance similarity measure. It gives the score ranging from −1 to 1. The error probabilities are computed to find the EER performance measure. During the test trial, gender of the hypothesized and test utterance speaker is the same. As explained earlier, during evaluation test, i-vector is compared with all the enrollment i-vectors using cosine distance measure. All scores where the claimed speaker in the test trial is the target speaker is labeled as target score, and remaining scores are labeled as non-target scores. The performance of our system is done using DET curve and EER using the verification scores. For this purpose, we have used particularly a package known as bob. measure from SPEAR toolkit. It provides functions for computing EER, FPR, and FNR from verification trials and plotting a DET curve. In this experiment, results are obtained by pooling all the target and non-target scores together. A single, speaker independent, threshold is swept over these two set of scores to compute FPR and FAR. There were some models where threshold generated is negative. This is because the scores array generated for these models have negative scores; hence, the threshold generated is negative. FAR and FRR are calculated for different set of thresholds using BOB measure package and then used to plot the DET curve. It is shown in Fig. 13.2. The equal error rate obtained is 23.16%.
13.6 Conclusions We have evaluated state of the art i-vector framework on SITW database to cope up with the real-life uncontrolled conditions in speech utterances. Here, we have used MFCC features for i-vector extraction. The EER value of 23.16% could be improved further by exploring other types of features and their combination with MFCC features, i.e., performance of our system is not checked yet for other types of
108
A. Nirmal and D. Jayaswal
Fig. 13.2 DET curve obtained for core–core conditions of SITW database
features. Therefore, in future, we would like to investigate our system performance with other types of features and their combinations with MFCC features.
References 1. Campbell, J.P.: Speaker recognition: a tutorial. Proc. IEEE 85(9), 1437–1462 (1997) 2. Reynolds, D.A., Rose, R.C.: Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans. Speech and Audio Process 3(1), 72–83 (1995) 3. Reynolds, D.A., Thomas, F.Q., Robert, B.D.: Speaker verification using adapted Gaussian mixture models. Digital Signal Process. 10(1), 19–41 (2000) 4. Campbell, W.M., Sturim, D.E., Reynolds, D.A., Solomonoff, A.: Support vector machines using GMM supervectors for speaker verification. IEEE Signal Process. Lett. (2006) 5. Kenny, P., Boulianne, G., Ouellet, P., Dumouchel, P.: Joint factor analysis versus eigenchannels in speaker recognition. IEEE Trans. Audio Speech Lang. Process. 15(4), 1435–1447 (2007) 6. Dehak, N., Kenny, P., Dehak, R., Glembek, O., Dumouchel, P., Burget, L., Hubeika, V., Castaldo, F.: Support vector machines and joint factor analysis for speaker verification. In: Proceedings of ICASSP, pp.1–4. IEEE (2009) 7. Dehak, N., Kenny, P., Dumouchel, P., Ouellet, P.: Front-end factor analysis for speaker verification. IEEE Trans. Audio Speech Lang. Process. 19(4), 788–798 (2011) 8. Haeb-Umbach, R., Ney, H.: Linear discriminant analysis for improved large vocabulary continuous speech recognition. In: Proceedings of ICASSP, vol. 1, pp. 13–14. IEEE (1992)
13 Investigating i-Vector Framework …
109
9. Hatch, A., Kajarekar, S.S., Stolcke, A.: Within-class covariance normalization for SVM-based speaker recognition. In: Proceedings of 9th International Conference on Spoken Language Processing 2006. Pittsburgh, PA (2006) 10. Campbell, W.M., Sturim, D.E., Reynolds, D.A., Solomonoff A.: SVM based speaker verification using a GMM supervector kernel and NAP variability compensation. In: Proceedings of ICASSP, pp. 637–640. Philadelphia, USA (2005) 11. Dehak, N., Dehak, R., Kenny, P., Brummer, N., Ouellet, P., Dumouchel, P.: Support vector machines versus fast scoring in the low-dimensional total variability space for speaker verification. In: Proceedings of INTERSPEECH, pp. 1559–1562 (2009) 12. Prince, S.J.D., Elder J.H.: Probabilistic linear discriminant analysis for inferences about identity. In: Proceedings of 11th International Conference on Computer Vision (2007), pp. 1–8. Rio de Janeiro, Brazil (2007) 13. Lei, Y., Ferrer, L., McLaren, M.: A novel scheme for speaker recognition using a phoneticallyaware deep neural network. In: Proceedings of ICASSP, pp. 1695–1699. IEEE (2014) 14. Torfi, A., Dawson, J., Nasrabadi, N.M.: Text-independent speaker verification using 3d convolutional neural networks. In: Proceedings of 2018 International Conference on Multimedia and Expo (ICME 2018), pp. 1–6. West Virginia University, Morgantown (2018) 15. McLaren, M., Ferrer, L., Castan, D., Lawson, A.: The Speakers in the Wild (SITW) speaker recognition database. In: Proceedings of INTERSPEECH, pp. 812–822 (2016) 16. Martin, A., Doddington, G., Kamm, T., Ordowski, M., Przybocki, M.: The DET curve in assessment of detection task performance. In: Proceedings of Eurospeech, pp. 1895–1898 (1997) 17. McFee, B., Raffel, C., Liang, D., Ellis, D.P., McVicar, M., Battenberg, E., Nieto, O.: librosa: audio and music signal analysis in python. In: Proceedings of 14th International Conference on Python in Science Conference, pp. 18–25 (2015) 18. Khoury, E., Shafey, L.E., Marcel, S.: Spear: an open source toolbox for speaker recognition based on Bob. In: Proceedings of the ICASSP, pp. 1655–1659. IEEE (2014)
Chapter 14
Application of Deep Learning for COVID Twitter Sentimental Analysis Towards Mental Depression Naseela Pervez, Aditya Agarwal, and Suresh Sankaranarayanan
Abstract The coronavirus pandemic hit the worldwide population to a large extent. But, one of the subtle effects of the COVID-19 pandemic was the depletion of the mental health of the people. Social media has become an efficient platform to express oneself, and Twitter is one of the most used platforms. There has been work where machine and deep learning were employed for tweet sentimental analysis for different applications including mental depression. Most of tweets sentimental analysis were focussed on positive and negative. There has been some research where neutral tweets were taken into consideration. In this research work, we have focussed on predicting depression of people, i.e. depressed, non-depressed and neutral from tweets during lock down period by employing deep learning models like bidirection long shortterm memory (Bi-LSTM), bi-directional encoder representations from transformers (BERT), and XLNET. Also, the BERT model has been modified by adding classification layer for tweet classification. In addition, the exploratory data analysis was performed for postlockdown tweets. Keywords COVID · Bi-LSTM · BERT · XLNET · Tweets
14.1 Introduction The coronavirus pandemic affected a considerable portion of the world’s population and has been one of its kind in many ways. From the virus being new to mundane humans to spread like a wild forest fire, COVID-19 is an unforgettable experience. The currencies of the most secure nations collapsed, lives were lost, and a more significant proportion of the people lost their employment. The secondary effects of N. Pervez University of Southern California, Los Angles, CA, USA A. Agarwal North Eastern University, Boston, MA, USA S. Sankaranarayanan (B) SRM Institute of Science and Technology, Kattankulathur Campus, Chennai, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_14
111
112
N. Pervez et al.
COVID -19 were depleting economies, unemployment, impacted food systems, etc. At the same time, the primary one has been on the mental health of people and also individuals utilize online media locales like Facebook, Instagram, Twitter and credit to share their considerations and perspectives. There has been some considerable usage of deep learning and machine learning for tweet sentimental analysis for different applications including mental depression [1– 5]. Also, the data from Twitter are unlabelled where clustering strategies employed for labelling them. These strategies, however, are ineffective. In addition, most Twitter sentiment analysis is focussed towards binary grouping. There has been limited work reported where neutral tweets considered and deep learning model applied like CNN, LSTM and not beyond that [6–10]. There has been no work where powerful models like BERT and XLNET have used for achieving higher accuracy. The use of hashtags to categorize content is a standard practise. It is, however, not a real solution. A single tweet, for example, will contain several hashtags. Lastly, there has been no much work found pertaining to COVID tweet analysis towards mental depression except for one where the work [10] focussed on hate speech classification only and employed CNN model towards tweet classification. So, based on the above-mentioned drawbacks, we here have proposed employing deep learning models in analysing the COVID tweets towards mental depression which is need of hour. In addition, the deep learning models like Bi-LSTM, BERT and XLNET have been deployed towards multi classification of tweets as positive, negative and neutral for mental depression during COVID. Also, the BERT model has been modified by adding the classification layer for tweet classification. The models were validated in terms of accuracy and error for proposing the best deep learning model. Using the model that performed best which is BERT, we predicted the class of a small post-vaccination dataset into depressed, non-depressed and neutral. This analysis of lockdown and post lockdown would ultimately result in reducing the number of suicide cases that have taken place because people cannot control their mental stress, loneliness, and hardship. The contributions of this paper are as follows: • Labelling the tweets using active learning approach and classification as depressed, not depressed and neutral using classifier. • Deploy and validate deep learning models Bi-LSTM, BERT and XLNET on mental depression using tweets pertaining to COVID-19 data during lockdown. • Validate the model deployed with best accuracy on post COVID -19 lockdown period for classification of tweets. The rest of paper is organized as follows. Section 14.2 gives a detailed literature review pertaining to sentimental analysis using machine learning and deep learning. Section 14.3 talks about the COVID sentimental analysis using deep learning models. Section 14.4 talks on implementation results and analyses. Section 14.5 gives the concluding remarks and future work.
14 Application of Deep Learning for COVID…
113
14.2 Literature Review Twitter sentiment analysis has been carried out using the machine learning (ML) and deep learning (DL). ML models have been previously employed in Twitter sentiment analysis. However, using deep learning models for determining the sentiments of tweets is new. In this section, the work that has used lexicon-based approach and ML has been discussed.
14.2.1 Sentimental Analysis Using Machine Learning In [4], the authors have tried to determine the mood of the public based upon the tweets that are made. The moods are categorized into six parts—tension, depression, anger, fatigue, confusion and vigour. Twitter sentiment analysis in this paper has been carried out based upon ‘profile of mood states’ which stands for POMS. However, generating a vector of the above six sentiments using POMS scoring limited and hindered the twitter sentiments. In reality, there are tweets that are happy and positive. Although the vector will have lower values of the labels, we do not get a proper insight of the sentiment represented in the tweet. Depression campaign has prompted many Twitter users to come forward and express their mental state. We are focussed on COVID-19 pandemic related tweets. One such work that has been situation specific was discussed in [5]. The tweets of this work were focussed on ‘bell let’s talk’ campaign—a social awareness campaign for mental illness. For classification purpose, bag of words were used as features and the classifier used was support vector classifier (SVC). The work also addressed the problem of imbalanced datasets. Oversampling technique SMOTE was applied to deal with class imbalance. If powerful deep learning models were employed, normalization of data would be carried out effectively with the help of the loss functions. Machine learning models sometime outperformed deep learning models in case of twitter sentiment analysis. This can be attributed to the data not being cleaned properly. In [6], the author has used the Twitter data to get a customer review for the flight experience. Also, the complaints of the various airline companies have been jotted down. A strong feature extraction technique, TF-IDF, has been used in the project. The work has used machine learning model called voting classifier. The classifier uses logistic regression and stochastic descent classifier. The classifier has outperformed the deep learning model, LSTM. A final accuracy of 0.791 has been recorded in this work. This accuracy can be attributed to a smaller dataset.
114
N. Pervez et al.
14.2.2 Deep Learning for Sentimental Analysis In [7], an attempt has been made to identify racist tweets. It was done using the deep learning models like CNN and GRU. Also, the text embedding technique that has been employed in the paper is GloVe embedding. This work is aimed at predicting hate speech. It addressed the problem as binary classification. However, the Twitter data might have tweets that are neutral with respect to hate speeches. The work has ignored considerably large neutral tweets. In [8], researchers have used the hashtags for the labelling of a tweet. If a tweet has a hashtag ‘happy’, it is classified as a happy tweet, and if a tweet has a hashtag ‘sad’ it is labelled as sad tweet. However, in many cases, a tweet has more than one hashtag. A single tweet can have both ‘happy’ and ‘sad’ as hashtags. And hence, the authenticity of the label is questionable. Also, the work has mentioned techniques like support vector machine (SVM), naïve Bayes, etc., for building the classifiers. Although machine learning has been used a lot in many sectors, deep learning is better with textual data because it has been used for many textual problems like sentence prediction, handwritings, etc. One of the papers that focussed on using Twitter data for depression was [9]. The paper used word-based RNN and GRU models. The achieved accuracy for the model was 97% and 98%, respectively. Although the model had high accuracy, as mentioned in the work, this was due to a smaller dataset of 13,385 rows. A smaller dataset has a smaller number of examples and thus might perform well on the data but will not work well with the other data. Also, the neutral tweets were ignored in this case. So, based on the above-mentioned drawbacks, we here propose to perform tweet sentimental analysis based on COVID tweets using deep learning models Bi-LSTM, BERT and XLNET. The deep learning models have been validated in terms of accuracy and error. Using the model that performed best which is BERT, we predicted the class of a small post-vaccination dataset into depressed, non-depressed and neutral. Following this, comparative analysis was carried for the two datasetslockdown and post lockdown to get an insight. These would be discussed in detail in Sect. 14.4.
14.3 COVID Tweet Sentimental Analysis Using Deep Learning In this work, we have aimed towards detecting the depleting mental health of people during the pandemic and also give an insight about how the mental health of people has improved post vaccinations and when the lockdown has been eased. To get an understanding of how the mental health of people has been affected during the COVID-19 pandemic, we concentrated on the Twitter data. We here focussed on two major datasets—the tweets that were made during the pandemic and the tweets that were made post-vaccination and when lockdown was
14 Application of Deep Learning for COVID…
115
eased. The first part of dataset collected was focussed between the months of March and December 2020. The second dataset was a very small dataset collected for 15 March 2021. The project here focussed on identifying the depressed, non-depressed as well as the neutral tweets of the users. Most of the classification models or Twitter sentiment analysis ignore the neutral tweets, that is, tweets completely off topic. Secondly, identify the section of the population that has raised awareness about mental health using Twitter platform. Finally, develop a perspective about the tweets made in the month of March through December (when the pandemic was on peak) and how the mental health has been after vaccinations were started and the lockdowns were eased. The first step in tweet analysis using deep learning model is data pre-processing and labelling. To label the pre-processed dataset, “active learning” has been employed. This labelled dataset is then required to label the rest of the dataset. However, the ‘text’ column containing tweets cannot be used by any classifier. Any classifier needs a vector or numerical data to work with. So, accordingly, TF-IDF which is a statistical measure used for the relevance of a term in a document. It is a way to convert a text into a vector form. TF-IDF has an edge over other techniques because it considers term frequency as well as document frequency into consideration. We have used the TF-IDF vectors generated for the purpose of classifying the tweets as depressed, non-depressed and neutral. The classifier uses this TF-IDF representation to link a tweet to a category. For classification, support vector classifier (SVC) has been used in this work. SVC was used as a part of active learning algorithm. Linear kernel was selected to separate the depressed, non-depressed and neutral data points. Further to labelling and classifying the tweets, deep learning models like Bi-LSTM, BERT and XLNET are employed for sentimental analysis of the tweets towards classifying as depressed, non-depressed and neutral. Details about deep learning model employed are discussed in brief.
14.4 Implementation Results and Analysis For this research work, dataset from March 2020 through December 2020 was collected. A dataset of 105,000 [11] rows was collected. The Twitter data of the March 2021 was collected using the tweepy library. We just collected tweets of the few hours for 15 March 2021 for the final comparisons. This dataset was used to implement the final model and predict the labels. Dataset was pre-processed by applying TF-IDF vectorization and support vector classifier for labelling the tweets as depressed, not depressed or neutral represented numerically as 0, 1 and so on. To understand the dataset better, we create a word cloud visualization of the dataset. The word cloud provides a way for us to understand the textual data better. The words in the text data that have higher frequencies appear larger in size, and the words that appear smaller have less frequency in text data. In building the Bi-LSTM model, we employed a powerful technique of word embedding known as GloVe embedding. GloVe embedding is used to create a vector
116
N. Pervez et al.
representation of the text. The GloVe embedding uses both the global as well as local statistic of a corpus to create a word embedding. Our research work is a multiclass classification problem. For this purpose, we have used ‘Sparse Categorical Cross Entropy’ loss function. The sparse categorical cross entropy loss function works similar to the categorical cross entropy loss function. The only difference are that the output labels in categorical cross entropy loss function is encoded while the output labels in sparse categorical cross entropy loss function are not encoded. To overcome the problem of overfitting in Bi-LSTM, we have used a dropout of 0.2. Dropout is a regularization technique. In this, some of the outputs of the model are ignored or dropped while training. This helps regularize the dataset and prevent overfitting. Optimizers are used in deep learning models to adjust the value of characteristics such as learning rate and weight decay to reduce the value of loss and simultaneously increase the accuracy of the model. The Twitter sentiment analysis presented in this work using a bi-directional LSTM used GloVe embedding to create a vector representation of the tweet. Also, for building a Bi-LSTM model, Adam optimizer was used. In BERT model, Bert tokenizer fast which is a Bert tokenizer, has been employed for tokenizing the input tweet texts. To classify the tweets as depressed, nondepressed and neutral, we have added a classification layer on top of the BERT model. AdamW optimizer has been used for the model. The AdamW optimizer separates the weight decay and the learning rate. This implies that both the learning rate and the weight decay can be optimized separately. XLNet is used for the purpose of text classification and not for word prediction. For the same, the hugging face transformer library has pre-trained model. We have used Bert for sequence classification to predict the class of our tweets. For generating input ids and attention masks, XLNet fast tokenizer has been used to encode all the tweets. AdamW optimizer with a learning rate of 5e–5 has been used for training the model. The learning rate enables the model to learn the information which helps improve the prediction. The weight decay has been set at 0.01 and also helps in the regularization of the input for training.
14.4.1 Comparative Analysis of Deep Learning Models The comparative analysis of three deep learning models in terms of accuracy, precision, recall, F1 score and losses are tabulated (Table 14.1). Table 14.1 Analysis of deep learning models Models
Accuracy
Precision
Recall
F1 Score
Bi-LSTM
96.716
0.966
0.963
0.965
BERT
96.716
0.966
0.963
0.965
XLNET
95.802
0.956
0.955
0.956
14 Application of Deep Learning for COVID…
117
From the analysis of results, it has been shown that BERT outperforms Bi-LSTM and XLNET in terms of accuracy. That is BERT achieved an accuracy of 96% as compared to Bi-LSTM and XLNET. In terms of precision and recall, BERT model achieved precision score of 96% as compared to XLNET and Bi-LSTM. This shows that BERT has predicted maximum Tweets accurately and positively with very less false positive and false negatives. In terms of F1 Score, the value is 0.97 which is near perfect classifier as maximum value is 1.0 as compared to XLNET and Bi-LSTM. Also, losses for all the three models decreases with increasing number of epochs which shows there is no overfitting.
14.4.2 Post Lockdown Tweet Analysis Using the model that performed best which is BERT, we predicted the class of a small post-vaccination dataset into depressed, non-depressed and neutral and draw a comparative analysis of the mental health during lockdown vs post-vaccination/postlockdown to get an insight. This gives us a faint perspective of improved mental health. But to actually draw an analysis, we need to know the distribution of the data into the various classes. From Figures 14.1 and 14.2, we can clearly see that there has been more than 40% drop in the depressed tweets. In fact, the tweets that have been made post the vaccination are dominated into non-depressed category. Fig. 14.1 Distribution of tweets during lockdown
118
N. Pervez et al.
Fig. 14.2 Distribution of tweets post lockdown
14.5 Conclusion and Future Work There has been no much work found pertaining to COVID tweet analysis towards mental depression So towards this, we in this work focussed on applying deep learning models on COVID Tweet data set for predicting the Tweet data as depressed, non-depressed and neutral. The models that have been implemented were Bi-LSTM, BERT and XLNET. BERT outperformed the other two models which are Bi-LSTM and XLNET with an accuracy of 96.7% approximately. The research work got lot of scope for future enhancement. So, if we used the models to train in languages like Italian, French, Spanish as well as German, we can also decide about the mental health in worst hit countries like Italy, France, Spain, etc. In future, when the vaccination drive would be carried out in mass and people are vaccinated, we can understand better about how the vaccination has impacted the mental health of the people.
References 1. Bollen, J., Mao, H., Pepe, A.: Modeling public mood and emotion: twitter sentiment and socioeconomic phenomena. In: Proceedings of the Fifth International AAAI Conference on Web and Social Media. AAAI, Barcelona, Spain, pp. 1–10 (2011) 2. Zunaira, J., Diana, I., Prasadith, B.: Monitoring tweets for depression to detect-at-risk users. In: Proceedings of Fourth Workshop on Computational Linguistics and Clinical Psychology. ACL, Vancouver, Canada, pp. 32–40 (2017) 3. Rane, A., Kumar, A.: Sentiment classification system of twitter data for US airline service analysis. In: Proceedings of 42nd Annual Computer Software and Applications Conference (COMPSAC). IEEE, Tokyo, Japan, pp. 769–773 (2018) 4. Zhang, Z., Robinson, D., Tepper, J.: Detecting hate speech on twitter using a convolution-gru based deep neural network. In: Gangemi, A., et al. (eds.) The Semantic Web ESWC 2018. Lecture Notes in Computer Science, vol. 10843, pp. 745–760. Springer, Cham (2018) 5. Maryam, H., Elke, R., Emmanuel, A.: EMOTEX: detecting emotions in twitter messages. In: Proceedings of ASE BIG Data/Social Com/Cyber Security Conference. ASE, Stanford, USA, pp. 1–10 (2014)
14 Application of Deep Learning for COVID…
119
6. . Pinkesh, B., Shashank, G., Manish, G., Vasudeva, V.: Deep learning for hate speech detection in tweets. In: Proceedings of the 26th International Conference on World Wide Web Companion (WWW ‘17 Companion). International World Wide Web Conferences Steering Committee ACM, Geneva, pp. 759–760 (2017) 7. Ziqi, Z., David, R., Jonathan, T.: Hate speech detection using a convolution-LSTM based deep neural network. In: Proceedings of ACM the Web conference (WWW’2018) ACM. New York, NY, USA, pp. 1–10 (2018) 8. Chayan, P., Pronami, B.: Detecting hate speech using deep learning techniques. Int. J. Adv. Comput. Sci. Appl. (IJACSA) 12(2), 619–623 (2021) 9. Alshalan, R., Al-Khalifa, H., Alsaeed, D., et al.: Detection of hate speech in COVID-19-related tweets in the Arab region: deep learning and topic modeling approach. J. Med. Internet. Res. 22(12), 1–12 (2020) 10. Amrutha, B.R., Bindu, K.R.: Detecting hate speech in tweets using different deep neural network architectures. In: Proceedings of Intelligent Computing and Control Systems (ICCS). IEEE, Madurai, India, pp. 923–926 (2019) 11. Depression through tweets. https://web.stanford.edu/class/archive/cs/cs224n/cs224n.1184/rep orts/6879557.pdf. Accessed 2021 Oct 01
Chapter 15
A Blockchain Solution for Secure Health Record Access with Enhanced Encryption Levels and Improvised Consensus Verification Monga Suhasini and Dilbag Singh Abstract Blockchain can be depicted as a permanent record, logging information in a decentralized way. This new innovation has been proposed to widen the horizons of information-driven spaces, including the medical information. Electronic medical records have recorded the course of event, improvement, and treatment of illnesses. So, it has high clinical worth. As healthcare information must be kept private and secure, information security and privacy preserving are the most crucial issues to be handled in healthcare. This paper analyzes and identifies the most pertinent security and privacy issues in existing healthcare systems. To resolve the privacy and security issues, blockchain innovation can act as a robust solution as blockchain technology uses cryptography and consensus verification as basic perspectives alongside decentralized architecture and immutable blocks of data. Various consensus and encryption algorithms of the blockchain technology have been studied in depth, and an overview has been presented in the paper. A novel blockchain-based patient health record (BB-PHR) system is proposed with improvised consensus mechanisms and enhanced encryption levels for medical data access control and protection in the paper. The system enforces consensus verification of patient and hospital administration both for maintaining privacy and tamper proof data access. Proposed system integrates encryption algorithms, access control mechanisms, and smart contract on a blockchain framework comprehensively to address the security and privacy vulnerabilities of a health record. With such a system, we can implement severe access and security control on healthcare information. Keywords Blockchain · Patient health records · Consensus · Encryption · Privacy · Security
M. Suhasini (B) · D. Singh Department of Computer Science and Engineering, Chaudhary Devi Lal University, Sirsa (HR), India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_15
121
122
M. Suhasini and D. Singh
15.1 Introduction With the advancement of technology, electronic medical records have turned into an imperative apparatus for healthcare sector. EMR gives the most valuable information for comprehensive medical analysis and research. Because the clinical information is vital for the analysis, it is very crucial to maintain security and safeguard these health records [1]. Subsequently, to proficiently coordinate these separate medical information bases for complete health consideration keeping up with security, and protecting data has turned into a troublesome issue in the healthcare sector [2]. The blockchain could empower another type of decentralized applications without delegates and fill in as the establishment for key components of web security architectures. As a cryptographic-based distributed record, blockchain development enables trust and security in transactions among various individuals in the association [3]. Blockchain is empowered by joining of many center innovations like cryptographic hash, computerized signature, and consensus agreement components [4]. This paper discusses various consensus mechanisms and encryption algorithms embedded in the blockchain technology to have a clear picture of technical aspects. Before designing a system for privacy preserving and safeguarding the critical health data, it is imperative to understand the security and privacy issues and challenges in healthcare sector, so an analytic research had been conducted to identify such challenges in the paper. At last, a system is proposed for privacy-preserving and secure data access with improvised consensus and enhanced encryption integrating blockchain for patient health records.
15.1.1 Contributions Following are the major contributions of this paper: • A comprehensive review has been conducted and major security and privacy challenges have been identified in existing healthcare systems. • In depth study has been done on encryption and consensus in blockchain technology, and an overview of the role of encryption and consensus is presented in the paper. • At last, in order to resolve the identified challenges, a blockchain-based patient health record (BB-PHR) system with enhanced encryption levels and improvised consensus verification has been proposed for privacy preserving and secure patient health record access.
15 A Blockchain Solution for Secure Health Record Access …
123
15.2 Literature Review Frikha et al. [1] proposed a new methodology that is an amalgamation of hardware and software including the most popular proof of work algorithm for consensus. Author designed an off-chain/on-chain architecture that to execute the PoW agreement calculation on FPGA. Study of various consensus algorithms used by embedded and hybrid HW/SW architecture is conducted in the paper. Lashkari et al. [2] since the appearance of decentralized record advancements, they have given assorted open doors in a wide scope of utilization areas. This article analyzes and gives an exhaustive study of the essentials of decentralized technology. A complete analysis of consensus algorithms in blockchain application areas is done using new architectural categorization. Hathaliya et al. [3] discusses transitions of healthcare 1.0 toward healthcare 4.0. This paper presents a comprehensive study and examination of cutting edge recommendations to keep up with security and protection in healthcare. Various scientific categorizations are utilized for investigating different security and protection issues in healthcare. The benefits and limits of different security and protection methods are examined in the paper. Zhang et al. [4] Nakamoto introduced blockchain as underlying technology behind bitcoin for first time, and that attracted the whole world toward cryptocurrency and distributed ledger technology. This paper gives an analysis of the major and most popular consensus algorithms along with their strengths and weaknesses applied in different blockchain platforms. Author majorly divided algorithms into two categories: PF and the AF consensus mechanisms stating finality on probabilistic or absolute behavior. This paper basically examined the literature related to role of consensus, encryption, and hashing in blockchain. Also, various articles related to current healthcare systems and their challenges have been studied. After an extensive review, major security and privacy challenges have been identified described in the next section. A blockchain solution is proposed to overcome these issues in healthcare.
15.3 Security and Privacy Challenges in Healthcare Systems These days, security and protection are the essential worries of the medical services industry due to a huge measure of medical information stored, retrieved and communicated over the Internet. More network breaches can be expected due to less security in the communication channel. Researchers have used the cryptography algorithms to handle these attacks [5]. In current systems, a central server acts as a repository of data where in data can be lost with a server failure that increases the risk of losing medical data [6]. To solve this problem, take a backup of data and stored in the centralized cloud server. There was one more issue like not shielded from
124
M. Suhasini and D. Singh
Table 15.1 Security and privacy challenges Security and privacy challenges
Description
Ransomware
It is a malware that threatens to publish or blocks access to data or a computer system until the victim pays a ransom fee to the attacker
Medical identity theft
It is the unlawful admittance of a person for unauthorized access of medical data to obtain services and care
Insider threats: oblivion, malicious, negligent, It is a danger to an association’s security or data professional that comes from the inside. Sometimes persons who are part of the organization intentionally or accidently use information outside that causes loss or harm to the organization Unsecured/misconfigured databases
Databases that are having old and less secure encryption levels can raise a breach in security of data. Also, a misconfigured database allows access to anyone leading to privacy issues
Third-party vendor compromise
A third-party vendor is an entity contracted with the healthcare organization to provide items or services, such as EHR systems and IT security systems. Healthcare organizations are experiencing data breaches from third-party vendors as well
Email compromise/fraud
Scammers use a spoofed email or compromised account to trick people into initiating a money transfer to an alternate (fraudulent) account in lieu of free healthcare services
the different vindictive assaults, for example, confidentiality-based, honesty-based, and accessibility-based assaults [7]. Table 15.1 lists various security and privacy challenges that healthcare industry has suffered over the time.
15.4 Overview of Encryption and Consensus in Blockchain Blockchain consolidates cryptography, decentralized framework innovation, distributed systems administration, and other notable advancements. Besides, blockchain also provides a secure framework, in which anyone cannot tamper the content of transactions and all the nodes participate in transactions anonymously [8].
15 A Blockchain Solution for Secure Health Record Access …
125
15.4.1 Consensus A vital part of blockchains is the way information is acknowledged onto the disseminated record by a consensus algorithm approving the information sections. Consensus plays a major role in maintaining agreement among nodes as when network expands, it is very difficult to maintain agreement and communication between different hubs in the network [9]. Various types of consensus mechanisms have been proposed that differ in terms of their underlying principles. Proposed architecture in the paper uses Ethereum consensus algorithm based on KECCAK256 which is an inherent part of Ethash PoW algorithm. Keccak is a hash algorithm family that serves as building blocks of the PoW consensus algorithms in blockchains [10]. It computes the Keccak-256 hash of the input.
15.4.2 Encryption and Hashing In order to enforce security and privacy in blockchain technology, encryption and hashing algorithms are used. Cryptography is the inherent component of blockchain where each piece of business data in the blockchain can be planned into a progression of hash keys which are like distorted characters made out of numbers and letters through a hash encryption work, consequently concealing explicit data [11]. SHA256 is a class of the SHA-2 algorithm cluster which is the most widely recognized hashing calculation in blockchain that generates a 256-bit message digest [12]. SHA256 is very secure and almost impossible to tamper with because of its complex hash calculation method. There must be no real way to turn around the result to compute the input [13, 14]. In this paper, SHA256 has been implemented for hash generation.
15.5 Proposed Solution: Blockchain-Based Patient Health Record (BB-PHR) System Healthcare systems are not secure as data is saved on a central server which can be compromised for security and privacy issues. Data access and sharing critical data with each other cannot deny the fact of data leakage. Blockchain is an effective solution as it provides decentralized, immutable, fully trusted and secure structure. This paper proposes a blockchain-based patient health record (BB-PHR) system to resolve the security and privacy issues identified above ensuring a privacy preserving enhanced encryption levels for data access and sharing.
126
M. Suhasini and D. Singh
15.5.1 System Design Proposed system is designed for the unique identification of participating entities and secure data access among them. The BB-PHR system focuses on three major entities: Patient, doctor and administrator. A new block address and global unique identifier are assigned for new registration of patient or doctor including all the information of the new entity. Proposed system is a smart contract based solution on Ethereum blockchain having improvised consensus verification and enhanced encryption levels for new block creation and data access. Figure 15.1 depicts the overview of proposed BB-PHR system including major participating entities and the processing flow among them. The roles of participating entities in the system are as follows: • Patient (P): A person is an entity who wants to take consultation registers himself/herself by entering personal data. Person can add medical issues after the verification, approval and role grant as “Patient” in the system. Patient is assigned a doctor and date of appointment is fixed. Patient can view/access health records thereafter. Also, patient can remove the doctor access whenever required. • Doctor (D): A person who himself/herself registers as medical professional including his personal details is another entity in the system. After verification and
Fig. 15.1 Overview of proposed BB-PHR system design
15 A Blockchain Solution for Secure Health Record Access …
127
approval “Doctor” role is granted. Doctor can only view/access the health record of a particular patient to whom he/she is assigned for consultation. After consultation is done doctor no longer can access records enforcing privacy controlled data access. • Admin (A): Hospital administrator is the entity in the system that acts as a bridge for communication and control between patient and doctor entity. Admin is the entity that verifies, approves, and grants role to the other two entities in the system.
15.5.2 Mapping Among Entities Encrypted double hash mapping has been implemented in the system between Patient (P) and Doctor (D) entity using the block hash address (BHA). P(BHA) ⇔ D(BHA) This mapping is controlled by Admin (A) for secure data access. For maintaining privacy and security of the records Admin (A) or Patient (P) can remove doctor access once the consultation is done.
15.6 Improvised Encryption and Consensus Mechanism An enhanced level of encryption has been achieved in the system for different participating entities. Also, an improvised consensus verification scheme has been implemented in the system for approval and access. Process begins with deployment of the smart contract by the admin. Once a new entity registers and inputs the data, an immutable new block is created and appended in the blockchain having enhanced encryption levels discussed below:
15.6.1 Enhanced Encryption Level Level 1 assigns a global unique identifier to each block, ensuring there is a single trusted copy of each record. Level 2 uses SHA256 encryption algorithm for unique hash generation for each new block created. Immutable new block is created having unique Ethereum block hash address, a global unique identifier (GUID), and SHA256 hash key. Figure 15.2 shows the flow of sequence of immutable block creation using EEL.
128
M. Suhasini and D. Singh
Fig. 15.2 Enhanced encryption level for new block
15.6.2 Improvised Consensus Verification Proposed system includes admin as a verifying and approving entity in addition to other verifying nodes in the blockchain network for improvised consensus verification. Flowchart for improvised consensus verification process is depicted in Fig. 15.3. All the new blocks created have to be verified and approved by admin. Until then neither patient nor doctor can add or view or access any data in the blockchain. Once verified and approved, admin grants role “Patient” and “Doctor” using KECCAK256 algorithm as an additional consensus verification scheme. Once admin grants role, a patient can add medical issues and doctor is eligible to be assigned as medical professional in the system. Admin assigns doctor to a particular patient using encrypted double hash mapping using block hash address (BHA) for both patient and doctor entity which permits doctor to securely access that particular patient record only. If mapping is not successful or doctor tries to access records before mapping is done, consensus is failed stating entity not permitted to access records; otherwise, consensus is successful and access permission is granted. If patient record access no longer is required, admin or patient removes doctor access ensuring integrity and privacy control.
15.6.3 Algorithm Figure. 15.4 illustrates the improvised consensus and encrypted BB-PHR access algorithm
15 A Blockchain Solution for Secure Health Record Access …
129
Fig. 15.3 Improvised consensus verification process
15.7 Conclusion A comprehensive review has been conducted to understand the various vulnerabilities and challenges that the current healthcare system is facing. Blockchain, due to its cryptographic features and consensus algorithms, is considered as a disruptive technology providing a solution to resolve the security and privacy challenges of healthcare data access and sharing. This paper gives an overview on the consensus, encryption, and hashing algorithms used in the blockchain. Finally, we proposed a
130
M. Suhasini and D. Singh
Fig. 15.4 Improvised consensus and encrypted BB-PHR access algorithm
blockchain-based patient health record system having improvised consensus verification and enhanced encryption levels for privacy preserving and secure data access in healthcare. Proposed system integrates administrator and patient-centric approach for authorized data access. Unique identification, SHA256Hash, Keccak256Hash, and encrypted double hash mapping in the system makes critical health records tamperproof, as any violation in the block or record would be identified due to different validation schemes used in the proposed system.
15 A Blockchain Solution for Secure Health Record Access …
131
References 1. Liu, X., Wang, Z., Jin, C., Li, F., Li, G.: A blockchain-based medical data sharing and protection scheme. IEEE Access (2019). https://doi.org/10.1109/ACCESS.2019.2937685 2. Jin, H., Xu, C., Luo, Y., Li, P.: Blockchain-based secure and privacy-preserving clinical data sharing and integration. In: Qiu, M., (Ed.), International Conference on Algorithms and Architectures for Parallel Processing. Springer, Cham. Switzerland AG2020, ICA3PP 2020, LNCS 12454, pp. 93–109. https://doi.org/10.1007/978-3-030-60248-2_7 (2020) 3. Taylor, P.J., Dargahi, T., Dehghantanha, A., Parizi, R.M., Choo, K.K.R.: A systematic literature review of blockchain cyber security. Digital Commun. Netw. 6(2), pp. 147–156, E-ISSN: 2352–8648 (2019) 4. Suhasini M., Singh, D.: Designing a transformational model for decentralization of electronic health record using blockchain. In: Proceedings of First International Conference on Computing, Communications, and Cyber-Security (IC4S 2019). Lecture Notes in Networks and Systems, vol. 121. Springer, Singapore. https://doi.org/10.1007/978-981-15-3369-3_55 (2020) 5. Hathaliya, J.J., Tanwar, S. (2020). An exhaustive survey on security and privacy issues in healthcare 4.0. Comput. Commun. 153, pp. 311–335, ISSN: 0140-3664, (2020) 6. Murugan, A., Chechare, T., Muruganantham, B., Kumar, S.G.: Healthcare information exchange using blockchain technology. Int. J. Electr. Comput. Eng. 10(1), 421–426, ISSN: 2088-8708. https://doi.org/10.11591/ijece.v10i1.pp421-426 (2020) 7. Zhang, P., White, J., Schmidt, D.C., Lenz, G., Rosenbloom, S.T.: FHIR chain: applying blockchain to securely and scalably share clinical data. Comput. Struct. Biotechnol. J. 16, 267–278 (2018). https://doi.org/10.1016/j.csbj.2018.07.004 8. Zhang, S., Lee, J.H.: Analysis of the main consensus protocols of blockchain. ICT express, 6(2), pp. 93–97, The Korean Institute of Communications and Information Sciences (KICS), Elsevier, ISSN: 2405-9595. https://doi.org/10.1016/j.icte.2019.08.001(2020) 9. Lashkari, B., Musilek, P.: A comprehensive review of blockchain consensus mechanisms. IEEE Access 9, 43620–43652 (2021). https://doi.org/10.1109/ACCESS.2021.3065880 10. Frikha, T., Chaabane, F., Aouinti, N., Cheikhrouhou, O., Ben Amor, N., Kerrouche, A.: Implementation of blockchain consensus algorithm on embedded architecture. Secur. Commun. Netw. 2021, p. 11, Article ID 9918697. https://doi.org/10.1155/2021/9918697(2021) 11. Zhang, J., Tan, R., Su, C., Si, W.: Design and application of a personal credit information sharing platform based on consortium blockchain. J. Inf. Secur. Appl. 55, 102659, ISSN 2214–2126. https://doi.org/10.1016/j.jisa.2020.102659 (2020) 12. Zhai, S., Yang, Y., Li, J., Qiu, C., Zhao, J.: Research on the application of cryptography on the blockchain. J. Phys. Conf. Ser.1168(3), p. 032077 IOP Publishing. https://doi.org/10.1088/ 1742-6596/1168/3/032077 (2019) 13. Kaushik, A., Choudhary, A., Ektare, C., Thomas, D., Akram, S.: Blockchain—literature survey. In: 2nd IEEE International Conference on Recent Trends in Electronics Information & Communication Technology (RTEICT), India (2017). 14. Selmanovic, D.: Cryptocurrency for dummies: bitcoin and Beyond. Toptal. https://www.toptal. com/bitcoin/cryptocurrency-for-dummies-bitcoin-and-beyond,Oct (2018)
Chapter 16
Addressing Item Cold Start Problem in Collaborative Filtering-Based Recommender Systems Using Auxiliary Information Ronakkumar Patel and Priyank Thakkar Abstract Recommender systems (RS) are becoming increasingly popular and are emerging as a viable answer to the problem of information overload. The collaborative filtering-based (CF) recommender system has trouble recommending new items or to new users because there are no ratings for new items or by new users. These issues are known as new item cold start (ICS) and new user cold start (UCS) problems. In many domains, such as ecommerce, we have auxiliary information such as an image and a textual description for each item. These auxiliary data can be used to produce feature vectors for new items, which can subsequently be used to find similarities between new items and existing items in the system, allowing for their recommendation, and alleviating item cold start problem. In this paper, we propose to extract feature vectors from images and textual descriptions of items using deep neural networks such as pretrained VGGNet and BERT, and then use these feature vectors for similarity calculation, allowing CF-based RS to predict ratings of new items accurately or with low error. We present 3 models that use only image, only text, and both images and text as auxiliary information, and tests on a large dataset from Amazon show that the suggested strategy is effective, with an MAE ranging from 0.77 to 0.86 on a 1 to 5 rating scale. Keywords Item cold start problem · Item-based collaborative filtering · Recommender systems · Deep learning · Auxiliary information
R. Patel · P. Thakkar (B) CSE Department, Institute of Technology, Nirma University, Ahmedabad, India e-mail: [email protected] R. Patel U & P U. Patel Department of Computer Engineering, Chandubhai S. Patel Institute of Technology (CSPIT), Charotar University of Science and Technology (CHARUSAT), Anand, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_16
133
134
R. Patel and P. Thakkar
16.1 Introduction The advancement in Web technologies and easy access to Internet has lead to plethora of information available online. Users have a tough time finding content/items of interest due to the information overload. The recommender system (RS) uses data about users and items to assist users identify products of interest. The recommendation of videos on YouTube, products on Amazon, music on Spotify, movies on Netflix are some examples of RSs. There are three types of recommender systems: (1) Content-based filtering (CBF) (2) Collaborative filtering (CF) and (3) Hybrid [4]. The CBF simply promotes items to the user that are similar to the ones he has liked in the past, for example, if the user has enjoyed a lot of action movies in the past, the user will be recommended a new action movie that he has not watched. CF, on the other hand, suggests items to users based on what other like-minded users have liked in the past. The method aids in the exploration of unknown items which may also be of interest [21]. Different information like explicit ratings, implicit ratings, product images, text, audio, and video are typically available and used for building user and item profiles. Traditional RSs [19, 20] though solve the problem of recommendation in general, but finds it difficult to recommend new items to users, and provide quality recommendation to new users [15]. This is know as item cold start (ICS) problem [24] and user cold start (UCS) problem [5], respectively. This is because traditional RSs rely on implicit or explicit feedback received by items or given by different users. This feedback is unavailable for a new item or a new user, making it impossible for a collaborative filtering-based recommender system to compute similarity between the new user and other users in the system, or between the new item and other items in the system, and thus recommend a newly added item or make a solid recommendation to a new user. Recommender systems are now prevalent in all domains, and items in each domain have different auxiliary information associated with them, such as images and reviews in eCommerce, text in the news domain, and audio in the music domain. These auxiliary information can be used, and new features can be engineered which can help alleviate problem of insufficient information about the items. The features extracted from auxiliary information allow to compute similarity between the new user and current users, as well as the new item and existing items in the system, allowing the CF-based RS to deal with new items and users. Recent advances in deep learning (DL) techniques such as convolutional neural network (CNN), sequential models, auto-encoders (AEs), generative adversial networks (GANs), and bidirectional encoder representations from transformers (BERTs) can aid in the extraction of latent features in auxiliary information, and they are now widely being used for this purpose [9, 11, 12]. The remainder of the paper is structured as follows: The related work is addressed in Sect. 16.2, while Sect. 16.3 throws light on the proposed approach, which focuses on ingesting different auxiliary information through DL techniques for solving the ICS problem. The datasets, evaluation metrics, and result analysis are discussed in Sects. 16.4, and 16.5 concludes the paper with some interesting future research directions.
16 Addressing Item Cold Start Problem in Collaborative Filtering …
135
16.2 Literature Review Various approaches to deal with ICS and UCS issues in CF-based RSs have been proposed in the literature. In this section, we will discuss traditional to eep learningbsed approaches adopted to address ICS and UCS. The authors in [17] used clustering and decision tree to deal with the ICS problem. Using historical ratings, previously rated items were clustered, and a decision tree was developed based on the clusters formed and the features of the items. The clusters were actually the leaves on the tree. The items that had not been rated were presented to a decision tree and placed in one of the clusters based on their characteristics. When an item is assigned to a cluster, it can be recommended to other users who liked the items in that and similar clusters. Content-based and collaborative strategies were used in the approach. It is difficult to determine item similarities in the absence of historical ratings, hence the authors of [26] proposed that the similarity be calculated using interrelated attributes of items. Interrelationship mining was used to create binary properties between each pair of items. The authors introduced two alternative item similarity calculation strategies in [3, 7]. The item similarity was calculated in the first technique based on the movie’s genres; the more similar the genres, the more similar the items are to each other [3]. The relative similarity was calculated using the transitive property of the trust [7] in the second approach for items that were not directly similar to the target items. The proposed approach could recommend the new items to the user because the similarities were not based on ratings. Wasserstein collaborative filtering (WCF) to predict user preference for new items was proposed in [10]. The researchers modeled the disparity between items with less ratings (warm items) and those with no ratings (new items) using Wasserstein distance. The assumption was that the ratings for new item were similar to the warm items which were similar to new items in terms of content information. The MetaCF was proposed to address new user cold start problem in [25]. The approach relied on meta-learning to enable fast learning for new users. It essentially generalized the learning and made CF model to learn in a few iterations. The proposed solution to the user cold start problem can also be used to solve the cold start problem for new items. In [18], the authors tackled a new ICS issue in micro-open education resources (OERs). To uncover similarities between OERs, researchers looked at educational environments and other attributes. The OERs were then clustered based on their estimated similarities. The learner may be recommended the new OER because it shared the same interests as the other OERs in the cluster. One of the approaches for filling in missing data is imputation. It can also be used to fill in the missing ratings in the sparse user-item rating matrix. The methods proposed in [1] used auxiliary information of items in negative matrix factorization for addressing the new item cold start issue. Using both active learning and item attribute information, the authors of [27] suggested an unique recommendation system for the item cold start problem. They created relevant user selection criteria based on item qualities and user rating history, and then merged the criteria in an optimization framework for user selection.
136
R. Patel and P. Thakkar
They then developed accurate rating predictions for the remaining unselected users using the feedback ratings, users’ past ratings, and item attributes. The proposed method outperformed traditional methods on two real-world datasets, demonstrating its superiority. In [6], authors suggested a method for predicting the user’s prospective preferences by merging the item’s attribute information with the historical rating matrix. The method used a matrix decomposition model to combine attribute and temporal data. The experimental results revealed that as compared to the baseline method, the suggested method produced a considerable improvement in recommendation accuracy when tested on the movielens and the climbed JD dataset. Researchers in [16] proposed LARA which was an end-to-end adversarial neural network with multiple generators. It mapped item attributes to user representation and helped in finding the user profile which was similar to the profile generated by LARA for the new items. The use of item attributes such as categories, keyword, tag, and so on with ratings is common in most RSs. Researchers have begun combining deep learning algorithms with the CF methodology because these approaches were time-consuming and extracting features required human intervention. The DNNRec [13] used the user and item-side information to minimize the cold start problem and outperformed state-of-the-art RSs. The model proposed in [23] to tackle ICS is the integration of two models. The item features were learned from the deep learning architecture stacked denoising autoencoder (SDAE) and then integrated to timeSVD++ CF model. In [24], to learn text features from a movie plot, the authors used a model that relied on bag of word approach which did not preserve semantic similarity of worlds. To address this, Fahad et al. [2] presented an algorithm HRS-CE that used word embedding model (Word2Vec) to produce distributed representation of items’ descriptions. They generated user profiles that captured the user taste and liking. The use of precise item descriptions rather than metadata represented deep semantics of items, resulting in a more complete user profile. To establish a relevance score between different things, the proposed model in [22] used BERT. The authors used item titles to retrain BERT’s masked language model for ecommerce, keeping next sentence prediction as the next purchase prediction. The item title tokens were used as item content to solve the ICS problem. We can observe from the literature review that the bulk of research employed routine item attributes to calculate similarities or form clusters. However, because these attributes are not readily available for the new item, bad recommendations result. Auxiliary information, such as photographs and descriptions/reviews of items, is also available in the ecommerce domain and can be used to generate more precise estimates of similarity and, as a result, recommendations. DL techniques may be used to extract features from this auxiliary data, allowing for a more accurate estimation of similarity between items and suggestions. With this in mind, the article proposes to use deep learning techniques to extract features from this data and integrate it into the CF pipeline.
16 Addressing Item Cold Start Problem in Collaborative Filtering …
137
16.3 Proposed Approach CF-based recommender systems cannot recommend new items as these items are not rated by any user in the system, and therefore, similarity of these items can not be computed with the existing items in the system. As previously stated, this issue is known as the item cold start (ICS) problem. Feature vectors extracted from auxiliary information of products, such as an image of the product, a textual description, or textual reviews, are available in some domains, such as eCommerce, and can be used to compute the similarity of the new item with existing items in the system. As a result, the likelihood of the new item being recommended to users with similar preferences also improves. This is the exact notion behind the proposed approach, as seen in Fig. 16.1. Visual and text features are retrieved from item images and item reviews, whenever a new item is added to the system. These retrieved features are used to determine how similar items in the system are. To predict the target user’s rating of a newly added item, items similar to the newly added item (as per the similarity computed using auxiliary information) that the target user has already rated are considered. The idea is captured in two equations Eqs. 16.1 and 16.2. Xi · X j X i X j
(16.1)
1 sim(i, j) × ru, j |sim(i, j)| j∈I
(16.2)
sim(i, j) =
rˆu,i = j∈I
Fig. 16.1 The proposed approach
138
R. Patel and P. Thakkar
Equation 16.1 is used to compute cosine similarity between items i and item j. X i and X j are feature vectors of items i and j, respectively. These feature vectors are extracted from auxiliary information using deep neural networks. Equation 16.2 computes target user u’s rating for the new item i as the weighted sum of the ratings u has given to the items similar to item i. In the formula, I is the set of items which are similar to the new item i and which have already been rated by the user u.
16.4 Experimental Analysis In this section, we discuss the dataset that is used, evaluation metrics employed, as well as our experimental setup and results analysis.
16.4.1 Dataset and Evaluation Metrics The Amazon data (2014)s clothing, shoes, and jewelry 5 core rating dataset [8] are used to evaluate the proposed approach. This dataset also contains images of the products, product review, description, price, category, etc. The dataset contains ratings to these products in the range of 1–5 by different users. The 2014s Amazon clothing, shoes, and jewelry dataset [8] with 5 crore ratings are used in this study. This dataset also contains images of the products, product reviews, product description, product price, product category, etc. The dataset contains user ratings to these products in the range of 1–5. For our experiment, we first cleaned the dataset by removing the items without image. This has made us left with 39,387 users, 23,033 items, and 278,677 ratings. Table 16.1 provides a small summary statistics. The first column indicates the number of ratings; an item has received or the number of ratings a user has given. The second column shows the number of items that have earned the number of ratings indicated in the first column’s corresponding cell, whereas the third column shows the number of users who have rated the number of items as indicated in the first column’s corresponding cell. The proposed model was tested using items with scores of 5, 6, 7, 8, 9, 10, or 15 or less as the new item. This means that during training, i.e., during similarity computation or rating prediction, all of these items’ ratings were masked. The masked ratings were used to form the test set. The proposed approach is evaluated using root mean square error (RMSE) and mean absolute error (MAE). Their formulas are given in Eqs. 16.3 and 16.4, respectively. 2 1 n i=1 ru,i − rˆu,i (16.3) RMSE = n MAE =
n 1 |ru,i − rˆu,i | n i=1
(16.4)
16 Addressing Item Cold Start Problem in Collaborative Filtering … Table 16.1 Statistics Number of ratings 5 6 7 8 9 10 11 12 13 14 15
Number of items
Number of users
5123 3409 2538 1803 1417 1163 942 742 643 544 457
15,558 8622 4979 3092 1936 1340 941 708 449 332 261
139
where ru,i is a true/ground-truth rating, while rˆu,i represents a predicted rating, and n is the number of items in the test set.
16.4.2 Experimental Setup and Result Analysis Three distinct models were set up to assess the proposed idea. We used only the image of the new item as auxiliary information in the first model (denoted as M1 in the Fig. 16.2) and only the text information related to the new item in the second model (denoted as M2). The third model (M3) used both image and text data related to the new item as auxiliary information. The pretrained VGG16 [14] model was used to extract visual features from the images of the items. To be precise, the visual features were extracted from the flatten layer (4096 features) of the pretrained VGG16 network. For extracting features from text data, BERT was used. The sentence embedding (128 features) was captured through BERT-base. The BERT used 12 transformer blocks, a hidden size of 128, and 2 attention heads. As demonstrated in Table 16.2, each model was put to the test in seven distinct trials. The core of the experiments is captured in the first column of Table 16.2. For example, a value of 5 in this column indicates that all items with a rating of 5 or less were deemed test items, with their ratings masked and used to build the test set. It is easy to understand that the rating vector for each of these items is now just a null vector with no ratings, making it impossible to compute similarity between these items and other existing items in the system using a standard CF-based RS that depends solely on ratings. However, in the suggested technique, we employ auxiliary information connected with these test items to extract feature representation for them, allowing for similarity computation and probable recommendation. The remaining rows in the remaining entries in Table 16.2 can be interpreted in the same way, and it should
140
R. Patel and P. Thakkar
Fig. 16.2 M1: Image as auxiliary information, M2: Text as auxiliary information, M3: Image and text as auxiliary information Table 16.2 RMSE and MAE of models M1, M2 and M3 in 7 different experiments M1 M2 M3 Items RMSE MAE RMSE MAE RMSE MAE ratings 5 6 7 8 9 10 15
1.15 1.17 1.18 1.19 1.20 1.21 1.27
0.7763 0.7803 0.7910 0.7954 0.8021 0.8077 0.8500
1.15 1.16 1.18 1.19 1.20 1.21 1.26
0.7794 0.7843 0.7927 0.7985 0.8048 0.8112 0.8516
1.15 1.16 1.18 1.19 1.20 1.21 1.26
0.7803 0.7844 0.7924 0.7983 0.8042 0.8110 0.8520
be noted that each following experiment is a superset of the previous experiment, with a few additional items added to the experiment. In all the experiments, we can notice that ratings for the test items which are actually items without any ratings in the training set are predicted with a good accuracy. The MAE varies between 0.776 to 0.852 which is a reasonably low error considering that we are predicting for only cold start items. We don’t find much choose between 3 models though as it can be seen that all of them perform almost identical.
16 Addressing Item Cold Start Problem in Collaborative Filtering …
141
16.5 Conclusion and Future Work Traditional collaborative filtering-based recommender systems suffer from item cold start (ICS) problem. We in this paper used auxiliary information to engineer features of cold start items. We used deep neural networks such as pretrained VGGNet and BERT to extract features from images and text descriptions of items. Three models were created that were then put to the test in 7 various tests. As auxiliary information, these 3 models used only image, only text, and both image and text. We found that all the models enable collaborative filtering-based recommender system to recommend new items with reasonably lower error rates. The performance of three models, however, did not differ significantly. This necessitates a more thorough analysis, but it does present an intriguing potential research opportunity. User cold start (UCS) problem is also an interesting problem which can be explored in the future.
References 1. Alghamedy, F., Zhang, J., Al-Ghamdi, M.: Imputing item auxiliary information in NMF-based collaborative filtering. In: Computer Science & Information Technology (CS & IT). AIRCC, pp. 21–36 (2018) 2. Anwaar, F., Iltaf, N., Afzal, H., Nawaz, R.: HRS-CE: a hybrid framework to integrate content embeddings in recommender systems for cold start items. J. Comput. Sci. 29, 9–18 (2018) 3. Barman, S.D., Hasan, M., Roy, F.: A genre-based item-item collaborative filtering: facing the cold-start problem. In: Proceedings of the 2019 8th International Conference on Software and Computer Applications, pp. 258–262. ICSCA ’19, Association for Computing Machinery, New York, NY (2019) 4. Bobadilla, J., Ortega, F., Hernando, A., GutiéRrez, A.: Recommender systems survey. Knowl.Based Syst. 46, 109–132 (2013) 5. Bobadilla, J., Ortega, F., Hernando, A., Bernal, J.: A collaborative filtering approach to mitigate the new user cold start problem. Knowl.-Based Syst. 26, 225–238 (2012) 6. Guo, X., Yin, S.C., Zhang, Y.W., Li, W., He, Q.: Cold start recommendation based on attributefused singular value decomposition. IEEE Access 7, 11349–11359 (2019) 7. Hasan, M., Roy, F.: An item-item collaborative filtering recommender system using trust and genre to address the cold-start problem. Big Data Cogn. Comput. 3(3) (2019) 8. He, R., McAuley, J.: Ups and downs: modeling the visual evolution of fashion trends with oneclass collaborative filtering. In: Proceedings of the 25th International Conference on World Wide Web, pp. 507–517. WWW ’16, International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE (2016). https://doi.org/10.1145/2872427. 2883037 9. Martins, G.B., Papa, J.P., Adeli, H.: Deep learning techniques for recommender systems based on collaborative filtering. Expert Syst. 37(6), e12647 (2020) 10. Meng, Y., Yan, X., Liu, W., Wu, H., Cheng, J.: Wasserstein Collaborative Filtering for Item Cold-Start Recommendation, pp. 318–322. Association for Computing Machinery, New York, NY (2020) 11. Mu, R.: A survey of recommender systems based on deep learning. IEEE Access 6, 69009– 69022 (2018) 12. Pan, W.: A survey of transfer learning for collaborative recommendation with auxiliary data. Neurocomputing 177, 447–453 (2016)
142
R. Patel and P. Thakkar
13. R, K., Kumar, P., Bhasker, B.: DNNRec: A novel deep learning based hybrid recommender system. Expert Syst. Appl. 144, 113054 (2020) 14. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2015) 15. Su, X., Khoshgoftaar, T.M.: A survey of collaborative filtering techniques. Adv. Artif. Intell. 2009(Section 3), 1–19 (2009) 16. Sun, C., Liu, H., Liu, M., Ren, Z., Gan, T., Nie, L.: LARA: Attribute-to-Feature Adversarial Learning for New-Item Recommendation, pp. 582–590. Association for Computing Machinery, New York, NY (2020) 17. Sun, D., Luo, Z., Zhang, F.: A novel approach for collaborative filtering to alleviate the new item cold-start problem. In: 2011 11th International Symposium on Communications Information Technologies (ISCIT), pp. 402–406 (2011) 18. Sun, G., Cui, T., Xu, D., Shen, J., Chen, S.: A heuristic approach for new-item cold start problem in recommendation of micro open education resources. In: International Conference on Intelligent Tutoring Systems, pp. 212–222. Springer, Berlin (2018) 19. Thakkar, P., Varma, K., Ukani, V.: Outcome fusion-based approaches for user-based and itembased collaborative filtering. In: International Conference on Information and Communication Technology for Intelligent Systems, pp. 127–135. Springer, Berlin (2017) 20. Thakkar, P., Varma, K., Ukani, V., Mankad, S., Tanwar, S.: Combining user-based and itembased collaborative filtering using machine learning. In: Information and Communication Technology for Intelligent Systems, pp. 173–180. Springer, Berlin (2019) 21. Wang, J., Yue-xin, L., Chun-ying, W.: Survey of recommendation based on collaborative filtering. J. Phys.: Conf. Ser. 1314(1), (2019), copyright - © 2019. This work is published under http://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License; Last updated - 2021-09-01 22. Wang, T., Fu, Y.: Item-based collaborative filtering with BERT. In: Proceedings of The 3rd Workshop on e-Commerce and NLP, pp. 54–58. Association for Computational Linguistics, Seattle, WA (Jul 2020) 23. Wei, J., He, J., Chen, K., Zhou, Y., Tang, Z.: Collaborative filtering and deep learning based hybrid recommendation for cold start problem. In: 2016 IEEE 14th International Conference on Dependable, Autonomic and Secure Computing, 14th International Conference on Pervasive Intelligence and Computing, 2nd International Conference on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech), pp. 874–877 (2016) 24. Wei, J., He, J., Chen, K., Zhou, Y., Tang, Z.: Collaborative filtering and deep learning based recommendation system for cold start items. Expert Syst. Appl. 69, 29–39 (2017) 25. Wei, T., Wu, Z., Li, R., Hu, Z., Feng, F., He, X., Sun, Y., Wang, W.: Fast adaptation for coldstart collaborative filtering with meta-learning. In: 2020 IEEE International Conference on Data Mining (ICDM), pp. 661–670 (2020) 26. Zhang, Z.P., Kudo, Y., Murai, T., Ren, Y.G.: Addressing complete new item cold-start recommendation: a niche item-based collaborative filtering via interrelationship mining. Appl. Sci. 9(9) (2019) 27. Zhu, Y., Lin, J., He, S., Wang, B., Guan, Z., Liu, H., Cai, D.: Addressing the item cold-start problem by attribute-driven active learning. IEEE Trans. Knowl. Data Eng. 32(4), 631–644 (2020)
Chapter 17
Research on the College English Blended Teaching Model Design and Implementation Based on the “Internet + Education” Yu Liu Abstract The Internet has created unprecedented opportunities for the development of education. The new generation of information technologies such as artificial intelligence and big data plays an important role in optimizing learning methods, improving teaching efficiency, and helping educational equity. This study attempts to design and implement a school-based blended teaching model based on four intelligent platforms. The overall design framework includes a needs analysis, hierarchical teaching objectives, diversified learning resources, and systematic learning processes. The specific implementation strategies include integrated teaching methods, multidimensional teaching interaction, and dynamic teaching evaluation. In addition, qualitative and quantitative empirical studies were conducted to verify the effect of the teaching model. This teaching model has been implemented in a college in China and has achieved satisfactory results. Keywords Internet + Education · College English · Blended teaching model · Design · Implementation
17.1 Introduction Blended teaching is a new teaching model based on e-learning, which means to achieve an integrated and unified teaching model in supervised physical places with online teaching where students independently control the learning process [1]. In 2018, the Ministry of Education of China issued the Education Informatization 2.0 Action Plan, whose core concept is the deep integration of information technology, education, and teaching, offering a new avenue for reforming foreign language education in China. In 2020, China’s University English Teaching Guide suggested that universities build or use quality courses such as online open courses, offline courses, online and offline hybrid courses, and virtual simulation experimental
Y. Liu (B) Institute of Linguistics, Shanghai International Studies University, Shanghai, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_17
143
144
Y. Liu
courses, and implement a hybrid teaching model, so that students can develop in the direction of active, independent, and personalized learning [2]. The integrated “Internet + education” intelligent system provides sufficient network learning space for foreign language education and teaching. However, the reformation of university English teaching has always faced many problems, such as reducing English class hour credits, unsatisfactory learning effects, and improvement of meeting students’ personalized needs. This study designs and implements the blended English teaching model in an undergraduate college to solve these problems.
17.2 Literature Review 17.2.1 “Internet + Education” International and domestic scholars discussed the relationship between the Internet and education. Sharma and Maleyeff proposed that the emergence of the Internet as a potent medium for commerce and communication will be reflected in the content and the delivery mode of management education [3]. Dogruer et al. mentioned that as the Internet has many different functions, it is important to consider to what extent it is used by students in higher education for academic purposes. The Internet also provides students with asynchronous education where they can reach any kind of information anytime and anywhere [4]. Qin and Zhang a Chinese scholar, defined “Internet + education” as a new form of education, reflecting the essence of “Internet + education”. “Internet + education” is not just the application of Internet and mobile Internet technology in education and not just the establishment of various education and learning platforms, but the deep integration of Internet, mobile Internet, and education. It is a strategic and global educational change to promote educational progress, efficiency improvement, and organizational change, and enhance educational innovation and productivity [5]. Yang and Yu believe that “Internet + education” should first be regarded as a new paradigm of education development under the new normal. The essence of “Internet+Education” lies in innovating the teaching, learning mode and school running mode, so as to optimize and reconstruct the relationship between teachers and students, school and society, supply and demand 6.
17.2.2 Blended Teaching Studies on the blended teaching model can be traced from the literature of international and domestic scholars.
17 Research on the College English Blended Teaching Model Design …
145
Barnum and Paarmann adopted a blended learning model that comprises both electronic and face-to-face interaction. His blended learning model has four pieces: webbased delivery, face-to-face processing, creating deliverables, and a collaborative extension of learning [7]. Graham believed that blended learning is part of the ongoing convergence of two archetypal learning environments. On the one hand, they have the traditional faceto-face learning environment that has been around for centuries. On the other hand, they have distributed learning environments that have begun to grow and expand in exponential ways as new technologies have expanded the possibilities for distributed communication and interaction [8]. Bersin designed four stages for the blended teaching process: (1) analyzing the internal needs and expectations of learners; (2) conducting teaching designs and formulating evaluation standards according to the learning status and internal needs of learners; (3) exploring and integrating teaching resources using the appropriate technical media; and (4) carrying out teaching practices, tracking the implementation effects, and evaluating the implementation results [9]. Horn believed that blended teaching is a formal form of educational organization, where students conduct at least partial online learning and can independently control the time, place, path, or progress of learning with at least part of the time in supervised physical places outside the home [10]. Chinese scholars have also explored the connotation and extension of blended teaching. He proposed that blended teaching combines the advantages of traditional learning methods with digital learning. It plays the leading role of teachers in guiding, inspiring, and monitoring the teaching process and focuses on student enthusiasm, initiative, and creativity [11]. Tian proposed that blended learning is a mixture of various learning theories, methods, media, content, modes, student support services, and environment [12]. Liu proposed the connotation and extension of blended teaching. In a broad sense, it includes a mixture of learning theory and teaching media, modes, and methods. It refers to a mixture of offline and online teaching in a narrow sense [13]. Moreover, Chinese scholars have also explored the construction and implementation of the hybrid teaching model in the Chinese educational environment. Zhou studied the reading, cooperative learning, and online participation learning concept relying on online and offline activities, such as classroom demonstration, collaborative learning, instant discussion, interactive learning, peer evaluation, and resource sharing. Researchers at home and abroad have conducted exploratory research on the mixed teaching model and have produced productive results in theory and practice 14. Hu proposed that hybrid teaching research should transcend technical decisionmaking, comprehensively use multi-two-language learning and teaching theory from the perspective of mixed learning ecological construction, pay attention to the three key issues of promoting learning mechanism, teaching evaluation, and mixed foreign language learning design, and promote effective teaching practice [15].
146
Y. Liu
For English teaching in Chinese universities, exploration based on the hybrid teaching model is deepening, but many key problems must still be solved. For example, how do we design the hybrid teaching model? What teaching strategies are suitable for mixed teaching models? How effective is the hybrid teaching model? This research aims to solve these problems.
17.3 Blended Teaching Model Design See Fig. 17.1
17.3.1 Needs Analysis Before formally implementing the mixed teaching model, teachers’ understanding and attention towards student learning are particularly important. In addition to understanding the students’ English proficiency, the teachers must also master the internal needs of the students (language ability, skill demands, learning motivation, and learning resource preferences). Teachers must also periodically investigate the student learning process and learning effect in the implementation process. Methodology The teacher organized the students to complete an online questionnaire at the beginning of the semester. After the end of the semester, two surveys were completed twice to immediately reverse the teaching of the next school year. The investigation and analysis of the learning situation were conducted in stages throughout the teaching process. At the beginning of the semester, we conducted
Fig. 17.1 Blended teaching model design and implementation
17 Research on the College English Blended Teaching Model Design …
147
the first student survey of the 2021 non-English major undergraduates in a mixed teaching model experimental class, including eight classes, 23 majors, and 737 students. The questionnaire includes student learning needs and expectations of the blended teaching models: language skills, learning motivation, learning focus, and learning resources. Results and Discussion The research results found that 86 and 76% of students chose to listen and speak language skills, and only 17 and 13% of students chose writing and translation, indicating that students have a greater need for improving their listening and oral language skills. In terms of learning motivation, 75% of students think it is most important to complete course tasks and exams, whereas 45 and 35% believe that their career development and study abroad are more important than improving their language ability. In terms of language skills, 32 and 26% of students indicated that it was better to develop the ability to obtain information and express ideas in English and to integrate into the international environment, whereas 18 and 17% proposed strengthening the training of language skills, such as reading, writing, translation, and gaining more language input and output opportunities through the information network. Regarding learning resource preferences, 67%, 44%, and 44% of the students choose classic English film appreciation, background culture knowledge, selected exercises, and detailed explanations, respectively. Therefore, we have four results. First, students have a high demand for listening and speaking skills and low translation and writing needs. Second, they pay most attention to short-term development goals, with lifelong language ability development ranking last. Third, they have a high demand for language output and input ability and low training demand for skills. Fourth, students are more interested in original English films and have less stereotypical practice.
17.3.2 Learning Objectives According to the University English Teaching Guide (2020) and the initial learning situation survey, the overall teaching objectives and unit learning objectives are divided into three levels: primary, middle, and higher goals, which are connected to the foundation, improvement, and development goals of the guide, respectively. Primary-Order Goal A primary-order goal is to meet one’s needs for information communication in daily life, study, and future work. Students should remember English speech, vocabulary, grammar, and discourse structure understand oral or written material of medium language, common personal and social communication subjects, and apply simple oral and written communication on familiar or other topics. Middle-Order Goal A middle-order goal is to communicate independently with English on familiar topics in daily life, study, and future work. Students should
148
Y. Liu
remember English speech, vocabulary, grammar, and discourse structure, understand logical relationships, discourse structure, and implied meaning of the material, apply the clear description of events, items, reasons, and plans. Higher-Order Goal A higher-order goal is to communicate effectively in many fields, such as daily life, study, and future work. Students should remember English speech, vocabulary, grammar, and discourse structure, understand some difficult language, familiar content or content related to the major for oral or written materials, apply oral and written communication on topics of public concern and professional interest. Moreover, students should synthesize, compare, and analyze the information from different sources and draw conclusions or form an understanding.
17.3.3 Online Resources The online learning resources in this study were primarily derived from four Internet education platforms. Unipus Online Learning Platform The Unipus online learning platform integrates learning, practice, testing, evaluation, and research under the Foreign Research Institute. This teaching model uses the online learning resources of New Horizons University English (third edition) on the platform. With the help of various kinds of independent learning resources provided by the Unipus learning platform, such as listening practice, reading tests, writing, and translation training, students can obtain the habit of learning English actively and acquiring language knowledge. Rain Classroom The rain classroom, an online interactive platform organized and developed by the Online Education Office of Tsinghua University, is predominantly used in the classroom teaching process. This classroom skillfully integrates teaching resources into PowerPoint and WeChat, with functions including preclass preview, class instant test, teacher-student interaction (barrage, submission, red envelope, roll call, speech, etc.), homework submission (answering questions, taking photos, voice, etc.), learning data analysis, and providing a good environment for the formative evaluation in the class. I-Test Intelligent Evaluation Cloud Platform The i-test intelligent evaluation cloud platform is used for the pretests and posttests of students participating in the teaching model practice and plays a role in promoting teaching through testing. It can improve the effect of students’ independent learning and teacher-student interaction, practice their English language knowledge, and improve their English language ability. Four-Idea Future (FIF) Intelligent Teaching Platform The four-idea future (FIF) intelligent teaching platform of Beijing iFlytek Education Technology Co., Ltd. involves online courses, an oral training system, and course evaluation. This teaching
17 Research on the College English Blended Teaching Model Design …
149
model adopts the FIF oral language training system for students to conduct oral language practice and evaluation. The course evaluation system can build student e-learning files, collect learning behavior data, and conduct intelligent analyses. Moreover, teachers can conduct reverse dial teaching according to the data analysis results.
17.3.4 Learning Guide The University English Blended Learning Guide aims to guide students to complete English online hybrid learning tasks. Each learning platform provides learning resources and optional learning strategies, as well as learning evaluation and assessment, helping students adapt to the new teaching model as soon as possible and obtain the expected learning effects. The learning guide is divided into three parts. Primary-Middle and High-Order Teaching Objectives Teachers systematically improve the learning guide according to the Guidelines for College English Teaching (2020) promulgated by the Ministry of Education and the newly revised school-based Outline of College English Curriculum. Pre-During-After-Class Learning Task All learning tasks before and after class are for students to study independently, engage in group discussion and sharing, and make comments in the form of individual and group combinations. Specifically, the learning task is also divided into three steps. Step 1:
Step 2:
Step 3:
Is pre-class online learning. Students should conduct collaborative online learning before class and complete the online learning tasks scheduled in each study guide chapter. Group help and teachers can be employed to achieve effective language input if necessary. Is during-class teaching and learning. The teacher and students solve difficult problems, take tests, display the learning results, and check and fill the gaps to complete the teaching content to realize a combination of language input and output (language practice). Is after-class learning and evaluation. After class, the students complete the language practice projects in the form of group cooperation to conduct the comprehensive practice training of the language and improve the students’ comprehensive language application ability. Students must also evaluate and reflect on the process and results of independent learning, and teachers must further provide feedback and personalized guidance on students’ doubts and deficiencies.
Evaluation and Reflection on Teaching and Learning Pre and after-class online learning includes autonomous and group study scoring, learning time, and selfreflection. Learning reflection in the class includes learning gains and unresolved
150
Y. Liu
questions. A reasonable and systematic guiding document is formed to realize the students’ independent acquisition of the low-order target platform, middleorder target live broadcast and face-to-face acquisition, and high-order target online lectures and MOOCs. The document also refines the learning content (key points, difficulties, doubts, and hot spots), enriches materials, and uses vivid language to enhance the learning guide. The two evaluation tables were designed based on a mixed teaching model of pre-class online learning, in-class learning, and after-class online consolidation learning, including online learning evaluation tables and in-class learning reflection tables. The online learning evaluation table includes two parts: individual and group learning, divided into a quantitative learning score and learning time. Self-reflection includes the learning experience, difficulties, and improvement measures. The mutual combination of quantitative and qualitative evaluation helps teachers and students grasp the self-evaluation and reflective results of individual and group learning during the online learning process. The learning reflection table for the class primarily contains two parts: qualitative learning harvest and unanswered questions, which help teachers understand the students’ mastery of knowledge and the need to be further strengthened promptly and obtain teaching feedback in time.
17.4 Blended Teaching Model Implementation 17.4.1 The Unit Case Objectives of Unit Each unit is divided into two parts: Part A and B. The primaryorder goal is to complete the culture leading activities and Part A words, text, translation, and paragraph writing. The middle-order goal is to complete Part B words and text content. The higher-order goal is to complete the field of expanding knowledge learning. Each unit has four classes. Pre-During-After-Class Learning Task Before the first time of class, pre-class learning tasks have three steps: unit subject and culture leading learning (audiovisual practice), Part A vocabulary preview (listening, group words, and translation), and grammar knowledge supplementary learning (data collection). In class answer and extension units import learning content, share and supplement Part A vocabulary knowledge, elaborate on grammar knowledge, and engage in questions and answers. After class, grammar knowledge consolidation practice is completed. The Unipus platform supports Part A vocabulary consolidation questions. Middle-order target students can learn Part B of the vocabulary platform, and higher-order target students can choose the professional vocabulary expansion learning provided on the learning platform.
17 Research on the College English Blended Teaching Model Design …
151
Before the second class, the interactive pre-reading and full-text understanding (listening, reading, retelling, translation, and discussion) and paragraph writing exercises can be completed before class. In the class, students share and show the structure and meaning of Part A articles. After class, middle-order target students can complete the study of Part B articles on the platform independently, and higher-order learning target students can choose relevant articles in their major for expanded reading training. Before the third class, students can engage in grammar and sentence pattern analysis of Part A and B text before class, ask and answer questions in class, and complete the special consolidation exercises on the learning platform after class. In addition, audiovisual exercises and three-order target set learning tasks can be completed for the fourth class. The Evaluation and Reflection on Teaching and Learning Students complete the language practice projects in the form of group cooperation to conduct comprehensive practice training in the language and improve their comprehensive language application ability. Moreover, students must also evaluate and reflect on the process and results of independent learning, and teachers must provide further feedback and personalized guidance on students’ doubts and deficiencies.
17.4.2 Integration of Teaching Methods According to the teaching objectives, teachers need to adapt teaching methods conducive to student development that fully reflect student participation, interactivity, and integration. In practice, this teaching model organically integrates the stent, situational, task driven, cooperative, and inquiry teaching methods based on the constructivist learning theory. For example, in the pre-class learning task, the teacher firstly establishes the cultural background, theme, real events, or problem situations. Then they set easy, moderate tasks, where students must first independently explore learning content. Finally, they engage in group collaborative learning, gradually constructing knowledge and performing self-learning evaluation and mutual group evaluation. In addition, the inquiry and cooperative methods can also be used to assist high-order target students to independently explore their subject field knowledge based on the learning platform and further cultivate their academic research ability.
17.4.3 Multidimensional Interaction This teaching model aims to realize the four-dimensional interaction model between teachers and students and between students for both online and offline platforms. The
152
Y. Liu
interaction between teachers and students is dominated by teachers in the leading role, with students as the main body. Teachers play a role in guiding, tutoring, and supervising the student learning process. Teachers guide the students to learn according to the learning process in the learning guide, train their knowledge and skills, address the difficulties and questionable points that the students encounter, and supervise the students learning results to help them improve their learning ability. Student interaction refers to collaborative learning between students. The learning tasks in the learning guide are based on students’ collaborative and independent learning, with self-evaluation and mutual evaluation of the learning results. Teachers use four online platforms to create preset modules incorporated into the learning process to help students achieve three goals and provide personalized teaching. In the online platform and offline face-to-face teaching interactions, students complete online learning tasks independently and jointly, and offline, teachers assist students in completing, consolidating, and expanding on the learning content to improve their learning ability.
17.5 Conclusion This study attempts to design and implement a school-based blended teaching model based on an intelligent system. The overall framework includes the process of a learning situation investigation, hierarchical teaching objectives, diversified learning resources, and systematic learning processes. The specific strategies include integrated teaching methods, multidimensional teaching interaction, and dynamic teaching evaluation. In addition, qualitative and quantitative empirical studies were conducted to verify the effect of the teaching model. This teaching model has been implemented in a college in China and has achieved satisfactory results. Some deficiencies exist in this study, which will continue to be improved in future studies. First, the evaluation of this teaching model should combine qualitative and quantitative methods to provide scientific feedback (i.e. the student learning evaluation) on the teaching model. Second, we could expand the online learning resources, such as the online live broadcast platform. The post-epidemic era has led to the rise of live-streaming platforms, which have become an indispensable online learning method. Teachers primarily use live broadcast platforms and other Tencent conferences to lecture online. Finally, the quality of the analysis of teacher interviews must also increase in the teaching evaluations to form a multidimensional interactive evaluation model between teachers and students for the online and offline classrooms and the teaching environment and policies.
17 Research on the College English Blended Teaching Model Design …
153
References 1. Qin, N.: Research on mixed teaching model construction under the background of “Internet + .” Shandong Normal University, Shandong (2017) 2. Steering Committee on University Foreign Language Teaching of the Ministry of Education: University Guide to English Teaching. Shanghai Foreign Language Education Press, Shanghai (2017) 3. Sharma, P., Maleyeff, J.: Internet education: potential problems and solutions. Int. J. Educ. Manage. 17(1), 19–25 (2003) 4. Dogruer, N., Eyyam, R., Menevis, I.: The use of the internet for educational purposes. Proc. Soc. Behav. Sci. 28, 606–611 (2011) 5. Qin, H., Zhang, W.S.: The essential characteristics and development trend of “Internet + education.” Edu. Res. 37(6), 8–10 (2016) 6. Yang, Q.Y., Yu, L.: Analysis of the connotation, characteristics and core elements of “Internet + Education.” China Edu. inform. 5, 25–30 (2018) 7. Barnum, C., Paarmann, W.: Bringing induction to the teacher: a blended learning model. T.H.E. J. 9, 56–64 (2002) 8. Graham, C.R.: Blended learning systems: definitions, current trends, and future directions. Global perspectives, local designs, Pfeiffer Publishing, San Francisco, The handbook of blended learning (2003) 9. Bersin, J.: The blended learning book: best practices, proven methodologies, and lessons learned. Pfeiffer Publishing, San Francisco, Calif (2004) 10. Horn, M., Sterker, H.: Mixed learning. Machinery Industry Press, China (2016) 11. He, K.K.: New development of educational technology theory from blended learning. J. National Acad. Edu. Admin. 9, 37–48 (2005) 12. Tian, F.L.: The practice and exploration of mixed teaching mode in universities under the information environment. Electr. Edu. Res. 4, 63–65 (2005) 13. Liu, H., Teng, M., Zhang, P.: What is the difficulty of hybrid teaching design?—Analysis of the online and offline hybrid teaching design scheme based on the Rasch model. Chinese Higher Edu. Res. 10, 82–87+108 (2020) 14. Zhou, Y.: The design and practice of the mixed teaching model. Mod. Educ. Technol. 3, 85–91 (2019) 15. Hu, J.H.: Theoretical connotation and research paradigm of mixed foreign language teaching. Foreign Lang. Commun. 4, 2–10 (2021)
Chapter 18
Qualitative Analysis of SQL and NoSQL Database with an Emphasis on Performance Jyoti Chaudhary, Vaibhav Vyas, and C. K. Jha
Abstract The social networking sites and different real-time applications generate data in very large amount and unstructured in nature which makes it difficult for relational database management systems to handle such data. SQL databases face many new challenges to prove them effective databases to store and process big data or unstructured data as they require high-rise in read and write performance. The use of SQL databases has proved them inadequate when it comes to store and query dynamic user data, especially for bulky, highly parallel applications such as search engines and social networks. For these cases, the NoSQL database was created to deal with. This document provides a background introduction, features and their specifications, and available data models for NoSQL database. In addition, this paper stratifies NoSQL databases performance on the basis of CRUD operations which will mainstream the benefits and use cases provided by NoSQL databases to help companies choose NoSQL. Keywords NoSQL · SQL · Comparison · Database system · CRUD operation · Data model
18.1 Introduction In the past, a number of applications often used SQL databases to collect data. SQL databases work on the basic structure in tabular format which contains rows and columns. SQL query language can be used to insert, query, update, and delete data. Parenthetically, technologies that can process, scale, and deliver the need for permanent availability in support of online application. As a result, relational or SQL J. Chaudhary (B) · V. Vyas · C. K. Jha Department of Computer Science, Banasthali Vidyapith, Jaipur, India e-mail: [email protected] V. Vyas e-mail: [email protected] C. K. Jha e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_18
155
156
J. Chaudhary et al.
databases are struggling to keep up with the wave of modernization (Puangsaijai and Puntheeranurak, [1]. The growing needs of industries require non-relational database to scale with high efficiency. This raises the use of NoSQL databases as they are highly scalable, efficient and can store any pattern of data like semi-structured and unstructured or either structured. Whereas, SQL databases are only able to handle structured format of data effectively. And, semi-structured data requires to be converted first in relational data before they can be stored. Likewise, the unstructured data is also not able to store directly (Gupta et al. [2]; Raut [3]). Thus, NoSQL was coined as a standard but not to be in opposition to existing SQL databases. NoSQL basically came in existence to satisfy the requirement which were not fulfilled with relational databases for current data structure of industries Sahatqija et al. [4]. A NoSQL database which is also called as “NoSQL” or else “non-relational” which provides a standard for storing and retrieving different forms of data in a different way rather to store the data in tabular format which was used in SQL database. The key principle of NoSQL database is to easy the storage and recovery of data irrespective to its structure and content. The NoSQL framework is a non-relational database that is sent and is projected for extensive information storage and near-parallel data processing between myriad of servers (Ali et al. [5]). This paper aims to present features and characteristics of both database technologies in particular how NoSQL has become a new solution to existing SQL database. These two databases differ in many ways depending on the implementation requirement of application irrespective to both can be used for similar applications.
18.2 Literature Review Kausar and Nasar [6] explains the digital world is growing rapidly, making the quantity, diversity, and speed of nature extremely difficult. In recent times, there have been two crucial changes in data management which are NoSQL databases and big data analytics. As they evolve for a variety of reasons, their self-determining interests cancel one another out, and their consolidation uses a large number of multifaceted datasets that can be structured, semi-structured, and unstructured. It brings great benefits to the organization in making timely decisions. On the one hand, a number of software solutions have emerged to support big data analysis, while further away, a number of packages of NoSQL databases are available in the market. Thus, the core purpose of this article is to understand their perspective and provide an absolute study to correlate future of some of the new major NoSQL data models. Babucea [7] features an unparalleled increase in the amount of data saved and managed over the past few years. At the same time, it has introduced many troubles and difficulties associated with proper management systems for huge databases, mainly traditional databases. Majority of them are committed to relational databases and are intended to ensure data integrity and transactional consistency. A new database model becomes the solution to this problem, the data model is named as NoSQL. NoSQL databases usually co-exist because they are complementary, not
18 Qualitative Analysis of SQL and NoSQL Database …
157
just alternatives to SQL databases. At present, SQL databases are facing many challenges like scalability and ability to move quickly and easily which is called agility, proving that they cannot acclimatize to existing features such as cheap storage and computing power. NoSQL databases, further away, are progressively demonstrating their ability to handle very large amounts of datasets. Thus, currently two complementary database models that share the same main purpose of creating, restoring, updating, and managing data, but with both advantages and disadvantages. This white paper is intended to provide an overview of NoSQL’s advantages and limitations over relational database. Corbellini et al. [8] in their work gives a wider amendment of NoSQL databases, including all available types of NoSQL models. Important features of all NoSQL databases types have been described and explain them with example of concrete implemented database management system which fits into the database type. The author concluded that NoSQL database performs better due to sharding and persist data storage support which can be considered based on storage solution requirement. They also mentioned that a hybrid data layer, which divides the application into different databases, is only necessary in a few scenarios which allow the use of relational databases for more consistency, while NoSQL databases work best for higher scalability with fast access. Ali et al. [5] compared SQL and NoSQL databases including its four data models which are easy to understand, run, and do not engross complex methods for optimizing SQL use in big data analysis. This describes NoSQL as a great tool for troubleshooting data availability. Where the data in SQL database must put into the tables. If the data will not fit in the table, the structure will need to be tweaked at this point. Since NoSQL provides schema-free transactions and the freedom to add fields to records without dealing with schema which is a major limitation in databases SQL data. Further, the most significant strand of NoSQL database is when an alternative storage technique is needed, developers don’t have to rely on the relational model anymore. Well, the importance of RDBMS is still considered as the requirement. However, the storage needs for the new generation of data are substantially different from those for traditional programs. The author concludes that NoSQL’s agile data model is well suited for vibrant scalability and improved efficiency of big data analytics. Mihai [9] highlights the fact that SQL and NoSQL databases are not mutually exclusive but they are complementary to one another with great merits and demerits. The article discusses the various categories and properties of NoSQL data models, as well as a thorough and synthetic comparison of relational and NoSQL data models. The author explains the suitability of both SQL and NoSQL database, one suit the centralized and another meant for decentralized applications which require high scaling needs with increase in data to ensure the permanent availability. The paper helps us to know that how NoSQL trends to be a better option for unstructured data and rapidly increased data volume. These databases store and process complex data effectively.
158
J. Chaudhary et al.
18.3 NoSQL Database NoSQL stands for “Not Only SQL” which indicates a means of introducing an idea for any data beyond relational databases such that data will not be stored in a tabular format where we need to specify the rows and columns. It is more of software that can store data based on the user’s needs (Jowan et al. [10]). NoSQL databases came into existence with a large range of families that satisfies different types of requirements looked after by organizations or companies. Such unstructured and structured data increase doubles every 1.2 years (Padhy and Kumaran [11]). These are known as types of NoSQL database; these have been defined to organize and retrieve the data from self-sufficient data models.
18.3.1 Key-Value Store These are very effective and strong data stores as they use key-value patterns. It is easier to interact with its application program interface (API) such that users can keep the data in schema-less manner through key-value store patterns. These store data like hash tables where keys are used for indexing. This marks them better than RDBMS. This data model works like a dictionary where a user is permitted to make requests based on specific key values. This works effectively but lacks the feature to store data in the customized view. And they do not support secondary keys (MartinezMosquera et.al. [12]). These data stores can work for online shopping carts to manage customer choices and preferences and other similar purposes.
18.3.2 Graph Store Database These data stores are useful to maintain relationships between different data sets available. This contains nodes and edges which work the same as in graphs (MartinezMosquera et al. [12]). This is used to draw and maintain relationships between the respective items. These databases are meaningful and helpful when a relationship between the items is required to maintain which enlightens the primary focus of these data stores. These are utilized in social networking sites, content management systems, and cloud management services. Few graph databases are Neo4j, Orient DB, etc.
18 Qualitative Analysis of SQL and NoSQL Database …
159
18.3.3 Wide Column Store Databases These data stores use column-oriented data structure that take in multiple attributes as a reference to a key. These are also highly scalable data stores as users can add any no. of columns in the DB at any required time. It is not required to fill values again for existing rows of the new columns (Martinez-Mosquera et al. [12]). Because of this, they are extensively supported in distributed architectures which include distributed storage, passing, sorting, and batch-oriented data processing. Some of the famous data stores are Hypertable, Cassandra, Hbase, etc.
18.3.4 Document Store Database These data stores are modeled to manage and store documents. These databases provide outstanding efficiency for horizontal scalability. These documents are less schematic although a bit similar to documents of the relational database. The document store deals with documents through different varieties and is further encapsulated in standard internal format. The standard internal format of documents is XML, PDF, JSON, and so on (Singh et al. [13]; Gupta et al. [2]). The documents are uniquely represented in the database through a unique key, where this key can be a URI, path, or a string. Some famous document databases are MongoDB, CouchDB, etc.
18.4 Feature Comparison of Relational Database and NoSQL Database The limitations of SQL database have been mentioned by many researchers which restricts them being flexible databases for structured, semi-structured, and unstructured data types Raut [3]. Thus, NoSQL jumps up with such features which gives meaning and wings to the technological world to get rid from existential issues.
18.4.1 Flexibility During development and evolution of any software application, it is not flexible to change the database schema. The SQL databases work on static schema which says, a pre-defined schema is required before injecting the data. If somehow a schema needs to change with pre-existing data, it can cause service failure, decrease in performance. While the NoSQL databases provide dynamic schema which means schema needs not to be pre-defined. Due to this, changes can be easily accommodated. This feature
160
J. Chaudhary et al.
of NoSQL database makes them more flexible to accept any kind of data such as structured, semi-structured or unstructured data, where SQL database supports only structured data Sahatqija and Ajdari [4].
18.4.2 Scalability The relational databases are normally scaled up while the hardware requirement is essential for a server so that it can work efficiently. This increases an effort overhead from the administrators to upgrade these databases Abourezq and Idrissi [14]. And, the DBMS scalability is crucial when it comes to choose a database management system for any software program. Since, SQL database supports vertical scalability which means after the quantity of records is being expanded, then the storage capacity and computational power of subsist node might be required to expand, such as CPU, the RAM, and the SSD of the database server. Because of the increased risk of hardware failure and the quantity of hardware required for future upgradability, this form of scalability is expensive. The NoSQL database, on the other hand, employs horizontal scalability, which means that as the amount of data grows, the data volume grows as well, which may be resolved by adding additional nodes for data storage and processing power such as adding servers to the NoSQL infrastructure Sahatqija et al. [4].
18.4.3 Availability With the change of era the crowd using the facilities has increased such as users for social media, e-commerce, and cloud have highly jumped. The relational databases have a drawback of single-point failure as they are scale up. While such kind of databases does not fit into the current usability of Internet applications. The users cannot rely on relational databases because the dependency of their lives is on faultless connections to support their daily activities. The distributed nature of the NoSQL database allows it to become the frequent usable database as the partial data can be accessed if any failure occurs Moniruzzaman and Hossain [15]. This provides a guaranteed use of the NoSQL database ignoring the failures of the system.
18.4.4 Performance The relational databases lack with this point as well. This database retrieves the information from non-volatile memory which is slow at processing, while the NoSQL database retrieves the data from volatile memory which is fast at processing. A study of comparison also proves the performance of NoSQL database is fast in reading, update, and querying while SQL database is well at updating values.
18 Qualitative Analysis of SQL and NoSQL Database … Table 18.1 Terminology comparison of SQL and NoSQL database
161
SQL
NoSQL
Database
Database
Table
Collection
Row
Document
Column
Field
Joins
Embedded documents
Aggregation (by group)
Aggregation pipeline
Since NoSQL databases have a variety of database where terminologies can vary with relational databases. Thus, keeping our future work in consideration, here the comparison is placed between SQL and MongoDB, which is a most popular document-oriented database according to database ranking engine and a better performer among NoSQL database as the author Mahmood [16] explained for large data set. The comparative study of different databases is analyzed on parameters like features, terminologies, and CRUD operation to extract the best databases which will deal all kinds of data types for current industrial use cases which were earlier compared in many researches emphasizing different criteria (Wadhwa and Kaur [17]; Arauja et al. [18]; Aghi et al. [19]; Petri [20]; Seo et al. [21]; Ceresnak and Kvet, [22]) (Tables 18.1, 18.2 and 18.3). Table 18.2 Query comparison of SQL and MongoDB Query
SQL query
NoSQL query
Select
Select*from Student
db.students.find()
Create
Create table Students()
db.createCollection()
Insert
INSERT INTO Students()
db.students.insert()
Delete
DELETE from Students()
db.students.delete()
Drop
DROP TABLE Students()
db.students.drop()
Table 18.3 Feature comparison of different types of NoSQL databases (Gupta et al. [2]) Data model
Performance
Scalability
Flexibility
Complexity
Key-value store
High
High
High
None
Column store
High
High
Moderate
Low
Document store
High
High
High
Low
Graph database
Variable
Variable
High
High
162
J. Chaudhary et al.
18.5 Qualitative Comparison of SQL and Different NoSQL Databases Quantification of qualitative results is evaluated using Likert scale. This will help to compare SQL and NoSQL databases on the basis of their performance to find the best suited database for one’s requirements. Since many researchers have compared both the databases on the basis of different parameters, but the aim of the study is to highlight the best performing databases for large volume of data in order of create, Read, update, and delete (CRUD) operation (Tables 18.4, 18.5 and 18.6) (Figs. 18.1 and 18.2). Table 18.4 Quantification of qualitative criteria Criterion values with quantifiable range
Criteria covered
Explanation
High-5, Variable-4, Moderate-3, Low-2, None-1
Performance, scalability, flexibility, complexity
From Table 18.3, move from the greatest to the lowest degree of assistance with 5 grades 5–1
Very High-5, High-4, Average-3, Slightly low-2, Low-1
Create, Read, Write, and Delete operation
In Table 18.6, proceed from the greatest to the lowest degree of support with 5 grades 5–1
Table 18.5 Summarization on the basis of features Features
Key-value store
Column-oriented
Document-oriented
Graph-oriented
Performance
5
5
5
4
Scalability
5
5
5
4
Flexibility
5
3
5
5
Complexity
1
2
2
5
Overall rating
16
15
17
18
Table 18.6 Summarization of performance on the basis of CRUD operation Operation
MYSQL
Oracle
MongoDB
Cassandra
Redis
Create
3
2
5
3
4
Read
1
2
5
2
4
Write
3
2
5
4
4
Update
1
1
4
4
4
Overall rating
8
6
19
13
16
18 Qualitative Analysis of SQL and NoSQL Database …
163
Fig. 18.1 Graphical analysis of features in NoSQL database
Fig. 18.2 Graphical analysis of CRUD operation
18.6 Conclusion and Future Scope This study insights all possible comparisons between SQL and NoSQL databases including four types of NoSQL databases which can contribute in big data analytics. These databases do not provide complicate mechanisms to optimize big data analytics. The NoSQL databases provide higher availability of data and higher scalability which is the main aim of the industries to deal with higher amount of data. Since the SQL databases provide tabular structure to store information which make it harder to opt for current data generated by industries as it is unstructured and semi-structured which needs flexible schema which can store data and makes the
164
J. Chaudhary et al.
processing easier. Thus, this study concludes that NoSQL databases are more efficient to use and have flexible data model which are more adaptable for scalability purpose to enhance efficiency.
References 1. Puangsaijai, W., Puntheeranurak, S.: A comparative study of relational database and key-value database for big data applications. In: 2017 International Electrical Engineering Congress (iEECON) (pp. 1–4). IEEE 2. Gupta, A., Tyagi, S., Panwar, N., Sachdeva, S., Saxena, U.: NoSQL databases: critical analysis and comparison. In: 2017 International Conference on Computing and Communication Technologies for Smart Nation (IC3TSN), pp. 293–299, (2017, October). IEEE 3. Raut, A.B.: NoSQL database and its comparison with RDBMS. Int. J. Comput. Intell. Res. 13(7), 1645–1651 (2017) 4. Sahatqija, K., Ajdari, J., Zenuni, X., Raufi, B., Ismaili, F.: Comparison between relational and NoSQL databases. In: 2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), pp. 0216–0221, (2018, May). IEEE 5. Ali, W., Shafique, M. U., Majeed, M. A., Raza, A.: Comparison between SQL and NoSQL databases and their relationship with big data analytics. Asian J. Res. Comput. Sci. 1–10 6. Kausar, M. A., Nasar, M.: SQL versus NoSQL databases to assess their appropriateness for big data application. Recent Adv. Comput. Sci. Commun. (Formerly: Recent Patents Comput. Sci), 14(4), 1098–1108, (2021) 7. Babucea, A.G.: SQL or NoSQL Databases?, p. 1. Annals of’ constant in Brancusi’ University of Targu-Jiu. economy series, Critical Differences (2021) 8. Corbellini, A., Mateos, C., Zunino, A., Godoy, D., Schiaffino, S.: Persisting big-data: the NoSQL landscape. Inf. Syst. 63, 1–23 (2017) 9. Mihai, G.: Comparison between relational and NoSQL databases. Econ. Appl. Inform. 3, 38–42 (2020) 10. Jowan, S.A., Swese, R.F., Aldabrzi, A.Y., Shertil, M.S.: Traditional RDBMS to NoSQL database: new era of databases for big data. J. Humanit. Appl. Sci 29(29), 83–102 (2016) 11. Padhy, S., Kumaran, G.M.M.: A quantitative performance analysis between MongoDB and Oracle NoSQL. In: 2019 6th International Conference on Computing for Sustainable Global Development (INDIACom) pp. 387–391, (2019, March), IEEE 12. Martinez-Mosquera, D., Navarrete, R., Lujan-Mora, S.: Modeling and management big data in databases—a systematic literature review. Sustainability, 12(2), 634, (2020) 13. Singh, N., Chandra, R., Shambharkar, S. B., Kulkarni, J.J.: A review of NoSQL databases and performance comparison between multimodel and polyglot persistence approach. IJSTR, 9(01), 1522–1527, (2020) 14. Abourezq, M., Idrissi, A.: Database-as-a-service for big data: An overview. Int. J. Adv. Comput. Sci. Appl. (IJACSA), 7(1), (2016) 15. Moniruzzaman, A.B.M., Hossain, S.A.: NOSQL database: New era of databases for big data analytics-classification, characteristics and comparison. (2013). arXiv:1307.0191 16. Seo, J. Y., Lee, D. W., Lee, H. M.: Performance comparison of CRUD operations in IoT based Big Data computing. International Journal on Advanced Science Engineering Information Technology, 7(5), 1765–1770, (2017); Mahmood, K., Orsborn, K., Risch, T.: Comparison of NOSQL datastores for large scale data stream log analytics. In: 2019 IEEE International Conference on Smart Computing (SMARTCOMP), pp. 478–480. (2019, June) IEEE 17. Wadhwa, M., Kaur, E. A.: Review of performance of various big databases. International J. Recent Innovat. Trends Comput. Commun. 5(6), 179–182
18 Qualitative Analysis of SQL and NoSQL Database …
165
18. Araujo, J. M. A., de Moura, A. C. E., da Silva, S. L. B., Holanda, M., de Oliveira Ribeiro, E., da Silva, G. L.: Comparative performance analysis of NoSQL Cassandra and MongoDB databases. In: 2021 16th Iberian Conference on Information Systems and Technologies (CISTI) (pp. 1–6). (2021, June) IEEE 19. Aghi, R., Mehta, S., Chauhan, R., Chaudhary, S., Bohra, N.: A comprehensive comparison of SQL and MongoDB databases. Int. J. Sci. Res. Publ. 5(2), 1–3 (2015) 20. Petri, G. (2005). A comparison of Oracle and MYSQL. Select J, 1 21. Seo, J.Y., Lee, D.W., Lee, H.M.: Performance comparison of CRUD operations in IoT based big data computing. Int. J. Adv. Sci. Eng. Inf. Technol. 7(5), 1765–1770 (2017) ˇ 22. Cerešˇ nák, R., Kvet, M.: Comparison of query performance in relational a non-relation databases. Transport. Res. Proc. 40, 170–177 (2019)
Chapter 19
A Rule-Based Sentiment Analysis of WhatsApp Reviews in Telugu Language Kalpdrum Passi
and Sujay Kalakala
Abstract One of the most active areas of research and discussion in the field of natural language processing is the concept of sentimental analysis or the concept of opinion mining. It has gained a lot of popularity in the past few years and the main cause behind that is the usefulness of the derived results. The methods and concepts can analyze the given text and provide results based on the nature of the text. This is very useful in terms of business growth as well as customer satisfaction and assistance. This rule-based approach was then used to train a machine learning model using a few parametric classifiers like K-nearest neighbors (KNN), XGBoost, and support vector machines (SVM). The classifiers also fetched a decent accuracy of 81%, 82%, and 78%, respectively, which indicated toward the good performance of the rule-based approach and its effectiveness with error counts of 0.296, 0.288, and 0.252 with TF-IDF and 0.285, 0.285, and 0.234 with bag of words. Along the process, manual observation was also used to compare the assigned sentiments to the sentence to find the errors in the method. The best performance with respect to results was given by SVM classifiers that returned an F1-score of 75%, and the lowest error count of 0.25 which is better among all the classifiers. The metrics which were used to judge these classifiers were the F1-scores as well as the mean squared error. Keywords Sentiment analysis · Positive · Negative · NLP · Telugu · Python · Rule-based · K-means · XGBoost · SVM · Bag of words · WhatsApp · Reviews
K. Passi (B) · S. Kalakala Laurentian University, Sudbury, ON P32C6, Canada e-mail: [email protected] S. Kalakala e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_19
167
168
K. Passi and S. Kalakala
19.1 Introduction 19.1.1 Background Sentiment analysis is the examination and assessment of the feelings or the sentiments that the information is attempting to or is passing on. Giving the capacity of identifying and understanding feelings is a huge margin of success for the exploration of this concept. It utilizes the techniques for natural language processing (NLP) to comprehend and deal with the unstructured content so the machine can comprehend it by changing over it into machine level data and afterward work from that point. It attempts to recognize patterns that are generally imperceptible to the onlooker and must be perceived in a hyper dimension.
19.1.2 Problem Statement The problem is to assign a sentiment, i.e., to each and every sentence present in the dataset according to a correct pattern or rule. Given the huge amount of dataset and the additional information already available on the Internet, it gets much easier to summarize the entire data in a few numbers. The sentences are denoted with a polarity value that is of three types: 0 for neutral, −1 for negative sentence, and + 1 for positive sentence. Moreover, after the first step of the problem, the next is to come up with a machine learning model that is a classifier using various techniques and uses the data to train a model in order to predict the polarities of the textual data. The dataset and the model both must also be analyzed so that the various techniques applied perform better and can work independently.
19.1.3 Aims and Objectives The aims and objectives for this research are quite simple. All of the objectives for this research fall under the concept of sentiment analysis which will be the main focus of this thesis. The objective is to carry out the sentiment analysis in two parts. The first part is the main algorithm of the approach where a traditional rule-based algorithm and method will be used to derive the sentiment and assign that particular sentimental value to the sentence. This rule-based method will use the auxiliary verbs and as well as the positive and negative words present in the sentences to arrive at a sentimental value for that sentence. The following research questions are addressed: • Can the traditional rule-based method prove to be useful and effective? • Can the model be trained based on what the rule-based approach created?
19 A Rule-Based Sentiment Analysis of WhatsApp Reviews in Telugu Language
169
19.2 Related Work Our everyday lives have been impacted by the considering individuals. The conclusions and assessments of different experts have consistently impacted our perspectives. The blast of Web 2.0 has prompted a spike in activities such as tagging, social networking, blogging, contributing to RSS, and podcasting [1]. Subsequently, there is also many times an increase of curiosity in individuals burrowing for these information a sets. Tangible analysis or optimization is a strategy for overseeing thoughts, feelings, and text accommodation [2]. The discussion of this report will take a gander at the different difficulties and utilization of sentiment analysis. The examination will also focus exhaustively on the different approaches to make a personal count of feelings and thoughts [3]. Different sentiment analysis or information-driven techniques, for example, Naive Bayes, SVM, maximum entropy, and perceptron will be discussed and their qualities and difficulties will be examined [4]. We will likewise see another part of cognitive psychology’s passionate examination particularly crafted by Deng and Wiebe [1], in which we will see approaches to acquire quietude, point of view in describing and trying to understand the design of discourse. We will likewise find out about explicit subjects in the concept of sentiment analysis and present-day exercises in those spaces [5]. A research [6] applies sentiment analysis on Bangla language. This paper uses the concepts of text processing such as normalization and tokenization along with using stemming to process text. The dataset used here is in Bangla language and was acquired from GitHub repository. The algorithm they developed is called Bangla Text Sentiment Score (BTSS) and is a rule-based algorithm that follows a similar approach that we used. This research inspired the methodology in our research. The rule-based approach is used to detect the positive and negative words in a sentence [7]. This list of words is called a Lexicon Data Dictionary (LDD) [8]. Along with this a supervised learning algorithm, support vector classifier (SVC) was used which gave an accuracy of ~ 82%. This depicts the effectiveness and precision of the entire methodology [9]. The polarities were calculated and converted into 1 and 0 for applying machine learning algorithms [10]. Since the unlabeled data was turned into a labeled set of data, it was much easier for a model to fit the data and use a supervised learning algorithm to predict the sentiment polarity [11]. The research problem addressed in this study is the same as any other natural language processing problem [12] in sentiment analysis which captures the essence of the text and extracts the necessary information for the prediction of the sentiment [13]. However, the Telugu language is different from any native language used in different research works as well as the procedure followed for predicting the sentiment of the text [14]. In the Telugu language, some special characteristics of the process need to be changed or modified to incorporate the new information in the underlying algorithm [15]. This creates a knowledge base around the fact that multiple approaches [16] to the given problem are available and these approaches can also be combined for an efficient evaluation of sentiments in different languages [17].
170
K. Passi and S. Kalakala
19.3 Data and Methods 19.3.1 Data Extraction In cases of data analysis, many times data must be collected in a huge volume, and the scale of the data is so diverse and huge that there is no option other than to automate the process of extracting the required and desired data. It also helps when client feedback or necessary data is needed to be extracted. This results in the need for special tools like APIs that can fulfill the request of granting access to necessary data without having to go through the trouble of manually fetching data for every single user. With the use of this API, data can be requested from a server and the request can be serviced by the API by fetching the necessary data [11]. For this purpose, the google-play-scrapper was used which is a package in Python and is used to scrape reviews of various kinds of apps in the Play Store. Using Python in a cloud-based environment, the process of data extraction was carried out followed by the entire cleaning and structuring of the dataset [18]. This module was used to extract the WhatsApp application reviews from Play Store using Python scripts so that the data can be processed and used for the sentiment analysis. It is usually advised to collect all the reviews or as much as you can but since this was for research and study purposes, a total of 5000 reviews in Telugu language were collected, and these were used for developing the rule-based model as well as for training and testing the model. It creates APIs that can act as a Web crawler over the Play Store. It can also be utilized to extract application information other than the reviews. It consists of a review function that can perform all the necessary operations and acts like an automated algorithm [19]. It returns two different outputs in which the first one is the list of reviews that is needed, and the second one is additional information about that review. This is how the data was obtained. Figure 19.1 shows the word cloud of the WhatsApp data in Telugu language scraped through the Play store.
19.3.2 Data Pre-processing The data preprocessing steps for text data involve data cleaning and vectorization of the text data. The data cleaning process is the very first step after collecting and importing the data. Cleaning the data requires performing techniques, such as punctuation and stop words removal. This process is the next step after collecting and loading the dataset into the notebook.
19 A Rule-Based Sentiment Analysis of WhatsApp Reviews in Telugu Language
171
Fig. 19.1 Word cloud of the data that was extracted
19.4 Sentiment Analysis 19.4.1 Rule-Based Methodologies In a rule-based approach, the algorithm tries to derive the polarity of the sentence based on the rules that are set in place or created in the algorithm. The rule-based approach has many ways to carry out the process such as it can count the number of times a sentence has the word ‘good’ or ‘bad’. Based on that, it can assess the score it should get. Another way to manage a rule-based approach is that the process could be such that all the count of the positive words can be subtracted from the count of the negative words in the sentence, and the score obtained is the polarity score. It is very simple to read and examine but when it is applied on a bigger scale with appropriate data, the results are tremendous sometimes [20]. Table 19.1 shows examples of auxiliary verbs. The rules that were used to treat the data and create polarities are given in Table 19.2. The process in this research relied heavily on the Telugu words and all of the processes were dependent on the words. A text file was created that contained all the main words of the sentences in it. The text file was then grouped and divided into two parts that contained negative and positive words separately. The two text files contained positive and negative words such as ‘good’, ‘super’, ‘disgusting’, ‘waste’, etc. These words were very important to gather the intent of a particular sentence. The sentence was checked against both of the lists, and the score was generated based on the count of positive and negative words in the sentence. Extra help was obtained by the use of similarly structured auxiliary
172
K. Passi and S. Kalakala
Table 19.1 Examples of auxiliary verbs Positive auxiliary verbs in Telugu
Translation in English
Negative auxiliary verbs in Telugu
Translation in English
I am
Won’t be there
Located
Nope
Is
Nope
Are
Refuse
Will be
Resistance
Receives Takes
Table 19.2 Rules to create polarities with examples Rules
Sentence in Telugu
Sentence in English
Polarity
Rule1: POS + POS (aux verb) = POS
V¯at¯avaranam chala b¯ag¯a (pos) + undi (pos aux verb) = Positive statement
The weather is very good
Positive
Rule 2: POS + NEG (aux N¯aku en.d.a l¯o bayat.a ki verb) = NEG p¯od¯am ˙ is.t.am ˙ (pos) l¯edu (neg aux verb) = Negative statement
I do not like to go out in the Negative sun
Rule 3: NEG + POS (aux V¯at¯avaran.am ˙ c¯al¯a verb) = NEG d¯arun.a˙ng¯a (neg) undi (pos aux verb) = Negative statement
The weather is so bad
Negative
Rule 4: NEG + NEG = NEU
V¯at¯avaran.am ˙ anta cet.t.ag¯a (neg) l¯ed.u (neg aux verb) = Neutral statement
The weather is not so bad
Neutral
Rule5: POS (words)–NEG (words)
N¯enu ¯ı v¯at¯avaran.a¯ nni I like this atmosphere, but I Neutral is.t.apad.utunn¯anu (pos) k¯an¯ı don’t like to go out n¯enu bayat.aku vel.lad.a¯ niki is.t.apad.anu (neg)
19 A Rule-Based Sentiment Analysis of WhatsApp Reviews in Telugu Language Table 19.3 Output of the rule-based algorithm
Sentence
173
Output polarity +1 −1 0
verbs that were also divided into subgroups that contained positive auxiliary verbs and negative auxiliary verbs. Moreover, if the sentence at hand does not contain any combination as such, then the score will be calculated based on the count of positive and negative words in the sentence, i.e., positive words–negative words = polarity [21]. Table 19.3 gives an example output of the rule-based algorithm. In the output of rule-based algorithm, + 1 indicates positive statements, 0 indicates neutral statements, and −1 indicate negative statements. We can take a simple example from the above output for our better understanding of how actually the rule-based algorithm works [21]. with polarity = −1 which can be written in transliterated Sentence: form for easy understanding. Translated form of sentence: Ma¯n chidi (pos) + k¯adu (Neg aux verb) = −1. Here manchidi (good) is positive word + kadu (not) is negative aux verb which should give result following Rule 2: POS + NEG (aux verb) = NEG.
19.4.2 Using the Polarities to Train the Data Moreover, building upon the rule-based methodology, it is used to create the labels for an otherwise unlabeled dataset. The unlabeled dataset could not be used for supervised machine learning beforehand and needs to be related to a target class for sentiment analysis [22]. A couple of traditional classifiers were used at first for the data to check the model that performs best and provides better results than the rest. Bigrams and Trigrams (N-gram range) These two terms are very important and crucial to natural language processing. In the process of vectorization, N-grams refer to the vectorization of words based on a window of words such as for example: two words at a time. Instead of creating a corpus full of all the unique words, bigrams (two words window) refer to creating a corpus of all the appearing combination of two words at a time [22]. For example: in the sentence, ‘This application is great to use’, a normal vectorizer will convert the corpus into = [‘This’, ‘application’, ‘is’, ‘great’, ‘to’, and ‘use’].
174
K. Passi and S. Kalakala
But in the case of bigrams, the corpus will be = [‘This application’, ‘application is’, ‘is great’, ‘great to’, and ‘to use’]. This is done so that the algorithm can pick up more information on the same amount of data that is being used. Trigrams similarly use windows of three words at a time and create the corpus full of grouped words with sets of three. These were used in this research in combination with one another such as using unigrams (single words). In this case, two different sets were made for training purposes for the model. 1. 2.
Counter vectorizer with unigrams. TF-IDF vectorizer with unigrams.
19.5 Results and Discussion 19.5.1 Evaluation Metrics The final step after training and building the predictive models is to evaluate their performance. Before evaluating models with metrics, a model selection technique is applied with each algorithm to check validation accuracy. The model selection process is applicable in choosing the best performing machine learning model among several trained models [23]. K-fold Cross-Validation Cross-validation is a technique that enables a machine learning model to be tested against new and a small sample of data. It helps validate the model based on multiple tests runs that it conducts using the data [24]. The K value in this cross-validation determines the number of times the model needs to be tested. Moreover, the K value also specifies how many groups the data will be split into [6]. It is widely used because it helps estimate the model’s potential without any bias [11]. Figure 19.2 shows a snippet of the code for applying cross-validation. In this research, the model selection process returns a distinct validation accuracy score for each model that helps in selecting one final model [20]. The validation fold selected for this procedure was K = Fivefold Cross-validation. Table 19.4 gives the cross-validation accuracy of all the models for the two techniques used in this research, namely TF-IDF and count vectorizer (or bag of words). This is the most important benefit of cross-validation over normal validation is that all the data is used for both training and testing purpose [21]. In normal validation, the same set is used as the validation set once. In K-fold, the identity of the dataset changes K times [11]. From Table 19.4, we observe that the validation accuracy of KNN TF-IDF in firstfold is lower than other folds. These metrics are explained below.
19 A Rule-Based Sentiment Analysis of WhatsApp Reviews in Telugu Language
175
Fig. 19.2 Code snippet—cross-validation applied in Python
Table 19.4 Cross-validation scores for all models Model
Firstfold
Secondfold
Thirdrdfold
Fourthfold
Fifthfold
KNN count vec
0.83315959
0.82229071
0.83581295
0.82723325
0.82974535
KNN TF-IDF
0.61189072
0.82869596
0.83126569
0.82484623
0.82598592
SVM count vec
0.79492568
0.78734048
0.79611921
0.78692337
0.78696898
SVM TF-IDF
0.78317735
0.77902967
0.78420702
0.77886202
0.77806452
XGboost Count vec
0.83065542
0.82397648
0.83305868
0.82345284
0.82251208
XGboost TF-IDF
0.83398609
0.82542988
0.83611121
0.82518452
0.82664806
19.5.2 Results Table 19.5 gives the notation used in the performance graphs for TF-IDF and bag of words (BOW) for each classifier [11]. Figure 19.3 shows the precision, recall, and F1-score of XGBoost algorithm for TF-IDF and BOW methods. In Fig. 19.3, it can be observed that TF-IDF gives same precision, recall, and F1score values as BOW method for XGBoost. Figure 19.4 shows the precision, recall, and F1-score of the SVM algorithm for TF-IDF and BOW methods. Table 19.5 Notation used for TF-IDF and bag of words (BOW) for each classifier
TF-IDF
Bag of words
XGB
XGB counts
SVM
SVM counts
KNN
KNN counts
176
Fig. 19.3 Model evaluation chart for the XGBoost classifier
Fig. 19.4 Model evaluation chart for the SVM classifier
K. Passi and S. Kalakala
19 A Rule-Based Sentiment Analysis of WhatsApp Reviews in Telugu Language
177
It can be observed in Fig. 19.4 that count vectorizer (BOW) gives better performance than the TF-IDF method with SVM. Figure 19.5 shows the precision, recall, and F1-score for KNN algorithm for TF-IDF and BOW methods. It can be observed that TF-IDF gives a higher precision than BOW, although the recall and F1-score are lower compared to BOW. In Fig. 19.5, the KNN classifier returns the F1-score of 81.51, which is greater than the SVM score but less than the XGBoost score. Table 19.6 gives the precision, recall, and F1-score of the three classifiers for TF-IDF and BOW methods. It can be observed that XGBoost gives the highest precision of 86.25%, recall of 83.21%, and F1-score of 82.58% for both TF-IDF and BOW. KNN and SVM give higher performance for BOW as compared to TF-IDF. However, XGBoost gives the same performance for both TF-IDF and BOW. The error rate can be calculated using mean square root, but it is a language-based analysis, and the verification of results can be done manually only so we can identify the best model [21]. We do not have any similar data research which can show the
Fig. 19.5 Model evaluation chart for the KNN classifier
Table 19.6 Test scores for the models
Model
Precision
Recall
F1-score
KNN counts KNN TF-IDF
83.29
81.98
81.51
84.52
81.63
SVM counts
81.32
83.74
79.61
78.73
SVM TF-IDF
82.62
78.78
77.79
XGboost counts
86.25
83.21
82.58
XGboost TF-IDF
86.25
83.21
82.58
178 Table 19.7 Error rates for the models
K. Passi and S. Kalakala MSE
Error rate
MSE for predict_SVM
0.2522239665096808
MSE for predict_KNN
0.29618001046572473
MSE for predict_XGB
0.2880690737833595
MSE for predict_SVM (BOW)
0.23443223443223443
MSE for predict_KNN (BOW)
0.2859759288330717
MSE for predict_XGB (BOW)
0.2851909994767138
results, so we need to base on the native reader for it. The code snippet for calculating the mean square error formula is given below. Mean squared error = (Predicted−Original)2 . def calc(y_true, y_test, na): scores = mean_squared_error(y_true, y_test) print(‘MSE for {} is -----> {}’.format(na,scores))
Table 19.7 gives the error rates for the three classifiers for TF-IDF and BOW. It can be observed that SVM gives the least error of 0.23443223 for BOW and 0.2522239 for TF-IDF. XGBoost gives the second-best results with an error of 0.285190 for BOW and 0.288069 for TF-IDF. KNN gives the highest error for both TF-IDF and BOW. From the performance metrics and the error rates, it can be concluded that XGBoost and SVM give the best performance in predicting the polarity of the Telugu text [20].
19.6 Conclusion and Future Work 19.6.1 Conclusion Through researching and reviewing all the related work that is based on complex and uncommon languages, this study was made possible. The preprocessing was handled like any other natural language processing task and was carried out on the dataset without any issues. The rule-based methodology turned out to be immensely helpful and provided a better way than the traditional approach followed generally. The experimentation conducted with multiple methods (bag of words and TF-IDF) of vectorization also led to the decision of selecting a better model which was tuned and optimized in all the ways possible. Count vectorizer provided better results since the dataset did not contain a lot of instances. The K-nearest neighbors and the support vector machine (SVM) model was not far back in the F1-score of the model, but the better scores of XGboost on both the vectorization approach and TF-IDF suggests that it is the ideal choice. The KNN model, however, had a lower error count as
19 A Rule-Based Sentiment Analysis of WhatsApp Reviews in Telugu Language
179
compared with the XGboost model and this created a variance between the results. The error rate was calculated using mean square error. SVM gave the least error of 0.23443223 for BOW and 0.2522239 for TF-IDF. XGBoost gave the second-best results with an error of 0.285190 for BOW and 0.288069 for TF-IDF.
19.6.2 Future Work For future work, a lot of more experimental analysis can be done on the data such as adding new rules as the dataset of verbs and words can be increased to capture more words and assign better adjusted values to the sentences. Another approach that can be explored is stacking the SVM model as well as the XGboost model since they both perform almost equally well. Experimenting with stacking, these two models to combine and create a new and better model and is surely something to carry out in further future research. Any further findings will be reserved for the future work.
References 1. Deng, L., Wiebe, J.: How can NLP tasks mutually benefit sentiment analysis? A holistic approach to sentiment analysis. In: Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pp. 53–59 (2016) 2. Kumar, A., Srinivasan, K., Cheng, W., Zomaya, A.: Hybrid context enriched deep learning model for fine-grained sentiment analysis in textual and visual semiotic modality social data. Inf. Process. Manage. 57(1), 102141 (2020). https://doi.org/10.1016/j.ipm.2019.102141 3. Saad, S., Yang, J.: Twitter sentiment analysis based on ordinal regression. IEEE Access 7, 163677–163685 (2019). https://doi.org/10.1109/access.2019.2952127 4. Namugera, F., Wesonga, R., Jehopio, P.: Text mining and determinants of sentiments: Twitter social media usage by traditional media houses in Uganda. Comput. Social Netw. 6(1) (2019). https://doi.org/10.1186/s40649-019-0063-4. Accessed 15 July 2021 5. Fang, Y., Tan, H., Zhang, J.: Multi-strategy sentiment analysis of consumer reviews based on semantic fuzziness. IEEE Access 6, 20625–20631 (2018). https://doi.org/10.1109/access.2018. 2820025 6. Cherif, W.: Optimization of KNN algorithm by clustering and reliability coefficients: application to breast-cancer diagnosis. Proc. Comput. Sci. 127, 293–299 (2018). https://doi.org/10. 1016/j.procs.2018.01.125. Accessed 14 July 2021 7. Nemes, L., Kiss, A.: Social media sentiment analysis based on COVID-19. J. Inf. Telecommun. 5(1), 1–15 (2020). https://doi.org/10.1080/24751839.2020.1790793. Accessed 15 July 2021 8. Mukhtar, N., Khan, M.: Effective lexicon-based approach for Urdu sentiment analysis. Artif. Intell. Rev. 53(4), 251–2548 (2019). https://doi.org/10.1007/s10462-019-09740-5 9. Poecze, F., Ebster, C., Strauss, C.: Social media metrics and sentiment analysis to evaluate the effectiveness of social media posts. Proc. Comput. Sci. 130, 660–666 (2018). https://doi.org/ 10.1016/j.procs.2018.04.117 10. El Alaoui, I., Gahi, Y., Messoussi, R., Chaabi, Y., Todoskoff, A., Kobi, A.: A novel adaptable approach for sentiment analysis on big social data. J. Big Data 5(1) (2018). https://doi.org/10. 1186/s40537-018-0120-0. Accessed 15 July 2021
180
K. Passi and S. Kalakala
11. Li, C., et al.: Using the K-nearest neighbor algorithm for the classification of lymph node metastasis in gastric cancer. Comput. Math. Methods Med. 2012, 1–11 (2012). https://doi.org/ 10.1155/2012/876545. Accessed 14 July 2021 12. Khattak, A., et al.: Tweets classification and sentiment analysis for personalized tweets recommendation. Complexity 2020, 1–11 (2020). https://doi.org/10.1155/2020/8892552. Accessed 15 July 2021 13. Ray, P., Chakrabarti, A.: A mixed approach of deep learning method and rule-based method to improve aspect level sentiment analysis. Appl. Comput. Inf. (2020). https://doi.org/10.1016/j. aci.2019.02.002 14. Bhowmik, N., Arifuzzaman, M.: Bangla text sentiment analysis using supervised machine learning with extended lexicon dictionary. Nat. Language Proce. Res. 1(3–4), 34–35 (2019) 15. Deepa, N., et al.: An AI-based intelligent system for healthcare analysis using ridge-adaline stochastic gradient descent classifier. J. Supercomput. 77(2), 1998–2017 (2020). https://doi. org/10.1007/s11227-020-03347-2. Accessed 14 July 2021 16. Nguyen, D., et al.: How we do things with words: analyzing text as social and cultural data. Frontiers Artif. Intell. 3, (2020). https://doi.org/10.3389/frai.2020.00062. Accessed 15 July 2021 17. Alonso, M., Vilares, D., Gómez-Rodríguez, C., Vilares, J.: Sentiment analysis for fake news detection. Electronics 10(11), 1348. (2021). https://doi.org/10.3390/electronics1011 1348. Accessed 15 July 2021 18. scikit-learn: machine learning in Python—scikit-learn 0.16.1 documentation, Scikit-learn.org, 2021. [Online]. https://scikit-learn.org/. Accessed: 14-Jul-2021 19. Steele, M.: scraping and storing Google Play app reviews with python. Medium (2021). [Online]. https://python.plainenglish.io/scraping-storing-google-play-app-reviewswith-python-5640c933c476. Accessed: 14-Jul-2021 20. sklearn.linear_model.SGD Classifier—scikit-learn 0.24.2 documentation. Scikit-learn.org, (2021). [Online]. http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGD Classifier.html Accessed: 15-Jul-2021 21. Saedsayad, K.: Nearest Neighbors–Classification (2017). [Online]. https://www.saedsayad. com/k_nearest_neighbors.htm Accessed: 14-Jul-2021 22. Explainer: what is WhatsApp? - Webwise.ie, 2021 [Online]. https://www.webwise.ie/parents/ explainer-whatsapp/. Accessed: 14-Jul-2021 23. Pocs, M.: Lovecraft with natural language processing—Part 1: rule-based sentiment analysis. Medium, (2020). [Online]. Available: https://towardsdatascience.com/lovecraft-with-naturallanguage-processing-part-1-rule-based-sentiment-analysis-5727e774e524 24. Manukonda, D.P., Kodali, R.: Phrase-based heuristic sentiment analyzer for the Telugu language. J. Emerg. Technol. Innov. Res. (JETIR) 6(3)
Chapter 20
Face Model Generation Using Deep Learning Rajanidi Ganesh Phanindra, Nudurupati Prudhvi Raju, Thania Vivek, and C. Jyotsna
Abstract Computer vision systems have embraced learning using networks in recent years. On the other hand, unsupervised learning with convolutional neural networks has received less attention. The proposed method will help to close the gap between convolutional networks’ performance and that of other machine learning algorithms. The goal of this paper is to use deep convolutional generative adversarial networks, a type of convolutional neural network, to create fake face images. The research demonstrating deep convolutional generative adversarial networks outperform generative adversarial networks is used in this paper. By training on picture datasets, the deep convolutional adversarial pair learns a series of representations in both the discriminator and generator leading to realistic face images. At the end of this paper, the results obtained by using deep convolutional generative adversarial networks are shown. Keywords Convolutional networks · Deep convolutional generative adversarial networks · Deep learning · Discriminator · Generator · Unsupervised learning
20.1 Introduction General Adversarial Networks (GANs) are a type of generative model that allows us to produce an entire image in parallel. This network uses unsupervised learning. GANs use differentiable performance represented by a neural network (Generator Network), like most different sorts of generative models. In GAN, a generator model is used to produce fake examples whereas a discriminator model decides if the received image is real or fake [1]. This was originally shown with relatively simple fully connected networks. Deep Convolutional General Adversarial Networks (DCGANs) are like GANs, however, they use convolutional networks instead of the fully connected networks which are utilised in Vanilla GANs, These convolutional networks help in finding deep correlation in images, in other words, they find the R. G. Phanindra (B) · N. P. Raju · T. Vivek · C. Jyotsna Department of Computer Science and Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Bengaluru, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_20
181
182
R. G. Phanindra et al.
spatial correlation. This means DCGANs would be a suitable and better option for pictures and videos. This paper attempts to use DCGAN to design and create a face generation model. A collection of networks and deep learning are used in the making of the model [2]. The model is trained on the CelebA dataset. One of the applications of the DCGAN face generator is where the actor Keanu Reeves’s face model from the movie John Wick was used in a game called Cyberpunk 2077. One another famous application is where characters for a new anime are created from feature extraction of characters from an old anime [3]. Adversarial algorithms are being used a lot in the security and defence fields. These fake faces are widely used in espionage, that is in spy work. These models help preserve the spy’s identity by creating faces that do not exist. Icons8, an IT company, is offering to sell diverse photos for marketing brochures and dating apps that intend to use the images in a chatbot. Section 20.2 discusses the research done on GAN models and the creation of DCGANs using the Convolutional Neural Networks (CNN) architecture which are more stable than previous GANs. It also discusses the different types of GANs and their drawbacks. Section 20.3 gives an overview of the proposed system design and an introduction to the generator and discriminator. In Sect. 20.4, the implementation of the design is discussed. Section 20.5 talks about the results obtained and thoughts about them. Section 20.6 draws the conclusion of the work and some future scope ideas.
20.2 Related Work Attempts to scale up GANs using CNNs to model images in the past have failed. When attempting to scale GANs using CNN architectures typically utilised in the supervised literature, the authors ran into problems. They discovered a family of architectures following significant model research that resulted in consistent training across a variety of datasets and enabled the construction of better resolution and deeper generative models. Adopting and adapting three shown modifications to CNN architectures is at the heart of their strategy. The first is the convolutional net, which uses strided convolutions instead of deterministic spatial pooling algorithms like maxpooling, allowing the network to learn its own spatial down sampling. In their generator, they leverage this strategy to allow it to learn its own spatial up sampling and discriminator. On top of convolutional features, there is a trend towards removing fully linked layers. They discovered that pooling global averages improved model stability but slowed convergence. Directing the highest convolutional features to the generator and discriminator’s input and output, respectively, was a good compromise. The third method is Batch Normalisation, which involves normalising the input to each unit to achieve a zero mean and unit variance. This aids gradient flow in deeper models and helps with training problems caused by improper initialisation.
20 Face Model Generation Using Deep Learning
183
With the exception of the output layer, which uses the Tanh function, the generator uses the ReLU activation. They discovered that adopting a limited activation allowed the model to learn to saturate and cover the training distribution’s colour space more quickly. They observed that the leaky corrected activation worked effectively within the discriminator, especially for higher resolution modelling. They propose a more stable set of architectures for training generative adversarial networks and show that for supervised learning and generative modelling, adversarial networks learn good representations of pictures [4]. Traditional GAN models lack control over the style of synthetic images they produce. The Style GAN model’s architecture adds various levels of detail, the GAN model introduces control over the style of generated images. When the Style GAN architecture was used to create synthetic human faces, it produced impressive results [5]. DCGANs have been known to achieve acceptable precision in the end, they are susceptible to instability during training. The RDCGAN, which was inspired by Mode Regularised GANs, shows that adding more feature maps to a model not only improves its efficiency as a representation learning method for supervised tasks but also stabilises the mode in GANs. It has been discovered that using reconstruction error can produce a meaningful error curve that corresponds to the quality of the produced images [6]. A new training method for GAN is described. The important concept is to gradually increase the quality of both the generator and the discriminator: beginning with a coarse contrast, additional layers that mimic more precise features are introduced as training goes. This accelerates and stabilises the training, enabling us to create photographs of exceptional quality [7]. The framework for predicting generative models using an adversarial approach where two models are trained at the same time has its advantages and disadvantages networks may benefit statistically from the generator network not being directly updated with data instances, but rather with gradients passing via the discriminator i.e., input items are not immediately duplicated into generator’s variables. A further benefit of networks is that they can depict very precise, even degenerate distributions, whereas Markov chain approaches require the distribution to be slightly hazy in order for the chains to blend amongst modes [8].
20.3 Proposed System Design The Generative Adversarial networks mainly work as a two-man game. The generator network is given a noise vector as input. This network, which consists of deep convolutional networks, changes this noise vector into a face model. Then this “fake face” along with an image from the training set is given to the discriminator. The discriminator’s work is to identify which amongst the two images is the fake one.
184
R. G. Phanindra et al.
Fig. 20.1 Architecture of a generator
The generator is trained to reduce the accuracy of the discriminator and the discriminator is trained to improve its prediction accuracy. This way both the discriminator and the generator are learning from each other to beat each other.
20.3.1 Generator Random noise is given to the generator network as input and uses a differentiable function to transform and reshape it into a structure that is identical to those images in the dataset (neural network). Random noise is an input that determines the generator’s output. When the generator network is applied to a wide range of random input sounds, it generates a wide range of useful output images. The model architecture for generator network is shown in Fig. 20.1 [9].
20.3.2 Discriminator A Discriminator network is a classifier network that gives the probability of an image that is true. As a result, the images given to the discriminator contain 50% real images and 50% fake images generated from the generator. The discriminator’s aim is to give true images a probability close to one and false images a probability close to zero. The generator, on the other hand, aims to generate fake images to which discriminator yields a probability close to one [10]. The discriminator will become more adept at discriminating between true and false images as training progresses. As a result, the generator would be forced to
20 Face Model Generation Using Deep Learning
185
Fig. 20.2 Architecture of a discriminator
update in order to produce more realistic samples in order to fool the discriminator. The general architecture for discriminator network is shown in Fig. 20.2 [9].
20.4 Implementation The implementation phase is divided into several steps, ranging from data loading to identifying and training adversarial networks.
20.4.1 Data Loader The dataset used is CelebA. The data is obtained by cropping the images and resizing them into 64 * 64 * 3 images of NumPy type. Some images from the dataset are in Fig. 20.3. The Data Loader returned by the function can shuffle and batch these Tensor images. Dataset also contains a wide range of poses and backgrounds. The batch size is 256 and the image size is 32 * 32 [11].
Fig. 20.3 A few images from the dataset
186
R. G. Phanindra et al.
The data transformation is passed into the image folder where the wrapper is used [12]. The get data function is defined to load the processed data from the images where it has been transformed by resizing, centre cropping and the images are in the tensor form [13]. After Transforming the images are wrapped into a Python image folder and the whole images are shuffled. After Transforming image data, you will load it into a different class. Data loading is performed, sample images can now be visualised. The images are transformed into NumPy type images [14]. A helper display function is created to display the images. From all the batches one batch of training images is obtained and plotted with corresponding labels. In the pre-processing part, the image ranges are scaled. The pixel values are modified within the range of the output activated generator i.e., tanh. So, the images are re-scaled to the feature range [15]. Pre-processing is done, now the scaled range of images can be checked to determine which range they fall in i.e., −1 to 1. Min: tensor (−0.96978) Max: tensor (0.98691).
20.4.2 Discriminator Network The discriminator will become more adept at discriminating between true and false images as training progresses. As a result, the generator would be forced to update in order to produce more realistic samples in order to fool the discriminator [16]. This is a convolutional classifier, but it does not have any MaxPooling layers. Three convolutional layers and a final completely linked layer make up the following architecture, which outputs a digit [17]. This digit determines the genuineness of every image. Except for the first, each convolution layer is accompanied by normalisation which is conducted for every batch. The hidden units use a leaky Rectified Linear (ReLU) activation function [18]. There are inputs and outputs to the discriminator which account for the network class. Input is 32 * 32 images and 3 dimensions. Output is a single-digit indicator. The discriminator module is initialised, the respective architecture is built and forward propagation of the neural network is done.
20.4.3 Generator Network Random noise is an input that determines the generator’s output. When the generator network is applied to a wide range of random input sounds, it generates a wide range of useful output images. In the generator class, input is the length of the vector whilst output is an image with 32 * 32 and 3 dimensions. Once the generator module is initialised, the respective architecture is built and forward propagation of the neural network is done.
20 Face Model Generation Using Deep Learning
187
There are 4 transposed layers. Each layer is defined so the output width and height become double. The hidden units use a leaky ReLU activation function. Feature maps are then calculated [19]. A standard deviation is used to initialise all the weights.
20.4.4 Training In the training loss functions, optimizer selection, and eventually the model building is all part of the phase. The model is trained with 40 epochs. The discriminator and the generator will be trained in turn. To calculate the losses, the previously established loss functions are used. The model is trained on Graphical Processing Unit (GPU) and the model inputs are also moved to GPU. The samples and losses are stored and the images are loaded. The discriminator is then trained on real and fake images. Finally, analyse the loss statistics. Loss from Discriminator: Total loss is the sum of fake and real image losses. Loss from Generator: Only the labels will be flipped in the generator failure. The generator’s aim is to persuade the discriminator that the images it generates are genuine. Optimizers are defined for two networks. Both should be executed at the same time in order to improve both networks. The optimizer used is the Adam optimizer [20].
20.5 Results and Discussion 20.5.1 Training Losses The training losses for the networks which are generated after each instance are plotted. Figure 20.4 shows the training losses occurred during model testing. In Fig. 20.4, training loss fluctuates because of input noise but there is a decrease in the end because it started producing fake images to fool it. So, it distinguishes which is original and which is fake. Y-axis is differentiated with 1 unit whereas in X-axis it is 25 units.
20.5.2 Generated Images Our algorithm was able to create new photos of phoney human faces that were as lifelike as possible. We can also see that all of the photographs are lighter in colour, even the dark faces. The samples are loaded from the generator network which is generated whilst training. Figure 20.5 shows the generated fake faces by the DCGAN model.
188
R. G. Phanindra et al.
Fig. 20.4 Loss generation
Fig. 20.5 Fake faces generated
20.6 Conclusion and Future Scope The model was able to successfully generate fake human images that look really similar to real ones. It should be noted that all images, including the brown ones, are lighter in shade. This is because the data set that is taken is biased as the majority of the images are of celebrities that are white. In the end, the model was able to convert noise into realistic human faces using DCGAN successfully. The results can be improved significantly with a wider range of variety in the dataset. The chosen dataset contains only the frontal view of the faces which limits the model to generate faces from only one angle. Future research can be done to address this. Extending this framework to other domains such as video and audio has not been done.
20 Face Model Generation Using Deep Learning
189
References 1. Tariq, S., et al.: Gan is a friend or foe? A framework to detect various fake face images. In: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing (2019) 2. Amrutha, C.V., Jyotsna, C., Amudha, J.: Deep learning approach for suspicious activity detection from surveillance video. In: 2nd International Conference on Innovative Mechanisms for Industry Applications (ICIMIA), pp. 335–339 (2020) 3. Marra, F., et al.: Detection of gan-generated fake images over social networks. In: 2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR). IEEE (2018) 4. Radfordv, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial network (2016) 5. Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: CVPR (2019) 6. Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation. In: ICLR (2018) 7. Yuan, Z., Jie, Z., Shan, S., Chen, X.: Attributes aware face generation with generative adversarial networks (2020) 8. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: NeurIPS (2014) 9. Fake Face Generator Using DCGAN Model. https://towardsdatascience.com/fake-face-genera tor-using-dcgan-model-ae9322ccfd65 (2022). Last Accessed 05 Feb 2022 10. Huang, Y., et al.: FakeLocator: robust localization of GAN-based face manipulations. arXiv preprint arXiv:2001.09598 3 (2020) 11. Ali-Gombe, A., Elyan, E., Jayne, C.: Multiple fake classes GAN for data augmentation in face image dataset. In: International Joint Conference on Neural Networks (IJCNN) (2019) 12. Tamuly, S., Jyotsna, C., Amudha, J.: Deep learning model for image classification. In: International Conference on Computational Vision and Bio Inspired Computing (ICCVBIC) (2019) 13. Krishnamoorthy, A., Sindhura, V.R., Gowtham, D., Jyotsna, C., Amudha, J.: StimulEye: an intelligent tool for feature extraction and event detection from raw eye gaze data. J. Int. Fuzzy Syst. (2020) 14. Reshma, M., Nair, P.C., Gopalapillai, R., Gupta, D., Sudarshan, T.S.B.: Multi-view robotic time series data clustering and analysis using data mining techniques. In: Advances in Signal Processing and Intelligent Recognition Systems, pp. 521–531. Springer, Cham (2016) 15. Amrutha, C.V., Jyotsna, C.: A robust system for video classification: identification and tracking of suspicious individuals from surveillance videos. In: International Conference on Soft Computing and Signal Processing (2020) 16. Dharneeshkar, J., Aniruthan, S.A., Karthika, R., Parameswaran, L.: Deep learning based detection of potholes in Indian roads using YOLO. In: 2020 International Conference on Inventive Computation Technologies (ICICT), pp. 381–385. IEEE (2020) 17. Aloysius, N., Geetha, M.: A review on deep convolutional neural networks. In: 2017 International Conference on Communication and Signal Processing (ICCSP), pp. 0588–0592. IEEE (2017) 18. Rani, N.S., Chandan, N., Jain, A.S., Kiran, H.R.: Deformed character recognition using convolutional neural networks. Int. J. Eng. Technol. 7(3), 1599–1604 (2018) 19. Mansourifar, H., Shi, W.: One-shot gan generated fake face detection. arXiv preprint arXiv: 2003.12244 (2020) 20. Yang, X., et al.: Exposing gan-synthesized faces using landmark locations. In: Proceedings of the ACM Workshop on Information Hiding and Multimedia Security (2019)
Chapter 21
Pipeline for Pre-processing of Audio Data Anantshesh Katti and M. Sumana
Abstract All machine learning pipeline needs a set of fixed procedure to be followed such as data collection, pre-processing, feature extraction, model training, and testing to name a few. And every procedure should be followed without which the results gained at the end implementation may vary from desired output or may cause inaccuracy. Every research or review paper focuses mainly on the model implementations or algorithms or results or performance but there are very few papers that discuss about the initial procedure of the pipeline such as pre-processing. Being one of the most important steps in the pipeline, this topic is either mentioned in a line or a paragraph in every paper. Here in this paper, pre-processing will be discussed and explored to its depth. Keywords Audio pre-processing · Spectrogram · Pipeline · Normalize · Padding
21.1 Introduction Any and every approach in machine learning or deep learning models need data. And all data collected, first require being analyzed, cleaned, and preprocessed. As most papers focus more on the model that is being built or algorithm being deployed, the pre-processing of data is only mentioned in a line or two and as any other process in the pipeline, pre-processing of data holds the most important of all as data here is analyzed and prepared for further pipeline without which further process may not generate the required output at the end implementation. This paper will be focusing on the pre-processing of data with audio as inputs as the focus. The goal of the data collected and used in this paper is for audio-based sentiment and emotion recognition. Thus, all paper referred to are relating to sentiment and emotion recognition using audio data. The related papers discuss about audio being A. Katti (B) · M. Sumana M. S. Ramaiah Institute of Engineering, Bangalore 560054, India e-mail: [email protected] M. Sumana e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_21
191
192
A. Katti and M. Sumana
most of the time converted to text and then pre-processing is carried-on on the text. And a few others where audio is directly used for model is preprocessed as transform, de-noising, filtering, etc. In this paper, the pipeline is deployed for pre-processing of the audio data, where the pre-processing is done on batch audio files. The pipeline consists of, starting from audio files being loaded, padding, extracting spectrogram, normalization and with final step by saving the results of pre-processing with every step being discussed in depth in Sect. 21.3.
21.2 Literature Survey The audio data pre-processing can be done either by converting the audio to text or used audio directly. If audio is used directly, whether the audio was created in a controlled environment or in normal environment is to be identified. If the audio data was recorded in controlled environment, steps like de-noising can be left out as the noise can be ignored or neglected due to less of its existence in the recordings [1]. Creates a dataset of audio files in wave format which is necessary and sample the frequency to a fixed Hz which helps in having a standardized pattern to work on with further discussion on feature extractions. Sampling audio to 6 kHz [2] and then they convert and obtain audio segments [2, 3] with each segment having only a part of the entire audio file. This data was obtained in normal environment with possible presence of noise. In [4], the discussion is done based on text context of data using the dialog RNN with utterance being embedded first into vector space so that it can be inputted into CNN. Another pre-processing done on textual data for sentimental analysis is done by removing a punctuation, converting upper to lower case, stemming of words, removal of stop words in done in [5–7], where in [6] further cleaning of data such as removing links and emotions was done because the data was collected from Twitter and [7] further lemmatization is done to reduce the complexity of the word. In [8], a unique algorithm is developed and used for pre-processing of the textual data that looks for verbs in the sentence to determine the sentiment of that sentence. In [9], more focus is given on the grammar and rules to understand the context and for pre-processing procedure. Pre-processing using voice activity detection system, which is used to identify the audios, categorizes, arrange them, and segregate them from signals in done in [10]. In [11], the paper discusses the data being recorded under controlled environment and with each audio files being recorded at 16 kHz with 16bit waveform. Investigating empirically the effect of audio pre-processing using deep neural network on music is done in [12] with using different time–frequency, logarithmic magnitude compression, and scaling. Discussion on how audio can be used to process by converting the analog signals to digital for processing and back to analog to hear the audio with different types of representation of audio like waveform, FFT, STFT, etc. is done in [13]. For machine learning concepts, techniques and auto encoders were referred from [14]. Ref. [15, 16] were referred for the concept of scheduling algorithms which can be used to schedule the methods used and the deployment at the cloud environment.
21 Pipeline for Pre-processing of Audio Data
193
21.3 Proposed Work The proposed work in this paper is a pipeline for audio data pre-processing, where the pre-processing of the audio is done in batch. The pipeline consists of a loader class, a padder class, log spectrogram extractor class, min–max normalize class, a saver class, and finally a class to call all the other class files. Each one of these class will be discussed in brief. Figure 21.1 shows this pipeline in the order, and the pre-processing will be carried out. Each of the methods is being explained in detail below.
21.3.1 Load Audio Files This class is responsible to load all the audio files in batch from a folder. All the audio files are of the same format–waveform audio file format (WAV). Here a few parameters at init method have been used such as sample rate, duration, and mono where sample rate refers to the number of samples of audio recorded every second usually in terms of Hertz. The duration represents the actual duration of the audio file, and finally, mono represents monophonic or single channel as compared to stereo which has two channels. In the load method, with the help of librosa library, loading of the file using the file path with the above-mentioned parameters is done. At the end of this method, each audio file from a specified directory will be loaded and ready to use for the upcoming pre-processing techniques. The sample of waveform of an audio file is shown in Fig. 21.2 with time/decibel as parameter on x–y axis. Figure 21.3 shows the example of a player provided by Python to play an audio that has been loaded through the loader class.
Fig. 21.1 Pre-processing pipeline
194
A. Katti and M. Sumana
Fig. 21.2 Waveform of audio file
Fig. 21.3 Option to play loaded audio
21.3.2 Padding the Audio Files This class is responsible to apply padding to an array. This padder is going to be a thin wrapper around a specific functionality of numpy called the pad. In this padder class, there are two methods, namely left padding and right padding. There are two parameters used, the first is an array and the other is the missing item variable. Now an array is needed here is because the audio file being used is converted to digital and represented in the form of machine understandable pattern, and hence, the audio signals are here represented in the form of an array. The second parameter is used because every audio file may vary with its duration, and hence, the array length may vary. Thus, to equalize the array length, we use the second parameter. And to fill the missing values, we use the left of right padding. For this method, a zero-padding is used, meaning a zero value is added at the place of the missing value. For the dataset used, only right padding is required.
21.3.3 Extract Spectrogram This class is used to extract log spectrogram in decibel from a time series signal by taking signal as input. The first thing needed to be done is to extract short time Fourier transform (STFT) with the help of frame size and hop length. The librosa implementation of STFT is going to give an array that has a shape as
21 Pipeline for Pre-processing of Audio Data
195
Fig. 21.4 Spectrogram of audio file
(1 + FRAME_SIZE/2, NUM_FRAMES)
(21.1)
The equation is the formula that shows the shape of STFT. The formula is a twodimensional array with first dimension being frame size as shown in the equation and the second dimension being the number of frames. Let us take a frame size as 1024 which is a standard frame size. When above formula which is the first dimension, is applied the result will be 513. But we need the result to be an even number, so we drop the last frequency bin from the total frame size. Every individual number in 1024 is a frequency bin for audio. Figure 21.4 shows a visual sample of spectrogram for an audio file with time/Hertz parameters on the labels of x–y axis. This spectrogram is a diagrammatic representation of frequencies with respect to time. The colors represent the variations in the audio pitch. The lighter the color in the image, louder the sound it is and vice versa.
21.3.4 Normalization This is used to apply minimum–maximum normalization to an array. The basic idea here is to take an array that has different values and that array values are down scaled to the format of 0–1 scale. Meaning, all the values, after normalization, are scaled to 0 and 1 range. The method takes in an array as input. This class file consists of normalize as well as de-normalize methods. In normalize method; first step is to scale the array to the range of 0 and 1. The formula is as below, norm_array = (array−array.min())/(array.max()−array.min())
(21.2)
In the above equation, norm_array is a variable created to store the normalized value. The array here is the array that contains the audio data values and we use the
196
A. Katti and M. Sumana
min() and max() method of Python along with dot operator to get the min and max of the audio array and substitute in the above formula to get the normalized result. The de-normalize method is the invert of the normalize method. Meaning, if one needs to change back the down scaled values of array back to their original array values, it can be done using the de-normalize method.
21.3.5 Saving the Results The saver is responsible to save features and the min max values of the audio waveform file. The method used in the saver class has two parameters as inputs; one is the feature and other being the file path. As the output, which is the feature here, is of the type array, we use numpy to save that feature to a desired directory. Thus, the end output being saved in the directory is of the type ‘.npy’ format. On the other hand, to save the original min, max values of the spectrogram, another file format is used to store the min–max values, which is a pickle file with its location being distinct from that of the features or the .npy files directory. Pickle is also a type of file storing format just like a .csv or .npy format.
21.3.6 Pre-processing Pipeline The pipeline processes audio files from a directory or folder, applying all the abovementioned classes and methods. So, it is going to load the waveform audio file, pad it if it is necessary, extract the spectrogram, normalize it, and finally, save all the outputs of all the methods. There are a couple of methods that are implemented in the pipeline, one method is to store the original min and max values of the log spectrogram so that it can be used at the time of regeneration of the signal and for de-normalizing. One method is used is to iterate over every audio file present in the directory and use them for all the above processes. One method is used to check if padding is needed, and another method is used to apply padding if it’s needed. In this pipeline, the constant values of frame size, hop length, duration, and sample rates are declared. The file path of the directory that contains the audio files as well as the location of the directory where the output is to be saved is to be mentioned in the pipeline class.
21.4 Results The output gained at the end of the pipeline is a numpy file which consists of multidimensional array representing the log spectrogram and min–max values after the normalization.
21 Pipeline for Pre-processing of Audio Data
197
Fig. 21.5 Numpy array output sample of single audio file
Fig. 21.6 Pickle output sample
Figure 21.5 shows the output after the pipeline is executed. The .npy file of each individual audio is saved uniquely in a directory. A sample of an audio file’s output is shown in Fig. 21.5. It is a 2-dimensional array with some values at the top being the result of log spectrogram and zero values at the bottom being the result of zero right padding. The pickle file retains the original min–max values so that it can be used to denormalize the array. Figure 21.6 shows an example where the pickle file is being read and it shows the min and max of the audio files. Usually, the process requires one to manually go through every process step by step and sometimes even one file at a time. Having a pipeline like this can help ease the process and hence speed up the process in the long run.
21.5 Conclusions In this paper, the implementation of audio pre-processing is discussed in depth. As any other part of the machine learning or deep learning procedure, pre-processing is also one of the most important steps. Here, a pipeline is implemented with all the needed methods being a part of that pipeline, right from loading the file from a directory to padding, to extracting spectrogram, to normalizing the audio file, and to saving the output to a desired directory. Instead of performing every process individually or manually and sometimes with one audio file at a time, all the process can be done in an instant with the help of this pipeline.
198
A. Katti and M. Sumana
References 1. Vryzas, N., Vrysis, L., Kotsakis, R., Dimoulas, C.: Speech emotion recognition adapted to multimodal semantic repositories. In: 13th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP), pp. 31–35 (2018). https://doi.org/10.1109/ SMAP.2018.8501881 2. Kim, W., Hansen, J.H.L.: Angry emotion detection from real-life conversational speech by leveraging content structure. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 5166–5169 (2010). https://doi.org/10.1109/ICASSP.2010.5495021 3. Bertero, D., Fung, P.: A first look into a convolutional neural network for speech emotion detection. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5115–5119 (2017). https://doi.org/10.1109/ICASSP.2017.7953131 4. Li, W., Shao, W., Ji, S., Cambria, E.: BiERU: bidirectional emotional recurrent unit for conversational sentiment analysis (2020) 5. Rotovei, D., Negru, V.: Improving lost/won classification in crm systems using sentiment analysis. In: 19th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), pp. 180–187 (2017). https://doi.org/10.1109/SYNASC.2017.00038 6. Griesser, S.E., Gupta, N.: Triangulated sentiment analysis of tweets for social CRM. In: 6th Swiss Conference on Data Science (SDS), pp. 75–79 (2019). https://doi.org/10.1109/SDS. 2019.000-4 7. Seyfio˘glu, M.S., Demirezen, M.U.: A hierarchical approach for sentiment analysis and categorization of Turkish written customer relationship management data. In: Federated Conference on Computer Science and Information Systems (FedCSIS), pp. 361–365 (2017). https://doi. org/10.15439/2017F204 8. Suriya, A.: Psychology factor-based sentiment analysis for online product customer review using multi-mixed short text ridge analysis. Adv. Sci. Res. (2019) 9. Lu, Q., Zhu, Z., Zhang, D., Wu, W., Guo, Q.: Interactive rule attention network for aspect-level sentiment analysis. IEEE Access 8, 52505–52516 (2020). https://doi.org/10.1109/ACCESS. 2020.2981139 10. Maghilnan, S., Kumar, M.R.: Sentiment analysis on speaker specific speech data. In: International Conference on Intelligent Computing and control (2017) 11. Pan, Y.C., Xu, M.X., Liu, L.Q., Jia, P.F.: Emotion-detecting based model selection for emotional speech recognition. In: The Proceedings of the Multiconference on “Computational Engineering in Systems Applications, pp. 2169–2172 (2006). https://doi.org/10.1109/CESA.2006. 4281997 12. Choi, K., Fazekas, G., Sandler, M., Cho, K.: A comparison of audio signal preprocessing methods for deep neural networks on music tagging. In: 26th European Signal Processing Conference (EUSIPCO), pp. 1870–1874 (2018). https://doi.org/10.23919/EUSIPCO.2018.855 3106 13. Arzaghi, S.: Audio pre-processing for deep learning. https://www.researchgate.net/publication/ 347356900_Audio_PreProcessing_For_Deep_Learning (2020) 14. Vaishnavi, A.S., Sumana, M.: Evolution of 3D images from 2D images. In: International Conference on Electronics, Computing and Communication Technologies (CONECCT) (2021) 15. Kirankumar, K., Sumana, M.: A parallel task scheduling algorithm for scientific workflow in heterogeneous cloud environment. Int. J. Adv. Sci. Technol. 29(10S) (2020) 16. Kirankuman, K., Sumana, M.: Power-aware mechanism for scheduling scientific workflows in cloud environment. In: Int. J. Inf. Syst. Model. Des. (IJISMD) 11(4) Article 2 (2021)
Chapter 22
Classification on Alzheimer’s Disease MRI Images with VGG-16 and VGG-19 Febin Antony, H. B. Anita, and Jincy A. George
Abstract Balancing thoughts and memories of our life is indeed the most critical part of the human brain. Thus, its stability and sustenance are also important for smooth functioning. The changes in the structure can lead to disorders such as dementia and one such type of condition is known as Alzheimer’s disease. Multi modal neuroimaging like magnetic resonance imaging (MRI) and positron emission tomography (PET) is used for the early diagnosis of Alzheimer’s disease (AD) by providing complementary information. Different modalities like PET and MRI data were acquired from the same subject, there exists markable materiality between MRI and PET data. Mild cognitive impairment (MCI) is the initial stage with few symptoms of AD. To recognise the subjects which are capable of converting from MCI to AD is to be analysed for further treatments. In this research, specific convolutional neural networks (CNN) which are designed for classifications like VGG-16 and VGG-19 deep learning architectures were used to check the accuracy of cognitively normal (CN) versus MCI, CN versus AD and MCI to AD conversion using MRI data. The proposed research is analysed and tested using MRI data from Alzheimer’s disease neuroimaging initiative (ADNI). Keywords Alzheimer’s disease (AD) · Mild cognitive impairment (MCI) · Cognitively normal (CN) · Visual geometry group (VGG) and convolutional neural networks (CNN)
22.1 Introduction Ageing is a crucial twist in the phase of life where one begins to rather settle in with the thoughts and memories of life. But disorders such as dementia creep into F. Antony (B) · H. B. Anita Department of Computer Science, Christ (Deemed to be University), Bangalore, Karnataka, India e-mail: [email protected] J. A. George Department of Life Sciences, Christ (Deemed to be University), Bangalore, Karnataka, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_22
199
200
F. Antony et al.
the thoughts and darken the room of memory. Therefore, dementia is a disorder that has to be noticed and cleared at the early stage itself. AD is an irreversible type of neurological disorder or dementia where an individual affected begins to lose memory and eventually loses the ability to respond or converse to the floor being. It is generally termed as a progressive disorder that has been studied to alter the structure of the human brain leading to the various inabilities of the brain in its functioning. It is found to affect people above the age of 65; whereas, the studies show the onset can be predicted in earlier stages by analysing the brain structure. The central nervous system is covered at its periphery by two important constituents such as grey matter and white matter. These membranous structures with the cell bodies in them form a coating around the brain and it, in turn, covers the functions of the brain. The depletion or reduction in the existence of the Grey Matter and the White Matter can affect their everyday life. Multiple proteins and other environmentally affected reasons are the cause for the changes in the structure of the grey matter and the white matter that in turn leads to the onset of the disease [1, 2]. The use of deep learning techniques with different model architectures was used for the early detection of AD using PET and MRI scans. In the studies, VGG-16 and VGG-19 classification methods were applied after the feature extraction process. The volume of grey matter gradually decreases and the white matter increases tend to AD [3]. The deep learning models train architectures from the scratch and have got limitations like deep learning networks which need a large amount of data for training but when it comes to medical images, data can be expensive and might have ethical issues. Analysing a deep learning network with a large amount of bio-medical images requires huge computational resources [4]. The large-scale training data sets can be used with smaller data sets with the final fully connected layers of convolutional neural networks (CNN). In this paper, we investigate two different CNN architectures or models such as VGG-16 and VGG19. These architectures are trained on different domains with the pre-trained weights when the training data is picked intelligently. In deep learning, CNN is known for its high accuracy prediction of bio-medical image classification. An important feature of this type of network is that it does not require manual feature extraction as it is able to extract features automatically [5].
22.1.1 Alzheimer’s Disease Neuroimaging Initiative (ADNI) The data we used in our study is from Alzheimer’s Disease Neuroimaging Initiative (ADNI) (https://www.adni-info.org). ADNI is an initiative that renders insight on understanding and studying the prospects of Alzheimer’s disease, its onset and its diagnostic perspectives. It is found to focus on providing platforms for the researchers of the field to gain knowledge and required support in data provision as well. The study of MCI and the other biomarkers can predict the onset of the disease and various other methodologies that can set a flaring pathway that could rule out the adversity of the disease in most of the population in today’s world. The resources mainly
22 Classification on Alzheimer’s Disease MRI Images…
201
provide data sets of five different types such as MRI Image, PET image, clinical data, biospecimen and genetic data [6]. The data is collected and authenticated to keep its consistency and quality. The same is the quality and credibility of the researchers who avail these sources for the studies on AD. The ADNI focuses its study participants across three stages of the disease analysis. The stages studied are CN (Cognitively normal: the study group whose brain structures and behavioural patterns are under control and is at the normal ageing category), SMC (Significant memory concern: it includes the study group that is more likely to be in transgression between a normal ageing concern to an MCI possibility) and MCI, which is a group with subjects with memory concerning risk. It is further categorised into three stages, early, mild and late that is studied under the category of EMCI, MCI and LMCI, respectively. ADNI is funded by the National Institutes of Health (NIH), National Institute of Aging and companies like Eisai, Elan, Merck and Forest Laboratories. From ADNI, we include both volunteers from ADNI1 and ADNI2 T1-weighted MRI images and PET images and were diagnosed with CN, MCI or AD. A volunteer has different MRI and PET scans from different periods, we selected three MRI and PET scans from three time periods and diagnoses based on the volunteer’s age. ADNI data helps to identify the relationship between various stages of the disease from CN to MCI to AD. The main objective of ADNI is to identify the disease stages of AD by using multimodal neuroimaging (MRI and PET), blood or CSF biomarkers and image analysis techniques. ADNI 1 deals with different biomarkers like blood and images as the outcome measures of cognitive decline; whereas, ADNI2 deals with different biomarkers like CSF and images as the predictive measures [7].
22.2 Materials and Methods In this paper, a deep learning-based algorithm for the detection of Alzheimer’s disease from Alzheimer’s Disease Neuroimaging Initiative (ADNI) is used. The data set contains MRI data of Alzheimer’s patients and normal control image data. 3D–2D conversion and image resizing were applied before VGG-16 and VGG-19 architecture of CNN for feature extraction.
22.2.1 Data Acquisition 3D T1-weighted MRI and PET scans were acquired using Siemens, GE and Philips scanners in Digital Imaging and Communications in Medicine (DICOM) format from ADNI official website. Neuroimaging core of ADNI is used to develop the final multimodal imaging plans for the MRI and PET acquisitions and to maintain data quality and scan consistency. ADNI1 is divided into three classes—MCI, AD and CN. From that, MCI stages have two classes—Early MCI (eMCI) and Late MCI (lMCI). Our main goal is to analyse the three classes and predict from CN to MCI
202
F. Antony et al.
and to AD, for that, we grouped eMCI and lMCI along with MCI. From ADNI, we downloaded original MRI and PET scans associated with MCI, AD and CN, i.e. no pre-trained or processed data were included in this research. Three diagnosis groups from the ADNI data set were considered, i.e. MCI—treated as MCI with eMCI and lMCI at baseline. AD—treated as AD at baseline and which will not convert back to MCI. CN—treated as CN at baseline [8]: The proposed model defines three different combinations for different stages in AD using VGG-16 and VGG-19 architecture. I.
II.
III.
CN versus MCI This model shows whether a patient is in a cognitive normal stage or the beginning of MCI. AD versus CN This model shows whether a patient is in a cognitive normal stage or the beginning of AD. MCI versus AD
This model shows whether a patient is on the edge of MCI or affected by AD. The size of the human brain changes with age due to the changes in grey matter (GM) and white matter (WM) of the brain, and these changes in the brain were analysed with the data provided CN, MCI and AD. The changes in the brain regions differ from the subject’s age to age. The data were divided into different classes based on the stages or the progression of the disease given in Table 22.1 (Fig. 22.1). Table 22.1 Data distribution Classes
No. of subjects
Age
Training images
CN
25
65+
200
60
260
MCI
25
65+
200
60
260
AD
25
65+
200
60
260
Total
75
600
180
780
Fig. 22.1 Proposed methodology for classification on AD
Testing images
Total images
22 Classification on Alzheimer’s Disease MRI Images…
203
Fig. 22.2 Skull stripping
22.2.2 Data set Preparation Pre-processing the MRI data set for skull stripping and brain tissue segmentation. While analysing brain images, only the brain region is required, so remove the skull part which is not a required region for analysis. Different image segmentation methods are used for skull stripping which includes histogram threshold, Fuzzy C means and image region growing where histogram-based algorithms will give more convincing results. The skull stripped image is shown in Fig. 22.2. After image pre-processing, scale down the pixel value (Range 0–255 to the range 0–1). Then prepare the data set for training and testing.
22.2.3 Data Augmentation To refrain from the possible overfitting due to the limited size of the MRI data set, we use data augmentation. From Fig. 22.3, we can observe that data augmentation improves the diversity of data sets used for training and analysis. This will help to improve our classification model more robustly.
22.2.4 VGG Concept Visual geometry group (VGG) neural network is one of the object recognition and medical image classification models. VGG focuses on the depth of the data as one of the significant features. To make changes in the depth of the data, it uses a 3 × 3 convolutional layer were one over the other. It intakes RGB images of 224 × 224 pixel range. Linear transformation of the image data is done with 1 × 1 convolution filters where one pixel is fixed for the spatial resolution after convolution.
204
F. Antony et al.
Fig. 22.3 Augmented images
22.2.5 VGG-16 VGG-16 is a convolutional neural network model which can work with 16 deep layers. The use of VGG-16 is so much applicable in the medical image classification framework. VGG-16 replaces the large kernel-sized filters with several 3 × 3 filters which put one over the other to increase the depth of the image. The VGG-16 network is needed to be trained before the initialisation of the deep neural network. The VGG-16 has a fully connected network and is 533 MB in size. The VGG-16 was initially trained for image object recognition on 2D images from the data set. The transfer learning of the VGG-16 framework will be implemented after feature identification and extraction. In Fig. 22.4, two fully connected layers were connected to the network which will be followed by a dropout layer to prevent overfitting of images. The final part of the framework will be the probability of AD with the loss function.
22.2.6 VGG-19 VGG-19 is another convolutional neural network model with a total of 47 layers including 19 trainable layers. Here a pre-trained network is loaded and trained which
22 Classification on Alzheimer’s Disease MRI Images…
205
Fig. 22.4 Transfer learning framework of VGG-16 [9]
will classify images into 1000 object categories. VGG-19 is trained over millions of diverse images with classification [10]. In Fig. 22.5, the pre-processed images were passed through weight layers and engaged in a stack of convolutional layers. The workflow of the architecture: • The convolutional layers are the first two layers are and in the initial layer, 64 filters were used that resulting in 224 * 224 * 64 volumes. • Then it reduces the height and width of the volume. It will tend from 224* 224* 64 to 112 * 112 * 64. • The next two convolutional layers with 128 filters to form the dimension of 112 * 112 * 128. • Then the volume reduced to 56 * 56 * 128.
Fig. 22.5 Transfer learning framework of VGG-19
206
F. Antony et al.
Table 22.2 VGG-16 model Model
Category
Age
Accuracy
Recall
Precision
VGG-16
CN versus MCI
65+
0.82
0.77
0.81
MCI versus AD
65+
0.80
0.79
0.82
CN versus AD
65+
0.81
0.79
0.76
Table 22.3 VGG-19 model Model
Category
Age
Accuracy
Recall
Precision
VGG-19
CN versus MCI
65+
0.90
0.88
0.89
MCI versus AD
65+
0.87
0.84
0.88
CN versus AD
65+
0.86
0.78
0.75
• Then the next two convolutional layers with 256 reducing the height to 28 which results in 28 * 28 * 256. • It will go on till it reaches 1000 filters to the size 1. If we compare VGG-16 with VGG-19, VGG-16 will have a total of 41 layers and VGG-19 will be having a total of 47 layers. VGG-16 and VGG-19 will be having convolutional layers 13 and 16, respectively. This convolution will be having filters or kernel size 64 and 128 for VGG-16 and 64,128 and 256 for VGG-19. Both have a maximum of five pooling levels [10].
22.3 Results See Tables 22.2 and 22.3.
22.4 Conclusion Alzheimer’s disease is generally known to be an inoperable disease that tends to the impairment of the brain structure. It depletes the functionality processing factors within the brain that are crucial for an individual’s survival. Several studies have been conducted with this regard to cure the disease and have shown to be not of great results. Therefore, early detection of the disease can help the individuals in taking precautionary methods to avoid the disease percentage before its onset. The study has put forward various models that can predict the onset earlier before the disease is fully-fledged and depletes the brain structure. Two methods implemented for the analysis are VGG-19 and VGG-16. It shows that the accuracy rates of VGG16 and VGG-19 are VGG-16 with an average of 81% of accuracy and VGG-19 with
22 Classification on Alzheimer’s Disease MRI Images…
207
an average of 84% of accuracy. Here VGG-19 shows more accuracy than VGG-16 architecture. This concludes the usage and application of these models in the early detection of the AD.
References 1. Frings, L., Yew, B., Flanagan, E., Lam, B.Y.K., Hüll, M., Huppertz, H.-J., Hodges, J.R., Hornberger, M.: Longitudinal grey and white matter changes in frontotemporal dementia and Alzheimer’s disease. PLoS ONE 9(3), e90814 (2014) 2. Hon, M., Khan, N.M.: Towards Alzheimer’s disease classification through transfer learning. In: 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (2017). https://doi.org/10.1109/bibm.2017.8217822 3. Baron, J.C., Chetelat, G., Desgranges, B., Perchey, G., Landeau, B., De La Sayette, V., Eustache, F.: In vivo mapping of gray matter loss with voxel-based morphometry in mild Alzheimer’s disease. Neuroimage 14(2), 298–309 (2001) 4. Sachdev, P.S., Zhuang, L., Braidy, N., Wen, W.: Is Alzheimer’s a disease of the white matter? Curr. Opin. Psychiatry 26(3), 244 (2013) 5. Vu, T.D., Ho, N.H., Yang, H.J., Kim, J., Song, H.C.: Non-white matter tissue extraction and deep convolutional neural network for Alzheimer’s disease detection. Soft Comput. 22(20), 6825–6833 (2018). https://doi.org/10.1007/s00500-018-3421-5 6. Mueller, S.G., et al.: The Alzheimer’s disease neuroimaging initiative. Neuroimaging Clin. N. Am. 15(4), 869–877, xi–xii (2005) 7. Weiner, M.W., et al.: The Alzheimer’s disease neuroimaging initiative: a review of papers published since its inception. Alzheimer’s Dement. 8(1S) (2012). https://doi.org/10.1016/j. jalz.2011.09.172 8. Kim, S.H., et al.: Neuron-specific enolase and neuroimaging for prognostication after cardiac arrest treated with targeted temperature management. ploS one 15(10), E0239979 (2020) 9. Website. https://doi.org/10.1080/19466315.2021.1884129 10. An Experimental Analysis of Different Deep Learning Based Models for Alzheimer’s Disease Classification Using Brain Magnetic Resonance Images. J. King Saud Univ.-Comput. Inf. Sci. Elsevier, Sept (2021). https://doi.org/10.1016/j.jksuci.2021.09.003
Chapter 23
Combining Variable Neighborhood Search and Constraint Programming for Solving the Dial-A-Ride Problem V. S. Vamsi Krishna Munjuluri, Mullapudi Mani Shankar, Kode Sai Vikshit, and Georg Gutjahr Abstract Dial-a-ride problems (DARPs) have become a popular topic in logistics in recent years. They are frequently used in transportation, goods distribution, and fast delivery. The DARP is an NP-hard optimization problem in which the objective is to organize transmutations from pickup to delivery locations of geographically dispersed customers. Multiple exact and heuristic approaches have been proposed in the literature to solve the DARP. In this paper, we propose a novel algorithm that combines a variable neighborhood search with constraint propagation to solve this problem. Variable neighborhood search is a metaheuristic that iteratively modifies routes to improve the quality of an incumbent solution. Constraint propagation makes use of techniques like backtracking, forward filtering, consistency enforcement to iteratively restrict the possible routes in the problem. Combining the two approaches, one obtains an algorithm that has good properties in terms of runtime and solution quality. In simulations, the algorithm is shown to be more efficient than the basic variable neighborhood search when runtimes are small. Keywords Dial-a-ride problem · Combinatorial optimization · Variable neighborhood search · Constraint programming
V. S. Vamsi Krishna Munjuluri (B) · M. M. Shankar · K. S. Vikshit Department of Computer Science and Engineering, Amrita Vishwa Vidyapeetham, Amritapuri, India e-mail: [email protected] M. M. Shankar e-mail: [email protected] K. S. Vikshit e-mail: [email protected] G. Gutjahr Center for Research in Analytics & Technologies for Education (CREATE), Amrita Vishwa Vidyapeetham, Amritapuri, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_23
209
210
V. S. Vamsi Krishna Munjuluri et al.
23.1 Introduction Dial-a-ride-problems (DARPs) are generalizations of vehicle routing problems with pickup and delivery (VRPPD) and vehicle routing problems with time windows (VRPTW) [18]. A set of customers give their pickup and drop-off locations. In most cases, they also provide the locations for their return trip (possibly on the same day). While VRPPDs deal with picking up and delivering goods, the variants of the DARP are also concerned about service-oriented factors like customer satisfaction and total waiting time. In other words, DARPs generalize a class of routing problems concerned with picking up and dropping off customers as per their demands. DARPs are more constrained than most other routing problems. In operations research, these problems are of great interest. The DARP is highly application-oriented, and multiple problem definitions exist. This also makes the DARP highly flexible, and it can be adapted to the specific needs of the customers. Even after deciding upon the problem structure, we can address the constraints in multiple ways. For example, the capacity and the time constraints can be formulated in a constraint satisfaction problem (CSP), whereas the waiting time can be a part of the objective that we are trying to optimize. DARP even lets us pick multiple depots if the customers are highly sensitive to waiting and we want them to have the service as soon as possible. We can also have dynamicity in the inputs, i.e., the customers can choose to modify the locations before entering the vehicle and also while in the vehicle. We can also have the fleet of our vehicles be homogeneous (all having the same capacity) or be heterogeneous (some having specialized equipment or different capacities). In general, the objective of a DARP could be formulated in two different ways: maximizing the satisfied demands subject to the availability of vehicles and other constraints or minimizing the total cost incurred in operating these vehicles. Combinations of these goals are also possible as objectives. The way that the DARP is formulated makes it very appealing to organizations with huge supply-chain divisions as a part of their business operations. Healy [14] showed that the DARP is NP-hard. Many methods that have been developed for the VRP can also be extended to the DARP [1, 2, 12, 13, 17, 21] We can try to solve the DARP exactly using techniques like the one proposed in [7], which is a branch and cut algorithm. Even though we get optimal solutions, these kinds of algorithms don’t scale well for real-world applications where the expectation is to prepare routes with minimal time and compute overhead. Therefore, heuristics like tabu search [8] and genetic algorithms [4] offer a good balance between the solution cost and the compute power, i.e., we can get nearoptimal performance with good scalability. Also, there are multiple metaheuristicbased approaches to solve the DARP like simulated annealing [3] and guided local search [24]. According to [8], the users can specify the departure time of their outbound journey (first route) and the arrival time of their inbound journey (second route); and they proposed a tabu search heuristic for solving the DARP. The application of metaheuristics to solve the DARP is explored in greater detail in [11].
23 Combining Variable Neighborhood Search and Constraint …
211
In the present work, we modify a version of the variable neighborhood search (VNS) [5] to solve the DARP. An efficient VNS algorithm was proposed by Parragh et al. [19]. Multiple modifications of this algorithm have been proposed in the literature. In this paper, we consider an idea that has been found to work well for other routing problems, namely to integrate constraint propagation into a VNS. The outline of this paper will be as follows. In Sect. 23.2, VNS, CSPs and how they are combined for solving the DARP are discussed in detail. Section 23.3 includes our simulation setup details and the results. Finally, we conclude in Sect. 23.4.
23.2 Methodology 23.2.1 Variable Neighborhood Search Variable neighborhood search (VNS) is a metaheuristic based on the idea of regular neighborhood shift during a search, with the understanding that several rules must be followed to ensure that this change occurs. A feasible solution in a DARP is a set of routes for the vehicles that satisfy the set of constraints. Among all possible feasible solutions, the optimal solution is defined as the solution that minimizes some objective function f , which is usually taken either as the sum of the length of the routes for the vehicles or as the sum of the waiting time of the costumers. For every possible assignment of routes to vehicles x, we define the 2-opt neighborhood [9] as the set of all possible assignments that can be obtained from the original assignment x by at most two operations: operations include combining two existing routes, splitting an existing route into two new routes, moving on costumer from one route to another, and swapping the order of two customers in a route [20]. In VNS, the idea of the 2-opt neighborhood structure is extended to the idea of multiple neighborhood structures, which we will call N1 , . . . , N K . Each neighborhood structure is defined by a set of operations that could be applied to the current assignment. In this way, Nk (x) forms a particular neighborhood around an assignment x that contains all the assignments that can be obtained by applying some or all of the operations in the k-th neighborhood structure to the assignment x (where k = 1, . . . , K ). The VNS is an iterative algorithm that keeps a feasible solution at any time and tries to improve this current solution in using the following three steps: a stochastic phase called shaking, where the algorithm picks a random neighbor of the current solution; a local search around the new solution defined by the K neighborhood structures; and an evaluation step that compares the current and the new solution to decide if the new solution should become the current solution. Now, we shall describe shaking, local search, and neighborhood change steps in detail before discussing the VNS procedure. Shaking procedure: Here, the shaking is used inside a VNS heuristic to solve local minima traps. Shaking selects a solution from the k-th neighborhood. The VNS algorithm will cycle through the neighborhood structures in subsequent iterations if the
212
V. S. Vamsi Krishna Munjuluri et al.
current iteration does not lead to progress; this is described later in the paragraph on the neighborhood change step. Here, we assumed that VNS variant uses a simple shaking approach for the k-th neighborhood, i.e., we choose a solution in the neighborhood of the current solution at random. Local Search: After the shaking step, we look at how the newly generated solution helps us make our current solution better. The local search is a heuristic that explores the neighborhood of a solution iteratively. After choosing an initial solution, we check for better solutions in the neighborhood of this new solution and keep doing that until no progress is made anymore. In our model, we use local search with a best improvement strategy which means that all the solutions in the neighborhood must be explored and the best one is returned. There are other local search strategies including a first improvement strategy which almost always takes lesser time to execute because it returns the first solution that is better than the current one without thinking about further exploring. Neighborhood change step: There are multiple ways in which one can change the neighborhood: sequential, cyclic, pipe, and skewed neighborhood changes [23]. We use the sequential change, wherein after finding an optimal solution for the current neighborhood structure, we need to check if it is actually better than the current solution. If it is, then we assign that solution as our current solution and start exploring its neighborhood. If it is not, we proceed to check for possibly better solutions in the next neighborhood (increment k). The VNS algorithm: Now that we have discussed the individual parts, we can give a full description of the VNS algorithm. We are starting the solution with a current state s0 . Here, the value of k is the index denoting neighborhood structure, and K is the total number of such neighborhood structures that we are willing to explore. The current state here is an initial solution where we start the neighborhood search. Note that we take two different neighborhoods: the neighborhood structure Nk for shaking operation and 2-opt for the local search operation. The main loop consists of three steps: first the shaking where we generate a new solution; then after the shaking, a local search will find an improved solution. Finally, we check if this new solution is better than the existing one and accordingly change the neighborhood. All the three steps—shaking, local search, neighborhood change—are done in a while loop which breaks when a stopping criterion is met. Some possible stopping criteria could be a fixed number of iterations, or a minimum required change for the value of the objective function.
23.2.2 Constraint Satisfaction Problems and Constraint Programming In the constraint satisfaction problem (CSP), the state of a group of objects must satisfy a set of constraints or limitations. More formally, in a constraint satisfaction problem, we have n variables X 1 , . . . , X n , and every variable has a set of possible values call its domain. We also have a set of constraints that these values must satisfy.
23 Combining Variable Neighborhood Search and Constraint …
213
For example, a simple constraint could be that X 1 cannot have the value 3. Such a constraint is called a unary constraint because it only involves a single variable. We can also have binary constraints that involve two variables; for example, X 2 must be greater than X 3 . In general, a constraint that involves n variables is called an n-ary constraint. The goal in a CSP is to find a consistent set of values for all of our variables so that none of these constraints are violated. Backtracking: The basic backtrack search constructs a partial solution by selecting values for variables until it finds a dead-end, at which point the incomplete solution cannot be continuously extended. When it comes to a halt, it reverses its previous decision and attempts again. This is done systematically to ensure that all options are explored. Constraint Propagation: In the basic form described above, backtracking is equivalent to brute-force search. It is conceptually simple but is usually not efficient enough to solve larger problems. For this reason, backtracking search is usually combined with techniques that narrow the search space down, while backtracking simultaneously explores the search tree. Such techniques for modifying a CSP are referred to as constraint propagation (CP) techniques. They are, more precisely, procedures that will enforce a sort of localized consistency, which could be a set of conditions that govern the consistency of a group of variables or constraints. There is a range of applications for constraint propagation. First, it transforms an issue into one that is identical but easier to resolve. Second, it will identify if a partial problem is satisfiable or unsatisfiable. The method of transmitting the domain reduction of a call variable to all or any of the constraints that are expressed over this variable is thought of as constraint propagation. More domain reductions may occur as a result of this approach. The mandatory constraints are then informed of those domain reductions. This method is repeated until no more variable domains are often reduced, or until a site becomes empty and fails. Forward Checking: The simplest method to avoid dead-ends in the backtracking search is to recognize early that a partial solution can not lead to an overall solution. To do that, forward checking just tests the constraints between the at hand and timeahead variables, which means now and then variables thus we call it restricted arc consistency. So, values within the domain of a “future” variable that dispute this task are far away from the domain when it is assigned to this variable. The benefit is that if the domain of a future variable becomes empty, the present incomplete solution is known to be consistent. Thus results, forward checking allows for the pruning of search tree branches which will result in failure sooner than simple backtracking. The idea of forward checking can be extended to enforcing some general forms of consistencies. The handbook of constraint programming [22] gives a comprehensive overview of constraint programming and the implementation of solvers. Kumar [16] gives an overview of algorithms used in CP problem. An overview for using CP solvers for scheduling problems is given by Da et al. [10]. Combining CP and VNS: CP can be added in three parts of a VNS algorithm:
214
V. S. Vamsi Krishna Munjuluri et al.
Table 23.1 Comparison of VNS with CP against VNS without CP but with twice the runtime Scenario 30 s with CP versus 60 s 100 s with CP versus 200 s a4-16 a4-24 a4-32 a4-40 a4-48 a5-40 a5-50 a5-60 a6-48 a6-60 a6-72 a7-56 a7-70 a7-84
100 99 95 94 98 112 98 97 99 98 100 99 100 100
100 100 100 97 92 100 95 97 88 97 97 94 95 98
Benchmark scenarios from Cordeau [7]. The first comparison gives the percentage of the average objective values from VNS with CP and a runtime of 30 s compared to the average objective from a VNS without CP but with 60 s runtime. A value of 100% would mean that they have the same average objective values when rounding to integer values. A value of 50% would mean that the VNS without CP is twice as good in terms of the average objective value. The second comparison is between VNS with CP and a runtime of 100 s against VNS without CP and a runtime of 200 s
1. To find the initial solution 2. In the shaking phase, when generating new solutions 3. To restrict the variable neighborhood for the local search For solving the DARP, in this work, we propose an algorithm based on an basic variable neighborhood search that uses CP for 1 and 2.
23.3 Results Two algorithms were implemented: a basic VNS with parallel cheapest insertion [6] for finding the initial solution and a VNS that integrates CP. The algorithms were implemented in Python. Google OR tools [15] are used as a CP solver, for parallel cheapest insertion, for 2-opt, and for the local search with best improvement. The full set of all the scenarios in the simulations was done on the Google’s Colab Notebooks with a Tesla K80 GPU. To evaluate the possible benefits of integrating CP into a VNS algorithm, we compared the algorithms on the benchmark datasets by Cordeau [7]. For each scenario, each of the two algorithms is run 10 times, and the average of the resulting 10 objective values is taken.
23 Combining Variable Neighborhood Search and Constraint …
215
The results of the comparison indicate that the difference between the two algorithms is small if both algorithms are given long runtimes. However, as shown in Table 23.1 show that integrating CP has the potential to speed up finding good solutions if the runtimes are small. In particular, for small runtimes, the algorithm with CP is almost twice as fast and therefore archives results that are comparable to the results of the algorithm without CP with twice the runtime. One possible explanation is that the initial solutions and efficient shakings are important early on to approach a good solution quickly; however, with enough runtime, the quality of the initial solution and the details of the shaking become less important in a VNS.
23.4 Discussion and Conclusion Dial-a-ride problems are important in multiple applications in operations research. This paper has described a novel algorithm that combines a variable neighborhood search with constraint propagation to solve this problem. In initial simulations, the algorithm is shown to be more efficient than the basic VNS when runtimes are small. In the future, we hope to perform more expensive simulations to identify more precisely in what situations integrating CP in a VNS is advantageous. As mentioned in the description of VNS, there are many variants of this algorithm such as the reduced VNS, the skewed VNS, the primal-dual VNS, and many more. In this work, we have integrated CP only in the basic form of the VNS. Exploring the combination of CP with other variants of VNS is a topic for further research.
References 1. Ajayan, S., Dileep, A., Mohan, A., Gutjahr, G., Sreeni, K., Nedungadi, P.: Vehicle routing and facility-location for sustainable lemongrass cultivation. In: 2019 9th International Symposium on Embedded Computing and System Design (ISED), pp. 1–6. IEEE, New York (2019) 2. Anbuudayasankar, S., Ganesh, K., Lee, T.R.: Meta-heuristic approach to solve mixed vehicle routing problem with backhauls in enterprise information system of service industry. In: Enterprise Information Systems: Concepts, Methodologies, Tools and Applications, pp. 1537–1552. IGI Global (2011) 3. Baugh, J.W., Jr., Kakivaya, G.K.R., Stone, J.R.: Intractability of the dial-a-ride problem and a multiobjective solution using simulated annealing. Eng. Optim. 30(2), 91–123 (1998) 4. Bergvinsdottir, K.B., Larsen, J., Jørgensen, R.M.: Solving the dial-a-ride problem using genetic algorithms. Informatics and Mathematical Modelling, Technical University of Denmark, DTU (2004) 5. Bräysy, O.: A reactive variable neighborhood search for the vehicle-routing problem with time windows. INFORMS J. Comput. 15(4), 347–368 (2003) 6. Braysy, O., Gendreau, M.: Vehicle routing problem with time windows, part I: route construction and local search algorithms. Transp. Sci. 39(1), 104–119 (2005) 7. Cordeau, J.F.: A branch-and-cut algorithm for the dial-a-ride problem. Oper. Res. 54(3), 573– 586 (2006) 8. Cordeau, J.F., Laporte, G.: A Tabu search heuristic for the static multi-vehicle dial-a-ride problem. Transp. Res. Part B: Methodol. 37(6), 579–594 (2003)
216
V. S. Vamsi Krishna Munjuluri et al.
9. Croes, G.A.: A method for solving traveling-salesman problems. Oper. Res. 6(6), 791–812 (1958) 10. Da Col, G., Teppan, E.C.: Industrial size job shop scheduling tackled by present day CP solvers. In: International Conference on Principles and Practice of Constraint Programming, pp. 144– 160. Springer, Berlin (2019) 11. Gendreau, M., Potvin, J.Y., Bräumlaysy, O., Hasle, G., Løkketangen, A.: Metaheuristics for the vehicle routing problem and its extensions: a categorized bibliography. In: The Vehicle Routing Problem: Latest Advances and New Challenges, pp. 143–169. Springer, Berlin (2008) 12. Gutjahr, G., Kamala, K.A., Nedungadi, P.: Genetic algorithms for vaccination tour planning in tribal areas in Kerala. In: 2018 International Conference on Advances in Computing. Communications and Informatics (ICACCI), pp. 938–942. IEEE, Berlin (2018) 13. Gutjahr, G., Krishna, L.C., Nedungadi, P.: Optimal tour planning for measles and rubella vaccination in Kochi, South India. In: 2018 International Conference on Advances in Computing. Communications and Informatics (ICACCI), pp. 1366–1370. IEEE, Berlin (2018) 14. Healy, P., Moll, R.: A new extension of local search applied to the dial-a-ride problem. Euro. J. Oper. Res. 83(1), 83–104 (1995) 15. Kruk, S.: Practical Python AI Projects: Mathematical Models of Optimization Problems with Google OR-Tools. Apress (2018) 16. Kumar, V.: Algorithms for constraint-satisfaction problems: a survey. AI Maga. 13(1), 32 (1992) 17. Malairajan, R., Ganesh, K., Punnniyamoorthy, M., Anbuudayasankar, S.: Decision support system for real time vehicle routing in Indian dairy industry: a case study. Int. J. Inf. Syst. Supply Chain Manage. (IJISSCM) 6(4), 77–101 (2013) 18. Parragh, S.N., Doerner, K.F., Hartl, R.F.: A survey on pickup and delivery models part II: transportation between pickup and delivery locations. J. für Betriebswirtschaft 58, 81–117 (2006) 19. Parragh, S.N., Doerner, K.F., Hartl, R.F.: Variable neighborhood search for the dial-a-ride problem. Comput. Oper. Res. 37(6), 1129–1138 (2010) 20. Potvin, J.Y., Kervahut, T., Garcia, B.L., Rousseau, J.M.: The vehicle routing problem with time windows part I: Tabu search. INFORMS J. Comput. 8(2), 158–164 (1996) 21. Rao, T.S.: An ant colony and simulated annealing algorithm with excess load VRP in a FMCG company. In: IOP Conference Series: Materials Science and Engineering, vol. 577, p. 012191. IOP Publishing (2019) 22. Rossi, F., Van Beek, P., Walsh, T.: Handbook of Constraint Programming. Elsevier (2006) 23. Toth, P., Vigo, D.: The Vehicle Routing Problem. SIAM (2002) 24. Voudouris, C., Edward, T.: Guided local search. Technical report CSM-247. Department of Computer Science, University of Essex, UK (1995)
Chapter 24
Decentralization of Traditional Systems Using Blockchain Harsh Mody, Harsh Parikh, Neeraj Patil, and Kumkum Saxena
Abstract Blockchain is considered to be one of the paradigms shifting technology in the 21st century. Before the advent of Blockchain, people could not think of a Peer-to-Peer computing system to be existing because of the key issues in having such technology such as issues with security and privacy, issues with scaling such a technology. Blockchain has garnered considerable attention in recent years, which has generated a wide range of applications. However, with new inventions in these fields such as smart contracts, new and improved secured protocols this dream does not seem that farfetched after all. Blockchain is the front runner for changing the Peer-to-Peer network system and having a safe and secure form for having transactions. Proof of work (PoW) is a decentralized consensus technique that requires network participants to spend time solving a complex mathematical puzzle to prevent anyone from manipulating the system. The primary problem with this technology is the lack of awareness among people; the general public still gets confused with Blockchain and Bitcoin. Even with these existing conditions, several companies have demonstrated use cases of this technology in various fields like finance, supply chain, smart contracts, and many other territories. Keywords Blockchain · Peer-to-Peer · Decentralization · Bitcoin · Cryptocurrency · Smart contracts · Security analysis
H. Mody (B) · H. Parikh · N. Patil · K. Saxena Thadomal Shahani Engineering College, University of Mumbai, Mumbai 400050, India e-mail: [email protected] H. Parikh e-mail: [email protected] N. Patil e-mail: [email protected] K. Saxena e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_24
217
218
H. Mody et al.
24.1 Introduction The concept of Blockchain was introduced by Nakamoto who is the mastermind behind the creation of Bitcoin [1]. In 2007, he published his research on a cryptography enthusiast’s mailing list called metzdowd.com. He called this paper “Bitcoin: A Peer-to-Peer Electronic Cash System” [2]. This was the necessary foundation that initiated a burst of new applications of Blockchain Technology in all the possible sectors where decentralization was required. Since this technology is just a decade old at this point, there is still a lot of research going on in this domain to improve the current technology and also to find newer applications at the same time. Decentralization also removes inefficiencies in a centrally managed bureaucratic system and helps the decision makers to make better choices as all of the citizens will take part in the decision making of the country. Blockchain is a collection of objects which are called Blocks. Blockchain can be visualized by thinking them as traditional databases where people store large amounts of data. One key aspect in which Blockchain differentiates itself from traditional databases is how the data is structured and stored. Each Block in a Blockchain is designed specifically to hold a fixed set of information that consists of a hash value of previous block, timestamp of the time when the block was created and some other data about the transaction and mining. These blocks are then chained onto previously mined blocks forming a chain of blocks called the Blockchain. Genesis Block is the first block in any Blockchain. All the blocks in the Blockchain follow the same data structure to the Genesis Block. It is used to initiate the Blockchain and is initialized with certain fixed data (Fig. 24.1). It is impossible to alter the Blockchain because they use a cryptographic hash function which is usually an irreversible one-way function. It is impossible to reverse and get back to original text using any amount of computation or cryptanalysis. While it is not possible to alter a Blockchain itself, creating forks are however possible. Forks are basically a divergence from the current path of the Blockchain Network. If the users of this technology grow dissatisfied of the functionality and offers proposed by the current state of the Blockchain, a fork is created in the existing implementation of the Blockchain to add additional features. This can also happen if there is a change in the network protocol used by the P2P Network. Most of changes to the fork are shortlived which usually happen due to disagreement between peers on the Blockchain Network but some forks are more permanent such as the fork which adds additional
Fig. 24.1 Blockchain blueprint
24 Decentralization of Traditional Systems Using Blockchain
219
Fig. 24.2 Hash rate owned in BTC by different countries [3]
features to the network like adding new security features to prevent attacks, reduce mining time. (Fig. 24.2). The above chart shows the hash rate owned in BTC (Bitcoin) mining by different countries. The term “hash rate” refers to the total cumulative processing power used to mine and process transactions on a Proof of Work Blockchain. All current financial institutions in the world are usually monitored and regulated by governing authorities of the respective countries. However, they have been accused many times of manipulating these currencies across the globe in order to gain more influence. Blockchains are typically managed by P2P (Peer-to-Peer) network, i.e., a distributed networking system which divides workloads between peers on a decentralized network. Each peer has a full record of data which is securely encrypted to prevent modifications and has equal privileges and is responsible for verifying the authenticity of a data. Any modification to this data leads to change in the actual records stored in the Blockchain and when verified with other copies of the transaction on the network is rendered useless.
24.2 Blockchain Security The participation and access levels of different entity determine the security level of a Blockchain system. Based on this, the Blockchain networks are either public or private in nature. Public Blockchain maintains public ledger of transactions and consensus is achieved via mining of the block. Any external user can get access to the
220
H. Mody et al.
transaction ledger as it is available with all of the users. For the private Blockchain, one needs to confirm its identity for membership, and access to the important data is given only to the trusted members of the organization. In a private Blockchain, the consensus is achieved only when the users verify the transaction this is also known as a selective endorsement. Tight control and regulation can be achieved with private and permissioned networks but throughput is low. In public and permissionless networks can have greater decentralization and the throughput is high since it is publicly available. Blockchain can suffer many attacks if it contains a lot of vulnerabilities especially if it is a public Blockchain. Such vulnerabilities are low in the private Blockchain system as any external user cannot get access to the transaction ledger. Proof of work (PoW) algorithm is one of the methods in reaching consensus to the Blockchain network. In a network using POW algorithm, consensus is achieved and a new block is added to a block chain only when a node in the network solves a complex mathematical problem. The miner node which solves the PoW algorithm will get some rewards in form of cryptocurrencies. Another method for coming to consensus is Proof of Stake (PoS). In this method, the miner or the node can validate the block according to how many coins they hold. The miner with the highest stakes will have a preponderance in helping reach consensus. Proof of Elapsed Time (PoET) is another Blockchain consensus algorithm. PoET uses a randomly generated elapsed time to decide mining rights and block winners on a Blockchain network.
24.2.1 Security Attacks Eclipse Attack In a Blockchain network, an endpoint uses a Peer selection strategy for getting its copy of the distributed ledger. So, the node selects some of the peer nodes to get the ledger from. In this attack, the attacker tries and creates all malicious nodes such that the victim’s node depends on the attacker’s nodes for the distributed ledger. Now, the attacker can keep fake ledger on its all nodes so that the victim also gets the fake ledger, thus manipulating the transactions of the victim. Though there are a million people running the software, the devices are unable to connect to all of them. The malicious attacker will ensure that all of the target peer node connections of the victims are made to the nodes controlled by the attacker. The attacker will flood the victim with its IP address first, which will most probably connect upon the software’s restart. A restart can either be forced by using a DDOS attack or the attacker will have to wait for it to occur. Now, there are a handful attacks which can be done once the eclipse attack is performed. Sybil Attack An attacker can create many fake identities in a Blockchain network. If they create a good number of malicious nodes, they may be able to vote out the honest node in the network. They have the ability to limit the node’s reception or transmission, essentially preventing other honest nodes from the network. Bitcoin counters this threat by requiring extensive PoW and offering a good incentive for the mined block.
24 Decentralization of Traditional Systems Using Blockchain
221
Selfish Mining When an attacker mines a block from the transaction pool but does not let it broadcast to other nodes, this is known as selfish mining. As a result, they continue to mine the block and prevent it from being broadcasted to the peer nodes. As a result, the attacker has more PoW than the other nodes. Hence, the attacker receives additional rewards while the other miners are left with the incomplete solution. Although this attack may have been in cryptocurrencies with unequally distributed pool shares, it is extremely difficult to execute in Bitcoin. Monacoin, a Japanese cryptocurrency, was subjected to a selfish mining attack, resulting in market value losses of up to $90,000. 51% Attack A 51% attack happens when a miner or group of miners try and control over 50% of the Blockchain network. Once they get hold of more than 50% of the network, they can prevent the creation of blocks by preventing the confirmation of the transaction. They can also reverse the transactions which were already completed. Numerous 51% attacks have taken place in different Blockchain networks in the past few years since the popularization of cryptocurrency and Blockchain technology. As the miner’s computing power increase, the probability that the miner will guess the solution to the PoW increases. So, a miner with greater computation power, will keep on mining more and more blocks. Hence, if the attacker gets the control of more than 50% of the hash rate, the attacker can control the Blockchain significantly. They can reverse the previously mined block from the Blockchain thus double spending the coins. 51% attacks can really destroy the whole Blockchain network. Private Key Security Attack An asymmetric key cryptographic system is used in Blockchain for encryption and decryption, every user has two keys a public and a private key. The private key is used as a decryption key. Every user has a unique private key. The receiver has to broadcast his public key to receive some personal data from other users. For avoiding the spoofing of the public key, usually there exists a third party in between the users which authenticates the user’s public keys to each other. The attacker cannot decrypt the plain text from the cipher text without getting the private key of the user as the key space is usually very large and the cryptographic algorithm used is very strong. A private key of the user is stored in a wallet and is used to carry out transactions using cryptocurrencies. If an attacker finds any vulnerability in the digital signature algorithm and is able to guess the private key of the user, he can totally exploit the users account and carry out any number of transactions using the key. Tracking such an attacker is almost impossible to carry out. Liveness Attack In liveness attack, the attacker tries to hold the transaction on delay as long as possible. It generally proceeds in three stages, i.e., the first stage is the preparation stage where in the attacker tries to gain the advantage of the honest miners by selfish mining. The next stage is the Denial Phase where in the attacker prevents the block from getting added into the public chain by keeping it with him. The third and the last stage is the Blockchain Delay Phase where the attacker tries to reduce the growth of the Blockchain by delaying transactions on the network. Double Spending Attack Double spending attack happens when the cryptocurrency is spent twice. A cryptocurrency is a digital file and there is no centralized authority to
222
H. Mody et al.
check this, duplication of cryptocurrency can be easily done. The Bitcoin Blockchain has a protocol used to counter double spend attacks on Blockchain network inspired by the traditional currency system. The records of payments done are written as transactions in decentralized network in the form of a Blockchain. This Blockchain is the status-quo Blockchain meaning this chain is a known verified chain containing all the valid transactions on the network till now. Thus, attackers to double spend, need to replace the status-quo chain in the network with their new branched chain, after taking the services. The DOA Attack The DAO attack stands for Decentralized Autonomous Organization. The aim of such an organization is basically to create rules and decision making tools which remove the need for documents and governance of people by creating a structure a decentralized control. The DAO works essentially on the following principle that few people such as the developers write smart contracts which govern the functioning of the organization. There is a funding period which is created in which people add funds to the organization by purchasing tokens from the organization just like when a new company on the stock market needs funds it introduces an IPO, here there is an ICO where in people add funds to the organization by giving it cryptocurrency. After this funding is done DAO’s duty is then to approve business proposals based on how the company is going to spend the money. Members who bought this ICO are allowed to vote for these decisions. DAO is usually just a software running on the decentralized network like in the case of the Ethereum Network.
24.3 Applications of Blockchain Internet of Things Internet of Things (IoT) is a group of devices interconnected to one other over the Internet. These devices are used in home automation, healthcare systems, security systems, smart TVs, etc. Most IoT systems currently rely on centralized systems which are usually connected to a single cloud server. Although these solutions work, they are not a secure way to use automated devices. Blockchain IoT devices instead would be able to get information, communicate with each peer, and store data for a large number of devices. The problems that the implementation of Blockchain in the IoT field will solve are decentralization, anonymity, and security. This would eliminate the need for a centralized server. Decentralization of IoT would increase the trust among users of the technology as there is no single owner who owns any data. Blockchain and IoT will be an incredible combination that will bring out the best in both worlds. Smart Contracts Smart contracts are simple programs or network protocols that are intended to run automatically over the Blockchain network when conditions of a contract or agreement are fully satisfied. With smart contracts, every transaction, assignment, and payment can have a digital record and signature that is identified, collected, and shared between two parties bounded by the contract. A smart contract
24 Decentralization of Traditional Systems Using Blockchain
223
allows us to use Blockchain and encryption technology to make a transparent and immutable contract in a decentralized environment. Smart contract execution guarantees to unite multiple parties together to an agreement as legal contracts without a need for a trusted third party. For example, while dealing with high-value assets smart contracts eliminate the need for bankers, lawyers, etc. Traditional contracts are costly because the middlemen must be paid for their services. Compared to smart contracts they have no intermediaries, and the only costs come from the underlying infrastructure of the Blockchain governing the smart contract. Voting System Blockchain technology can help the government implement a voting system that is unchangeable, transparent to the public, and cannot be hacked into by anyone. Using Blockchain in the voting system will offer an effective way of conducting fair elections. For this system to work voter must send their details and proof of their identity to the concerned authorities to prevent fake votes. As per the traditional system, we allocate a few days to conduct voting. In a Blockchain-based voting system, when someone votes, the concerned authorities can access the voter’s Blockchain to ensure that the voter is not committing any fraud. If the voter’s vote is valid, it is counted and in case the vote is invalid their vote gets rejected this is done by the polling station. Thus, preserving the system from fraud. After the vote is cast, the vote becomes a part of the Blockchain and gets stored after using an encryption algorithm and now will be a part of the transaction. Once the voter casts their vote, no hacker can change the result due to the characteristic of Blockchain being immutable and secure. Health Care System Blockchain technology provides a framework to implement this technology in the medical field. Ranging from storing medical records to managing the logistics of pharmaceuticals. One of the major ways Blockchain can revolutionize this field is by saving the medical history of the patient. As time goes by the medical records can grow longer and complex and every hospital has a different way of storing them. There have been several cases where hospitals are reluctant to share medical details of the patient with other hospitals. With Blockchain a common ledger that can act as a databank for all medical records, this will ensure that the health care workers and researchers can obtain timely, accurate, and comprehensive information. Another use for this technology is in the pharmaceuticals industries. With Blockchain, all the different parts of the manufacturing process can be monitored this is useful for tracking costs, labor, and even waste in emissions at every point. Financial System Blockchain technology improves the entire financial sector to optimize business effectively by sharing, storing, and transmitting data more transparently and securely. This improves their ability to do business as the transactions happen in a quick and secure manner even if it is across borders. At the moment, international payments are expensive and time-consuming business for banks. There are many possible errors and transparency and traceability are not always present in a traditional system. All of these problems might disappear if we shifted to a Blockchain. If the powers of Blockchain technology are properly harnessed then it can potentially change the whole financial sector.
224
H. Mody et al.
Fig. 24.3 The biggest mining pools and their share in mining BTC [4]
Identity Management Identity management has been the biggest concern since the advent of the Internet and being anonymous while surfing the web is of paramount importance. If the user’s identity falls in the hands of some malicious user, they can steal their credentials. Blockchain technology offers the solution that the user’s identity will remain secure even if they share their credentials on the internet. The below Pie Chart shows the biggest Mining Pools and their share in Mining BTC. However, as we see the major share of Pool cannot be linked to single pool of mining. With proper implementation of the Blockchain, it will allow for self-sovereign identity, which will inherit all the properties of the Blockchain such as security and transparency. One of the major advantages of this method is anonymous authentication to guarantee maximum security to the user. Self-sovereign identity is the concept that individuals or organizations can store their information by deciding where and what part of their identity they want to share without relying on any central databank which stores their IDs (Fig. 24.3). Cryptocurrency Cryptocurrencies such as Bitcoin, Lite Coin, Ethereum, are some of the most popular applications of Blockchain. A major reason behind the sudden popularity of the cryptocurrency is the development of exchanges where one can trade their cryptocurrencies both with any other cryptocurrency and with the traditional currency (Fiat). There have been significant issues in the traditional monetary system that uses credit and debit cards. The number of frauds that are committed every day puts either the individual or an organization in substantial financial despair. Cryptocurrency algorithm is more secured and is more reliable than using traditional forms of payment like cash or cards. Cryptocurrency has much cheaper processing fees with reliable transactions. A cryptocurrency is a like traditional currency but it is based on a distributed ledger which contains multiple users as nodes. This form of decentralization allows them to work outside the grasp of any authorities and make transactions without being traced (Fig. 24.4).
24 Decentralization of Traditional Systems Using Blockchain
225
Fig. 24.4 Mining rewards released by BTC this year [5]
The above diagram shows the mining rewards released by BTC by year. This increases over time because, as BTC is limited in supply and since a lot of people mine and hold BTC for longer periods, it becomes more and more scarce, thus increasing the price of mined rewards over time. Supply Chain Management Using Blockchain in the supply chain system will transform this whole process. The time required in the transit of final products from the manufacturers to the consumers can be reduced significantly. One of the essential characteristics of Blockchain is transparency, this will allow any employee to pull up the live position of the goods that are being manufactured or can track them during transit. If there is some defect in the manufacturing in a particular slot, the company can trackback and figure out the reason behind the problem. This can be crucial in the food industry if some expired product has entered their system and ended up in the final product. They call back all the products that are transported to individual retailers so that they do not cause harm to their consumers (Table 24.1).
24.4 Conclusion and Future Trends With advent of more and better cryptocurrencies and old ones such as Bitcoin making new highs daily, there is a lot of excitement among new buyers. Investing in these currencies is an investment in Blockchain technology itself. Few new and exciting stuff which seem catching traction are BaaS, i.e., Blockchain-as-a-Service which is
226
H. Mody et al.
Table. 24.1 The applications of blockchain in traditional systems Applications
Problems with traditional method
Blockchain solution
Companies working in that field
Voting system
A significant problem in this field is security; hackers can change the outcome of the election
As security is one of the most significant properties of the Blockchain, it overcomes this issue
Agora
Currency
Counterfeit currency can damage the economy of the country by raising the inflation rate
Cryptocurrency is the safest form of transaction without the need for a trusted third party
Bitcoin, dogecoin
Internet of Things
IoT requires a system that is secure, decentralized, easy-to-implement, and transparent
Blockchain can basically Chronicled, Helium, fulfill all the Archtouch requirements of IOT ranging from decentralization using a distributed ledger to security from the encryption technology
Health care
The medical records of a patient are confined to the hospital and in case of emergency, it will take time to request their records
With the distributed ledger, the patients’ medical information could be universally available
Supply chain
The ability to track the product during production and even after selling can be crucial in the food industry
Blockchain provides a Amazon way to track the goods at any stage and even after they are sold, which reduces overhead costs
Digital IDs or Physical IDs
In traditional identity, the user needs some authority for verification, and they do not have the control to share a part of their IDs
Self-sovereign identity changes the currently centralized physical identity as the users can now have full authority to their identity proofs
BurstIQ, medicalchain
Microsoft
used to for creation and management of cloud-based networks, VC and SSI, i.e., Verifiable Credential and Self-Sovereign Identity which is a universal tool which helps in storing important documents. Decentralized Finance or DeFi is an exciting advancement as it eliminates the needs for traditional banking mediators. Tokens such as NFT’s also have seen a lot of investment, NFTs are used to generate a cryptographical digital asset for paintings, music, etc. which cannot be replicated. It acts as a collectible in the digital world. Blockchain like any new technology, is a concept that first disrupts the traditional systems but over time it has the ability to boost the
24 Decentralization of Traditional Systems Using Blockchain
227
growth of a network that encompasses both the old concepts and the new invention. We anticipate to see some significant Blockchain Technology developments in the near future to increase performance and scalability, which is a distinct problem for public Blockchains and implementations in highly specific usage. With innovation in Blockchain Technology and its use cases being now applied to something outside the financial sector too, we see a lot of potential in the technology.
References 1. Wikipedia contributors. Nakamoto, S.: Wikipedia, The Free Encyclopedia. https://en.wikipedia. org/w/index.php?title=Satoshi_Nakamoto&old id=1040396149 (24 August 2021) 2. Bitcoin.Org. (N.d.). Retrieved September 18, 2021, from https://bitcoin.org/bitcoin.pdf 3. Cambridge bitcoin electricity consumption index (CBECI) (n.d.). Ccaf.io. https://ccaf.io/cbeci/ mining_map 4. Hashrate Distribution. https://www.Blockchain.com/pools (2017) 5. Miner Stats (n.d.). Retrieved January 19 2022, from https://www.kaggle.com/jventrone/minerstats-2020 (2020)
Chapter 25
Design of a Secure and Smart Healthcare IoT with Blockchain: A Review Trishla Kumari, Rakesh Kumar, and Rajendra Kumar Dwivedi
Abstract Health care is a data-intensive system, where a large amount of data is generated regularly. Patient data is used to provide health services, manage health research, generate medical records, and process medical insurance claims. Therefore, this critical data of patients must be secured. Blockchain technology can be used to maintain security in such smart healthcare systems. While these days, the focus of the blockchain applications in practice has been to build spreadsheets that include visual tokens, the momentum of this emerging technology has now expanded to provide security in the healthcare systems too. With the growth of the popularity of healthcare Internet of things (IoT), it is important to know that this technology-based smart system can be supported with blockchain technology for providing security services to all related players viz., patients, doctors, insurance companies, and researchers. This paper presents a state-of-the-art survey on securing healthcare IoT systems with blockchain technology. Keywords Machine learning · Smart contract · Supply chain management · Internet of things (IoT) · Consensus mechanism
25.1 Introduction Blockchain is a distributed spreadsheet that separates the area where each node is connected to a network that shares information and values. One of the most important uses of blockchain in the medical field. Blockchain is a ledger that keeps all transactions in digital form. It uses a peer-to-peer network, which is distributed to create a report, and ordered records of the continuous list are called blocks. Each block in the network contains a series of secure and signed functions and is itself verified by the network, using an authentication method. Every copy of blockchain data is T. Kumari (B) · R. Kumar Department of Computer Science & Engineering, MMMUT, Gorakhpur, India e-mail: [email protected] R. K. Dwivedi Department of IT & CA, MMMUT, Gorakhpur, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_25
229
230
T. Kumari et al.
distributed to all nodes in the network. Blockchain differs from conventional websites in that the algorithms are used to block the database from any type of conversion. Any system responsible for managing and maintaining medical data must take into account the user’s rights enshrined in applicable law [1, 2], such as the European General Data Protection Regulation (GDPR) EU 2016/679 regarding personal data protection. To apply these types of rules is a big challenge because blockchain has an immutable text that is a public and private key used to create a series of content that is static, consistent, and time. Two types of blockchain are used: • Public mainchain • Private sidechain. Each trusted and trusted node has a copy of the blockchain and mainchain depending on the type of nodes. Local content is made up of only a collection of links to permissions, health data, and other useful information for privacy reasons, and all the organizations and devices, which are participating devices, generated a huge amount of data [3, 4]. The host stores data or data stored in the cloud. According to the National Institute of Standards and Technology (NIST), the advantages of using blockchain technology are: • The nature of the division of digital ledgers. • The nature of resistance to interference. • Impossibility of modifying published work. Figure 25.1 shows the important features of blockchain technology. It provides high-level security, commitment between two parties called a “smart contract,” record data for future use, and reduce fraud in any digital marketer. Fig. 25.1 Blockchain technology
25 Design of a Secure and Smart Healthcare …
231
25.1.1 Motivation Indirectly, blockchain provides a solution by distributing the system over its participants, in a similar fashion of “the law is equal to all.” Although it does not represent a universal solution to our problem, it does represent the first viable alternative to fundamentally revolutionize the current paradigm.
25.1.2 Organization The rest of the paper is organized as follows: A brief background of the existing system is described in Sect. 25.2. The survey of existing systems and their work is discussed in Sect. 25.3. Section 25.4 concludes the existing system with some future directions.
25.2 Background Before proceeding further, we present the importance of technology in health care as: (a)
Cases of the use of blockchain health care in digital health Blockchain plays a very important role in a digital healthcare system as it provides transparency to the system. Here are some commercial applications of blockchain: • The appearance of the supply chain: The biggest challenge is to ensure the authenticity of medical equipment. Blockchain-based systems track items from the production site and in each phase through a supply chain enabling customers to have full visibility and transparency of the goods they purchase. • Electrical health within the patient Records: It creates an electronic healthrelated record of a patient that is accessed and modified only by an authorized person across multiple healthcare organizations. Each authorized person has a specific key to unlock the data for a specific period. • Wise insurance contracts and supply chain payments: Everyone involved in a smart contract can significantly reduce disputes over claims for overpayment of prescriptions and other goods. • IoT security for remote monitoring: Blockchain decentralized property help IoT devices to directly interact with each other without using a centralized server, which makes it difficult to present denial of service (DoS) and a man attacking in the middle.
232
T. Kumari et al.
Fig. 25.2 Challenges with blockchain technology
(b)
Security and privacy challenges
Figure 25.2 shows the challenges with blockchain. It does not provide privacy as there are many nodes connected to the network. Sometimes one attacker or group of attackers breaks multiple nodes in a system. Moreover, if most of the nodes are hacked, then the data can be altered or hacked. One of the problems that may arise is the consideration of private data, for example, if the generated hash values of a patient’s private keys are kept in the block itself, and a trusted physician knows the value of a patient’s private key hassle, then he may be able to obtain useful information such as when a patient visits a hospital. Patient privacy is compromised by doing this [5]. Since blockchains rely solely on multiple cryptographic algorithms to ensure integrity, this may jeopardize the security and integrity of the entire network if quantum computers become available. Some of the other challenges for blockchain are people are unaware of this technology, what is it? how does it work? why did it come into existence? etc.
25.3 Literary Survey In today’s world situation, when a person goes to a hospital for treatment of his/her problem, he/she has to follow all the basic formalities required by the hospital such as filling his personal or health details in a form, taking appointment from a particular doctor and also some test related to his/her problem. Considering his/her form details,
25 Design of a Secure and Smart Healthcare …
233
tests, and appointment, the doctor provided him proper treatment for his problem and gave a prescription of his problem after his treatment prescription, he has to follow the payment procedure and the payment procedure is done either by Amazon pay, Google pay Phone pay, dominate the UPI, etc., so it consumes a lot of time and all the online transactions are handled by third parties. Therefore, by using blockchain technology, we can avoid the involvement of third parties, and it takes less time to transfer money because of its peer-to-peer decentralized nature. (a)
(b)
(c)
(d)
(e)
(f)
Blockchain-Based Key Management Scheme in Fog-Enabled IoT Systems Chen et al. [6] proposed blockchain-based key management a secure key management system and secure installation group channels in mist-based IoT systems. They showed that the system gains conditional anonymity, on-refusal, data availability, and service verification. Their simulation is also done to show the efficiency of the proposed system. Service Framework Opportunities for Use of Blockchain in Health care Blockchain is very useful in the field of health care as it provides electronic data storage, biomedical research, drug supply, health insurance, etc. [7]. It has some scalability issues, issues with a smart contract too. Due to a lack of technical talent, it is hard to have a new version of it. Applications of Blockchain toward Health care Pandey et al. [8] focused on the use cases of blockchain in the field of health care as it provides patient data management, supply chain management, billing management, medical goods management, etc. In addition, this paper discusses challenges like technicality and adoption, business challenges, scalability, etc. Blockchain solves many medical-related problems as it provides trust between parties, has traceability, and work without any intermediaries. Using Blockchain Technology for eHealth Data Access Management Tripoli et al. [9] discussed the benefits of blockchain technology and how a big amount of data is stored and easily accessible in the healthcare field. Blockchain provides transparency for the patient so that a patient can store their past details and provide access to anyone else and by doing this he can connect to a doctor whenever they want. No third parties are involved in the process. In addition, this paper showed how this technology proves better with the use of correct tools, models, and protocols. Blockchain-based Medical Records Safe and Medical Storage Chen et al. [10] discussed how the medical-related big amount of data spread all over the medical institutions are stored on the blockchain with security and transparency using blockchain-based storage as it is hard to manage all the medical data to store securely and access it. A systematic review of blockchain Xu et al. [11] described the current status of blockchain in the field of business and economics. By the use of cluster analysis, they identify some aspects like how blockchain is beneficial for an economy, sharing economy, etc. Moreover, this research is all searched on the Web of Science Service. This paper also focuses on the future direction and applications of this technology.
234
T. Kumari et al.
(g)
Lightweight Blockchain for Health care Ismail et al. [12] focused on a blockchain technology that is lightweight means a system that reduces computing overhead and a communication overhead as the number of blocks increases in the system. By doing this, the scalability of a system increases. It also allows confidentiality of data and security of medical records. Introducing Blockchain for Health care Alhdhrami et al. [13] discussed the risk of the patient record and medical data among third parties. Implemented the access control list that reduces the risk of data access to an unauthorized person. This paper uses permission and permissionless blockchain to allow data to share between persons. Blockchain Technology Use Cases in Health care Zhang et al. [14] presented the case study that implements blockchain technology. A system where the patient has full control of their data to access any information and allow anyone to access their personal information like he provides access to doctors. By doing this, they provide transparency to the system. Benefits of Blockchain for Both Patients and Doctors Nguyen et al. [15] presented the conceptual model of medical applications diagrammatically through which patient compares the cost of medicines and medical bills, doctors manage their patients, transaction manages through this application. This model helps in the development of medical records in health care to be used by public health authorities. Limitations Internet is necessary for blockchain technology. Therefore, the patient needs to have a stable Internet connection along with the hardware such as mobile phones, laptops, etc. to interact. Patients have to store their history related to her health on the system. Patient medical history is useful for accessing patient’s problem, which tells about the health issues to the patient and what medicine he/she used to take. When the patient fails to upload his/her history, then the doctor full check-up and finds the problem. This process will take a lot of time and it can be avoided if the details of the patient are already provided. We observed following limitations in the existing works:
(h)
(i)
(j)
(k)
• • • • • •
Consensus mechanism Limited availability of technical talent Immutable Lack of awareness Key management Scalability
Figure 25.3 shows that there are many limitations of blockchain technology like people are unaware of it, blockchain technology is immutable so, after writing data no one can modify it, changing the data is near to impossible. Blockchain technology is growing day by day but still, there is a lack of technicality among
25 Design of a Secure and Smart Healthcare …
235
Fig. 25.3 Limitations of blockchain technology
(I)
people to use it. Some other limitations are scalability, which means that the transaction speed slows down as the number of nodes increases in a network, key management is a tough task to do, consensus mechanism that means every block, which is created, must reach a common consensus, and because of this, it consumes a lot of time and resources. Major Findings of the Survey Our findings show that blockchain technology growing day by day and is used in every sector in the upcoming future. Also, blockchain is used in health care for security reasons, data storing data sharing, secure online transactions, accessing records easily, etc. However, it rarely works in drug prescription management and supply chain management scenarios. Therefore, blockchain exploration is still needed. Most of the papers represent the framework architecture and model of the blockchain system, smart contracts, and consensus mechanism. Still, there is a need to explore blockchain characteristics such as blockchain platforms, etc.
In Table 25.1, we discuss the existing system and their related work by comparing them with the advantages and techniques used to show the benefits of blockchain in the healthcare system.
25.4 Conclusion and Future Work Blockchain provides a secure digital environment. It manages the patient’s data in a secure way. This data is publicly accessible in real-time by anybody in a series of medical service organizations whenever authorized by a patient. In this paper,
236
T. Kumari et al.
Table 25.1 Comparison of existing schemes References
Author
Techniques
Advantages
Description
[6]
Tong Chen, Lei Zhang (2021)
Key management scheme is used
It achieves conditional anonymity, conditional anonymity, non-refusal, data availability, and service verification
It showed that the scheme they used achieves data recoverability, conditional anonymity, non-repudiation, and resource authentication
[7]
Igor Radanovi´c, Robert Liki´c (2018)
Blockchain-based key management scheme
Reduce cost in healthcare systems, data security, and accessibility
We can use encryption key management schemes to provide high-level security
[8]
Sandeep Pandey, Gajendra K. (2018)
Drug blockchain is used
Reduce cost Scalability
Some of the issues are already tackled like cost reduction, scalability, etc.
[9]
Elie Rachkidi, Tripoli, Nada C. Taher (2017)
Access control list is used for unauthorization. Etherium is used
Privacy, the confidentiality of the critical medical data
Achieve scalability and security for better performance in the field of healthcare data exchange
[10]
Yi Chen, Shuai Ding, Zheng Xu (2019)
The storage scheme is used
Blockchain maintains data storing, sharing, and personal privacy
Blockchain ensures the confidentiality of a person
[11]
Min Xu, Gang Kou (2019)
Cross-chain technology
Avoid drugs prescription fraud, provide availability and security
Simply we can say that it provides transparency in the real world
[12]
Leila Ismail, Huned Materwala (2019)
Uses light-weighted blockchain architecture
Fault-tolerant, secure, scalable, traceable, and private blockchain
This architecture reduces the communication overhead and computational overhead to the system (continued)
25 Design of a Secure and Smart Healthcare …
237
Table 25.1 (continued) References
Author
Techniques
Advantages
Description
[13]
Zainab Alhdhrami, Salma Alghfeli (2017)
Uses permission and permissionless blockchain architecture
Privacy of patients and prevent data from several types of attacks
This architecture allows permission and permissionless blockchain to share data
[14]
Peng Zhang, Case study Douglas Schmidt (2018)
Provide privacy of transactions, patient data, and transparency
A system where patients have access control to their medical history
[15]
Tran Le Nguyen (2018)
Inexpensive, customer-friendly,
This app facilitates patients and doctors to manage their data accordingly
Conceptual model
we investigated the blockchain technology research trends in the healthcare systems and found that the blockchain has great potential in the field of healthcare security. Blockchain is based on distributed, decentralized, and immutable system. In this paper, we identified the benefits of blockchain technology and its applications in the area of health care. Much more work could be part of research in the field of secure healthcare systems like data privacy and confidentiality-related problems. Future research directions can include the Internet of things (IoT) with artificial intelligence (AI) to make the healthcare system more effective.
References 1. Dwivedi, R.K., Kumari, N., Kumar, R.: Integration of wireless sensor networks with cloud towards efficient management in IoT: a review. In: 2nd Springer International Conference on Data & Information Sciences (ICDIS 2019), Agra, India, pp. 97–107, Springer, March 29–30 (2019) 2. Dwivedi, R.K., Kumar, R., Buyya, R.: Secure healthcare monitoring sensor cloud with attributebased elliptical curve cryptography. Int. J. Cloud Appl. Comput. (IJCAC) 11(3), Article 1 (2021, July–September) 3. Dwivedi, R.K., Kumar, R., Buyya, R.: A novel machine learning-based approach for outlier detection in smart healthcare sensor clouds. Int. J. Healthcare Inf. Syst. Inform. (IJHISI) 16(4), Article 26 (2021, October–December) 4. Srivastava, R., Dwivedi, R.K.: A survey on diabetes mellitus prediction using machine learning algorithms. In: 6th International Conference on ICT for Sustainable Development (ICT4SD 2021), Goa, India, Springer, August 05–06 (2021) 5. Srivastava, R., Dwivedi, R.K.: Diabetes mellitus prediction using ensemble learning approach with hyper parameterization. In: 6th International Conference on ICT for Sustainable Development (ICT4SD 2021), Goa, India, Springer, August 05–06 (2021)
238
T. Kumari et al.
6. Chen, T., Zhang, L.: Blockchain-based key management scheme in fog-enabled IoT systems. IEEE Internet Things J. 8(13) (2021, July 1) 7. Radanovi´c, I., Liki´c, R.: Service framework opportunities for use of blockchain technology in medicine. In: Applied Health Economics and Health Policy (2018) 8. Pandey, S., Katuwal, G.J.: Applications of blockchain in healthcare: In: Current Landscape & Challenges (2018) 9. Rachkidi, E., Tripoli, C., Taher, N.: Towards using blockchain technology for eHealth data access management. In: International Conference on Advances in Biomedical Engineering (ICABME) (Nov 2017) 10. Chen, Y., Ding, S., Xu, Z.: Blockchain-based medical records secure storage and medical service framework. J. Med. Syst. (2019, November) 11. Xu, M., Kou, G.: A systematic review of blockchain. In: Xu et al. (eds.) Financial Innovation (2019) 12. Ismail, L., Materwala, H.: Lightweight blockchain for healthcare. https://doi.org/10.1109/ACC ESS.2019.2947613 (2019) 13. Alhadhrami, Z., Alghfeli, S.: Introducing blockchains for healthcare. In: International Conference on Electrical and Computing Technologies and Applications (ICECTA) (2017) 14. Zhang, P., Schmidt, D.C., White, J.: Blockchain technology use cases in healthcare. In: Advances in Computers (2018) 15. Nguyen, T.L.: Blockchain in healthcare: a new technology benefit for both patients and doctors. In: Proceedings of PICMET ‘18: Technology Management for Interconnected World (2018)
Chapter 26
Detecting Deceptive News in Social Media Using Supervised Machine Learning Techniques Anshita Malviya and Rajendra Kumar Dwivedi
Abstract Our society has witnessed numerous incidents concerning fake news. Social media has always played a significant role in its contribution. People with notorious mindsets often are the generators and spreaders of such incidents. These mischievous people spread the fake news without even realizing the effect it has on naive people. People believe on the fake news and start behaving accordingly. Fake news appeals to our emotions. It plays with our feelings, and it can make us angry, happy, or scared. Also, fake news can lead to hatred or anger toward a specific person. Nowadays, people easily befool each other using social media as a tool to spread the fake news. In this paper, we proposed a machine learning-based model for detection of such deceptive news which creates disturbances in our society. This model is implemented in Python. We used machine learning algorithms, viz. logistic regression, multinomial Naive Bayes, passive-aggressive classifier, and multinomial classifier with hyperparameter. Different vectorization techniques have been used for evaluation of models. Results show that the passive-aggressive classifier using Tf-idf vectorization outperforms others in terms of accuracy to detect the fake news. Keywords Logistic regression · Multinomial Naïve Bayes algorithm · Passive-aggressive classifier · Multinomial classifier with hyperparameter · Count vectorizer · Tf-idf vectorizer · Hash vectorizer
26.1 Introduction The ineluctable part of our community is the social media. The news source present on it cannot be trusted. Nowadays, these social platforms are the medium of spreading fake news which consists of forge stories and vague quotes, facts, and sources. These hypothetical and spurious stories are used to influence people’s opinion toward any issue. The term fake news amalgamates these different conceptions such as misinformation, disinformation, and mal-information. Since few years, the spread A. Malviya (B) · R. K. Dwivedi Department of Information Technology & Computer Application, MMMUT, Gorakhpur, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_26
239
240
A. Malviya and R. K. Dwivedi
of fake news is generally through social media platforms like Twitter, Facebook, WhatsApp, YouTube, etc. in the form of videos, memes, advertisements, imposing contents, and many more. This has become a serious issue as it is causing serious crimes and affecting the peace and brotherhood among people. Fake news is an inaccurate, fallacious detail about any event or happening. Fake news is generally aimed toward destroying the image of a particular person and organization by spreading false rumors about it. It can be also considered as propaganda against a particular person, company, and organization. It is often spread in order to befool the naive audience who in general are clueless about that particular piece of info. In spite of the fact that the term fake news can mean differently to different people, in general it means the same, that is, the false that is false. In this growing and digitalized world, most of the population’s youth use social media as a medium for communicating with each other. In addition to this, they rely on social media for acquiring all kinds of news. So, these youths become the easy target by fake news generators for its transmission. Some of the recent examples of fake news are spying technology added in bank notes, fake news about late actor SSR’s last words before he died, and several fake news about corona virus and its vaccine. There are various ways by which this fake news could be identified. Reader’s emotions are influenced by these fake news which leads to anger and disputes. If the Web site address is fake, author is anonymous, source of news is mispresented, and the article is grammatically incorrect; all these clearly indicate that the news is fake. People should also check the publication date of the article to find the authenticity of the news. We readers could also take actions against the misinformation spread on social media. If we see a post trending on social media with misinformation, we should report it or we could also report people who are spreading this information and creating disturbance in the society. We could be the editors and could find the truth behind the articles and protect our community from this problem. Rest of the paper is organized as follows. Section 26.2 introduced the literature survey related to the deceptive news prediction using machine learning algorithms. Proposed methodology is presented in Sect. 26.3. Section 26.4 gives the overview of the machine learning techniques used in prediction of fake news. Experimental work is explained in Sect. 26.5. Conclusion and future directions are given in Sect. 26.6.
26.2 Literature Survey This section discusses a detailed review on the detection of fake news and classification of fake and real news present on social media platforms. It has been observed that since few years researchers have shown their great interest in this area. Mandical et al. [1] discussed about the problem of fake news spreading worldwide and the importance of machine learning in the detection of these fake news. A fake news classification system has been proposed by the authors using three machine learning algorithms such as Naïve Bayes, passive-aggressive classifier, and
26 Detecting Deceptive News in Social Media …
241
deep neural networks on eight datasets. Adinda et al. [2] told about two methods of identifying fake news. The first method is fact checker which involves volunteers, and the second is based on intelligent systems. Bhogade et al. [3] collected various news stories using natural language processing, machine learning, and artificial intelligence. The usage of different models of machine learning has been discussed for prediction of fake news and checking the authenticity of news stories. Ashtaputre et al. [4] proposed a model which involves machine learning techniques and natural language processing for the detection of fake news. They performed comparison of different classification models and techniques to determine the best result among them. They used Tf-idf vectorization for data preprocessing. Baarir et al. [5] explained about the current threat to mankind which is the wide spread of illegal information in the form of news on social media. Tf-idf vectorizer has been used as the data preprocessing technique, and support vector machine classifier is used for the construction of detection system. Bharath et al. [6] explored five machine learning algorithms such as logistic regression, support vector machines, Naïve Bayes, and recurrent neural network models and found their effectiveness toward solving the fake news detection task. Sharma et al. [7] elaborated about the consequences and disadvantages of using social media platforms. The authors performed binary classification of different news articles using artificial intelligence, natural language processing, and machine learning. Ahmed et al. [8] proposed techniques to check the authenticity of the news. They concluded that the machine learning methods are reliable for detecting fake news. They evaluated the accuracy of their proposed model with other systems by applying various combinations of machine learning algorithms like support vector machine, passive-aggressive classifier, logistic regression, and Naïve Bayes. Kumar et al. [9] presented the significance of techniques of natural language processing to classify news in fake and real. Text classification methods with the help of various classifications models are used by them to predict the result. Waghmare et al. [10] used block chain framework with machine learning methods for the detection and classification of social news. Khivasara et al. [11] proposed Web-based augmentation to provide authenticity of content to the readers. LSTM and GPT-2 are two algorithms used by the authors to distinguish between real and fake news articles. Mishra R. [12] found influence association among readers by proposing a HiMap model for the detection of fake news. Experiments are performed using two Twitter datasets and calculated accuracy for the state-of-the-art models. Shaikh et al. [13] detected the accuracy of the fake social news with the help of various machine learning approaches. Tf-idf vectorization has been used for feature extraction. Chokshia et al. [14] collected the percentage of users tricked by the spread of fake news as in 2020; 95% of the people protesting against the CAA law thought that their citizenship would be taken, and these people became the victim of this problem. Different deep learning methods are used by the authors to solve this problem. Wang et al. [15] investigated the issues related to the prediction of fake news. They proposed novel set-up consisting of the fake article detector, the annotator, and the reinforced selector as the models based on deep learning were not capable to tackle the
242
A. Malviya and R. K. Dwivedi
dynamic nature of article on social media. Lee et al. [16] had proposed architecture of deep learning which is used to detect the fake news in Korean articles. The sentences written in Korean are smaller than English sentences which create problem; therefore, different CNN-based architectures of deep learning are used to resolve the issue. Qawasmeh et al. [17] accepted that in comparison to present text-based analysis of traditional approaches, the detection of fake news is tougher. It has been seen that machine learning methods are less efficient than neural network models. They have used latest machine learning approaches for the automatic identification of fake news. Agarwalla et al. [18] tried to find an efficient and relevant model for the prediction of fake news using classification algorithms and natural language processing. Han et al. [19] used Naïve Bayes and hybrid CNN and RNN approaches used to mitigating the problem of fake news. Manzoor et al. [20] told about the success of detection of fake news using various machine learning algorithms but the ongoing change in the features and orientation of fake news on different social media platform be solved by various deep learning approaches such as CNN, deep Boltzmann machine, deep neural network, natural language processing, and many more. We have seen that some researchers had also used various techniques for automatic prediction of rumors. They told that articles present on social media tend to be true but in reality they are not. Implementations of NLP techniques are used by authors for the detection of rumors. Random forest and XGBoost models are constructed by the researchers for fake article detection [21–25].
26.3 Proposed Methodology Methodology used to make various models based on different machine learning algorithms to analyze model accuracy and find fake versus true news consists of the following algorithm and also depicted in Fig. 26.1. Algorithm: Deceptive news prediction Input: Fake news data from social media Output: Accuracy of the models Step 1: Start Step 2: Repeat steps 3 to 8 until no new datasets available Step 3: Enter the news dataset Step 4: Preprocessing of the dataset using each vectorizer one by one Step 5: Split the dataset in train and test datasets Step 6: Design and train the models with machine learning algorithms like logistic regression, multinomial Naive Bayes, etc. Step 7: Test the models Step 8: Output the accuracy of the models Step 9: Analyze the models Step 10: Stop
26 Detecting Deceptive News in Social Media …
Fig. 26.1 Flowchart of methodology for fake news detection
243
244
A. Malviya and R. K. Dwivedi
26.4 Selection of Machine Learning Techniques Machine learning, a subset of artificial intelligence, is so versatile today that we use it several times in a day without having knowledge of it. We cannot imagine this world without machine learning as we already got so many things from it and in future will also get. Learning is a native behavior of living beings. Living beings get new knowledge from the surrounding and modify it by experiences like happiness and hurdles which comes on their way. Simulating the learning ability of living beings into machines is what we all know as machine learning. Used machine learning algorithms in experimental work are discussed below. A.
Multinomial Naïve Bayes (MNB) Multinomial Naïve Bayes algorithm is commonly used in natural language processing and is a probabilistic learning approach. It uses the approach of Bayes theorem. Probability of each item for a given list is calculated, and the output is the item with the highest probability [26]. Many algorithms come under Naïve Bayes theorem with a principle that each attribute is independent of the other attribute. Bayes theorem formula is as follows. P(A/B) = P( A) × P(B/A)/P(B)
B.
C.
The advantages of multinomial NB is that it is easy to implement, it can be used for continuous and discrete data, real-time applications can be simply predicted, and it can handle huge dataset. This algorithm is not suitable for regression. It is suitable for textual data classification rather than predicting numerical data. Passive-Aggressive Classifier (PAC) Passive-aggressive classifier is a machine learning algorithm but it is not very popular among enthusiasts. This is very efficient for various applications, such as detection of fake news on social media. This algorithm is similar to a perceptron model that consists of regularization parameter and does not use a learning rate. Passive refers to not making changes in the model if the prediction is correct, and aggressive refers to making changes in the model if the prediction comes as incorrect. It is a classification algorithm for online learning used in machine learning. Online learning is one of the categories of machine learning like supervised, unsupervised, batch, instance-based, and model-based learning. A system can be trained in passive-aggressive classifier by incrementally giving the instances continuously, sequentially, individually, or in small batches. Multinomial Classifier with Hyperparameter (MCH) Multinomial classifier with hyperparameter is a Naïve Bayes algorithm. This algorithm is not so much popular among the machine learning enthusiasts. This algorithm is mostly suitable for text data. It is a Naïve Bayes algorithm which involves tuning with the hyperparameter [27, 28].
26 Detecting Deceptive News in Social Media …
D.
245
Logistic Regression (LR) Logistic regression comes under supervised machine learning technique and is the most widely used algorithm. Output of categorical dependent attributes is predicted with the help of independent attributes set in this algorithm. The result of the prediction is in the form of probabilistic values (between 0 and 1). The classification problems are solved using it. There is ‘S’-shaped logistic function/sigmoidal function in place of regression line for predicting two values (0 or 1). Data classification can be performed using continuous and discrete datasets. The concept of threshold value is used in logistic regression. The nature of dependent attribute should be categorical and multi-collinearity should not be present in independent variables are the two main assumptions of logistic regression. There are three types of logistic regression—binomial, multinomial, and ordinal. The equation of logistic regression is given below Log[y/1 − y] = b0 + b1 x + b2 x2 + · · · bn xn
26.5 Experimental Work This section presents the experimental details as follows. A.
Data Collection The first step involved in developing the classification model is collecting data. The goodness of the predictive model is based on the quality and quantity of the data collected which turn out to be one of the most important steps in developing a machine learning model. The news datasets are taken from Kaggle repository. Table 26.1 represents the attributes, instances, and size of four datasets.
B.
Preprocessing of Fake News Dataset In data preprocessing, we will take the text attribute from the dataset which comprises of actual news articles. For making the model more predictable, we will modify this text attribute so that more information could be extracted. This is done using ‘nltk library.’ Firstly, we have removed stopwords present in the article. Stopwords are the words which are used to connect and tell the tense of sentences and thus
Table 26.1 Details of news datasets S. No.
Dataset
Attributes
Instances
Size
1
Dataset 1
id, title, author, text, label
20,800
94.0 MB
2
Dataset 2
id, title, text, label
6335
29.2 MB
3
Dataset 3
urls, headline, body, label
4009
11.9 MB
4
Dataset 4
author, publisher, title, text, language, site_url, main_img_url, type, title_without_stopwords, text_without_stopwords, hasImage
2096
10.4 MB
246
C.
D.
A. Malviya and R. K. Dwivedi
have less importance in the context of sentences and can be removed. After that, tokenization is performed followed by vectorization. Vectorization is the technique of natural language processing in which words are mapped with vectors of real numbers for semantic prediction. Three vectorization techniques are used in this paper, namely count, hash, and Tf-idf vectorizers. Tf-idf stands for term frequency-inverse document frequency vectorizer. The transformation of text into significant number representation is done by this vectorizer. It is a very common algorithm which is used to fit algorithms of machine learning for prediction. Count vectorizer is a best tool. On the basis of the occurrence of each word in the text, this tool is used to transform a given text into a vector. It is helpful in the case of multiple texts. Hashing vectorizer uses hashing techniques which is used to find the name of string token so that it can be mapped with integers. It is a vectorizer which is used to transform collection of documents into sparse matrix which consists of the count of token. Design and Train/Test the Models Before building the models, we divided the dataset into two parts, namely train and test datasets. Train dataset consists of 67% of total instances, and test dataset consists of 33% of total instances. Analyze the Models We evaluated the models using confusion matrix and accuracy report. Table 26.2 depicts the comparisons of accuracy of different models using three types of vectorizer for four datasets. When the datasets are preprocessed using Tf-idf vectorizer, it is seen that passive-aggressive classifier gives the best accuracy, i.e., 95.2%, 90.9%, 98.4%, and 78.2%, respectively, for all the datasets. Similarly, logistic regression gives the highest accuracy, i.e., 94.9%, 89.9%, 89.9%, and 76.3%, respectively, for all the datasets when they are preprocessed by count vectorization. Table 26.2 also reveals that passive-aggressive classifier gave the best performance, i.e., 89.0%, 89.0%, and 74.2% for dataset 2, 3, and 4 except first dataset in which logistic regression gave the best accuracy when preprocessed by hashing vectorizer.
Figure 26.2 and 26.3 depicts graph showing the accuracy of models for four datasets. Each figure shows the accuracy of each model using three vectorizers. Figure 26.2a, b tells that all the models gave best result using Tf-idf vectorization for datasets 1 and 2. Figure 26.4 shows the accuracy report of Tf-idf vectorization, count vectorization, and hashing vectorization, respectively. Each figure reveals the accuracy report for each model in terms of four datasets using each vectorization.
26 Detecting Deceptive News in Social Media …
247
Table 26.2 Comparison of accuracy S. No.
Dataset
Machine learning algorithms
Tf-idf vectorizer
Count vectorizer
Hashing vectorizer
1
Dataset 1
Multinomial Naïve Bayes
0.900
0.898
0.876
Passive-aggressive classifier
0.952
0.935
0.923
Multinomial classifier 0.902 with hyperparameter
0.899
0.881
Logistic regression
0.950
0.949
0.926
Multinomial Naïve Bayes
0.879
0.873
0.868
Passive-aggressive classifier
0.909
0.876
0.890
Multinomial classifier 0.882 with hyperparameter
0.879
0.875
2
3
4
Dataset 2
Dataset 3
Dataset 4
Logistic regression
0.903
0.899
0.888
Multinomial Naïve Bayes
0.893
0.873
0.868
Passive-aggressive classifier
0.984
0.876
0.890
Multinomial classifier 0.947 with hyperparameter
0.879
0.875
Logistic regression
0.976
0.899
0.888
Multinomial Naïve Bayes
0.723
0.665
0.658
Passive-aggressive classifier
0.782
0.711
0.742
Multinomial classifier 0.759 with hyperparameter
0.684
0.704
Logistic regression
0.763
0.698
0.719
Accuracy Report for Dataset1
Accuracy Report for Dataset2
0.96
0.92 0.91 0.9 0.89 0.88 0.87 0.86 0.85 0.84
0.94 0.92 0.9 0.88 0.86 0.84 0.82 MNB
PAC Tfidf
MCH
Count
Hashing
LR
MNB
PAC Tfidf
(a)
Fig. 26.2 Accuracy report of models for dataset 1 and dataset 2
MCH
Count
(b)
Hashing
LR
248
A. Malviya and R. K. Dwivedi Accuracy Report for Dataset4
Accuracy Report for Dataset3 1
0.8 0.75
0.95
0.7 0.9 0.65 0.85
0.6
0.8
0.55 MNB
PAC Tfidf
MCH
Count
MNB
LR
PAC Tfidf
Hashing
MCH
Count
(a)
LR
Hashing
(b)
Fig. 26.3 Accuracy report of models for dataset 3 and dataset 4
Accuracy Report for Tfidf Vectorizaon 1.2 1 0.8 0.6 0.4 0.2 0
1.2
1.2 1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0 MNB
PAC
MCH
LR
Accuracy Report for Hashing Vectorizaon
Accuracy Report for Count Vectorizaon
0 MNB
PAC
MCH
LR
MNB
PAC
MCH
Dataset1
Dataset2
Dataset1
Dataset2
Dataset1
Dataset2
Dataset3
Dataset4
Dataset3
Dataset4
Dataset3
Dataset4
(a)
(b)
LR
(c)
Fig. 26.4 Accuracy report of models for Tf-idf, count, and hashing vectorization
26.6 Conclusions and Future Directions We have developed machine learning models such as logistic regression model, multinomial Naïve Bayes model, passive-aggressive classifier model, and multinomial classifier with hyperparameter to predict fake news. We found that for all the used dataset in experiment passive-aggressive classifier gave the best result using Tf-idf vectorization. Further, it is observed that all the models gave best performance using Tf-idf vectorizer in comparison to other vectorizers. We have got 98.4 and 97.6 percentage accuracy for passive-aggressive classifier model and logistic regression model, respectively. These are the elementary observations based on the four datasets. To confirm the findings, more versatile datasets are needed. As future work, we can do more experiments by using more machine learning algorithms, preprocessing techniques, and datasets for finding efficient system to detect fake news.
26 Detecting Deceptive News in Social Media …
249
References 1. Mandical, R.R., Mamatha, N., Shivakumar, N., Monica, R., Krishna, A.N.: Identification of Fake News Using Machine Learning. IEEE, New York (2020). ISBN: 978-1-7281-6828-9/20 2. Adinda, P.K., Zyblewski, P., Choras, M., Kozik, R., Giełczyk, A., Wozniak, M.: Fake News Detection From Data Streams. IEEE, New York (2020). ISBN: 978-1-7281-6926-2/20 3. Bhogade, M., Deore, B., Sharma, A., Sonawane, O., Changpeng, M.S.: A research paper on fake news detection. Int. J. Adv. Sci. Res. Eng. Trends 6(6) (2021, June) https://doi.org/10. 51319/2456-0774.2021.6.0067. ISSN (Online) 2456-0774 4. Ashtaputre, P., Nawale, A., Pandit, R., Lohiya, S.: A machine learning based fake news content detection using NLP. Int. J. Adv. Sci. Technol. 29(7), 11219–11226 (2020) 5. Baarir, N.F., Djeffal, A.: Fake news detection using machine learning. In: 2020 2nd International Workshop on Human-Centric Smart Environments for Health and Well-being (IHSH). ISBN: 978-1-6654-4084-4/21 6. Bharath, G., Manikanta, K.J., Prakash, G.B., Sumathi, R., Chinnasamy, P.: Detecting fake news using machine learning algorithms. In: 2021 International Conference on Computer Communication and Informatics (ICCCI-2021), Jan. 27–29, 2021, Coimbatore, India. IEEE, New York. ISBN: 978-1-7281-5875-4/21/2021 7. Sharma, U., Saran, S., Patil, S.M.: Fake news detection using machine learning algorithms. Int. J. Eng. Res. Technol. (IJERT). ISSN: 2278-0181, Special Issue—(2021), NTASU—2020 Conference Proceedings 8. Ahmed, S., Hinkelmann, K., Corradini, F.: Development of fake news model using machine learning through natural language processing. Int. J. Comput. Inform. Eng. 14(12) (2020). ISNI: 00000000-919-5-0263 9. Kumar, K.A., Preethi, G., Vasanth, K.: A study of fake news detection using machine learning algorithms. Int. J. Technol. Eng. Syst. (IJTES) 11(1), 1–7 (2020). ISSN: 0976-1345 10. Waghmare, A.D., Patnaik, G.K.: Fake news detection of social media news in blockchain framework. Indian J. Comput. Sci. Eng. (IJCSE) 12(4) (Jul-Aug 2021). https://doi.org/10. 21817/indjcse/2021/v12i4/211204151, e-ISSN: 0976-5166 p-ISSN: 2231-3850 11. Khivasara, Y., Khare, Y., Bhadane, T.: Fake news detection system using web-extension. In: 2020 IEEE Pune Section International Conference (PuneCon) Vishwakarma Institute of Technology, Pune India. Dec 16–18, 2020. IEEE, New York (2020). 978-1-7281-9600-8/20 12. Mishra, R.: Fake news detection using higher-order user to user mutual-attention progression in propagation paths. In: Computer Vision Foundation 2020 Workshop 13. Shaikh, J., Patil, R.: Fake news detection using machine learning. In: International Symposium on Sustainable Energy, Signal Processing and Cyber Security (iSSSC), ISBN: 978-1-72818880-5/20/ ©2020 IEEE. IEEE, New York (2020). https://doi.org/10.1109/iSSSC50941.2020. 93588 14. Chokshia, A., Mathew, R.: Deep learning and natural language processing for fake news detection: a survey. In: International Conference on IoT based Control Networks and Intelligent Systems, ICICNIS (2020) 15. Wang, Y., Yang, W., Ma, F., Xu, J., Zhong, B., Deng, Q., Gao, J.: Weak supervision for fake news detection via reinforcement learning. In: Copyright c 2020, Association for the Advancement of Artificial Intelligence (www.aaai.org) 16. Lee, D.H., Kim, Y.R., Kim, H.J., Park, S.M., Yang, Y.J.: Fake news detection using deep learning. J. Inf. Process Syst. 15(5), 1119–1130 https://doi.org/10.3745//JIPS.04.0142 (October 2019). ISSN 1976-913X (Print), ISSN: 2092-805X (Electronic) 17. Qawasmeh, E., Tawalbeh, M., Abdullah, M.: Automatic identification of fake news using deep learning. In: Sixth International Conference on Social Networks Analysis, Management and Security (SNAMS). IEEE, New York (2019). 978-1-7281-2946-4/19 18. Agarwalla, K., Nandan, S., Nair, V.A., Hema, D.D.: Fake news detection using machine learning and natural language processing. Int. J. Recent Technol. Eng. (IJRTE) 7(6) (March 2019) ISSN: 2277-3878
250
A. Malviya and R. K. Dwivedi
19. Han, W., Mehta, V.: Fake news detection in social networks using machine learning and deep learning: performance evaluation. In: International Conference on Industrial Internet (ICII). IEEE, New York (2019). 978-1-7281-2977-8/19 ©2019 IEEE. https://doi.org/10.1109/ICII. 2019.00070 20. Manzoor, S.I., Singla, D.J.N.: Fake news detection using machine learning approaches: a systematic review. In: Proceedings of the Third International Conference on Trends in Electronics and Informatics (ICOEI 2019) IEEE Xplore Part Number: CFP19J32-ART; ISBN: 978-1-5386-9439-8 21. Jain, G., Mudgal, A.: Natural language processing based fake news detection using text content analysis with LSTM. Int. J. Adv. Res. Comput. Commun. Eng. 8(11) (November 2019) ISSN (Online) 2278-1021, ISSN (Print) 2319-5940 22. Devi, C.U., Priyanka, R., Surendra, P.S., Priyanka, B.S., Nikhila, C.H.N.D.L.: Fake news detection using machine learning. JETIR 6(4) (2019, April) ISSN: 2349-5162 23. Kyeong-hwan, J.J.: Fake news detection system using article abstraction (2019). 978-1-72810719-6/19 ©IEEE 24. Lin, J., Tremblay-Taylor, G., Mou, G., You, D., Lee, K.: Detecting fake news articles. In: International Conference on Big Data (Big Data). IEEE, New York (2019). 978-1-7281-08582/19 25. Reis, J.C.S., Correia, A., Murai, F., Veloso, A., Cambria, F.E.: Supervised learning for fake news detection. IEEE, IEEE Computer Society, IEEE Intelligent Systems 1541-1672 (2019) 26. Dwivedi, R.K., Rai, A.K., Kumar, R.: Outlier detection in wireless sensor networks using machine learning techniques: a survey. In: IEEE International Conference on Electrical and Electronics Engineering (ICE3), pp. 316–321 (2020) 27. Dwivedi, R.K., Kumar, R., Buyya, R.: A novel machine learning-based approach for outlier detection in smart healthcare sensor clouds. Int. J. Healthcare Inform. Syst. Inform. 4(26), 1–26 (2021) 28. Dwivedi, R.K., Kumar, R., Buyya, R.: Gaussian distribution based machine learning scheme for anomaly detection in wireless sensor network. Int. J. Cloud Appl. Comput. 3(11), 52–72 (2021)
Chapter 27
4G Communication Radiation Effects on Propagation of an Economically Important Crop of Eggplant (Solanum melongena L.) Trushit Upadhyaya , Chandni Upadhyaya , Upesh Patel, Killol Pandya, Arpan Desai, Rajat Pandey, and Yogeshwar Kosta Abstract The presented interdisciplinary research emphasizes on electromagnetic (EM) wave effect on plants. The economically important crop of Eggplant (Solanum melongena L.) was utilized to examine the EM wave-induced stress. The true-to-type plants were raised in separate electromagnetic environments. The electromagnetic wave exposure was given from 12 to 96 h at 1800 GHz frequency. A decline of nearly 60% was deduced in the case of seed germination rate compared to control. A significant decline in plant physiology and growth was observed through the analysis of plant height (62%), root length (44%), leaf count (40%), and leaf area (75%). The biochemical parameters, viz. chlorophyll contents, carotenoid contents, and H2 O2 contents, were modulated as compared to unexposed plants. Keywords Electromagnetic wave effect · Plant deterioration · Plant physiology · Biochemical effects
27.1 Introduction The mobile communication has become one of the primary requirements of current generation. The utilization of the mobile has significantly increased along with the development in the mobile technologies. The prevailing fourth generation mobile communication has reached it apical use among the communication users. The electromagnetic wave has been considered as a strong mutant for the animal and plant tissues [1, 2]. There is a huge research community which has addressed these issues of effects of electromagnetic waves on flora and fauna [3–13]. The prime need for a healthy diet and regular consumption of antioxidant-rich food has been induced due T. Upadhyaya (B) · U. Patel · K. Pandya · A. Desai · R. Pandey · Y. Kosta Chandubhai S. Patel Institute of Technology, Charotar University of Science and Technology (CHARUSAT)), Changa, India e-mail: [email protected] C. Upadhyaya Sardar Patel University, Anand, Gujarat, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_27
251
252
T. Upadhyaya et al.
to a stressful lifestyle. Cancer, metabolic disorders, and cardiovascular diseases are the prime reason for death. The incorporation of fruits and vegetables in a regular diet reduces the risk of chronic diseases, viz. Alzheimer’s disease, cancer, diabetes, heart-related disease, and functional inability due to aging [14]. According to the recommendation of the National Research Council (NRC), five servings of vegetables and fruits per day are excellent for better health. Along with the emerging need for mass-scale production of agricultural crops, good quality cultivation is also become essential due to consumer awareness toward the quality of food. The paper addresses adverse effect of fourth generation mobile communication on medicinal plant. There is a sharp rise in mobile communication subscribers across the globe. To fulfill the communication traffic requirements, cell tower sites are regularly coming up. The agricultural land is no exception. The farmers are permitting the installation of the mobile towers in the agricultural land for the sake of small financial motivation. The radio waves are non-ionizing radiation that possesses low quantum energy which does not govern the ionization of atoms. When such waves interacted within living matter, they get absorbed and energy transfer increases the frequency of collisions which leads to increment in heat and is referred to as thermal effects. The effects of such non-ionizing radiation are categorized in mainly thermal and non-thermal effects which are assessed by different agencies and bodies, viz. European Committee for Electrotechnical Standardization (CENELEC) and International Commission on Non-ionizing Radiation Protection, (ICNIRP). The exposure limits are mainly evaluated by the specific absorption rate (SAR) of the biological tissues which is the magnitude of energy absorbed by the tissue upon exposure of RF-EMFs as given by [15]. SAR = __
ωε0 ε __ 2 σ __ 2 E = E 2ρ 2ρ
(27.1)
where E = peak value of internal electric field, σ = tissue conductivity, ρ = material resistivity, ε0 = free space permittivity, and ε" = imaginary component of permittivity which is related to dielectric losses of the material. Figures 27.1 and 27.2 illustrate the EM exposure setup.
Fig. 27.1 EM wave exposure setup
27 4G Communication Radiation Effects on Propagation …
253
Fig. 27.2 Helix antenna used for exposure
27.2 Experimental Setup A helical antenna transmitter was utilized for giving exposure at 1800 MHz frequency. The RHCP helix with directive pattern having gain of around 15 dBi was employed for the radiation for presented experiments. The E-field strength at the plant was around 7 V/m, and a consistent magnetic field of around 1.9 mG was measured near to the plant by means of gauss meter.
27.2.1 Determination of Seed Germination and Preparation of Planting Material The seeds from single mother plant were taken for experiment to avoid any genetic variation and were grouped into two. One of the group was kept in isolated place as unexposed control, and other was exposed to 1800 MHz frequency for analyzing the rate of seed germination which gave healthy seedlings, and the total number of seeds is taken in test and multiplied with 100. The seedlings were transferred to separate pots and incubated for 20 days by providing same conditions, viz. soil, water, and fertilizer content. The plantlets were utilized for further analysis.
27.2.2 Physiological Analysis The parameters like plant height and root length of 20-day-old plantlets were measured by ruler, and the fresh weight of the plant was measured after the removal of dirt from roots. The dry weight of plants was deduced by drying them in a microwave oven at 90 ˚C for 36 h. The leaf parameters, viz. length of the longest leaf, were
254
T. Upadhyaya et al.
measured by ruler, and the leaf area was measured at the harvest stage by leaf area meter (WDY-500A, Swastik Scientific Company, India).
27.2.3 Biochemical Analysis Determination of pigment content, viz. chlorophyll a, chlorophyll b, and carotenoid of both groups of plants, was followed by spectrophotometric analysis mentioned by Wintermans and De Mots (1965). The H2 O2 concentration within leaf tissue was deduced according to Zhou et al. (2006).
27.3 Result and Discussion The electromagnetic radiation is exposed on Solanum melongena L. seeds and assessed for selected physiological and biochemical parameters.
27.3.1 Physiological Alterations The physiological parameters were evaluated for unexposed control and periodically exposed seeds and plantlets of egg plants and exemplified in Table 27.1. The effect of radiation on seed germination revealed that the rate of seed germination was Table 27.1 Effect of electromagnetic exposure on plant seedling growth and physiology Exposure Plant growth parameters time (h) Rate of seed germination (%)
Plant height (cm)
Root length (cm)
Fresh weigh of plants (gm)
Dry weight of plants (gm)
Relative water content (%)
Control
55.00 ± 2.50 50.25 ± 1.27 18.80 ± 1.67 22.85 ± 1.30 5.15 ± 1.00 72.72 ± 0.93
12
63.50 ± 1.63 60.12 ± 0.68 23.67 ± 0.88 31.16 ± 1.22 5.00 ± 1.51 71.85 ± 2.00
24
64.50 ± 2.00 58.00 ± 1.20 18.12 ± 1.12 31.95 ± 0.63 4.72 ± 0.61 71.61 ± 1.25
36
51.70 ± 1.50 47.72 ± 0.83 16.24 ± 1.00 20.52 ± 1.25 4.58 ± 0.52 65.22 ± 1.54
48
46.60 ± 2.05 26.00 ± 0.85 15.50 ± 0.72 19.37 ± 0.75 3.57 ± 1.25 64.80 ± 2.17
60
41.33 ± 1.00 22.50 ± 1.23 12.20 ± 1.50 18.16 ± 1.40 3.00 ± 0.80 58.55 ± 3.30
72
30.77 ± 2.30 21.60 ± 2.00 10.70 ± 1.69 17.45 ± 1.03 3.00 ± 0.90 49.90 ± 1.75
84
27.62 ± 1.75 19.78 ± 1.85 10.21 ± 0.95 12.51 ± 0.68 2.97 ± 0.74 45.82 ± 1.70
96
23.88 ± 2.50 19.00 ± 1.45
*
7.78 ± 1.26 12.00 ± 1.00 2.08 ± 0.16 40.43 ± 0.18
The results are represented as Mean ± SD where n = 3 and p < 0.05
27 4G Communication Radiation Effects on Propagation …
255
enhanced upon exposure of 12–24 h as compared to control and can be considered as positive effect of such radio waves. However, such effect was restricted to short-term controlled exposure. The rate of germination was decreased at more significant rate of 6.0–56.58% after 24 h of exposure compared to control. The plant growth parameters, viz. plant height and root length, were similarly enhanced in case of 12–24 h exposed seeds which deteriorated in progressive manner up to 96 h of exposure. The plant height was decreased by 5.0–62.00% for 36–96 h of exposure, whereas root length was decreased by 13.61–58.61% compared to unexposed plants. Consequently, the fresh weight and dry weight of plants were deteriorated in parallel for long-term exposure. The relative water content of plants is the crucial factor for plant health which also gets reduced up to 44.40% upon exposure compared to control. The leaf parameters of eggplant plantlets were also revealed significant modulation which shown reduction in leaf count by 39.49%, length of the longest leaf by 67.10%, and leaf area by 75.74% upon 96 h of exposure compared to unexposed leaves and are shown in Fig. 27.3. Consequently, deterioration at higher rate upon long-term exposure of 96 h shows hazardous effects of radio waves on plant physiology. The similar effect of short-term and long-term exposure effect of 900 MHz frequency on medicinal plant physiology was obtained by Upadhyaya et al. [6].
Fig. 27.3 Effects of electromagnetic exposure on leaf parameters
b
Control
b
12 h
a
0.04
24 h c
d e d
0.02
36 h e
f
48 h 60 h 72 h
0.00 tr o 12 l h 24 h 36 h 48 h 60 h 72 h 84 96 h h
84 h
Control b
0.020 0.015
a
12 h c
24 h c d
0.010
d d
36 h e e
48 h 60 h
0.005
72 h 0.000
84 h 96 h
C
96 h
0.025
on tr o 12 l h 24 h 36 h 48 h 60 h 72 h 84 96 h h
0.06
Chlorophyll b content (mg/cm2)
T. Upadhyaya et al.
Co n
Chlorophyll a content (mg/cm2)
256
Time duration of exposure
(a)
Time duration of exposure
(b)
Fig. 27.4 Effect of electromagnetic exposure on chlorophyll content a Chl a content and b Chl b content
27.3.2 Biochemical Analysis The selected biochemical parameters, viz. chlorophyll pigment content, carotenoid content, and plant stress marker H2 O2 content, were estimated from leaf extracts of exposed and control plantlets. The chlorophyll a and b (Chl a and Chl b) both revealed parallel outcome which were enhanced slightly upon 12–24 h of exposure and thus can directly correlated to the plant physiology outcomes because chlorophyll pigments are responsible for photosynthesis in plants. The respective decline of 52.5% and 37.3% in Chl a and Chl b shows negative effect on photosynthetic potential of exposed plants compared to control. The carotenoid content was increased from 24 to 48 h of exposure which directly revealed induction of plant defense against abiotic stress in form of radiation. The reduction in plant carotenoid from 60 to 96 h exposure resulted due to weakening of plant health and revealed premature senescence. The H2 O2 content is the remarkable marker for assessment of plant stress. The presented outcome revealed exponential increase in H2 O2 content from 33.33–66% from 12 to 96 h of exposure which directly revealed stress on plant tissue (Figs. 27.4 and 27.5). The presented analysis revealed deteriorative effect of radiation on seed germination and physio-chemical parameters in time dependent manner. The prolonged exposure deteriorates plant health which is not suitable for getting high quality and quantity of crop yield.
27.4 Conclusion The presented investigation discovered that all measured physiological parameters revealed deterioration in plant growth. The H2 O2 , chlorophyll, and carotenoid contents upon the exposure of 96 h radiation have shown a noteworthy deterioration. These outcomes revealed a weakening of the plant defense system and plant health.
b
c c
12 h 24 h
a a
d d e e
36 h 48 h
0.004
60 h
0.002
72 h
0.000
84 h
Control e e
15
bc
10
e
d c de
b 5
12 h 24 h 36 h 48 h
a
60 h 72 h
0
84 h 96 h
C
C
96 h
H2 O2 Content (µM)
0.006
20
Control
0.008
257
on tr ol 12 h 24 h 36 h 48 h 60 h 72 h 84 h 96 h
0.010
on tr o 12 l 24 h 36 h 48 h 60 h 72 h 84 h 96 h h
Carotenoid content (mg/cm2)
27 4G Communication Radiation Effects on Propagation …
Time duration of exposure
Time duration of exposure
(a)
(b)
Fig. 27.5 Effect of electromagnetic exposure on a carotenoid content b H2 O2 content
There is consistent advancement in the domain of wireless communication and our march toward 5G frequency, and the consequences of exposure to such radiation on plants should be investigated thoroughly.
References 1. Stefi, A.L., Vassilacopoulou, D., Margaritis, L.H., Christodoulakis, N.S.: Oxidative stress and an animal neurotransmitter synthesizing enzyme in the leaves of wild growing myrtle after exposure to GSM radiation. Flora 243, 67–76 (2018) 2. Upadhyaya, C., Patel, I., Upadhyaya, T., Desai, A.: Exposure effect of 900 MHz electromagnetic field radiation on antioxidant potential of medicinal plant Withania somnifera. In: Inventive Systems and Control, pp. 951–964. Springer, Singapore (2021) 3. Kaur, S., Vian, A., Chandel, S., Singh, H.P., Batish, D.R., Kohli, R.K.: Sensitivity of plants to high frequency electromagnetic radiation: cellular mechanisms and morphological changes. Rev. Environ. Sci. BioTechnol. pp. 1–20 (2021) 4. Kumar, A., Kaur, S., Chandel, S., Singh, H.P., Batish, D.R., Kohli, R.K.: Comparative cyto-and genotoxicity of 900 MHz and 1800 MHz electromagnetic field radiations in root meristems of Allium cepa. Ecotoxicol. Environ. Saf. 188, 109786 (2020) 5. Upadhyaya, C., Patel, I., Upadhyaya, T., Desai, A., Patel, U., Pandya, K.: Investigation of mobile communication radio frequency exposure on the medicinal property of Jasminum grandiflorum L. In: 2021 5th International Conference on Computing Methodologies and Communication (ICCMC), pp. 212–218. IEEE, New York (2021, April) 6. Upadhyaya, C., Dwivedi, V.V., Upadhyaya, T.: Effects of electromagnetic waves on a medicinal plant. Wireless Commun. 4(13), 807–809 (2012) 7. Chandel, S., Kaur, S., Issa, M., Singh, H.P., Batish, D.R., Kohli, R.K.: Appraisal of immediate and late effects of mobile phone radiations at 2100 MHz on mitotic activity and DNA integrity in root meristems of Allium cepa. Protoplasma 256(5), 1399–1407 (2019) 8. Levitt, B.B., Lai, H.C., Manville, A.M.: Effects of non-ionizing electromagnetic fields on flora and fauna, part 2 impacts: how species interact with natural and man-made EMF. Rev. Environ. Health (2021) 9. Stefi, A.L., Mitsigiorgi, K., Vassilacopoulou, D., Christodoulakis, N.S.: Response of young Nerium oleander plants to long-term non-ionizing radiation. Planta 251, 1–17 (2020)
258
T. Upadhyaya et al.
10. Alattar, E.M., Elwasife, K.Y., Radwan, E.S., Alagha, A.M.: Effect of microwave treated water on the growth of corn (Zea mays) and pepper (Capsicum annuum) seedlings. Romanian J. Biophys. 28(3) (2018) 11. Maffei, M.E.: Plant responses to electromagnetic fields. In: Biological and Medical Aspects of Electromagnetic Fields, pp. 89–110. CRC Press, Boca Raton (2018) 12. Roche, J., Didyk, N.P., Ivanytska, B.O., Zaimenko, N.V., Chudovska, O.O.: Effects of the electromagnetic field of Wi-Fi systems and experimental gadget M4 on growth, development and photosynthesis of wheat. Plant Introduct. (85/86), 15–24 (2020) 13. González-Vidal, A., Mercado-Sáenz, S., Burgos-Molina, A.M., Sendra-Portero, F., RuizGómez, M.J.: Growth alteration of Allium cepa L. roots exposed to 1.5 mT, 25 Hz pulsed magnetic field. Int. J. Environ. Health Res. pp. 1–13 (2021) 14. Temple, N.J.: Antioxidants and disease: more questions than answers. Nutr. Res. 20(3), 449–459 (2000) 15. Barnes, F.S., Greenebaum, B.: Biological and Medical Aspects of Electromagnetic Fields. CRC Press (2018)
Chapter 28
Annadata: An Interactive and Predictive Web-Based Farmer’s Portal Cdt. Swarad Hajarnis, Kaustubh Sawant, Shubham Khairnar, Sameer Nanivadekar, and Sonal Jain
Abstract The world is rapidly moving toward digitization. In these tough times, where everyone is not so comfortable with the digital world, there are many classes in the society that faces difficulties in updating themselves and to keep up with their professional requirements at the same time. There are many classes that are neglected during the ongoing pandemic situation, whose issues aren’t addressed, whereas they actually play an important role in society as well as in maintaining ecological balance. Agriculture is an important sector in India. It is indispensable for the sustenance and growth of the Indian economy. On an average, about 70% of the households and 10% of the urban population is dependent on agriculture as their source of livelihood. Agriculture is the primary source of food and plays an important role in employment and economy in India. Thereby, to address some of these issues, this system is developed in order to help and uplift the farmer’s community as they play vital role in many aspects. Some of the issues would be addressed that are on high priority currently. The whole system is developed with a motive to help the unaddressed community that selflessly delivers to us a keen interest toward working for a social cause. Keywords Random forest algorithm · Rasa · Crop prediction · Weather forecasting · Chatbot Cdt. S. Hajarnis (B) · K. Sawant · S. Khairnar · S. Nanivadekar · S. Jain Department of Information Technology, A.P. Shah Institute of Technology, Thane, Maharashtra, India e-mail: [email protected] K. Sawant e-mail: [email protected] S. Khairnar e-mail: [email protected] S. Nanivadekar e-mail: [email protected] S. Jain e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_28
259
260
Cdt. S. Hajarnis et al.
28.1 Introduction Due to the increase in the use of machines and fast evaluation of the world toward the digitization process, there are many classes of the societies that aren’t very familiar with this new trend. There is a strong need to acknowledge this. There are many classes in the society that have been facing a lot of issues during the ongoing Covid19 pandemic situation. One of those is the Farmer Community that faces a lot of issues in their day-to-day lives. There are many issues that go unaddressed and are being neglected since ages, whereas this community has always been selflessly working in order to provide the best products to us all-day every day. Farmers are the sole reason we have fresh and good nutrition rich food products. But, there are many issues which the farmers have been facing since long period of time. Many of these issues even go unaddressed or are overlooked by the concerned authorities. Thereby, with a positive approach, a Web portal for the farmers wherein many of their problems would be identified, and they would be provided a solution for the same. Some of their issues have been identified, and the Web portal would contain a solution for it. This portal is designed after a detailed study and taking into consideration a lot of data from relevant sources about the concerned aspects. The system consists of different modules which helps the farmers in many aspects. The highlighted part of this system is that the whole system is be multi-lingual, and the farmers can access the data, information, and overall operations in the system in their native language. This enables the farmers to gain confidence over the system and can make the most out of it. The security aspect is also taken into consideration, and thereby, firstly, the farmer has to register themselves on our database using their mobile number. After entering the mobile number, the system shall generate login credentials for the users and will be sent to them via SMS. Once the user has register themselves on our database, they can have an access to portal and take benefits from all the modules. The modules are as follows: • • • • • •
Weather Prediction Government Schemes Crop Prediction Market Prices Farm Bills Chatbot
Through these modules, the user would be able to get access of a lot of relevant data real-time information. This would be helpful to them in deciding their workflow, being a beneficiary of government schemes, getting real-time weather prediction, and also the chatbot. The system is developed by using different technologies, algorithms and after a detailed study of the same. Algorithms have enabled to give precise predictions and provide more accurate results. We all are also aware of the situation that was arised due to the mis-interpretation of the new farm bills that were imposed recently. Many who were directly affected due to this farm bills had no proper legitimate information about the bills and were mis-guided all through their way. Thereby,
28 Annadata: An Interactive and Predictive Web-Based Farmer’s Portal
261
Fig. 28.1 Web portal
recognizing the need to guide the people who are directly related to these new farm bills was important. So, this portal would be providing all the legal information that is concerning their profile in simpler and their native language (Fig. 28.1).
28.2 Literature Survey A lot of relevant data and study has been done in this field. Random forest algorithm is used for both regression as well as classification. In [1], the publishers had used tenfold cross-validation technique which helped them in indicating, giving high accuracy, and correlation between the climate and the crop yield. The accuracy of model [1] was found to be 87%. In their project, many other factors like quality of soil, pest, and chemicals which are used were not used as they stated because it depends on the type of field. KNN algorithm [2] to create a chatbot is also efficient. As we can see in the findings, the existing system posses’ some drawbacks. The existing system consists of crop predictions, chatbots, chatbots using AI, weather predictions. In this system, we would be including all these modules with a higher accuracy rate and would be predicting more accurate results using different algorithms and technologies. Also, the system’s highlighted part would be that all the findings, researches, predictions, information, etc., would be provided to the farmers in their native language with a motive to make it more user-friendly. They would easily connect to our portal as they would find it familiar as it would be addressing to them in their language, and this would play as the key factor. The current products in the market are very limited and address to very limited specifics; also, the language
262
Cdt. S. Hajarnis et al.
barrier plays a role in turning the users skeptical about the product and be reserve about the approach toward the product, as we have been seeing the language barrier since ages. When something is been conveyed in the language that the users are very familiar with, to say, their native language will help our product win their trust and that trust of the consumers would motivate us to work even better and produce the best out of our skillset! Different data mining techniques and its outputs with respect to crop predictions were studied [3]. According to the study, agriculture contributes 10+ percent of the total India’s GDP, and a lot of families depends in this as their source of livelihood [4]. With the advancement in the IT sector, it becomes necessary to provide them with better products and products that would consist of quantity and quality both! Having said this, we would like to update and modify the current systems and add new features and modules in our system that would address more issues and problems that are faced by the target audience. The government schemes [5, 6] are very limited and need to be updated with time as well as there is no flow through which the beneficiaries would be notified about any newly launched government scheme, it will just be updated on the site.
28.3 Methodology and Functionality The user first needs to register themselves on the portal using their contact number and the portal would ask for some basic user data as mentioned in system working. After successfully log in, the user can get access to various modules in the system. Weather Prediction Once the user gives input of their district, the request will be sent to open weather API Web site which will give detailed weather information. This data would be fetched every 10 s. During extreme weather conditions, when the red flags are generated in weather forecast, it shall be notified to the registered users through SMS using Fast-to SMS API. Government Schemes The system consists in detailed information of the preexisting government schemes and how to apply/where to apply, required documents, etc., all the basic information to be the beneficiary of that scheme is mentioned. The admin adds newly launched government schemes to the portal. Also, whenever any new govt. scheme is been added to the portal, all the registered users are notified through SMS on their registered mobile number. Farm Bills The target audience is from the backward areas and has less knowledge about technical specifics, and also, the literacy rate in those areas is very low as compared to any urban city. So, these people, who are directly concerned with the newly imposed farm bills, have very less amount of knowledge about what exactly the bills are. It is an overall loss for any country and not just a community. Portal provides all the legal information and its technicalities in simpler and their native Language. This module will make them independent of any external factor to help them in understanding technicalities related to the legal forms.
28 Annadata: An Interactive and Predictive Web-Based Farmer’s Portal
263
Fig. 28.2 Crop prediction
Crop Prediction User selects their district and season (Khariff, Rabi, Whole year). Now, using random forest algorithm, a list of crops for the selected district will be displayed (Fig. 28.2). This list is generated using parameters like temperature, rainfall, soil, and pH of soil of the selected district, basically sci-geographic conditions of the district. Each crop consists of its image and information on the portal, which includes climate, soil needed for the crop, time of rowing the crop, pesticides and precautions, government handouts, some videos related to that crop. There is a “Will Plant” button, when the farmer will click the button, it would be stored in the back end on the database so as to keep the record of the farmers growing that particular crop in the same region/district. It will also add +1 value to the existing count of the crop in that region, and the updated value will be shown to the future users.
264
Cdt. S. Hajarnis et al.
Chatbot Our virtual chatbot will enable the farmers to post their queries and get instant solutions for the same. This will help them as it will be available 24 * 7 for them, and they can get their queries solved at any hour of the day. This chatbot would be really helpful to them as this chatbot would be a multi-lingual support offered to them. The farmers can post the queries in their native language, and the solution will also be provided to them in the same language. This can help in reducing the language barrier and be helpful to the end users. This module will be designed by using RASA. Market Prices Herein, this module will address the post-cultivation part in the agricultural domain. There are pre-defined prices of each crop by the government that would be provided to them and also the market price of that crop so that they can maximize their profits as they truly deserve.
28.4 System Architecture Herein, we would be designing a portal that addresses the issues faced by our end users. The system is committed toward building a user-friendly portal. The user has to first register himself/herself on our database if we do not possess any records, i.e., if they are new users. They can simply register using their mobile number. After providing the mobile number, verification would be done for security reasons. The system would generate a password for them which they can use for their further logins (Fig. 28.3). After the successful login, the user has to create their profile, i.e., name, age, sex, region, district, crop type they have been practicing, etc.; after providing all this information, their profile would be created and recorded in database. Now, the users can take benefits of all the modules that the system consists of. These modules will enable the target audience in getting relevant information about many fields that concern their profile. Overall, this system would be helpful to the farmers as it would address major issues, and we tend to extend the scope of the project and its modules in the future scope. This system would prove to be a user-friendly system as it would have a multi-lingual support and that can increase the comfort level of the target audience and win their trust over the product. Once the user successfully logins into our portal, we are obliged to provide them all kind of support, and the system ensures the same (Fig. 28.4).
28.5 Conclusion By applying the knowledge and skillset, we are determined toward building a completely user-friendly Web portal which would be useful to the targeted audience, farmers. This system ensures the multi-lingual support and has other modules as
28 Annadata: An Interactive and Predictive Web-Based Farmer’s Portal
265
Fig. 28.3 Login
above mentioned that would address different specific issues. In near future, the product would be designed with a motive to help the unaddressed community and help them resolve their issues. The overall performance would be tried and tested and would be made available once all the trials-n-errors have been successfully worked upon so as the ssers can enjoy, learn, and grow in all aspects using our product.
266
Cdt. S. Hajarnis et al.
Fig. 28.4 Flow diagram
References 1. Moraye, K., Pavate, A., Nikam, S., Thakkar, S.: Crop yield production using random forest algorithm for major cities in Maharashtra. Int. J. Innov. Res. Comput. Sci. Technol. (IJIRCST) (2021) 2. Yashaswini, D.K., Hemalatha, R., Niveditha, G.: Smart chatbot for agriculture. Int. J. Eng. Sci. Comput. (2019) 3. Jambekar, S., Nema, S., Saquib, Z.: Prediction of crop production in India using data mining techniques. In: 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA) 4. Deepika, K., Tilekya, V., Mamatha, J., Subetha, T.: Jollity chatbot—a contextual AI assistant. In: 2020 IEEE Third International Conference on Smart Systems and Inventive Technology (ICSSIT) 5. For government schemes. www.businessinsider.in 6. For government schemes. www.nvshq.in
Chapter 29
A Survey on Privacy Preserving Voting Scheme Based on Blockchain Technology P. Priya, S. Girubalini, B. G. Lakshmi Prabha, B. Pranitha, and M. Srigayathri
Abstract The blockchain methodology utilizes cryptographic hashes to establish end-to-end demonstration to facilitate security for votes. The aim of the project is to establish a successful voting methodology with the help of blockchain mechanism. With this mechanism, every vote is reserved as a new block and gets updated in the database. The system assures that voting system maintains the one person, one-vote (democracy) principle. This is accomplished by matching the voter’s unique face biometric at the start of each voting attempt to avoid double voting. In this, we have created an online voting system through blockchain methodology which is used to solve the issues that are faced by the pre-developed methodology. For each unique vote, different enactment is done. Miners will reject a vote if it is suspected as being malicious. Blockchain mechanism used here makes the vote trustworthy and reliable, and it will also aid to raise the number of voters besides it enhances trust toward the government. In this project, we apply the hash function using SHA—256 algorithms for secure password hashing. It is necessary to have unique hash for every vote that is being recorded by cryptographic hash by which each vote can be verified distinctly. This characteristic facilitates the general voting process’s verifiability. In addition to this, the vote that is being counted is more secure and not an individual including the operator knows about the details of the vote that is being recorded. Keywords Voting · Face biometric · Blockchain · Polling · Result prediction
29.1 Introduction Electronic voting has grown in popularity as a way to remove redundancies and inefficiencies in the paper-based voting system. Online polling systems have been emerging as a most commercial technology in the fast-developing countries. As online polling system is concerned, it should facilitate the same reliable and integrity service as like traditional polling systems. Online polling systems should guarantee P. Priya (B) · S. Girubalini · B. G. Lakshmi Prabha · B. Pranitha · M. Srigayathri Department of Computer Science and Engineering, M.Kumarasamy College of Engineering, Karur 639113, Tamilnadu, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_29
267
268
P. Priya et al.
more reliable services that is it should ensure the voters that their votes cannot be altered by anybody at any case. Online polling systems are mainly formulated to obscure the ballots. However, because various intelligence services throughout the world control distinct areas of the Internet, they can detect or intercept ballots, this system does not ensure complete anonymity or integrity. Due to security and privacy problems discovered throughout time, the historical viewpoint offered in the last two decades implies that it has not been as successful. Using the customizable blockchain technique, the framework given in this study addresses the effectiveness of the polling procedure, the worth of hashing protocols, block generation and choking, data accretion, and output statement and also prevents the fake voting using face recognition approach [1, 2]. To identify the individual’s state and characteristics, face detection system is being used [3]. Some of the attributes like face, fingermarks, hands, voice, and iris are considered as physical features that are taken into account in the process of face detection [4]. As a result, it creates a real-time authentication system that uses face biometrics to authorize people to vote online. To upgrade online polling scheme and to enhance the reliability and to manage the information that are considered more difficult in blockchain methodology have been modified through this development. It also facilitates no connection between the identities of the voters and the operators to avoid information leakage. During the voting procedure and after the election, the voter must remain anonymous. The votes must be correct. The ballots that are being recorded should be irreplaceable and unchangeable [5, 6]. The polling systems should be efficacious to make the suffrages counted more secure. Our solution aids mobility, flexibility, and efficiency in addition to the primary demand. The system, however, will confine the discussion in this work to the four main needs. In traditional polling scheme, we use pen and paper to poll our ballots which leads us to face many fraudulent issues like modification of votes, so as to overcome these issues this reliable online polling scheme would be more beneficial. The information stored in blockchain is sequential which cannot be modified or dropped by anyone without authorization. As a result, the collection of data or information is unchangeable. The node should be connected to one another because each node contains a hash that considered as the characteristics of the foregoing node, and thus by this mechanism, the whole chain, ensuring immutability [7, 8]. The blockchain is formulated as private to maintain the votes that is being recorded as limited so that no one can change or modify the ballots in the online polling system. In blockchain methodology, each and every activity has authorization so that the only special blocks can communicate with the blockchain not all the blocks that is being created. Even though this mechanism has more benefits, it will not guarantee whole security and reliable service because there are many cybersecurity services that handle various views of the Internet which would enable them to recognize and detain suffrages.
29 A Survey on Privacy Preserving Voting Scheme …
269
29.1.1 Features of Blockchain Technology 29.1.1.1
Better Transparency
One of the major difficulties in the current industry is transparency. To upgrade lucidity, management has striven to initiate additional act and charter. However, there is one factor that prevents any system from being completely transparent: centralization. An organization can use blockchain to create a completely decentralized network that eliminates the need for a centralized authority, increasing the system’s transparency [9]. To analyze and record the ballots, the blockchain methodology has a number of peers for guiding the entire progress. The members who are appointed as peers do not have the authority of involving in consensus progress, and they are only involved in verification stage. Anyone who is appointed as peer is not possible to engage in the consensus function, but they have right to join in the examination stage. The basis of consensus method is investigated by decentralization. Each node preserves a copy of the transaction record once it has been validated. On taking associations into account, lucidity has an eminent effect. Governments can use openness to develop government processes and even conduct voting, as previously mentioned.
29.1.1.2
Enhanced Security
In comparison with other approaches, blockchain automation will furnish high certainty. Whenever a transaction is being done, it should be under a general agreement mechanism. In addition to this, each and every transaction is encoded and has an actual connect to the older transaction that has been done under the concept of hashing protocol. Since in this methodology, security is more concern each and every block has the record of all activities that are carried out on the Web. In case of any defamatory access that is notified to alter any information in the network, it is not possible to do so, as the solicitation to entrance and adjust the chunk is repudiated. Only reading the information is possible and there is no chance of modifying the data in the blockchain methodology. This is also the best option for systems that rely on immutable data, such as citizen-aging systems.
29.1.1.3
Reduced Costs
Businesses are currently spending a lot of money to improve their current system’s management. That is why they aim to save costs and invest the savings in something new or improving current processes. Organizations may save a lot of money by embracing blockchain to reduce costs connected with third-party vendors. There are no charges for retailers as the blockchain methodology does not have rooted centralized competitor. Furthermore,
270
P. Priya et al.
when it comes to authenticating a transaction, there is less contact required, avoiding the need to spend money or time to do basic stuff.
29.1.1.4
True Traceability
Companies can use blockchain to focus on building a supply chain that includes both vendors and suppliers. It is difficult to trace objects in the traditional supply chain, which can lead to a variety of issues such as theft, counterfeiting, and product loss [10]. Blockchain has more benefits than traditional approach as it facilitates a clear supply chain. It allows all parties in the supply chain to track the commodities and guarantee that they are not being replaced or misused. Companies can also benefit from blockchain traceability by using it inside.
29.1.1.5
Improved Speed and Highly Efficient
The final industrial benefit provided by blockchain is increased efficiency and speed. Blockchain automates time-consuming processes in order to increase efficiency. Blockchain technology facilitates computerization to prevent the systems from manual bugs. By providing a single location to store transactions, the digital ledger makes all of this possible. Everything becomes very efficient and speedy as a result of process simplification and automation.
29.1.2 Applications of Blockchain Technology 29.1.2.1
Financial Services:
Blockchain implements many imaginative ways that are used in services of finance. Blockchain provides the easy facility of managing the assets and ensuring payments with the help of self-operating business supply chain which enhance the users to access and retrieve data or information in a unique manner. Since blockchain is being used, there is no need of intermediates to transmit the information as no one is involved there is no possibility of leaking of data.
29.1.2.2
Health care
The managing and recording of information related to health are very important and should not be altered by anyone to ensure that this certain blockchain technology is used. To get rid of fraudulent activities in the records of health care, the information
29 A Survey on Privacy Preserving Voting Scheme …
271
is stored and saved in segregated databank through blockchain with the help of encryption and digital impressions to assure reliable and secure service [11].
29.1.2.3
Government Sectors
Even in government sectors, services and operations are handled by the use of blockchain technology. Blockchain helps in removing some of the challenges like data transfer inside the government organizations which they are working currently. The proper linking and sharing of data with Blockchain enable better management of data between multiple departments. The transparency is increased and provides a better way to detect and look over the transactions.
29.1.2.4
Consumer Package Goods and Retail
Use of blockchain technology in selling and buying goods makes the services more reliable and efficient. This involves confirming the legitimacy of excessive cost commodities, avoiding crooked activities, identifying theft things, providing digital assurance, maintaining allegiance marks, and optimizing bargaining processes, among other things.
29.1.2.5
Travel and Hospitality
Blockchain technology helps in increasing the capability of operations involved in trips or journey with a friendly approach. It can be used for money exchanging, keeping crucial registers such as passports and other identification cards, making reservations, maintaining travel insurance, and rewarding loyalty.
29.2 Related Work Alvi et al. [12] proposed Cyberhate automation is used in the polling process for the citizens in many countries. Digitalization solitary cannot fix the subject completely. There were also various procedures for operating or altering Cybernated technologies in order to prevent people from voting. This research paper combines digitalization with blockchain methodology to develop a polling system by analyzing the aforementioned challenges. The data honesty, seclusion, and certainty of citizen were all attained in our suggested electronic polling methods using the Merkle tree and fingerprint hash. Qureshi et al. [13] bestowed SeVEP, a certain and sustainable canvass system that make use of familiar cryptographic primeval to render vote and voter concealment, vote reliability, and the original identity of voters can be investigated by two stage
272
P. Priya et al.
examination progress, which enables many voters to poll at same intervals in addition to that it prevents duplicate ballots and achieves variability and un-coercibility in the existence of a suspicious polling device. SeVEP was a certain, checkable, and empirical online polling system, according to the certainty, production, and relative examination in expression of certainty, possessions, and decryption costs. Gao et al. [14] polling scheme that is designed on the basis of blockchain ensures higher lucidity at the time of recording ballots. In addition to that, this methodology has the capability to investigate the ballots that has been recorded wrong and it also prevents the system from quantum collision with help of cryptography technology. When there are small number of voters, this scheme was suitable that it has some benefits in certainty and performance. Yavuz et al. [15] by use of Ethereum wallets and solidity language online pooling system was developed and analyzed for the Ethereum network. Later an election was completed each and every copy of the vote or updating in the Ethereum blockchain. Voters can poll their votes either by using an android device or Ethereum wallets. As well as the polling system that is executed on the basis of blockchain methodology gives dependable amenity which is also mentioned in this paper. Hjalmarsson et al. [16] to examine the working of blockchain designed a system by the formulation of isolated e-polling systems. The study elicited the criteria for developing electronic voting systems, as well as the approved and machinery constraints of employing blockchain as a favor to implement such methods. A study began at assessing a few of the most prominent blockchain skeleton that provides blockchain as a favor. To vanish the cost that is included in formulating an election, it also facilitates more security than traditional polling. Chaieb et al. [17] using a blockchain designed a fully verifiable online electronic voting process. VYV (Verify-Your-Vote) was an e-voting protocol that used decryption ancient formed on Elliptic Curve Cryptography (ECC), settings, and IdentityBased Encryption (IBE). It guaranteed the following seclusion and certainty features: Only capable voters can poll voter validation, poll security, without receipt, and good faith, as well as single and all empirical. As well as the blockchain methodology utilizes a tool called ProfVerif to manifest the integrity of protocol used. Sharma et al. [18] by amalgamating the robustness of raising software networking and blockchain methodology provided a web infrastructure for the smart city. The architecture was split up into two fragments: core network and edge network. With the help of hybrid infrastructure, the firmness of centralized and distributed network was accumulated to generate an advanced infrastructure. In this approach, system also developed a Proof-of-Work technique to ensure security and privacy. To evaluate the feasibility and performance of our suggested model, system simulated our model and analyzed it based on various performance metrics. Curran [19] by constant examination designed a set of records to hold the directory of data that keeps growing. It turned into decentralized warding off a unit factor of non-success with the organization running collectively to valid certify new proceedings. It was made up of data structure blocks containing block of single proceedings as well as the effects of any blockchain concludable.
29 A Survey on Privacy Preserving Voting Scheme …
273
Tarasov et al. [20] proposed to be successful, electronic voting requires a more transparent and secure method than present systems provide. Blockchain methodology is being executed in this model. The voting system’s underlying technology was a payment mechanism that provides transaction anonymity, a feature that has yet to be seen in blockchain protocols. The suggested protocol ensured that voter transactions were anonymous while remaining private, and that the election was transparent and secure. Ayed [21] for the execution of local and national elections designed an online polling system with help of blockchain methodology. Online voting system also has an opposite ballot where people can put down an empty vote to show their discontent or denial of present ruling parties. Each and every time when a voter polls the vote, all the activities are to be registered and the blockchain needs to be updated. The related work is summarized in Table 29.1 as shown.
29.3 Blockchain Technology Peer-to-peer network model is used for developing blockchain which facilitates a universal set of data that can be accessed by anybody even though they do not have trust on each other. It provides a trusted and shared ledger of transactions, with immutable and encrypted copies of data saved on each network node in the network. Blockchain and derived era give a transparent accounting and governance layer for the Internet. All network contributors have equal access to the same data in real-time [22, 23]. All actors can see what is going on the network, and transactions may be tracked back to their beginning. A distributed accounting device or a public and transparent supranational governance system can alternatively be defined as blockchain. The transaction is permanently published to the blockchain when the network confirms it with majority consensus. Otherwise, the proceedings will be repudiated and will not be handled by the system. Valid and final transactions are those that have been integrated into the blockchain [24, 25]. A Blockchain protocol operates on pinnacle of the Internet, on a peer-to-peer network of computer systems that each one run the protocol and maintains an equal replica of the ledger of transactions, letting P2P value proceedings without a gobetween while tool consent. By taking account on every day, the set of shared records which is known as a blockchain is considered as a common registry to secure all the activities that are being performed. It is a shared, trusted public registry of proceedings, that anyone can see but no one may sway. A blockchain is a scattered databank that keeps a persistently increasing list of proceedings of data, cryptographically secured to stop interfere and adaptation [26]. Blockchain mechanism is used here, it makes the vote trustworthy and reliable, and it will aid to rise the number of voters besides it enhances trust toward the government. Any people from any place can amalgamate and isolate through Bitcoin and Ethereum which are considered as an open-source objects of blockchain. It is certified in concern of difficult mathematical attributes [27, 28].
274
P. Priya et al.
Table 29.1 Related works S. No.
Title
Techniques
Findings
1
Digital Voting: Online polling System on the basis of blockchain [12]
Block chain and fingerprint hash
Proposed voting blockchain, a hash attribute is automated from the details of the voters and the information gathered is saved as a block in the format of chain
2
SeVEP: Secure and Verifiable Electronic Polling System [13]
Cryptographic primitives, graphical password technique
Developed a functioning framework and were evaluated its ductility and serviceability in physical world disposition
3
An Anti-Quantum E-Voting Compact in Blockchain with Survey Task [14]
Decentralization and tamper-resistant features
It facilities the characteristics to sustain the security and precision of the election with the help of amalgamated ring impression
4
Toward Secure E-Voting Using Ethereum Blockchain [15]
Ethereum blockchain
Through Ethereum network and blockchain design, it conveys the possible issues that can be involved in online polling scheme
5
Blockchain-Based E-Voting System [16]
Blockchain technology and POA Network
It provides convenient facility for the voters as the online polling scheme is cheaper and quicker. It also avoids unwanted interactions between the voters and candidates in the process of polling which would make the candidates to be at the edge of the seat without knowing the results
6
Verify-Your-Vote: A Verifiable Blockchain-based Online Voting Protocol [17]
Elliptic Curve Cryptography
Using the ProVerif tool, modeled the protocol and demonstrated that it ensured vote privacy, secrecy, and voter authentication
7
Blockchain-based hybrid network architecture for the smart city [18]
Software Defined Networking (SDN)
Proposed model to assure security and privacy and avoid tampering of information by attackers (continued)
29 A Survey on Privacy Preserving Voting Scheme …
275
Table 29.1 (continued) S. No.
Title
Techniques
8
E-Voting on the Blockchain [19]
Interplanetary File System Generating a (IPFS) blockchain-based polling scheme does not need the government to change the entire process instead of that it is enough to make certain alterations in the current scheme of polling
Findings
9
The Future of E-Voting [20]
zk-SNARKs
The elections which are under government will not come under this system. These can be extended to opinion polls or business elections providing a voting platform for voting independent of cost or context
10
A Conceptual Secure Blockchain-Based Electronic Voting System [21]
Secure Hash Algorithm (SHA 256)
Blockchain technology was proposed for electronic voting system. Belief is independent as it is a split up. Any device with Internet connectivity can be used for polling the votes
Process and functionalities involved are much faster in a public blockchain than the private blockchain because there are fewer nodes in a private blockchain. The consortium blockchain, on the other hand, exists among firms, and membership strategies are established for managing transactions better. Because the blockchain will be regulated by an administrator who was appointed for each place, this study employs consortium blockchain. The title contains information about the block, such as the preceding hash, present credit, and issues, as well as the block’s and transactions’ time stamps. The accurate measure of each block in horizontal ranges between one to eight Mega Bytes. The blocks are recognized and put in an isolated way according to their title. Here, Fig. 29.1 describes the process of block creation with Hash Link. The process of converting an input of oppressive and random size to a specified output is known as hashing. To perform various hashing levels, several methods are formulated. The MD5 algorithm which facilitates a 128-nit or 32-character hash outcome is used for hashing subsequently. MD5 is the most recent algorithm in the sequence of before developed version of MD algorithms. It was invented to be used as a cryptology hashing technique [29]; however, they have flaws which would make it difficult to generate unique hash values, making it vulnerable. Another cryptographic hash algorithm is SHA (Secure Hashing Algorithm), which produces a 160-bit hash result with 40 hexadecimal characters. The algorithm was unable to
276
P. Priya et al.
Fig. 29.1 Block creation with hash link
withstand collusion assaults, and its use has decreased. Along with SHA 3 and SHA 256, various novel algorithms have been proposed throughout this time [30]. The US National Security Agency developed a set of algorithms which is commonly known as blockchain technology. SHA 256 and SHA 512 are recent algorithms with no collusion issues and are considered secure otherwise, at least for the time being.
29.3.1 Voting Process • To implement an e-voting scheme based on blockchain technology that meets the core e-voting features while also providing a degree of decentralization and putting as much power over the system in the hands of the voters as feasible. The system has been formulated to withstand a biometric verification-based polling application in the real-world entity taking into account unique needs such as privacy, eligibility, convenience, receipt freeness, and verifiability. Here, Fig. 29.2 describes the block creation with Hash Link. 29.3.1.1
Voting Interface Creation
As the primary function of this solution, the e-voting process necessitates aspects such as privacy, security, anonymity, and verifiability, it is critical that the underlying technology be consistent in order to address these difficulties. The Blockchain technology is more sufficient. This module explains about interface creation for secure voting process. For accessing the system, the administrator is given a separate user name and password to distinguish him from the voters. Admin is responsible to maintain all information in database up to date. In this module, admin can view voter
29 A Survey on Privacy Preserving Voting Scheme …
277
Fig. 29.2 Block creation with hash link
information details such as voter name, address, mobile number, age, gender, voter Aadhaar card number, etc., and these details are saved in the database. The voter information in the system can be checked by the administrator [31].
29.3.1.2
Candidate Details
This module explains about candidate adding process. The election commission is in charge of labeling the voter sequence for polling votes, considering prior documentation of ballots as a foundation [32]. Admin add candidate details like their name, symbol, and Party name. These details are verified by candidates and added in voting database. During polling process, candidate details will be shown to the voters.
29.3.1.3
User Credentials
User should enter their details for registration process. The voters can access the polling application as soon as the enrollment step is accomplished. The details registered by users are Name, mobile number, age, gender, address, Aadhaar card number, voter id, and password and also capture face image for unique verification process [33]. Admin can view voter registration information details. These details are stored in the system. And then admin can avoid illegal voter details in the system.
29.3.1.4
User Verification and Polling
This module is designed with user id and password so as to make the voters access to the polling application. Once the entered details are verified by server, then face
278
P. Priya et al.
image will be capturing for verification. Facial features are extracted and matched with server. On the step of logging in, some of the characteristics are grip into description. The candidate names and their electoral symbols of their political group are mentioned in the polling gadget that can be viewed by the people to poll their ballots to the one whom they needed as an administrator. After the task is accomplished, the voters are able to pick the applicant they wanted to win. Once completion of voting, the details are transferred to the server in a secure manner. The voter can poll their ballots only one time, once the vote is noted then the voter cannot change his ballots, and this would help to avoid fraudulent activities.
29.3.1.5
Block Chain Implementation
In the election process, block generation is a fundamental component; it is a necessary element to set down the voted that is being casted. The suffrages that have been polled, and the transactions involved are taken down in form of blocks, which must be abstracted using features of hash as a conclusion of the polling period. The SHA 256 algorithm will be used to hash the block’s data (i.e., the complete result). This can be performed by amalgamating results of each node and grouping them as duo. Every subsequent block generates the hash value after finishing the transactions that will be pre-owned by successive blocks by accumulating the value generated preceding block, a numerical value is picked at random, and the hash of the block. The hashing method is used to secure the information and data of the blocks which could not be modified or altered by anyone [34, 35].
29.3.1.6
Result Announcement
The predominant groups inside the nodes of the blockchain methodology help in getting and amalgamating the results that are noted from the blocks [36]. There is no need of processing the activities involved in generating the blockchain, once the polling progress is completed and the electoral results are announced [37].
29.3.2 Algorithm—Blockchain Technology It is a digital plan to reserve data. These pieces are chained all around, and this makes their features cannot be modified. When a piece of data is grouped to the other blocks, its data is immutable. In exactly the way, it was once introduced to the blockchain, that it will be publicly available to anyone who wants to see it ever again [38, 39]. Various assured protocols like SHA 1, SHA 2, and SHA 256 are accompanied with blockchain methodology, as they display specific results for the different inputs given with the help of hash features [40, 41]. The main attribute of hash is that it
29 A Survey on Privacy Preserving Voting Scheme …
279
creates a unique key to recognize each and every transaction that has been performed at same interval. Blockchain methodology is more secure because of the use of hashing crypto strategy that aids to develop a satisfactory and robust hashing key and translates a bit of data to strings of characters. All the activities are hashed into one prior to a block [42]. The hash pointer which holds the information that is undeniable helps in linking one block to another. Hence, any changes done in the transaction that would reflect in various string of characters and affect all the blocks [43, 44].
29.3.2.1 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.
Block and Hash Generation
A block carrying current transaction information. Each data generates a hash. A hash is a numbers string and letters. All the activities are marked in an order of how they appeared. The hash that is running not only identified by its proceedings but also with help of hash of preceding transactions. Even if a little alteration is verified in the transaction, it leads to the creation of new hash. The hash has been examined by the nodes to verify that no transaction has been modified. A transaction is set into a block only if there is an acceptance of each and every node is casted. Blockchain can be generated only if each and every block has the credentials of foregoing blocks. A Blockchain is effective because it is distributed among multiple computers, each with a copy of the Blockchain.
29.3.2.2
Hash Function SHA 256
Blockchain is mainly used as it provisions the succession of proceedings that is being done. Unique node in the fetters is associated with the preceding node in the fetters. The initial block within the chain is considered as the base of the stack. Every block that is being created new is put on the top surface of the previous block to develop a stack like structure which is known as block chain [45]. The hash strategy is mainly implemented to modify chain of any message with variable extend to a fastened length, and it is discriminated by the means of openness, unidirectionality, contradiction prevention, and high susceptibility. Hash protocol mainly used to facilitate reliable data security which enables to check whether the information has been illicit modified. When the information examined is modified, its hash merit additionally swaps similarly. The security of the information can be administrated on the basics of hash merit even though if the information is in the savage condition. The National Institute of Standards and Technology announced
280
P. Priya et al.
as SHA as subset of cryptographic hash function as it has the properties that are connected to cryptographic hash function [46, 47]. A subset of SHA-2 algorithm is known as SHA 256 which facilitates a 256-bit message. The strategies of computation are mainly split up into two that includes memorandum prior to processing and the foremost loop. In the preprocessor step of memorandum compilation, the data of various length are handled with binary bit filling and message span stuffing and later the stuffing stage the pervade memorandum is cut down toward various 512 segment memorandum chunks. In primal loop stage, all and every message inside the chunk is operated at a contraction methodology. Here, the output of the precursory contraction methodology is considered as the input of current compacting methodology [48, 49]. To ensure reliable password hashing, SHA 256 protocol is used. This algorithm is mainly used to check transactions in cryptocurrencies like Bitcoin. The security of blocks and proceedings on the blockchain can be handled by hash features. The header of the current block is taken as the hash value of the preceding block, and anyone can distinguish their hash merit in the company of the revert hash merit [50]. As a result, the coherence of the previous block’s piece of data is determined. On the other hand, the hash function is used to both implement public and private keys.
29.4 Conclusion This online voting system using block chain technology will keep the elector’s particulars by which electors can login and use his electing virtue. It gives the utensil keep going on elector’s poll to each gathering and it include entire integer of electors for each gathering. There is a databank that is support by the ballot cut of India in which entire labels of elector beside the whole piece of data are maintained. The system is built with a Web-based interface to encourage user participation and includes features like face recognition to prevent fake and double voting. It provides user-friendly interface to control the voter’s organizations and applicants for the organization. With help of digital polling methodology, there is a raise in number of voters casting their votes in ballots. In addition to this, with help of blockchain the cost of conducting elections will also be reduced. In proposed voting system, no one can make changes without the knowledge of hash value. It enhances the execution with dropped bug value.
References 1. Haenni, R., Koenig, R.E., Dubuis, E.: Cast-as-intended verification in electronic elections based on oblivious transfer. In: Krimmer, R., Volkamer, M., Barrat, J., Benaloh, J., Goodman, N., Ryan, P.Y.A., Teague, V. (eds.) Electronic Voting, pp. 73–91. Springer, Cham, Switzerland (2017) 2. Karayumak, F., Olembo, M.M., Kauer, M., Volkamer, M.: Usability analysis of HeliosAn opensource verifiable remote electronic voting system. In: Proceedings of the Electronic Voting Technology/Workshop Trustworthy Electronic. (EVT/WOTE), p. 5 (2011)
29 A Survey on Privacy Preserving Voting Scheme …
281
3. Rajesh Kanna, P., Santhi, P.: Unified deep learning approach for efficient intrusion detection system using integrated spatial–temporal features. Knowl.-Based Syst. 226 (2021) 4. Qureshi, A, Megías, D., Rifà-Pous, H.: PSUM: peer-to-peer multimedia content distribution using collusion-resistant fingerprinting. J. Netw. Comput. Appl. 66, 180197 (2016, May) 5. Kosba, A., Miller, A., Shi, E., Wen, Z., Papamanthou, C.: Hawk: The blockchain model of cryptography and privacy-preserving smart contracts. In: Proceedings under IEEE Symposium on Security and Privacy (SP), May 2016, pp. 839–858 6. Wu, Y.: An E-voting system based on blockchain and ring signature. M.S. thesis, Dept. Comput. Sci., Univ. Birmingham, Birmingham (2017) 7. McCorry, P., Shahandashti, S.F., Hao, F.: A smart contract for boardroom voting with maximum voter privacy. In: Proceedings of the International Conference on Financial Cryptography and Data Security, pp. 357–375. Springer, Cham (2017) 8. Li, Y., Susilo, W., Yang, G., Yu, Y., Liu, D., Guizani, M.: A blockchain-based self-tallying voting scheme in decentralized IoT. arXiv:1902.03710 (2019) 9. Murugesan, M., Thilagamani, S.: Efficient anomaly detection in surveillance videos based on multi-layer perception recurrent neural network. J. Microprocessors Microsyst. 79 (2020) 10. Thilagamani, S., Shanti, N.: Gaussian and Gabor filter approach for object segmentation. J. Comput. Inf. Sci. Eng. 14(2), 021006 (2014) 11. Perumal, P., Suba, S.: An analysis of a secure communication for healthcare system using wearable devices based on elliptic curve cryptography. J. World Rev. Sci. Technol. Sustain. Dev. 18(1), 51–58 (2022) 12. Alvi, S.T., Uddin, M.N., Islam, L.: Digital voting: a blockchain-based e-voting system using biohash and smart contract. In: 2020 Third International Conference on Smart Systems and Inventive Technology (ICSSIT). IEEE, New York (2020) 13. Qureshi, A., Megías, D., Rifà-Pous, H.: SeVEP: secure and variable electronic polling system. IEEE Access 7, 19266–19290 (2019) 14. Gao, S., Zheng, D., Guo, R., Jing, C., Hu, C.: An anti-quantum E-voting protocol in blockchain with audit function. IEEE Access 7, 115304–115316 (2019) 15. Yavuz, E., Koç, A.K., Çabuk, U.C., Dalkiliç, G.: Towards secure E-voting using Ethereum blockchain. In: Proceedings of the 6th International Symposium on Digital Forensic and Security (ISDFS), March 2018, pp. 1–7 16. Hjálmarsson, F.Þ., Hreigarsson, G.K., Hamdaqa, M., Hjálmtýsson, G.: Blockchain-based Evoting system. In: Proceedings of the IEEE 11th International Conference on Cloud Computing (CLOUD), July 2018, pp. 983–986 17. Chaieb, M., Yousfi, S., Lafourcade, P., Robbana, R.: Verify-your-vote: a verifiable blockchainbased online voting protocol. In: European, Mediterranean, and Middle Eastern Conference on Information Systems, pp. 16–30. Springer, Cham (2018) 18. Sharma, P.K., Park, J.H.: Blockchain based hybrid network architecture for the smart city. Future Gener. Comput. Syst. 86, 650–655 (2018) 19. Curran, K.: E-voting on the blockchain. J. Brit. Blockchain Assoc. 1(2), 1–6 (2018) 20. Tarasov, P., Tewari, H.: The future of E-voting. IADIS Int J. Comput. Sci. Inf. Syst. 12(2), 19 (2017) 21. Ayed, A.B.: A conceptual secure blockchain-based electronic voting system. Int. J. Network Security Appl. 9(3), 01–09 (2017) 22. Sharma, P., Singh, S., Jeong, Y., Park, J.H.: DistBlockNet: a distributed blockchains-based secure SDN architecture for IoT networks. Commun. Maga. IEEE 55, 78–85 (2017) 23. Pawlak, M., Guziur, J., Poniszewska-Mara´nda, A.: Voting process with blockchain technology: auditable blockchain voting system. In: International Conference on Intelligent Networking and Collaborative Systems, pp. 233–244. Springer, Cham (2018) 24. Hardwick, F.S., Gioulis, A., Akram, R.N., Markantonakis, K.: E-voting with blockchain: An evoting protocol with decentralization and voter privacy. In: 2018 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData), pp. 1561–1567. IEEE, New York (2018)
282
P. Priya et al.
25. Hanifatunnisa, R., Rahardjo, B.: Blockchain based e-voting recording system design. In: 2017 11th International Conference on Telecommunication Systems Services and Applications (TSSA), pp. 1–6. IEEE, New York (2017) 26. Pandiaraja, P., Aravinthan, K., Lakshmi, N.R., Kaaviya, K.S., Madumithra, K.: Efficient cloud storage using data partition and time based access control with secure AES encryption technique. Int. J. Adv. Sci. Technol. 29(7), 1698–1706 (2020) 27. Dagher, G.G., Marella, P.B., Milojkovic, M., Mohler, J.: BroncoVote: secure voting system using ethereum’s blockchain (2018) 28. Shukla, S., Thasmiya, A.N., Shashank, D.O., Mamatha, H.R.: Online voting application using ethereum blockchain. In: 2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 873–880. IEEE, New York (2018) 29. Thilagamani, S., Nandhakumar, C.: Implementing green revolution for organic plant forming using KNN-classification technique. Int. J. Adv. Sci. Technol. 29(7S), 1707–1712 (2020) 30. Pradeep, D., Sundar, C.: QAOC: noval query analysis and ontology-based clustering for data management in Hadoop 108, 849–860 (2020) 31. Srikrishnaswetha, K., Kumar, S., Rashid Mahmood, Md: A study on smart electronics voting machine using face recognition and Aadhar verification with IOT. In: Innovations in Electronics and Communication Engineering, pp. 87–95. Springer, Singapore (2019) 32. Logeswaran, R., Aarthi, P., Dineshkumar, M., Lakshitha, G., Vikram, R.: Portable charger for handheld devices using radio frequency. Int. J. Innov. Technol. Explor. Eng. (IJITEE) 8(6), 837–839 (2019) 33. Santhi, P., Mahalakshmi, G.: Classification of magnetic resonance images using eight directions gray level co-occurrence matrix (8dglcm) based feature extraction. Int. J. Eng. Adv. Technol. 8(4), 839–846 (2019) 34. Teja, K., Shravani, M.B., Simha, C.Y., Kounte, M.R.: Secured voting through blockchain technology. In: 2019 3rd International Conference on Trends in Electronics and Informatics (ICOEI), pp. 1416–1419. IEEE, New York (2019) 35. Salman, T., Zolanvari, M., Erbad, A., Jain, R., Samaka, M.: Security services using blockchains: a state of the art survey. IEEE Commun. Surv. Tutorials 21(1), 858–880 (2018) 36. Acemyan, C.Z., Kortum, P., Byrne, M.D., Wallach, D.S.: Usability of voter verifiable, end-toend voting systems: baseline data for Helios, Prêt à voter, and scantegrity II. In: Proceedings of the Electronic Voting Technology Workshop/Workshop Trustworthy Electronics (EVT/WOTE), 2014, pp. 26–56 37. Zhao, H., Bai, P., Peng, Y., Xu, R.: Efficient key management scheme for health blockchain. CAAI Trans. Intell. Technol. 3(2), 114–118 (2018) 38. Pandiaraja, P., Sharmila, S.: Optimal routing path for heterogenous vehicular adhoc network. Int. J. Adv. Sci. Technol. 29(7), 1762–1771 (2020) 39. Takabatake, Y., Kotani, D., Okabe, Y.: An anonymous distributed electronic voting system using Zerocoin. IEICE Technical Report, pp. 127–131 (2016) 40. Çabuk, U.C., Senocak, ¸ T., Demir, E., Çavdar, A.: A proposal on initial remote user enrollment for IVR-based voice authentication systems. Int. J. Adv. Res. Comput. Commun. Eng. 6, 118–123 (2017, July) 41. Li, J., Liang, G., Liu, T.: A novel multi-link integrated factor algorithm considering node trust degree for blockchain-based communication. KSII Trans. Internet Inform. Syst. (2017) 42. Ta¸s, R., Tanrıöver, Ö.Ö.: A systematic review of challenges and opportunities of blockchain for E-voting. Symmetry 12(8), 1328 (2020) 43. Garg, K., Saraswat, P., Bisht, S., Kr Aggarwal, S., Kothuri, S.K., Gupta, S.: A comparative analysis on e-voting system using blockchain. In: 2019 4th International Conference on Internet of Things: Smart Innovation and Usages (IoT-SIU), pp. 1–4. IEEE, New York (2019) 44. Deepa, K., Thilagamani, S.: Segmentation techniques for overlapped latent fingerprint matching. Int. J. Innov. Technol. Explor. Eng. 8(12), 1849–1852 (2019) 45. Zhu, L., Wu, Y., Gai, K., Raymond Choo, K.-K.: Controllable and trustworthy blockchain-based cloud data management. Future Gener. Comput. Syst. 91, 527–535 (2019)
29 A Survey on Privacy Preserving Voting Scheme …
283
46. Alam, A., Zia Ur Rashid, S.M., Abdus Salam, Md, Islam, A.: Towards blockchain-based evoting system. In: 2018 International Conference on Innovations in Science, Engineering and Technology (ICISET), pp. 351–354. IEEE, New York (2018) 47. Choo, K.-K., Ozcan, S., Dehghantanha, A., Parizi, R.M.: Blockchain ecosystem—technological and management opportunities and challenges. IEEE Trans. Eng. Manage. 67(4), 982–987 (2020) 48. Gourisetti, S.N.G., Mylrea, M., Patangia, H.: Evaluation and demonstration of blockchain applicability framework. IEEE Trans. Eng. Manage. 67(4), 1142–1156 (2020) 49. Khoury, D., Kfoury, E.F., Kassem, A., Harb, H.: Decentralized voting platform based on ethereum blockchain. In: 2018 IEEE International Multidisciplinary Conference on Engineering Technology (IMCET), pp. 1–6. IEEE, New York (2018) 50. Zhang, W., Yuan, Y.H., Huang, S., Cao, S., Chopra, A., Huang, S.: A privacy-preserving voting protocol on blockchain. In: 2018 IEEE 11th International Conference on Cloud Computing, pp. 401–408. IEEE, New York (2018)
Chapter 30
A Survey on Detecting and Preventing Hateful Comments on Social Media Using Deep Learning I. Karthika, G. Boomika, R. Nisha, M. Shalini, and S. P. Srivarshini
Abstract Freedom is that the right that is expressed by everybody. However, under the guise of free speech, this privilege is being abused to discriminate against and harm others, either physically or verbally. Hate speech is the regarding for this class of religious bigotry. Hate speech is depicted as language used to show scorn toward an individual or a gathering of people dependent on characteristics like ritual, ethnicity, gender specific, ethnic group, handicap, and heterosexuality. It can take the form of speech, writing, gestures, or displays that target someone due of their affiliation with a particular group. Hate speech has been more predominant in recent years, both in individual and Internet. Hateful content is reared and shared on social media and other online sites, which in the end leads to hate crime. The developing utilization of online media stages and data trade has brought about critical advantages for mankind. However, this has resulted in a number of issues, including the spread and dissemination of hate speech messages. Late investigations utilized a scope of AI and profound learning strategies with text mining techniques to automatically detect hate speech messages on real-time datasets to handle this developing issue on social network forum. Hence, the aim of this paper is to survey the various algorithms to detect the hateful comments and predict the best algorithms in social media datasets. Keywords Classification · Neural networks · Comments analysis · Machine learning · Social network · Natural language processing · Back propagation
30.1 Introduction Web-based media is a popular and, most importantly, basic means for individuals to openly impart their thoughts and insights while likewise connecting with others on the web. It has turned into all vital part of human life [1]. It is a phase where individuals are effectively irritated or mishandled by others, who express disdain in different structures like sexism, prejudice, governmental issues, etc. [2, 3]. The I. Karthika (B) · G. Boomika · R. Nisha · M. Shalini · S. P. Srivarshini Department of Computer Science and Engineering, M.Kumarasamy College of Engineering, Karur, Tamil Nadu 639113, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_30
285
286
I. Karthika et al.
Fig. 30.1 Social media types
utilization of these web-based media stages for digital oppression, online annoyance, and shakedown is likewise on the ascent [4]. Person to person communication locales (SNS) have simplified it for us to associate with various social orders or associations that we are keen on [5]. These destinations have arrived at a critical number of people in the public eye because of the improvement of various capacities like high velocity web and handheld gadgets [6, 7]. A large portion of the supervisors in these associations are more youthful than thirty. Analysts enjoy taken benefit of the tremendous measures of information accessible on various person to person communication locales and embraced broad exploration in an assortment of fields [8]. Opinion Investigation is a well-known field of study that uses a great deal of information from web-based media. The various kinds of online media are portrayed Fig. 30.1.
30.2 Related Works Fortuna and Nunes [9] analyzed the challenges of seeing disdain discourse, which is named in an assortment of stages and settings and gives a bound together depiction. This locale has obvious cultural effect potential, especially in web-based networks and virtual media frameworks. The progression of robotized disdain discourse recognizable proof requires the improvement and codify of shared assets, just as suggestions, clarified datasets in numerous dialects, and calculations. Disdain discourse is language that affronts or corrupts, or prompts brutality or scorn toward organizations, in the field of vision of explicit qualities like actual appearance, confidence, plummet, public or racial beginning, sexual direction, sex distinguishing proof, or different attributes, and it can happen in an assortment of semantic styles, even in a diffused management or when humor is used [10]. Al-Makhadmeh and Tolba [11] took a gander at 1500 examples to check whether blending gadget learning approaches with NLP was gainful. This proposed approach
30 A Survey on Detecting and Preventing Hateful Comments …
287
assist the system to analyze multiple interaction sites. Besides, this technique was viewed as more precise in recognizing disdain discourse and to be additional timeproductive than the conventional strategy. This is on the floorings that the executioner home grown language handling upgrading gathering profound learning calculation (KNLPEDNN) was used to examinations Twitter reactions and figure disdain and non-disdain posts with high precision. The projected technique utilized lots of Tweets as measurements in oneself learning framework; it likewise classified remarks from past realities assessment, which proficiently diminished the miscategorization charge. Cao et al. [12] proposed a solitary profound learning model called DeepHate, which uses multi-layered literary portrayals for automated disdain discourse identification in web-based media. What’s more, conduct extraordinary investigations on three genuine worldwide and freely to be had datasets. The results of the paper shows that proposed framework analyzed outliers in speech detection. Then, at that point, conduct experimental analyzes at the DeepHate form and give discernments into the recognizable capacities that aided in distinguishing disdain discourse in online media. The remarkable element assessment furthermore works on the logic of our proposed model. Waseem and Hovy [13] give an insights set of 16k tweets commented on for disdain discourse. And further we examine which words utilized by us offer the agreeable ID performance. Also we analyze the abilities which improves the recognition of offending words in out campus, and find that regardless assumed contrasts inside the geographic and expression term circulation, they have practically zero fabulous impact on by and large execution, and infrequently work on over character—degree capacities. The exemption for this standard is sexual orientation. Furthermore, offered a posting of norms-based absolutely in significant race thought to choose bigot and misogynist slurs. These can be utilized to gain more records and address the issue of a little, but profoundly productive assortment of scornful clients. While the issue is a long way from settled, we find that utilizing a man or lady n-gram-based thoroughly approach bears the cost of a solid establishment. Segment realities, aside from sexual orientation, brings little improvement, however this will be a direct result of the absence of inclusion. We intend to upgrade region and sexual orientation type to refresh predetermination data and tests. Davidson et al. [14] grouped tweets as either disdain discourse, cruel language, or not single or the other. Training a model to recognize those classifications, then, at that point, inspect the outcomes to assist it with seeing how we will recognize them. The discoveries propose that fine-grained labels can support the discovery of disdain discourse in a distribution and feature various significant obstacles to compelling arrangement. We presume that future compositions should better record for setting and diversity in the utilization of disdain discourse. Additionally, they assembled tweets containing disdain discourse key terms utilizing a publicly supported disdain discourse dictionary. We use publicly supporting to classifications an example of these tweets into three gatherings: those that contain disdain discourse, those that just hold back hostile language, and those that have not one or the other. To segregate
288
I. Karthika et al.
between these unmistakable classes, we train a multi-style classifier [1]. An assessment of the assumptions and mistakes uncovers when we can dependably recognize disdain discourse from other shocking words and when this differentiation is more troublesome. We found that racialist and homophobic tweets are bound to be marked as disdain discourse, while jerk tweets are bound to be named as hostile. It is likewise more hard to classifications tweets that do not contain obtrusive disdain phrases. Badjatiya et al. [15] proposed multiple machine learning classifiers. These classifiers’ element regions are indicated thus by project-explicit implanting recognized utilizing three profound learning designs: CNN, FNN, LSTM We investigate run of the mill spaces, for example, burn n-grams, TF-IDF vectors, and bag-of-words vectors as baselines (BoWV). The intricacy of the home grown language builds makes this task exceptionally intense. We perform sizeable tests with various profound dominating models to investigate semantic expression installing’s to deal with this intricacy. Ibrohim and Budi [16] assembled an Indonesian Twitter dataset for harmful language and disdain discourse acknowledgment, including distinguishing the goal, classification, and seriousness of disdain discourse. This exploration examines multiname composed substance gathering for oppressive language and disdain discourse discovery in Indonesian Twitter, including identifying the objective, classification, and level of disdain discourse utilizing gadget learning processes with Naive Bayes, Support Vector Machine, and Classifier Chains, and Random Forest Decision Tree classifiers and Binary Relevance, Label Power-set as data change strategies. Term recurrence, orthography, and dictionary capacities were among the capacity extractions we utilized [17]. The result of our trails not covered that the RFDT classifier utilizes LP during the in vogue time frame since the change approach furnishes top notch precision with a short estimation time. Alfina et al. [18] delivered a new dataset for disdain discourse ID in Indonesian, which envelops disdain discourse as a general rule, including strict, ethnic, racial, and sex contempt. We like wise did an underlying exploration to see which mix of gadget learning rules and highlights created the best outcomes. Implement the paper to find the unwanted words in international language. As should be obvious, there has not been a lot of exploration done regarding this matter. The most fundamental exploration we found has come about in a dataset for strict disdain discourse, yet the nature of this dataset is deficient. The analysts needed to build a new dataset that included disdain discourse as a rule, like scorn of religion, race, identity, and sex. What’s more, we directed a starter examination utilizing the framework learning approach. To this point, AI has been the most broadly used technique for text arrangement. Ibrohim and Budi [19] developed another Twitter dataset for recognizing oppressive language in Indonesian. Besides, tests in distinguishing oppressive language in Indonesian online media were introduced to safeguard the harmful expressions and composing designs in Indonesian web-based media. In this paper, using Indonesian language we make a new dataset and also we made a direct study on oppressive language. Using our dataset in all occasions NB beats SVM and regression in grouping oppressive language. With regards to ability extractions, express unigram and expression n-gram blends beat elective elements like NB, SVM, and RFDT.
30 A Survey on Detecting and Preventing Hateful Comments …
289
Categorizing tweets into non-oppressive language, harmful yet not hostile language, and hostile language is shown by the experimental outcome and is more troublesome than simply deciding if the tweet is oppressive or not. The classifier we utilized experienced difficulty recognizing whether the tweet was harmful however as of now not hostile or hostile language for this situation. Salminen et al. [20] endeavored the development of an Internet-based disdain classifier that instant spikes in request for an adaptable stage. This prototype functions admirably for recognizing contemptuous input from side to side numerous online media frameworks, uses progressed etymological capacities, like Bidirectional Encoder Representations from Transformers (BERT) (see “BERT” stage), and is made accessible to specialists and experts for comparable use and improvement. Then, at that point, there was a great deal of trying different things with various order cycles and component portrayals (machine learning classifiers). While each of the models seem to surpass the catch phrase-based pattern classifier, XGBoost with every one of its elements performs honorably (F1 = 0.92). As indicated by the component importance investigation, BERT abilities generally affect the conjectures. Since the stage categorical impacts from social networks are like their individual stockpile papers, the discoveries propose the generalizability of the great quality rendition. Likewise, make code unreservedly accessible for use in certifiable programming frameworks just as for additional refinement by online disdain specialists. The comparative study is shown in Table 30.1.
30.3 Existing Methodologies Inside the last decade, there has been a generous expansion in research on text grouping in online media. Identifying and halting the utilization of different kinds of harmful language in web journals [21, 22], micro blogs, and informal communities is an especially helpful part of this job. In this study, we look at how to find hate speech on social media while separating it from popular vulgarity [23]. We plan to employ supervised category algorithms and a recently released dataset annotated for this purpose to construct lexical baselines for this paper [24]. The necessary steps for hate speech discernment can show in Fig. 30.2. Most web-based media stages have executed individual approaches to restrict hate speech [25]; however, executing these guideline necessitates extensive manual labor to revision each file. Some sites, such as Facebook, have newly broaden the number of content material appraiser. Automatic technology and methods may be used to speed up the reviewing process or devote human resources to positions that demand a thorough human examination [26]. In this segment, we look at how automated hate speech identification from text work.
290
I. Karthika et al.
30.3.1 Keyword-Based Approaches An essential perspective for figuring out hate speech is the use of a keywordprimarily-based technique [27] By utilize of an ontology or dictionary, text that included possibly detestable keywords are recognized. For example, Detest base Table 30.1 Table of related works S. No. Title
Techniques
Findings
1
A survey on automatic detection of hate speech in text
Keyword selection and recursive search techniques
Analysts oftentimes start by social occasion and explaining new messages, and these datasets are regularly kept hidden. Because there is fewer data accessible, this slows down the study process
2
Automatic hate speech detection using killer natural language processing optimizing ensemble deep learning approach
Killer natural language processing optimization ensemble deep learning approach (KNLPEDNN)
Proposed programmed framework make sure lowest loss function and highest prediction accuracy
3
DeepHate: hate speech detection via multi-faceted text representations
Convolutional neural network and long short-term memory attention mechanism (C-LSTM-Att Encoder)
For programmed hate speech detection, we utilized multi-faceted text portrayals
4
Hateful symbols or hateful Demographic distribution people? Predictive features technique for hate speech detection on Twitter
To identify racist and sexist slurs, a list of criteria based on critical race theory was presented
5
Automated hate speech detection and the problem of offensive language
Term frequency and inverse document frequency (TF-IDF) and support vector machine algorithm (SVM)
Should seek out hateful sources of training data without resorting to specific phrases or unpleasant language
6
Deep learning for hate speech detection in tweet
Long short-term memory networks (LSTMs)
The use of neural network architectures for hate speech identification was researched
7
Multi-label hate speech and Support vector machine abusive language detection (SVM), Naive Bayes (NB), in Indonesian Twitter and random forest decision tree (RFDT) classifier and binary relevance (BR), label power-set, and classifier chains (CC)
Using annotation rules and gold standard annotations, I created a dataset for identifying abusive language and hate speech (including identifying targets, categories, and levels of hate speech) (continued)
30 A Survey on Detecting and Preventing Hateful Comments …
291
Table 30.1 (continued) S. No. Title
Techniques
Findings
8
Hate speech detection in the Support vector machine Indonesian language: a (SVM), Naive Bayes (NB) dataset and preliminary study
For hate speech identification, we created a new dataset of tweets in the Indonesian language and conducted a pilot investigation evaluating the performance of numerous characteristics and machine learning techniques
9
A dataset and preliminaries Naive Bayes, support vector study for abusive language machine, and random forest detection in Indonesian decision tree classifier social media
The harmful words composing designs in Indonesian web-based media were examined as one of the issues in perceiving oppressive language in Indonesian online media
10
Developing an online hate Logistic regression (LR), classifier for multiple social XGBoost (extreme gradient media platforms boosted decision trees), feed-forward neural network (FFNN)
Experimented with various machine learning models for online hate detection and discovered that XGBoost as a classifier of hostile social media comments performed the best
carry on with a database of infamous conditions for lots companies throughout ninety five languages. Such well-maintained assets are valuable, as terminology adjustments over time [28, 29]. However, as we discovered in our study of the descriptions of hate speech, in reality utilize of a detestable slander is not always consequently adequate to incorporate hate speech. Keyword-based approaches are rapid and effortless to understand. However, they have got extreme barriers. Recognizing handiest racial slurs may achieve a tremendously explicit contraption yet with low review in which accuracy is the extent of appropriate from the set identified and remember is the percent of applicable from in the worldwide populace [30, 31]. In other words, a gadget that is predicated mainly on key phrases might no longer perceive hateful content material that does not use these phrases [4]. In comparison, which involves conditions that would however are not usually detestable (e.g., “swine”, “trash”, and many others.) would create too many false alarms, growing bear in mind at the price of precision [32].
292
I. Karthika et al.
Fig. 30.2 Steps for detecting hate comments
30.3.2 Machine Learning Categorization A machine learning model uses tests of labeled literary material to make a classifier that can distinguish loathe discourse utilizing marks commented on by content analysts [4]. A few ideas were proposed and shown to be effective in the great beyond. In this paper, we discuss an augmentation of the open-source structures utilized in the current review [33]. (i)
Content preprocessing and function selection. Textual content features suggesting hate should be retrieved to become aware of or classify user-generated content material. Individual phrases or sentences with obvious functions (n-grams, i.e., series of n consecutive phrases). Words can be stemmed to upgrade function parallel by removing morphological distinctions from the root [34]. Metaphor processing can extract functionalities as well. In textual content classification, the bag-of-words handling is frequently utilized. Under this approach, a submission is described as a set of phrases or n-grams with no particular sequence. This assumption plainly ignores a crucial aspect of languages [35], but it has proven useful in a variety of situations. There are several approaches for assigning weights to the phrases
30 A Survey on Detecting and Preventing Hateful Comments …
293
Fig. 30.3 Existing methods for Hate speech detection
(ii)
that are more important in this setting, including TF-IDF [36]. For a current information retrieval overview. Beside distributional elements, state installing, or allotting a vector to an expression, is pervasive when utilizing profound learning techniques in regular language handling and text-based substance mining, and incorporates word2vec [37].The bag-of-words assumption is challenged by several deep learning designs, acting as recurrent and transformer neural networks, which simulate the sequencing of the words by processing over a succession of word embedding [38]. Hate speech detection procedures and baselines. Text categorization can be done with help multiple machine learning algorithms such as Bayesian algorithm, Vectorization, and Regression algorithm [39]. With the hypothesis that the aspects do not contact with one other, Naive Bayes models classify chances without delay. SVMs and Logistic Regression are linear classifiers that anticipate lessons established on a mix of ranks for each attribute [40]. The necessary steps can show in Fig. 30.3.
30.4 Proposed Methodologies The best way to meet new individuals is although social networking platform. People have discovered an illegal and immoral way to use social networking sites as their popularity has grown. The expression of hate and harassment are the most widespread and destructive misuses of online social media [9]. Violence, hostility, bullying, coercion, harassment, racism, insults, provocation, and sexism are all examples of
294
I. Karthika et al.
hate speech. These are a few of the most significant on the web risks to a social networking forums [41]. To classify the data and determine if the remarks are hateful or normal, deep learning-based algorithms are applied. • Script is observed as a bunch of words by FF-based neural networks. • RNN-based representations see text as a collection of words and are useful for capturing word relationships and text structures [42]. • For Term Count, CNN-based models are taught to identify designs in text, acting as key expressions (TC). • Capsule networks have recently been applied to TC to label the instruction loss problem caused by CNN pooling operations [43]. • The attention method is active in categorizing related words in text, and it has evolved into an available tool in DL model development. • Memory-augmented networks join neural networks with an external memory that allows the models to peruse and write to datasets. • Graph neural networks are planned to catch interior graph structures of natural language, acting as syntactic and semantic parse trees [44]. Finally, we can reviewed various approaches such as machine learning and deep learning techniques in text classification in social media datasets. Figure 30.4 shows the suggested framework and described.
Fig. 30.4 Proposed work
30 A Survey on Detecting and Preventing Hateful Comments …
295
The rooting and selecting of a set of characterizing and discriminating aspects is the focus of most efforts in developing a robust deep learning classifier [45]. Text mining including following steps • • • • •
Tokenize text-based reviews as single terms Analyze unigrams, bigrams, and n-grams Removing stop words, analyze blocking words, and discarding special characters Finally, extract key phrases Analyze extended words that can be substituted with right words.
A database of categorized terms is created here, which is then used to check the words for any inappropriate words. If the communication contains any vulgar terms, the message will be submitted to the Blacklists, which will filter those words out. Finally, as a result of the content-based-filtering technique [46], a message free of obscene terms will be posted on the user’s wall. As follows is the suggested deep learning classifier: Step 1: Start the neural network model Step 2: Three types of layers are constructed. Step 3: Activate the layers. Step 4: Specify the inputs and neurons. Step 5: Construct key terms as positive and negative Step 6: Match with testing keywords. Step 7: Label as “positive” and “negative” A system uses blacklists to automatically reject undesired messages based on both message content and message author interaction and features [47]. In addition to the collection of features evaluated in the classification procedure, a various semantics for filtering rules to greater match the examined zone, to benefit the users Filtering Rules (FRs) specification, and a various semantics for filtering rules to greater fit the evaluated zone [48].
30.5 Conclusion We can observe the existing machine learning, deep learning approaches in this exploration. We may conclude that deep learning models can be used to resolve a variety of problems. The widely utilized deep learning and machine learning approaches for text classification were explored and compared during this task. We discovered that several forms of CNN perform well in sequential learning tasks and solve the problems of disappearing and explosion of weights in standard text classification algorithms when learning long-term relationships in this task. Furthermore, the concerting of CNN models can be affected by hidden size and batch size. And also we implement our concept in text, in future we will implement in image as well as video form of hate speech. We can able to detect and prevent the image, video in further.
296
I. Karthika et al.
References 1. Singh, V.K., Radford, M.L., Huang, Q., Furrer, S.: ‘They basically like destroyed the school one day’ on newer app features and cyberbullying in schools. In: Proceedings of the ACM Conference on Computer Supported Cooperative Work and Social Computing, pp. 1210–1216 (2017) 2. Jha, A., Mamidi, R.: When does a compliment become sexist? Analysis and classification of ambivalent sexism using Twitter data. In: Proceedings of the 2nd Workshop on NLP and Computational Social Science. Vancouver, BC, Canada, pp. 7–16 (2017) 3. Murugesan, M., Thilagamani, S.: Efficient anomaly detection in surveillance videos based on multi layer perception recurrent neural network. J. Microprocess. Microsyst. 79 (2020). 4. Haidar, B., Chamoun, M., Yamout, F.: Cyberbullying detection: a survey on multilingual techniques. In: Proceedings of the European Modeling Symposium (EMS), pp. 165–171 (2016) 5. Mozafari, M., Farahbakhsh, R., Crespi, N.: Hate speech detection and racial bias mitigation in social media based on BERT model. PLoS ONE 15(8), 1–26 (2020) 6. Deepa, K., Kokila, M., Nandhini, A., Pavethra, A., Umadevi, M.: Rainfall prediction using CNN. Int. J. Adv. Sci. Technol. 29(7S), 1623–1627 (2020) 7. Wiegand, M., Ruppenhofer, J., Kleinbauer, T.: Detection of abusive language: the problem of biased datasets. In: Proceedings of the HLT-NAACL. Minneapolis, MN, USA, pp. 602–608 (2019) 8. Thilagamani, S., Nandhakumar, C.: Implementing green revolution for organic plant forming using KNN-classification technique. Int. J. Adv. Sci. Technol. 29(7S), 1707–1712 (2020) 9. Fortuna, P., Nunes, S.: A survey on automatic detection of hate speech in text. ACM Comput. Surv. 51(4), 1–30 (2018) 10. MacAvaney, S., Yao, H.-R., Yang, E., Russell, K., Goharian, N., Frieder, O.: Hate speech detection: challenges and solutions. PLoS ONE 14(8), Art. no. e0221152 (2019) 11. Al-Makhadmeh, Z., Tolba, A.: Automatic hate speech detection using killer natural language processing optimizing ensemble deep learning approach. Computing 102(2):501–522 (2020) 12. Cao, R., Lee, R.K.-W., Hoang, T.A.: DeepHate: hate speech detection via multi-faceted text representations. In: Proceedings of the 12th ACM Conference Web Science. Southampton, U.K, pp. 11–20 (2020) 13. Waseem, Z., Hovy, D.: Hateful symbols or hateful people? Predictive features for hate speech detection on Twitter. In: Proceeding of the NAACL Student Research Workshop. San Diego, CA, USA, pp. 88–93 (2016) 14. Davidson, T., Warmsley, D., Macy, M., Weber, I.: Automated hate speech detection and the problem of offensive language. In: Proceeding of the ICWSM. Montreal, QC, Canada, pp. 15– 18 (2017) 15. Badjatiya, P., Gupta, S., Gupta, M., Varma, V.: Deep learning for hate speech detection in tweets. In: Proceedings of the 26th International Conference on World Wide Web Companion (WWW Companion). Perth, WA, Australia, pp 759–760 (2017) 16. Ibrohim, M.O., Budi, I.: Multi-label hate speech and abusive language detection in Indonesian Twitter. In: Proceedings of the 3rd Workshop on Abusive Language Online. Florence, Italy, pp. 46–57 (2019) 17. Soni, D., Singh, V.K.: See no evil, hear no evil: audio-visual-textual cyberbullying detection. Proc. ACM Hum. Comput. Interact. 2(CSCW), 1–26 (2018) 18. Alfina, I., Mulia, R., Fanany, M.I., Ekanata, Y.: Hate speech detection in the Indonesian language: a dataset and preliminary study. In: Proceeding of the International Conference on Advanced Computer Science and Information System (ICACSIS). Jakarta, Indonesia, pp. 233–238 (2017) 19. Ibrohim, M.O., Budi, I.: A dataset and preliminaries study for abusive language detection in Indonesian social media. Proc. Comput. Sci. 135, 222–229 (2018)
30 A Survey on Detecting and Preventing Hateful Comments …
297
20. Salminen, J., Hopf, M., Chowdhury, S.A., Jung, S.-G., Almerekhi, H., Jansen, B.J.: Developing an online hate classifier for multiple social media platforms. Hum. Cent. Comput. Inf. Sci. 10(1), 1–34 (2020) 21. Burnap, P., Williams, M.L.: Cyber hate speech on Twitter: an application of machine classification and statistical modeling for policy and decision making. Policy Internet 7(2), 223–242 (2015) 22. Rajesh Kanna, P., Santhi, P.: Hybrid intrusion detection using map reduce based black widow optimized convolutional long short-term memory neural networks. Expert Syst. Appl. 194 (2022) 23. Rajesh Kanna, P., Santhi, P.: Unified deep learning approach for efficient intrusion detection system using integrated spatial-temporal features. Knowl. Based Syst. 226 (2021) 24. Cer, D., Yang, Y., Kong, S., Hua, N., Limtiaco, N., John, R., Constant, N., Guajardo-Céspedes, M., Yuan, S., Tar, C., Sung, Y., Kurzweil, R.: Universal sentence encoder. In: Proceedings of the EMNLP, Brussels, Belgium, pp. 169–174 (2018) 25. Zhang, Z., Robinson, D., Tepper, J.: Detecting hate speech on Twitter using a convolutionGRU based deep neural network. In: Proceedings of the ESWC, Heraklion, Crete, Greece, pp. 745–760 (2018) 26. Chan, T.K.H., Cheung, C.M.K., Wong, R.Y.M.: Cyberbullying on social networking sites: the crime opportunity and affordance perspectives. J. Manage. Inf. Syst. 36(2), 574–609 (2019) 27. Duggan, M.: Online harassment 2017. Pew Research Center, Washington, DC, USA, Technical Report (2017) 28. Thilagamani, S., Shanti, N.: Gaussian and Gabor filter approach for object segmentation. J. Comput. Inf. Sci. Eng. 14(2), 021006 (2014) 29. Waseem, Z., Hovy, D.: Hateful symbols or hateful people? Predictive features for hate speech detection on Twitter. In: Proceedings of the NAACL Student Research Workshop, pp. 88–93 (2016) 30. Deepika, S., Pandiaraja, P.: Ensuring CIA triad for user data using collaborative filtering mechanism. In: 2013 International Conference on Information Communication and Embedded Systems (ICICES), pp. 925–928 (2013) 31. Raisi, E., Huang, B.: Cyberbullying detection with weakly supervised machine learning. In: Proceedings of the IEEE/ACM International Conference on Advances in Social Network Analysis Mining, pp. 409–416 (2017) 32. Perumal, P., Suba, S.: An analysis of a secure communication for healthcare system using wearable devices based on elliptic curve cryptography. J. World Rev. Sci. Technol. Sustain. Dev. 18(1), 51–58 (2022) 33. Gunasekar, M., Thilagamani, S.: Performance analysis of ensemble feature selection method under SVM and BMNB classifiers for sentiment analysis. Int. J. Sci. Technol. Res. 9(2), 1536– 1540 (2020) 34. Tarwani, N., Chorasia, U., Shukla, P.K.: Survey of cyberbullying detection on social media big-data. Int. J. Adv. Res. Comput. Sci. 8(5), 831–835 (2017) 35. Sun, C., Qiu, X., Xu, Y., Huang, X.: How to fine-tune Bert for text classification. In: Proceedings of the China National Conference Chinese Computational Linguistics. Springer, Kunming, China, pp. 194–206 (2019) 36. Pandiaraja, P., Sharmila, S.: Optimal routing path for heterogenous vehicular Ad-hoc network. Int. J. Adv. Sci. Technol. 29(7), 1762–1771 (2020) 37. Rosa, H., Pereira, N., Ribeiro, R., Ferreira, P.C., Carvalho, J.P., Oliveira, S., Coheur, L., Paulino, P., Veiga Simão, A.M., Trancoso, I.: Automatic cyberbullying detection: a systematic review. Comput. Hum. Behav. 93, 333–345 (2019) 38. Logeswaran, R., Aarthi, P., Dineshkumar, M., Lakshitha, G., Vikram, R.: Portable charger for handheld devices using radio frequency. Int. J. Innov. Technol. Explor. Eng. (IJITEE) 8(6), 837–839 (2019) 39. Dadvar, M., Eckert, K.: Cyberbullying detection in social networks using deep learning based models; a reproducibility study. Springer, New York, USA (2018) 40. Pradeep, D., Sundar, C.: QAOC: novel query analysis and ontology-based clustering for data management in Hadoop 108, 849–860 (2020)
298
I. Karthika et al.
41. Al-garadi, M.A., Hussain, M.R., Khan, N., Murtaza, G., Nweke, H.F., Ali, I., Mujtaba, G., Chiroma, H., Khattak, H.A., Gani, A.: Predicting cyberbullying on social media in the big data era using machine learning algorithms: review of literature and open challenges. IEEE Access 7, 70701–70718 (2019) 42. Pandiaraja, P., Aravinthan, K., Lakshmi, N.R., Kaaviya, K.S., Madumithra, K.: Efficient cloud storage using data partition and time based access control with secure AES encryption technique. Int. J. Adv. Sci. Technol. 29(7), 1698–1706 (2020) 43. Vidgen, B., Derczynski, L.: Directions in abusive language training data, a systematic review: garbage in, garbage out. PLoS ONE 15(12), Art. no. e0243300 (2020) 44. Deepa, K., Thilagamani, S.: Segmentation techniques for overlapped latent fingerprint matching. Int. J. Innov. Technol. Explor. Eng. 8(12), 1849–1852 (2019) 45. Waseem, Z., Thorne, J., Bingel, J.: Bridging the gaps: multi task learning for domain transfer of hate speech detection, pp. 29–35. Springer, Cham, Switzerland (2018) 46. Santhi, P., Mahalakshmi, G.: Classification of magnetic resonance images using eight directions gray level co-occurrence matrix (8dglcm) based feature extraction. Int. J. Eng. Adv. Technol. 8(4), 839–846 (2019) 47. Karthika, I., Priyadharshini, S.: Survey on location based sentiment analysis of twitter data. IJEDR 5(1). ISSN: 2321-9939 (2017) 48. Ashfaq, R.A.R., Wang, X.-Z., Huang, J.Z., Abbas, H., He, Y.-L.: Fuzziness based semisupervised learning approach for intrusion detection system. Inf. Sci. 378, 484–497 (2017)
Chapter 31
Anomaly Detection for Bank Security Against Theft—A Survey G. Pavithra, L. Pavithra, B. Preethi, J. R. Sujasre, and R. Vanniyammai
Abstract Abnormal event detection is one of the most huge objectives in analysis sector and viable applications of motion picture intelligence. To improve public safety, reconnaissance cameras are progressively being used in broad daylight spaces like streets, junction, banks, and shopping centers. One most critical limit in video structure is perceiving uncommon activity, for instance, car accidents, bad behaviors, or other illegal exercises. Generally, odd events are only sometimes happened when diverged from standard activities. By and large, atypical occasions seldom happen when contrasted with typical exercises. The target of an irregularity revelation structure is to advantageous sign a development that veers off from customary examples and to recognize the time window of the inconsistency happening in the framework. Once an abnormal activity has been identified, classification techniques are utilized to segregate it into distinct activities. This paper provides an overview of anomaly detection, with an emphasis on the approach of detecting odd actions in banking operations. Banking operation includes daily, weekly, and a periodic activities and transactions performed by or affecting variety of stakeholders such as employees, clients, and external entities. Events may open out over time, and early discovery can considerably mitigate potential negative consequences, and in some cases actively prevent the same. Inconsistency recognition dependent on time series is utilized to recognize individuals during badly designed occasions. In this work AI based oddity recognition strategy carried out to perceive the typical and unusual occasions. Keywords Video surveillance · Face recognition · HAAR cascade algorithm · Alert notification · Gaussian mixture model
G. Pavithra (B) · L. Pavithra · B. Preethi · J. R. Sujasre · R. Vanniyammai Department of Computer Science and Engineering, M.Kumarasamy College of Engineering, Karur, Tamilnadu 639 113, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_31
299
300
G. Pavithra et al.
31.1 Introduction A desire of a shrewd reconnaissance for savvy observing that catches information continuously, sends processes, and understands information about those being monitored and detect illegal person using surveillance camera [1]. The video information is much of the time utilized as a scientific instrument for after-wrongdoing examination [2]. Consequently, these frameworks ensure undeniable degree of safety at public spots which is typically an amazingly perplexing test. It is low time consuming for detection and provide alerts. The camcorders and best items are accessible at sensible cost on the lookout, so video reconnaissance frameworks have become more well known and dynamic observation [3]. Video reconnaissance finds it’s clear implication without any complicated deals to keep in track with the human actions that are not still. Generally, Video signal is any progression of time evolving pictures. An actually picture is an underlying conveyance of powers that stays steady after some time, while a period changing picture has a spatial force dissemination that shifts with time. Video signal is treated as a progression of pictures is called outlines. A vision of constant video is gotten by changing the casings in a quicker way which for the most part alluded as casing rate. The requirement for computerized video is expanding in fields like video remotely coordinating, multimedia authoring, education systems, and video-on- demand systems [4, 5]. Video Surveillance System which is meant to analyse the area where it has been installed in real time for monitoring those suspicious movements is liable to generate continuous alarms. Henceforth, the system secures arena with an utmost efficiency.
31.2 Related Work Hwang et al. [6] proposed a fruitful peculiarity congestion disclosure part, specifically D-PACK, which involves neural network which is nothing but a Convolutional Neural Network (CNN) and an independent significant learning plan for mechanized portraying the engaged plans and isolating abnormal arrangements [7]. Prominently, D-PACK was reviewed most straightforward the initial not many bytes of the essential few parcels in each accept circumstances for what they are for early identification. The plan can rouse the arising endeavors towards online abnormality location frameworks that component lessening the volume of handled parcels and obstructing noxious streams on schedule. Park et al. [8] presented an unsupervised learning strategy to anomaly detection that consider the variety of normal patterns explicitly, while reducing the representation capabilities of CNNs [3]. The method represented video frames by utilizing prototype properties within memory units, hence reducing CNN capacity [9, 10]. To lessen intra-class variances capabilities, the system presented a characteristic compactness loss, which mapped a conventional video casing’s abilities to the closest thing and urged them to be close. The system had been constitute video frames the
31 Anomaly Detection for Bank Security Against Theft …
301
usage of the prototypical functions inside the memory items, lessened the ability of CNNs. Wu et al. [11] proposed a DeepOC neural organization, named as DeepOC, that can all the while access reduced trademark portrayals and prepared a DeepOC classifier. Just the predefined ordinary examples were utilized to fabricate their low-dimensional undeniable level highlights, which were then packed utilizing a one- class classifier. In the interim, a decoder was utilized to modify crude examples from these lowdimensional component portrayals for the exact planning connection and the variety of element portrayals [12]. This construction was continuously made utilized an ill-disposed methodology during the prepared stage. Tanga et al. [13] propose a methodology that joins the benefits as well as equalizes the burden from the both techniques [14]. A start for finishing organization is intended to lead upcoming edge expectation and remaking consecutive zed. Outline forecast was set up the remaking blunders and works with the recognizable proof of unusual events while reproduction helped the foreseeing future edges from ordinary occasions. The later technical aspect tends to connect strongly with pre- generated squares that would confine the outer shape of the object that is meant for suspicious movement. Chaker et al. [15] proposed an unaided methodology for swarm scene abnormality location and confinement utilizing an interpersonal organization model. In windowprincipally based strategy, a video have turned out to be first apportioned at spatial and worldly stages and a rigid arrangement of spatio-transient 3d retangles have been fabricated. Elements with scene gestures were discovered. A worldwide informal organization (GSN) fostered the current window to portray the scene’s worldwide conduct utilizing nearby informal organizations. In terms of anomaly detection accuracy, the findings and suggested strategy outperform the most, if it is not all, of the current state-of-the-workmanship techniques. Xu et al. [16] suggested Appearance and Motion Deep Net (AMDN) that studies trait depictions utilizing significant neural associations. In order to take advantage of the best corresponding data of each look and development designs, conveyed novel twofold combination engineering. The structure consolidated advantages of right on time and late combination strategies [17]. Auto encoders are proposed to freely learn both appearance and development incorporates similarly as a joint depiction (early blend).Then, at that point, in view of the learned elements, various one-class SVM models are utilized to foresee the irregularity scores of each info. At long last, an original late combination methodology is proposed to consolidate the registered scores and identify strange occasions. Medel et al. [18] suggested a Composite Conv-LSTM Encoder-Decoder to take in the consistency of recordings from non-covering spots of casings within an info portion. The model used a composite structure and tested the effects of “conditioning” in learning extra significant representations. The best model was picked reliant upon the redoing and conjecture. Conv-LSTM models were used to assess both qualitative and quantitative data. Furthermore that model showed serious outcomes on oddity recognition datasets [19].
302
G. Pavithra et al.
Zhou et al. [20] proposed a neural neighborhood anomaly area through significantly accomplishing feature sorting out some way to know, inadequate depiction, and word reference learning data on in three coupled neural dealing with blocks. Specifically to learn higher abilities, planned a development combination block followed through an element change square to encounter the advantages of eliminating loud verifiable past, reoccuring movement, and decreasing information insufficiency [21]. Moreover, to adapt to certain dangers (e.g., non adaptive refreshing) of the predominant scanty coding streamlining agents and incorporate the requirements of neural neighborhood (e.g., equal figuring), planned a novel repetitive neural organization to learned scanty portrayal and word reference by means of giving a versatile iterative hard-approaching computation (flexible ISTA) and compounding the adaptable ISTA as another long rapid term memory (LSTM). Liu et al. [22] to handle the peculiarity discovery issue inside a video expectation structure. Obviously, this changed into the chief show-stoppers and switch ages the differentiation between an anticipated predetermination body and its ground truth to go over a customary occasion. To foresee a future edge with higher superior grade for ordinary exercises, aside from the generally used look (spatial) limitations on depth and gradient. This paper also established a motion constraint in video prediction with the aid of requiring the optical glide between expected shapes and ground truth shapes to be steady, also this is main job which brings a fleeting limitation inside the motion pictures expectation assignment. Spatial like this and movement limitations worked with the future body forecast for routine occasions, and consequently worked with to see those weird practices that didn’t check the assumption. Li et al. [23] evolved a multi-venture deep convolutional network which concurrently detected the presence of the target and the geometric properties of the target with recognize to the region of interest. Second, a recurrent neuron layer became followed for based visual detection. The intermittent neurons can manage the spatial circulation of apparent prompts having a place with an item whose space or design is hard to expressly characterize. Both networks tested through the sensible project of detecting lane restrictions in traffic scenes. The multiple tasking Convolutional Neural Network (CNN) gave helper mathematical information for aiding the resulting displaying of given path frameworks. The Recurrent Neural Network not really settled path limits, alongside those areas contained no imprints, with none explicit past expertise or auxiliary demonstrating.
31.3 Video Surveillance System Video Surveillance is one of the fundamental subjects in Image Processing. Video Surveillance begins with simple CCTV frameworks, to catch information and to screen individuals, occasions and exercises. Existing computerized video surveillance framework basically gives the foundation to catching, putting away, and appropriating video, leaving danger discovery to human administrators. Human observing of reconnaissance video is a difficult work—escalated task and detecting different
31 Anomaly Detection for Bank Security Against Theft …
303
occasions continuously video is undeniably challenging in manual perceive [24]. Thusly the Intelligent video reconnaissance machine is appeared. The investigation programming strategies video stream pictures to naturally distinguish things (peoples, supplies, vehicles) and occasions of revenue for security purposes. Noticing or examining a specific location for security and business reasons for existing is known as video observation. The sending of video surveillance cameras is persuaded by security and wrongdoing the board concerns. Video observation cameras are utilized in purchasing focuses, public spots, banking establishments, organizations, and home security and ATM machines. Nowadays, explores experience relentless development in network reconnaissance [25]. The object being is the instability incidents that are occurring from one side of the planet to the other. Hence, there might be need brilliant reconnaissance framework for intelligent observing that catches information continuously, communicates, processes and examine the information applicable to those observed. The video data can be utilized as a measurable apparatus for after-wrongdoing examination [26, 27]. Hence, those frameworks guarantee an undeniable degree of safety at public spots which is for the most part an incredibly perplexing assignment. As video cameras are accessible at positive cost inside the market, subsequently video observation frameworks have become more reasonable [28]. Video reconnaissance frameworks have assortment of utilizations like traffic checking and human activity following. In video reconnaissance framework show highlights which examinations conduct in the monitored area progressively, and makes the occasions accessible for creating ongoing cautions and content based looking in real time [29]. The term multi-view face acknowledgement, from several perspective suggests simple conditions which would cooperatively calculates the acquired images/ recordings to obtain a conclusion that has been obtained from numerous cameras that efficiently acquires the matter (or scenery) simultaneously. Yet, the expression has much of the time been utilized to perceive faces all through present varieties [30]. The two cases that is discussed shortly either with multiview face acknowledgement or with recording images using solitary camera both should be keen in terms of gathering video information. The two cases, nonetheless, veer with regards to video information. While a multi-camera system ensures the social event of multi-view information at without fail, the probability of gaining similar information with a single camera is obscure. Non-agreeable acknowledgment applications, like observation, require such variances. Most current multi-view video face acknowledgment calculations exploit single-view recordings [31]. Given several face photographs to endorse, they look up in the grouping to “change” the face part’s exposure in single picture to the indistinct stance and light of another image. This strategy may likewise essential the postures and lighting conditions to be anticipated for both face photographs. This “conventional reference set” idea has additionally been utilized to work on the all encompassing coordinating with calculation, wherein the positioning of look-into results shapes the premise of matching measure. There are likewise works which handle present forms verifiably without assessing the posture expressly.
304
G. Pavithra et al.
31.3.1 Face Recognition Approaches A great deal of study has been performed for automatic anomalous activity detection and preventing the situations like bank robbery. Computer vision techniques are used in the proposed methodology to recognize normal and aberrant human behavior [26]. The system consists of a framework where objects are shifting with respect to a fixed background and each video frame is processed as follows. First, foreground extraction approach is used to gain clear view of people [32]. The foreground objects are extracted from a video sequence using background subtraction. Generally, banks are installed in a closed enclosure where background does not change over the time. As a foundation outline, the framework utilizes an edge with no moving items. Subsequent to gathering the frontal area objects, then, at that point, figure the movement location process. Motion templates are a quick and easy approach to record motion and gesture recognition [33, 34]. The pixel force is straightly inclining esteem as a component of time, where more brilliant (more white) values addressing later movement areas. When an object moves, it leaves a motion history of its movements behind. Also detect face image enclosed with movement. This will help to identify the thief [35, 36]. The movement and face detection will be performed based on unwanted time periods. This additionally gives programmed alarm to the approved people. Here, Fig. 31.1 describes the process of block creation with Hash Link.
Fig. 31.1 Proposed work
31 Anomaly Detection for Bank Security Against Theft …
31.3.1.1
305
Video Capturing Framework
This module proposes robbery location and global positioning framework dependent on observation cameras. Without the use of sensors, image processing is used to detect theft and the movement of burglars in surveillance camera footage. This system focuses on detecting objects. The security people can be warned about the suspected individual conducting burglary using Real-time analysis of the movement of any human from Surveillance Camera footage and thus gives an opportunity to prevent the identical [37].
31.3.1.2
Set Time Based Storage
Before capturing the activities by camera, Admin should set the time for predicting abnormal activities based on unwanted time period [36]. This module gives input from the Human Detection by reconnaissance camera. When a person enters into the system it checks the timer to measure the time. The system sends an alert email to the administrator when the predefined time limit for human detection is reached [38].
31.3.1.3
Movement Detection
Evolution Behavior of the person is surveyed before the structure. If the system detects any movement in the image, the system without human intervention takes a snapshot of the observed image and executes the alarm according to the user preferences. The first step is by acquiring video pictures from CCTV. Those pictures will be used for motion detection system [39, 40]. If a motion is detected, the information of time stamp and images with detected motion will be saved. The captured time value should check with database to predict normal or abnormal activity[41]. The development worth may be when contrasted with time limit.
31.3.1.4
Face Identification
Input is as constant video catching. Images are split into still images. Face detection is done in the process. Facial capabilities are matching with database using grass man learning algorithm. The transient records in video arrangements approve the assessment of facial unique changes and its product as a biometric identifier for individual acknowledgment. There are a few distinct methods to make the trademark vector for classifier preparing. Some even utilize the whole picture as a component vector and lead characterization, which requires high calculation. So here highlight vector is produced using critical picture esteems from each channel, for example, energy, mean and famous deviation framing a 40 worth property vector for each picture. System has utilized the human nature that human will have unobtrusive amount of
306
G. Pavithra et al.
advancements, for instance, eyes glinting and also mouth and face limit improvements. Framework can get this data effectively in light of the fact that managing video grouping by which the entire arrangement of the item’s developments can be gotten [42]. Taking that point in to account we can decrease the mistakes that happens because of bogus recognition of a human face and limit the hour of reproduction.
31.3.1.5
Send Alert Intimation
In an observation climate, the programmed recognition of strange exercises can be utilized to caution the important power of possible lawbreaker or perilous practices, like programmed announcing of an individual. In proposed framework obscure occasion alert ship off the predefined contact numbers with respect to specific officials. Here additionally carry out picture sharing for simple ID of hoodlums [43].
31.3.1.6
Curiosity
Assemble the connection between the lopsided dispersions of still pictures and video clasps of various quality Intricacy is low and execution is high. Low tedious for recognition.
31.3.1.7
Feature Extraction
We presented an upgraded include extraction strategy for irregularity recognition. We propose to utilize skeleton construction to dispense with the solicitation awkwardness issue. The identification model can focus harder on the weak consistent parts and creates an exact model. Adding new or eliminating old assets and changing the boundaries of certain assets [44].
31.3.1.8
Impact on Gaussian Mixture Model
There are parcel boundaries to fit, and generally requires loads of information and numerous emphases to obtain great outcomes. Accept that there are a sure number of Gaussian disseminations, and every one of these conveyances address a group. Subsequently, a Gaussian Mixture Model will in general gathering the information guides having a place toward a solitary conveyance together [45–47].
31 Anomaly Detection for Bank Security Against Theft …
307
31.3.2 Challenges Suitable element extraction characterizing typical practices. Taking care of imbalanced appropriation of ordinary and unusual information, tending to the varieties in unusual conduct meager event of unusual occasions [48, 49]. Natural varieties of diversified occasions in real time results in development of highly complicated scenarios which would get converted into a bigger issue to be captured in a usual cameras [50].
31.4 Conclusion Proposed framework centers around executing a Smart Camera based irregularity location that screens bank exercises, it can identify any kind of dubious conduct, and the thieves would be followed on the basis of motion and the facial detection approach based on unwanted time period. If any such suspicious activity is detected at unwanted time, the Smart Camera will immediately send an alert message to the security department. This news depicts whether what kind of caution is formed; it also contains the face image of the thief and time detected with a web link where the live image is stored, so the security can accompany suitable guidance. A Gaussian Mixture Model is a parametric probability thickness work tended to as a weighted measure of Gaussian part densities. A Gaussian Mixture Model is used for modeling of the background is because it is one of the best model for background modeling. It simulates a wide range of pixel types. Peculiarity location which can be anticipated the huge bunch pixels esteem contrast from the caught video. When anomaly analyzer calculates the massive cluster pixel image then masks is headed on that pixels and therefore the shifting item is detected. The former systems that are prevailing already is not capable of recognising any sort of complicated nor peculiar novel movements that are meant for planned assaults in theft [46]. Counteraction of safety breaks and dangers. Disclosure of stowed away execution valuable open doors. Stretch financial plans, assets and ability further. Quicker results. They can’t recognize novel assaults. They experience the ill effects of misleading problems. They must be customized again for each new example to be identified. Machine information delivered is tedious and doesn’t need human consideration. By persistently checking a live climate, peculiarity recognition calculations can really open themselves to huge measures of live preparation information.
308
G. Pavithra et al.
References 1. Murugesan, M., Thilagamani, S.: Efficient anomaly detection in surveillance videos based on multi layer perception recurrent neural network. J. Microprocess. Microsyst. 79 (2020) 2. Georgescu, M.-I., Barbalau, A., Ionescu, R.T., Khan, F.S., Popescu, M., Shah, M.: Anomaly detection in video via self-supervised and multi-task learning (2020). arXiv:2011.07491 3. Deepa, K., Kokila, M., Nandhini, A., Pavethra, A., Umadevi, M.: Rainfall prediction using CNN. Int. J. Adv. Sci. Technol. 29(7 Special Issue), pp. 1623–1627 (2020) 4. Thilagamani, S., Shanti, N.: Gaussian and Gabor filter approach for object segmentation. J. Comput. Inf. Sci. Eng. 14(2), 021006 (2014) 5. Perumal, P., & Subha, S.: An analysis of a secure communication for healthcare system using wearable devices based on elliptic curve cryptography. World Rev. Sci. Technol. Sustain. Devel. 18(1), 51–58 (2022) 6. Hwang, R.H., Peng, M.C., Huang, C.W., Lin, P.C., & Nguyen, V.L.: An unsupervised deep learning model for early network traffic anomaly detection. IEEE Access 8, 30387–30399 (2020) 7. Sabokrou, M., Khalooei, M., Fathy, M., Adeli, E.: Adversarially learned one-class classifier for novelty detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3379–3388 (2018) 8. Park, H., Noh, J., & Ham, B. (2020). Learning memory-guided normality for anomaly detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 14360-14336 9. Wu, P., & Liu, J.: Learning causal temporal relation and feature discrimination for anomaly detection. IEEE Trans. Image Process. 30, 3513–3527 (2021) 10. Luo, W., Liu, W., & Gao, S.: Remembering history with convolutional lstm for anomaly detection. In: 2017 IEEE International Conference on Multimedia and Expo (ICME), pp. 439–444. IEEE (2017) 11. Wu, P., Liu, J., Shen, F.: Adeep one-class neural network for anomalous event detection in complex scenes. IEEE Trans. Neural Netw. Learn. Syst. 31(7), 2609–2622 (2020) 12. Nie, X., Wang, B., Li, J., Hao, F., Jian, M., Yin, Y.: Deep multiscalefusion hashing for crossmodal retrieval. IEEE Trans. Circuits Syst. Video Technol. 31(1), 401–410 (2021) 13. Tang, Y., Zhao, L., Zhang, S., Gong, C., Li, G., Yang, J.: Integrating prediction and reconstruction for anomaly detection. Pattern Recognit. Lett. 129, 123–130 (2020) 14. Thilagamani, S., Nandhakumar, C.: Implementing green revolution for organic plant forming using KNN-classification technique. Int. J. Adv. Sci. Technol. 29(7S), 1707–1712 (2020) 15. Chaker, R., Al Aghbari, Z., Junejo, I.N.: Social network model for crowd anomaly detection and localization. Pattern Recognit. 61, 266–281 (2017) 16. Xu, D., Yan, Y., Ricci, E., Sebe, N.: Detecting anomalous events in videos by learning deep representations of appearance and motion. Comput. Vis. Image Understand. 156, 117–127 (2016) 17. Peng, X., Feng, J., Xiao, S., Yau, W.-Y., Zhou, J.T., Yang, S.: Structured autoencoders for subspace clustering. IEEE Trans. Image Process. 27(10), 5076–5086 (2018) 18. Medel, J.R., & Savakis, A.: Anomaly detection in video using predictive convolutional long short-term memory networks (2016). arXiv:1612.00390 19. Sprechmann, P., Bronstein, A.M., Sapiro, G.: Learning efficient sparse and low rank models. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1821–1833 (2015) 20. Zhou, J.T., Du, J., Zhu, H., Peng, X., Liu, Y., Goh, R.S.M.: Anomalynet: an anomaly detection network for video surveillance. IEEE Trans. Inf. Forensics Security 14(10), 2537–2550 (2019) 21. Huo, J., Gao, Y., Yang, W., Yin, H.: Multi-instance dictionary learning for detecting abnormal events in surveillance videos. Int. J. Neural Syst. 24(03), 1430010 (2014) 22. Liu, W., Luo, W., Lian, D., Gao, S.: Future frame prediction for anomaly detection: a new baseline. In: Proceedings of the IEEE conference on computer vision and pattern Recognition, pp. 6536–6545 (2018)
31 Anomaly Detection for Bank Security Against Theft …
309
23. Li, J., Mei, X., Prokhorov, D., Tao, D.: Deep neural network for structural prediction and lane detection in traffic scene. IEEE Trans. Neural Netw. Learn. Syst. 28(3), 690703 (2017) 24. Chong, Y.S., Tay, Y.H.: Abnormal event detection in videos using spatiotemporal autoencoder. In: International Symposium on Neural Networks, pp. 189–196 (2017) 25. Medel, J.R., Savakis, A.: Anomaly detection in video using predictive convolutional long shortterm memory networks. arXiv preprint arXiv:1612.00390 (2016) 26. Hinami, R., Mei, T., Satoh, S.I.: Joint detection and recounting of abnormal events by learning deep generic knowledge. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3619–3627 (2017) 27. Luo, W., Liu, W., Gao, S.: A revisit of sparse coding based anomaly detection in stacked rnn framework. In: Proceedings of the IEEE International Conference on Computer Vision, Oct 2017, pp. 341–349 (2017) 28. Sharma, A, Varshney, N.: Identification and detection of abnormal human activities using deep learning techniques. Eur. J. Mol. Clin. Med. 7(4), 408–417 (2020) 29. Jiang, F., Yuan, J., Tsaftaris, S.A., & Katsaggelos, A.K.: Anomalous video event detection using spatiotemporal context. Comput. Vis. Image Understand. 115(3), 323–333 (2011) 30. .Pandiaraja, P., Aravinthan, K., Lakshmi Narayanan, R., Kaaviya, K.S., Madumithra, K.: Efficient cloud storage using data partition and time based access control with secure AES encryption technique. Int. J. Adv. Sci. Technol. 29(7), 1698–1706 (2020) 31. Sabokrou, M., Fayyaz, M., Fathy, M., Klette, R.: Deep-cascade: cascading 3D deep neural networks for fast anomaly detection and localization in crowded scenes. IEEE Trans. Image Process. 26(4), 1992–2004 (2017) 32. Santhi, P., Mahalakshmi, G.: Classification of magnetic resonance images using eight directions gray level co-occurrence matrix (8dglcm) based feature extraction. Int. J. Eng. Adv. Technol. 8(4), 839–846 (2019) 33. Xu, D., Ricci, E., Yan, Y., Song, J., & Sebe, N.: Learning deep representations of appearance and motion for anomalous event detection. arXiv preprint arXiv:1510.01553 (2015) 34. Cai, R., Zhang, H., Liu, W., Gao, S., Hao, Z.: Appearance-motion memory consistency network for video anomaly detection. Proc. AAAI Artif. Intell. 35(2), 938–946 (2021) 35. Li, W., Mahadevan, V., Vasconcelos, N.: Anomaly detection and localization in crowded scenes. IEEE Trans. Pattern Anal. Mach. Intell. 36(1), 18–32 (2014) 36. Malhotra, P., Vig, L., Shroff, G., Agarwal, P.: Long short term memory networks for anomaly detection in time series. In: Proceedings Presses Universitaires De Louvain, Aug 2015, p. 89 (2015) 37. Cheng, K.-W., Chen, Y.-T., Fang, W.-H.: Video anomaly detection and localization using hierarchical feature representation and Gaussian process regression. In: CVPR (2015) 38. Mousavi, H., Nabi, M., Galoogahi, H.K., Perina, A., Murino, V. : Abnormality detection with improved histogram of oriented tracklets. In: Proceedings of International Conference on Image Analysis and Processing, pp. 722–732 (2015) 39. Neelima, D., Rao, K.L.: A moving object tracking and velocity determination. Int. J. Adv. Eng. Sci. Technol. (IJAEST) 11(1), 96–100 (2011) 40. Chongjing, W., Xu, Z., Yi, Z., Yuncai, L.: Analyzing motion patterns in crowded scenes via automatic tracklets clustering. In: Communications, China 10, no. 4, Apr 2013, pp. 144–154 (2013) 41. Munawar, A., Vinayavekhin, P., De Magistris, G.: Spatio-temporal anomaly detection for industrial robots through prediction in unsupervised feature space. In: 2017 IEEE Winter conference on applications of computer vision (WACV). IEEE, pp. 1017–1025 (2017) 42. Pradeep, D., Sundar, C.: QAOC: Noval query analysis and ontology-based clustering for data management in Hadoop, vol. 108, pp. 849–860 (2020) 43. Smeureanu, S., Ionescu, R.T., Popescu, M., Alexe, B.: Deep appearance features for abnormal behavior detection in video. In: International Conference on Image Analysis and Processing, vol. 10485, pp. 779–789 (2017) 44. Gunasekar, M., Thilagamani, S.: Performance analysis of ensemble feature selection method under SVM and BMNB classifiers for sentiment analysis. Int. J. Sci. Technol. Res. 9(2), 1536– 1540 (2020)
310
G. Pavithra et al.
45. Lu, Y., Yu, F., Reddy, M.K.K., Wang, Y.: Few-shot scene-adaptive anomaly detection. In: European Conference on Computer Vision, Oct 2020, pp. 125–141 46. Liu, W., Luo, W., Li, Z., Zhao, P., Gao, S.: Margin learning embedded prediction for video anomaly detection with a few anomalies. In: Proceedings of 28th International Joint Conference on Artificial Intelligence, Aug 2019, pp 3023–3030 (2019) 47. Peng, X., Lu, C., Zhang, Y., Tang, H.: Connections between nuclear norm and frobeniusnorm-based representations. IEEE Trans. Neural Netw. Learn. Syst. 29(1), 218–224 (2018) 48. Rajesh Kanna, P., Santhi, P.: Unified deep learning approach for efficient intrusion detection system using integrated spatial–temporal features. Knowl. Based Syst. 226 (2021) 49. Deepa, K., Thilagamani, S.: Segmentation techniques for overlapped latent fingerprint matching. Int. J. Innov. Technol. Explor. Eng. 8(12), 1849–1852 (2019) 50. Vondrick, C., Pirsiavash, H., Torralba, A.: Generating videos with scene dynamics. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
Chapter 32
A Framework of a User Interface Covid-19 Diagnosis Model M. Sumanth and Shashi Mehrotra
Abstract The world is facing the global challenge of COVID-19 pandemics, which is a topic of great concern. It is a contagious disease and infects others very fast. Artificial intelligence (AI) can assist healthcare professionals in assessing disease risks, assisting in diagnosis, prescribing medication, forecasting future well, and may be helpful in the current situation. Designing, a user-friendly Web application-based diagnosis model framework, is more useful in health care. The study focuses on a Web-based model for diagnosing the COVID-19 patients without direct contact with the patient. Chest CT scans have been important for the testing and diagnosing of COVID-19 disease. The Web-based model would take inputs, CT scan images, and users’ symptoms and display classification results: NON-COVID-19 or COVID-19 infected. Keywords Convolution neural network · User interface · COVID-19
32.1 Introduction COVID-19 is a severe acute respiratory syndrome (SARS), first detected in December 2019 in Wuhan, China. It spread worldwide quickly in just a few months, making it highly infectious [1, 5]. Windedness, loss of olfactory sense and taste, and body temperature are a few of the symptoms that the virus causes. The extended incubation time frame mix and asymptomatic people (whoever doesn’t show signs of disease but carries the virus) make the Covid significantly harder to identify. The data up to April 5, 2020, report COVID-19 1,133,758 cases in 200 countries and 62,784 deaths [2], and still, its variant delta, omicron going on till date. This
M. Sumanth (B) · S. Mehrotra Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, AP, India e-mail: [email protected] S. Mehrotra e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_32
311
312
M. Sumanth and S. Mehrotra
is globally a great concern regarding public health. A laboratory-based and chest radiography and swab collection are the major COVID-19 diagnostic approaches. In diagnosing COVID-19, X-ray and computed tomography (CT) are essential [1]. Chest CT is useful for diagnosing COVID-19 complications, and its role in prognosis needs to be investigated further. These days, artificial intelligence (AI) methods enhance the power of imaging tools and help medical professionals. In recent days, many studies using artificial intelligence techniques to diagnose COVID-19 with chest-computed tomography have exploded (CT) [3]. Artificial intelligence is becoming more prevalent in various fields, and medical health care is the most important. The features derived from multiple analysis methods, including artificial intelligence, are available in modern medical imaging. Fuzzy reasoning, evolutionary calculations, and neural networks are examples of analyses that use these functions. Medical diagnostics can be aided by proper image processing, feature selection, and artificial intelligence methods. Convolution neural network is becoming a popular tool due to their accuracy and speed [4]. Online symptom checkers would provide healthcare access in the low resources available areas such as rural areas and improve public health [5]. The paper presents a framework for COVID-19 case prediction online, with the help of CT and symptoms taken as input from the user. Convolutional neural networks are used for diagnosing COVID-19 infection from chest CT scan pictures. A model gives information about various CT scans from patients with COVID-19 sickness bacterial pneumonia. Such a platform may be safer for hospitals and the health system since radiologists are costly and not always available. CT scans are relatively affordable and are now a part of routine medical care. This could save both money and time, which is particularly important when dealing with COVID-19. The paper is organized as follows: Sect. 2 discusses some related work; Sect. 3 describes design and methodology, and Sect. 32.4 concludes and future work.
32.2 Related Work Deep learning and machine learning algorithm have been used in health care for disease diagnosis. The research uses classification and clustering techniques for health care. Clustering is a technique of grouping similar objects together and used for various purposes [6–9]. Classification is to classify the object. The authors in [10–12] used machine learning and deep learning algorithm for disease diagnosis. Online symptom checkers applications allow users to input symptoms for a diagnostic. Berry [13] states that most users search for health information globally, and symptom checkers provide a cost-effective way to the societal demand for on-demand electronic health care. According to Berry [13], after the 2012 Olympics, online symptom checkers add to syndromic surveillance with huge success and admiration. There is a clear connection between the online symptom checker system and conventional telephone
32 A Framework of a User Interface Covid-19 …
313
triage results for several syndromic indicators. Online symptom checker data are tended to provide early notice for specific disease systems. Taneja and Gupta [14] provide an overview of Web development. Server-side programming includes writing Web apps, hosting Web sites, and creating Web pages. Client-side programming refers to developing tools or interfaces for interacting with these Web applications, sites, or pages. Web development entails client and serverside programming. Server-side programming includes writing Web apps, hosting Web sites, and creating Web pages. Client-side programming refers to developing tools or interfaces for interacting with these Web applications, sites, or pages. Ozsahin et al. [3] discuss that thoracic CT is highly sensitive for diagnosing COVID-19, making it a primary tool for detecting the COVID-19. As per Garcıa et al. [15], the Internet has become a standard tool in our everyday lives. We rely on our data being securely stored and accessible in the cloud, and as the volume of data grows, a quick and dependable Web interface is needed. A thorough understanding of various programming languages is usually required to create such interfaces. We want to demonstrate that we can reduce the code and information required by open-source software libraries available today. Chamola et al. [16] conducted a study on COVID-19 affected patients to identify the effect of the SARS-CoV-2 virus on various organs of the patients.
32.3 Design and Methodology The model for predicting COVID-19 cases is based on deep learning. The model consists of two modules: 1. 2.
CNN model for diagnosis using chest CT images. Covid-19 diagnosis model using the primary symptoms.
A user-friendly Web page accepts two kinds of inputs from the users, which are used to diagnose whether a person COVID-19 infected or not. The following two inputs are to be given by the user: • CT scan images of the patient • List of symptoms patient is suffering from Following symptoms are considered for the diagnosis: • • • • • •
Headache Itchy eyes Loss of smell Runny nose Cough Trouble in breathing
User input (0 or 1) denotes their response for the above symptoms; 0 represents a person does not have mentioned symptoms, while 1 indicates the presence of the
314
M. Sumanth and S. Mehrotra
signs in the user. The output is displayed based on their responses CT scan image (Fig. 32.1). Input module consists of two Web pages to accept types of inputs: CT scan image and symptoms. Various steps of preprocessing the data need to be done. Preprocessed data are passed to the classification model. The Ccassification model consists of Model1 work for the classification of CT scan image, and Model2 works with text symptoms input by the user, and finally, the output is displayed. Figure 32.2 demonstrates whether a person is diagnosed with Covid-19 or nonCovid-19.
Input Image
Input Module
Classification Module
Web page1 Input symptoms
Web page2
Model1
Preprocess Data
Model 2
Diagnosis Result Fig. 32.1 Workflow of the process
Fig. 32.2 Classification output
32 A Framework of a User Interface Covid-19 …
315
32.4 Conclusion and Future Work The paper discusses the detection of COVID-19 disease by the user interface model, using CT scan images and symptoms obtained from patients in two separate modules for CT scan and symptom input. The most common symptoms of COVID-19 infection are dry cough, fever, and trouble breathing. Some COVID-19 patients have also recorded weakness, muscle aches, a loss of smell or taste. However, to generalize the model results, it must exercise caution collaborating with a domain expert. During their initial inpatient treatment, hospitals could use the model to identify COVID-19 infected patients and non-COVID-19 patients. It would be helpful to control the spread of disease to other patients. The symptom checkers could use the symptoms checker for monitoring patient symptoms, forewarning healthcare officials in resource-limited communities.
References 1. Shi, F., Wang, J., Shi, J., Wu, Z., Wang, Q., Tang, Z., Shen, D.: Review of artificial intelligence techniques in imaging data acquisition, segmentation and diagnosis for covid-19. In: IEEE Reviews in Biomedical Engineering (2020) 2. WHO: Coronavirus disease 2019 (COVID-19), Situation Report-76, 5 Apr 2020. Available: https://www.who.int/docs/default-source/coronaviruse/situation-reports/20200405-sitrep-76covid-19.pdf 3. Ozsahin, I., Sekeroglu, B., Musa, M.S., Mustapha, M. T., & Uzun Ozsahin, D. (2020). Review on diagnosis of COVID-19 from chest CT images using artificial intelligence. Computat. Math. Methods Med. (2020) 4. Ma, N., Zhang, X., Zheng, H.T., & Sun, J. (2018). Shufflenet v2: Practical guidelines for efficient CNN architecture design. In: Proceedings of the European conference on computer vision (ECCV), pp. 116–131. 5. Morita, T., Rahman, A., Hasegawa, T., Ozaki, A., Tanimoto, T.: The potential possibility of symptom checker. Int. J. Health Policy Manag. 6(10), 615 (2017) 6. Mehrotra, S., Kohli, S., Sharan, A.: To identify the usage of clustering techniques for improving search result of a website. Int. J. Data Mining Model. Manag. 10(3), 229–249 (2018) 7. Mehrotra, S., Kohli, S., Sharan, A.: An intelligent clustering approach for improving search result of a website. Int. J. Adv. Intel. Paradigms 12(3–4), 295–304 (2019) 8. Mehrotra, S., Kohli, S.: Comparative analysis of K-Means with other clustering algorithms to improve search result. In: 2015 International Conference on Green Computing and Internet of Things (ICGCIoT), pp. 309–313. IEEE, Oct 2015 9. Mehrotra, S., Kohli, S.: Data clustering and various clustering approaches. In: Intelligent multidimensional data clustering and analysis, pp. 90–108. IGI Global (2017) 10. Obaid, O. I., Mohammed, M.A., Ghani, M.K.A., Mostafa, A., Taha, F.: Evaluating the performance of machine learning techniques in the classification of Wisconsin Breast Cancer. Int. J. Eng. Technol. 7(4.36), 160–166 (2018) 11. Abdullah, D., Susilo, S., Ahmar, A.S., Rusli, R., Hidayat, R.: The application of K-means clustering for province clustering in Indonesia of the risk of the COVID-19 pandemic based on COVID-19 data. Qual. Quant. 1–9 (2021) 12. Nishant, P. S., Mehrotra, S., Mohan, B., Devaraju, G.: Identifying classification technique for medical diagnosis. In: ICT Analysis and Applications, pp. 95–104. Springer, Singapore (2020)
316
M. Sumanth and S. Mehrotra
13. Berry, A.C.: Online symptom checker applications: syndromic surveillance for international health (2018) 14. Taneja, S., Gupta, P.R.: Python as a tool for web server application development. Int. J. Inf. Commun. Comput. Technol. 2(1), 2347–7202 (2014) 15. Garcıa, G.P., Bejar, J.O., Romero, J.J.F.: Development of dynamic web sites using python and google closure 16. Chamola, V., Hassija, V., Gupta, V., Guizani, M.: A comprehensive review of the COVID-19 pandemic and the role of IoT, drones, AI, blockchain, and 5G in managing its impact. IEEE access 8, 9022590265 (2020)
Chapter 33
Priority Queue-Based Framework for Allocation of High Performance Computing Resources Manish Kumar Abhishek, D. Rajeswara Rao, and K. Subrahmanyam
Abstract In high performance computing environment using containers, the performance of an application is always a point of concern in terms of resource allocation. Cluster is formed using containers which are provisioned based on the configuration provided as per application need. The environment has its own distributed cluster using physical machines to launch these containers. The dynamic allocation of containers provides the flexibility for clusters but does not ensure the processing of multiple requests at the same time due to resource limitations. It is difficult to figure out which request has to get the resource in terms of memory, GPU and CPU and which can be avoided due to shortage of resources. This paper describes a priority queue-based framework which is internally based on an algorithm to schedule the resource allocation incoming requests to the containers required to execute high performance computing (HPC) applications. It behaves like a prioritybased scheduler to schedule resources allocation. Higher-based priority request will first get the requested resources. Request with same priority will be considered based on request submission time. It is implemented to provide a high performance computing environment to have the solution for allocating the required computing resources to containers and ensure that critical applications with higher priority will get the computing resources. The proposed framework will allocate the resources to containers in a well-balanced environment. The performance has been evaluated and compared with existing algorithms and scheduler. Keywords Container · Docker · High performance computing · Priority queue · Scheduler · Virtualization
33.1 Introduction The containerization is based on virtualization. Virtualization facilitates the resources by opting abstraction, partitioning and isolation. It helps in provisioning the M. K. Abhishek (B) · D. R. Rao · K. Subrahmanyam Koneru Lakshmaiah Education Foundation, Vaddeswaram, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_33
317
318
M. K. Abhishek et al.
computing instances in form of containers on demand as needed to execute the high performance computing application. Containers are based on operating system virtualization and package the application; it is all dependencies as a single unit. The computing resources in terms of CPU, GPU and memory are allocated from the physical machines which are hosted in data centers. High performance computing (HPC) applications using containers are trending nowadays from research perspective. It acts like an easily scalable environment in research labs where user can run machine learning-based models on large datasets without worrying about the infrastructure configuration, cluster startup and maintenance. Containers are managed by Docker files in form of executables images which wrap the whole application, its dependencies and everything that needs to run the application. On the other hand, resource allocations are just defined in configuration files and containers are getting spawned. These configurations have the minimum, maximum value for resources in terms of CPU, memory and GPU [1]. The cluster of physical servers using which containers will be provisioned and provided has limits in terms of computing resources. Few requests do not make any difference but as it is getting used for highly performance-based application, number of request matters. With huge number of requests, resource is going to be exhausted at certain point and needs to be monitored. It is difficult to differentiate among requests when resources are just going to be exhausted. The allocation of memory at run time will be difficult at this point and can results into failure. The policy for resource allocation is used but it does not hold the priority of coming requests. Every request is going to be important and resources will get allocated which impacts the other critical running application running that need the resources as well at same time. We have proposed a priority queue-based framework which is implemented using an algorithm to allocate the resources to the containers to execute applications within HPC cluster environment. Priority-based allocation will have a significant impact on multiple requests over the time and the main issue we have resolved using this framework is the allocation of resources to the higher priority requests to avoid the impact of resource crunch at running time using the containers. There are well known resource manager used to manage the HPC resources [2]. There are already multiple number of resource allocation scheduling algorithms are present but does not consider much about GPU and CPU requirements to optimize the resource allocation to containers for HPC applications [3]. We are attaching priority to the incoming requests for resource allocation the available pool of computing resources. There will be five priority types, i.e., urgent, high, major, normal and low. The request with same priority will be served based on submission timestamp. Here, we analysis the challenges and explore the alternatives for allocation of resources to containers in high performance computing environment. We focused on the implementation of priority queue-based approach—computing resources allocation scheduling based on algorithm and its performance evaluation. This framework will not applicable only for containers but also for others. For example: the virtual machines. It will be highly profitable for services wherever allocation of computing resources is required [4]. The main advantage of this framework-based allocation is using a priority scheduler for allocating and scheduling the computing resources to the containers. The other
33 Priority Queue-Based Framework for Allocation of High Performance…
319
sections of this paper are arranged in the following manner: Sect. 33.2 describes the virtualization and containerization. Section 33.3 describes our proposed algorithm using priority queue along with its implementation. Section 33.4 describes the evaluation methodology. Section 33.5 captures the results and discussion related to performance. Section 33.6 is having the concluding remarks.
33.2 Virtualization and Containers The key concept of virtualization exists from mainframe era. Virtualizing the computing resources in terms of storage, CPU, network, I/O is always in trend to sharing and improves the utilization of systems. Infrastructure virtualization in terms of hardware is provisioning the virtual machines (VM) [5]. Hypervisor are used to achieve the same. The application executed using these VM are isolated from the underlying hardware resources. It is one way to improve the overall efficiency of virtualization. It is widely used across IT industry as a cost-effective, energy saving and infrastructure reducing model. In cloud computing, it is used as infrastructure-asa-service solution. It offers high availability, remote access, scalability and effective use of resources. Using virtualization, multiple applications and operating systems can reside on same physical server by increasing the flexibility of hardware [6]. There are various type of virtualization and one among is operating system level virtualization. It refers a feature in which kernel allows the existence of multiple isolated user-space instances [7]. Containers are these instances act like physical hosts for the running applications. Containerization is used to create, package, deploy and run application in an isolated lightweight execution environment in form of the container. Container executes on real host by sharing the operating system, required libraries, binaries and other dependencies. It reduces the host management by running the isolated system on the top of a single OS [8]. These are very light weight in comparison with virtual machines and gaining higher popularity from usage and OS virtualization perspective. Most of the Google apps run in containers. Containers can be run anywhere from the public clod to private data center [9]. Docker is used to build the image of application, its deployment and execution in a container. To conclude, it is used for the orchestration of containers and monitoring the same. Whole process is achieved majorly using three components, i.e., CLI client, REST API and last but not least the Daemon process. Docker provides a range of commands to build, tag, deploy image and to know about the application running status using the CLI client. REST API is used to communicate with Docker registry is used to manage the info about the images [10]. The application related requirements and configuration are defined in a file known as Docker file [11]. If there is no configuration defined for resources, container does not have followed any constraint related to computing resources. Real-time scheduler can be configured but mostly it is kept as default as CPU scheduling is kernel level features. If these are not configured properly, it can result into host instability. To start accessing and set GPU resource capabilities in
320
M. K. Abhishek et al.
Fig. 33.1 Priority-based resource allocator using Docker containers within HPC cluster
container, Docker command needs to be run by using the “–gpus” flag. Capabilities and other configuration are defined as environment variable in image (Fig. 33.1).
33.3 Priority Queue-Based Framework Here, we have introduced priority queue-based framework to allocate the computing resource for the HPC application execution. Using this, we are addressing the issue of resource allocation when resources are going to be exhausted and we have multiple requests in queue looking for resources. Using priority-based queue, the higher priority request will get the resources. Selecting the best available resources is one of the most important factors that need to be address for user requests at run time. This is one of the challenges due to different type of applications with multiple workloads running in multiple containers within cluster. The policies have been defined which will be applied on queue to find out the priority of requests submitted to run the applications along with resource configurations. Downside, we will describe the framework and its implementation [12].
33 Priority Queue-Based Framework for Allocation of High Performance…
321
33.3.1 Proposed Approach The application is deployed on multiple containers in terms of n requests, which will look for configured computing resources. Based on workload, resource consumption is going to be increase and decrease or vice versa. It is mainly based on applications with multiple workloads in HPC environment [13]. The aim to allocate the required resources to the higher priority-based request first for the containers which are hosted on HPC cluster. Along with this, every host in data center is also has a running container based. This scheduler will help to redirect the computing instance request to appropriate host that in turn will serve the incoming request by provisioning a new container sized according to the defined configurations. The priority and all the host state data has been shared to the computing instances to take the appropriate decisions timely [14]. Based on policy, the priority with value “urgent” will be considered first. There will be five types of priorities which will be in following order, i.e., urgent, high, major, normal and low. The priority-based queue will help in decision making instantly and differentiate among the requests to keep low level requests in the queue if have sufficient computing resources. If request with same priority will come, then request will be served based on the coming request timestamp as first come first serve (Fig. 33.2).
Fig. 33.2 Priority queue-based high level architecture
322
M. K. Abhishek et al.
33.3.2 Implementation The priority queue-based framework is implemented as a part of research work using own stack of container-based cluster to run HPC application. We have used Docker also to create the images of application and to deploy them in terms of containers. The resource allocation to request is based on priority which is implemented in form of Algorithm 1 (pseudocode). Containers are deployed on multiple servers within HPC cluster. Here, we have considered the CPU as well as GPU from computing resource perspective to execute the application and resources will be served based on the priority of request. We have deployed multiple applications with multiple workloads in form of containers so that different amount of resources will be required and allocated based on the priority. The proposed policy is known as the priority blocking policy to retrieve the request from the queue based on the priority. The container request CRi for resource allocation for an application is defined in terms of flavors persisted in queue as FIFO order. These flavors are none other than the five types of priorities PU, PH, PM, PN and PL. When request will be retrieved, custom sorting CS is applied based on priority to fetch the higher priority-based request in a sequential order. In the framework, we have considered queue Q1 at two levels. At first level, request are submitted in first in, first out order and at second level, it will consider the request based on the availability of GPU, CPU and memory via considering the priority of request [15]. Algorithm 1: pseudocode for the priority blocking policy Input: Sn, CRn, Q1 Output: Request with higher priority 1: Sn = ϕ 2: RL = new list < > (); 3: for each request ∈ CRn do 4: for each server ∈ Sn 5: check available CPU, RAM, GPU 6: set priority based on flavor 7: RL.add(request) 8: end for 9: end for 10: pull request 11: highest = RL.get(0) 12: for each req ∈ RL 13: if highest.priority < req.priority then 14: highest = req; 15: end if 16: end for (continued)
33 Priority Queue-Based Framework for Allocation of High Performance…
323
(continued) 17: RL.remove (highest) 18: return highest • • • •
Sn : number of servers where containers will be provisioned CRn : container deployment requests for resource allocation Q1: the queue to persist the incoming requests PU, PH, PM , PN and PL : priority urgent, high, major, normal, low
33.4 Evaluation Methodology In this section, we will explain the results of the experiments which are performed in our in-built lab environment for containers to execute the multiple high performancebased applications with different workload.
33.4.1 Experimental Test Bed We built our cluster-based environment where containers are deployed looking for GPU and CPU resources. Docker is used to provision the container instead of virtual machines to execute the HPC application [16]. The containers are deployed with multiple flavors categorized based on required computing resources. We have used two large physical hosts with the below mentioned configurations for multiple containers (Table 33.1).
33.4.2 Benchmarks and Applications To benchmark the requests, a set of particular specifications and test beds have been considered. The proposed framework benchmarks using the CloudSim toolkit where in the priority blocking queue-based approach and existing algorithms performance are evaluated. This toolkit is used to describe the instances, computing resources and policies to spawn and schedule the resources allocation [17]. The requests Table 33.1 Test bed specifications
Resource
Containers
Operating system
Cent OS 7.0
Memory
126 GB
Network
Emulated 1GigE
Number of processors
12 X QEMU virtual CPU @2.67 GHz
324
M. K. Abhishek et al.
are submitted over a time period with some delay. The container’s network bandwidth differs with respect to the physical servers used. Standard algorithm, i.e., first come first serve and scheduler, i.e., random scheduling technique has been used for comparing the results with the proposed framework.
33.5 Results and Discussion We have leveraged Docker engine for launching the containers in out in-built lab setup. Containers are provisioned easily with fast pace even in distributed environment using the same. The default networking has been opted by us, i.e., bridge networking. Firstly, we executed the framework based on priority queue-based algorithm and evaluated it. Using the same, we have detected that with the last availability of resources and having multiple requests at same time, the higher priority request get served for the resource allocation instead of first come first serve. Figure 33.3 shows the resource utilization impact for our priority-based queue and existing first come first serve-based algorithm. The processing speeds of containers are fully consumed in case of priority queue due to the scheduler. It is evaluated to figure out the optimum scheduler for consuming the different types of priority request present in the queue based on required computing resources. Figure 33.4 shows the results using the proposed and existing algorithm for container workload balance results. The existing first come first serve scheduling scheme dispatch requests to the running containers but unable to consider the multiple workloads of multiple containers and resulting into the longer response time where on other hand Radom schedule only guarantees the balanced workload of multiple containers. We observed that whenever the numbers of requests are less and remaining buffer sizes of containers are full, it is difficult to balance the workload of few containers but using priority-based scheduling it can be easily handled.
Fig. 33.3 Number of requests versus utilization rate of resources allocation
33 Priority Queue-Based Framework for Allocation of High Performance…
325
Fig. 33.4 Comparison between existing algorithm, scheduling and priority queue-based workload
Figure 33.5 is representing the average makes span of the existing algorithm using first come first serve and scheduling, i.e., random scheduling. We found that average makespan increases with the number of requests. It seems directly proportional but with our proposed one, priority-based queue is taking less time in terms of 50–60 s. It is analyzed and detected that the significant performance improvement is found by using the priority-based queue for resource allocation requests. In high performance computing environment, application execution in containers continuously looks for the resource consumption. It varies at run time.
Fig. 33.5 Average makespan versus number of requests with existing and proposed algorithm
326
M. K. Abhishek et al.
33.6 Conclusion As containerization gains a popularity using Docker, multiple research work carried time to time for execution of the HPC application using containers with multiple benchmarking to conclude on the resource allocation using existing algorithm and scheduler. Application deployed in containers is going to consume the computing resources. It varies at run time based on multiple workloads. We demonstrated that if we have to allocate the computing resources to the container, and requests will be served based on priority. Higher the priority first will get the resources when computing resources are at the point of exhaustion. We used Docker for the orchestration of containers. In our previous research work, it is demonstrated that HPC application can be run efficiently using the containers and those are really useful under such environment where research work needs to be carried out by students [14] instead of VM using Docker. Here, as next step and as an enhancement we addressed the issue related to resource allocation to containers which results the appropriate consideration of incoming requests. Our aim is to allocate the resources to most needed application at the point when underlying infrastructure is going to be exhausted in terms of computing resources, i.e., RAM, GPU, number of CPU core. Existing scheduling algorithms and policy does not address this problem efficiently [18]. We proposed a novel framework design, its execution and estimation for containers resource management. The proposed priority queue multiplexes the virtual to real resources relatively based on multiple variant requirements. It is efficient in allocating the resources to the neediest requests and through which overall performance is also improved. From future work perspective, we will move forward and consider other computing resources in terms of I/O performance, network bandwidth and can address other existing challenges to adopt in cloud environment also [18].
References 1. Younge, A.J., Pedretti, K., Grant, R.E., Brightwell, R.: A tale of two systems: using containers to deploy HPC applications on supercomputers and clouds. In: 2017 IEEE International Conference on Cloud Computing Technology and Science (CloudCom), pp. 74–81. Hong Kong (2017). https://doi.org/10.1109/CloudCom.2017.40 2. Gannon, D., Sochat, V.: Singularity: a container system for HPC applications, (2017) 3. Gupta, A., Milojicic, D., Kalé, L.: Optimizing VM placement for HPC in the cloud. pp. 1–6 (2012). https://doi.org/10.1145/2378975.2378977 4. Younge, A.J., Pedretti, K., Grant, R.E., Gaines, B.L., Brightwell, R.:Enabling diverse software stacks on supercomputers using high performance virtual clusters. In: 2017 IEEE International Conference on Cluster Computing (CLUSTER), pp. 310–321 (2017) 5. Heyman, T., Preuveneers, D., Joosen, W.: Scalar: a distributed scalability analysis framework. In: Norman G., Sanders W. (eds,) Quantitative Evaluation of Systems. QEST 2014. Lecture Notes in Computer Science, vol. 8657, Springer, Cham (2014)
33 Priority Queue-Based Framework for Allocation of High Performance…
327
6. Peng, C., Rajan, A.: CLTestCheck: measuring test effectiveness for GPU kernels. In: Hähnle R., van der Aalst W. (eds.) Fundamental Approaches to Software Engineering. FASE 2019. Lecture Notes in Computer Science, vol. 11424, Springer, Cham (2019) 7. Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T., Ho, A., Neugebauer, R., Pratt, I.: A warfield—xen and the art of virtualization. In: Proceeding 19th ACM Symposium on Operating Systems Principles, SOSP 2003, Bolton Landing, USA, Oct. (2003) 8. Netto, M., Calheiros, R., Rodrigues, E., Cunha, R., Buyya, R.: HPC cloud for scientific and business applications: taxonomy, vision, and research challenges. ACM Comput. Surv. 51. https://doi.org/10.1145/315022 9. Benediˇciˇc, L., Cruz, F., Madonna, A., Mariotti, K.: Sarus: highly scalable Docker containers for HPC systems. (2019). https://doi.org/10.1007/978-3-030-34356-9_5 10. Gravvanis, G.A., Morrison, J.P., Marinescu, D.C., et al.: Special section: towards high performance computing in the cloud. J Supercomputer 74, 527–529 (2018). https://doi.org/10.1007/ s11227-018-2241-9 11. Cito, J., Ferme, V., Gall, H.: Using Docker containers to improve reproducibility in software and web engineering research. pp. 609–612 (2016). https://doi.org/10.1007/978-3-319-387918_58 12. Wang, Y., Evans, R., Huang, L.: Performant container support for HPC applications. pp. 1–6 (2019). https://doi.org/10.1145/3332186.3332226 13. Langholtz H.J., Marty A.T., Ball C.T., Nolan E.C.: An introduction to resource-allocation behavior. In: Resource-Allocation Behavior, Springer, Boston, MA (2003) 14. Abhishek M.K., Rao DR.: A scalable framework for high-performance computing with cloud. In: Tuba M., Akashe S., Joshi A. (eds.) ICT Systems and Sustainability. Lecture Notes in Networks and Systems, vol. 321. Springer, Singapore (2022). https://doi.org/10.1007/978-98116-5987-4_24 15. Abhishek, M.K., Rao, D.R.: Dynamic allocation of high-performance computing resources. Int. J. Adv. Trends Comput. Sci. Eng. 9, 3528–3543 (2020). https://doi.org/10.30534/ijatcse/ 2020/159932020 16. Abhishek, M.K., Rao, D.R.: Framework to secure Docker containers. In: 2021 Fifth World Conference on Smart Trends in Systems Security and Sustainability (WorldS4), pp. 152–156 (2021). https://doi.org/10.1109/WorldS451998.2021.9514041 17. Tao, Y., Wang, X., Xu, X., Chen, Y.: Dynamic resource allocation algorithm for container-based service computing. pp. 61–67 (2017). https://doi.org/10.1109/ISADS.2017.20 18. Arima, E., Schulz, M.: Pattern-aware staging for hybrid memory systems. In: Sadayappan P., Chamberlain B., Juckeland G., Ltaief H. (eds.) High Performance Computing. ISC High Performance 2020. Lecture Notes in Computer Science, vol. 12151, Springer, Cham (2020) 19. Mirzoev, T., Yang, B.: Securing Virtualized Datacenters. Int. J. Eng. Res. Innov. 2(1), springer (2010) 20. Calheiros, R., Ranjan, R., De Rose, C., Buyya, R.: CloudSim: a novel framework for modeling and simulation of cloud computing infrastructures and services (2009) 21. Abhishek, M.: High performance computing using containers in cloud. Int. J. Adv. Trends Comput. Sci. Eng. 9, 5686 (2020). https://doi.org/10.30534/ijatcse/2020/220942020 22. Madhumathi, R., Radhakrishnan, R.: Priority queue scheduling approach for resource allocation in cloud. 15, 472–480 (2016). https://doi.org/10.3923/ajit.2016.472.480
Chapter 34
A Novel Hysynset-Based Topic Modeling Approach for Marathi Language Prafulla B. Bafna and Jatinderkumar R. Saini
Abstract Topic modeling is an unsupervised technique which allows representing the corpus using various topics which are inherent but hidden. These topics in turn contain different relevant words, and the relevance is calculated using latent features. The topic model is built using latent dirichlet allocation (LDA) which is constructed on the corpus and is based on latent features. The vector space model for Marathi is VSMM used by the existing system which does not consider the different relationships between the words based on the meaning of the words. Hysynset vector space model for Marathi (HSVSMM) is proposed which uses Hyponyms-hypernyms relationship along with synonyms and formulates group known as Hysysnset. It reduces dimensions along with betterment in the quality of topics. The term ‘Hysynset’ is used first time in the research world. The comparative analysis of topic models using VSMM and HSVSMM is carried out using several parameters like entropy, purity, coherence measure, etc. The corpus contains more than 1200 text documents in the form of different Marathi stories.
Keywords Coherence measure Entropy Hysynset Latent dirichlet allocation (LDA) Marathi Topic model Vector space model (VSM) Synonyms WordNet
34.1
Introduction
Natural language processing (NLP) is evolved in the last two decades. It is a sub-branch of text mining in which the text data may be present in different languages. India is a diverse country having more than 30 languages. But, very few P. B. Bafna J. R. Saini (&) Symbiosis Institute of Computer Studies and Research, Symbiosis International (Deemed University), Pune, India e-mail: [email protected] P. B. Bafna e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_34
329
330
P. B. Bafna and J. R. Saini
research works are being explored for different languages like Gujarati [1] Punjabi, Marathi, Hindi, and so on. As these are official languages of different states, abundant documents are generated in different forms like pdf, twits, etc. [2–5]. To organize these documents and retrieve the exact information is the need of an era. Different text mining techniques like topic modeling, text clustering facilitates to extract the data. Several natural language processing (NLP) steps [4] need to be performed like tokenization means dividing the text into smaller logical units. After tokenization, part of speech (PoS) [6] can be extracted using PoS tagger. It assigns each word as noun, pronoun, adjective, punctuations, and so on. Stop words [7, 8] and special characters are removed to reduce noisy features of the text. Lemmatization is carried out which converts word/token into its meaning original form. For example for the statement ‘उंदरालावाटलेयाचीएकछानशीटोपीशिवावी!’, tokenization separates out the text and results into eight tokens including ‘!’ PoS tagger identifies ‘टोपी’ as a noun. ‘एक’ being number won’t be considered in the next NLP processing step. Similarly, ‘याची’ being pronoun is also ignored. In the lemmatization step, छानशी is converted to ‘छान’ and termed as a lemma. TF-IDF [9] weight is a statistical measure that evaluates the significance of the word with respect to the corpus. The significance of the term is directly proportional to the frequency of the term. It is offset by the word’s frequency in the text corpus. TF-IDF count is used deciding the significance of the term using a specific threshold. Different types of relationships between words like synonyms, hypernyms, hyponyms, etc., are reflected in WordNet. Hypernym and hyponym are related as superordinate and subordinate. For example, प्राणी (hypernym) means an animal, उंदीर means rat (hyponym), that is, hypernym is a type of hyponym. There are two steps to implement a vector space model. Words’ vector is used to represent text documents in the first step and converting it into numerical form is the second step. The vector space model for Marathi (VSMM) is built on Marathi corpus of stories and poems. Different techniques can be applied to this numeric form like LDA [10]. It is one of the topic modeling methods. Different words formulate the document. The topic is represented by various words. LDA finds out topics and their words representing the corpus using latent features. Topic coherence is used to evaluate the topic model intrinsically. It is a statistical score of a topic obtained by measuring the semantic similarity among top scoring words in the topic [11–15]. Topics can be visually represented using clusters. Hierarchical agglomerative clustering (HAC) is one of the ways by which similar objects are grouped. The groups are called clusters. In case when objects are documents then these clusters are document clusters. These clusters are evaluated using several parameters like entropy, purity, etc. Entropy means disordering in the clustering. Minimum entropy indicates minimum disordering so lower entropy indicated good clustering. Purity is the measure of correctly classified data points. It means greater purity indicates better clustering.
34
A Novel Hysynset-Based Topic Modeling Approach for Marathi Language
331
This research is unique because 1. Vector space model using Hysysnet is built 2. Performance evaluation of both topic models are carried out 3. Comparative analysis of topics retrieved using existing and proposed model is carried out using several parameters. The organization of the paper is carried out in the mentioned manner. Section 34.2, that is, the next section describes the literature review. Section 34.3 depicts the research methodology. Results and discussions are detailed in Sect. 34.4. The conclusion is presented in Sect. 34.5.
34.2
Literature Review
Question-answering (QA) framework is proposed, as the question is received in the form of text input, its readability level is determined in the form of incorrect spelling, grammar error or slang term. The candidate answers are retrieved using the readability index [16]. Hate speech is detected automatically using supervised machine learning techniques. Features such as a bag of words, character level representations, etc. are used. Classification performance is also evaluated. Character level embedding works better than token level. Slur list with other features is used as a lexical resource to give accurate results. Dependency parsers or modeling of features in the form of imperatives and politeness are used. Such linguistic features and linguistic knowledge are useful to identify a subtype of hate dialog. Sometimes evaluation becomes difficult due to the availability of specific data sets [17]. NLP software framework is developed to compare different algorithms based on ease of use and scalability. The framework uses the document streaming concept. It serially processes all corpora. The topic analysis is carried out using latent sirichlet allocation and latent semantic analysis. The corpus size of the training dataset does not impact the performance of the algorithm. The framework can be extended easily by other researchers. The existing digital library is input to the algorithm [18]. To use a topic model in the qualitative study, there are two main challenges. The first is to assess quality based on quality metrics which maps with the human judgment and second is, identifying subtopics of qualitative data. To overcome the first problem, TF-IDF coherence is used as a quality metric. It is better than regular coherence with respect to proximity to human judgment. The semi-supervised technique known as SLDA, based on the interval is developed with a specific group of keywords that are predefined by the researchers. These keywords are assigned to the topics. A case study is presented to show the betterment of ISLDA and assigning predefined keywords in the form of groups [19].
332
P. B. Bafna and J. R. Saini
A generative topic model is developed using latent dirichlet allocation. It is an iterative process to integrated documents in a generative manner. Technique uses the temporal ordering. Different time segments are used to divide the documents. Each segment discovers certain topics, which are transferred in the succeeding time segments to influence the process of topic discovery. To prove the betterment of the topics, documents form CiteSeer repository are input to the segmented model to discover the distinct topics. S-ATM, detects the distributions based on topic word and author topic for scientific research [20].
34.3
Research Methodology
The experimental setup used to extract the topics from the corpus is described in this section. After preprocessing the corpus followed by identifying the TF-IDF value of the tokens. VSMM [5] and HSVSMM based on Hysynset are constructed. The concept of Hysynset is used first time in the research world. It includes hypernyms–hyponyms and synonyms relationship. In Fig. 34.1, ‘प्राणी’ and ‘वनचर’ are synonyms, whereas ‘सिंह’ and ‘प्राणी’ are hypernyms–hyponyms. The entire group is called Hysynset. Their (प्राणी, वनचर, सिंह) TF-IDF weights are0.1, 0.5, 0.2, respectively, feature weight of the entire group (Hysynset) is 0.1 + 0.5 + 0.2 = 0.8. LDA is built on both matrices that are VSMM and HSVSMM, and their comparison is performed using various parameters such as topic coherence, number of topics, dendrogram structure, and cluster validation parameters. Figure 34.2 states the stepwise flow of experimental analysis. Fig. 34.1 Hysynset
Fig. 34.2 Steps in research methodology
34
A Novel Hysynset-Based Topic Modeling Approach for Marathi Language
333
Marathi text in the form of stories is processed using library udpipe (http://cran.rproject.org/web/packages/udpipe/vignettes/udpipe-annotation.htm). The corpus is created using ‘R’programming is an open-source language which provides the language model. Model coding initiates with downloading the model, followed by loading model and terminates with calling the model. The commands used are model < - udpipe download model(language = ‘Marathi-ufal’), udmodel < - udpipe load model(), and file = ‘Marathi-ufal-ud-2.4–190,531.udpipe’, respectively. Corpus preprocessing operations like lemmatization, PoS tagging are supported by same library. To annotate the corpus, udmodel is passed along with preprocessed corpus. Using command udpipe annotate (udmodel). Different steps in research methodology are
34.3.1 Data collection and corpus construction A total of 1256 stories are downloaded and collected from various websites (www. matrubharti.com/novels/marathi, https://marathi.pratilipi.com/marathi-short-storiespdf-free-download, https://www.hindujagruti.org/hinduism-for-kids-marathi/category/ marathi-katha/marathi-stories). Total of 12 Mb data is processed. The data needs to be stored in ‘UTF-8’ encoding style to prepare a corpus.
34.3.2 Preprocessing Corpus Tokenization is the initial step of corpus preprocessing. The corpus is preprocessed which initiated with tokenization. Total tokens are 1,56,365. Library udpipe provides the POS tagger to identify all PoS, numeric, special characters, and so on. Adjectives, nouns, adverbs, and verbs are retrieved which automatically resulted into noise removal. The total number of tokens at this stage is 85,145.Lemmatization is implemented again using udpipe to generate 47, 109 unique lemmas.
34.3.3 Constructing VSMM and HSVSMM To construct VSMM significant terms based on TF-IDF score are used. HSVSMM is constructed using the Hysynset groups. The similar terms are grouped, that group is termed as Hysynset because similar terms present in the group includes hyponyms–hypernyms and synonyms. TF-IDF scores of all similar terms present in one Hysynset are added, the sum is called as Hysynset frequency. The terms and Hysyset groups above threshold are used to construct HSVSMM and termed as features of HSVSMM. The threshold value is considered as 75% of the highest frequency.
334
P. B. Bafna and J. R. Saini
34.3.4 Intrinsic Evaluation of Model Both of the matrices that are VSMM and HSVSMM are used to construct LDA. The coherence score is calculated as an intrinsic evaluation using both of the matrices. As these matrices represent the corpus, the coherence score indicates topic independence. Using both VSMM and HSVSMM the topics discovered are four and three, respectively.
34.3.5 Constructing the Unsupervised Model To assign a topic to every document, both VSMM and HSVSMM are input to the model. These feature vectors are used to calculate probability. The association degree of each feature with respect to every topic is calculated. The top N informative words are generated for each topic using the probabilities. The features of the topic obtained using both the matrices are different. Thus, topic assignment of the documents also varies. Three topics are derived using HSVSMM and four topics are extracted using VSMM.
34.3.6 Visualization and Validation of Results A dendrogram is generated for both topic models. Hierarchical agglomerative clustering (HAC) is used to generate the dendrogram. Using entropy and purity, the clusters generated by HAC are evaluated. The evaluation is performed on different datasets using functions provided by the entropy library in ‘r’. Entropy and purity are calculated for both dendrograms for various data sets having a different size. Both cluster evaluation parameters (entropy and purity) showed the improved readings for the topic model developed using HSVSMM for all datasets. It proves better topic coherence and more similar features are retrieved using the proposed approach that is HSVSMM.
34.4
Results and Discussions
Different sentences from the stories with various NLP steps are presented in Table 34.1. There are 32,183 total sentences in the corpus. The first column specifies the sample sentences from the stories. In the next column, the terms obtained after tokenization are stated. The next column specifies, required PoS in the form of adjectives, adverbs, etc, thus stop words and punctuations are automatically
34
A Novel Hysynset-Based Topic Modeling Approach for Marathi Language
335
Table 34.1 NLP steps on corpus sentences S. No.
Sentence
Tokenization
Noise removal
Unique lemmas
Hysynset
1
एकाजंगलात कल्पना सचली
एका, जंगलात, एक, वाघ, राहत,केला, विचारझकास.. कल्पना
जंगलात, वाघाने,.
जंगल, वाघ, प्राणी,.. विचार,
वाघ, प्राणी, विचार, कल्पना
discarded and the sample sentence is shown in the second column. For eg. ‘एका’, ‘जंगलात’, ‘एक’, ‘वाघ’, ‘राहत’, are obtained after tokenization. ‘एका’ being numeric value gets ignored, in the same way, stop words, punctuations are ignored. The third column specifies the words obtained after lemmatization that is all unique lemmas are considered. Lemmatization converts ‘वाघाने’ into ‘वाघ’. Hysynset are formed using the relationship between different lemmas. WordNet (http://www. cfilt.iitb.ac.in/wordnet) is used to identify the relationship between the lemmas/ words. The relationship between ‘वाघ’, ‘प्राणी’ is hypernym and hyponym. Thus they are grouped. Similarly, if the tokens are retrieved having synonyms relationship between them, they are grouped. This in turn reduces the dimensions. Each group or token is termed as a feature of Hysynset. The existing approach to build VSMM do not implement the step of formation of Hysynset, There are total 32,183 sentences including 1,56,365 number of tokens. After removing stopwords it results into 85,145 l tokens. Lemmatization results into 47,109 lemmas. Due to multiple synonyms and hypo-hypernyms final features in the form of Hysynset are 33,121. For each feature of Hysynset, TF-IDF weight is calculated. If a feature has two or more tokens, then their TF-IDF weights are added and termed as feature weight. Table 34.2 shows the TF-IDF weight for each extracted token. Hysynset are formed and their feature weight is calculated. In Table 34.2 ‘गावकरी’ and ‘लोक’ are treated separately and their TF-IDF weights are0.1 and 0.4, respectively. In Table 34.3, these two weights are added and resultant weight is 0.1 + 0.4 = 0.5. It clearly shows the dimension reduction. The significant features are chosen using a threshold value. The features with more than 0.75 measures are significant and depicted in Table 34.4. There are 4,892 significant features, that is, the features having more than 0.75 feature weight. The term‘विचार’ has the lowest feature weight that is 0.71. Every story is represented in the form of a feature vector built using Hysynset. There are a total of 1256 stories. The significant features like’ आळशी’’ etc. are represented in columns. The feature weight of गावकरी,लोक with respect to story1 is 0.97. In Hysynset based vector space, the story 1 is represented as {0.97, 0.65… 0.21}. Table 34.5 represents HSVSMM using significant features. Table 34.2 Term and TF-IDF count
Term
दंड
चर्चा
…
गावकरी
लोक
TF-IDF weight
0.02
0.05
…
0.1
0.4
336 Table 34.3 Features and their weights
Table 34.4 Significant term and TF-IDF count
Table 34.5 Features of HSVSMM
P. B. Bafna and J. R. Saini Term
दंड
चर्चा
…
{गावकरी, लोक}
TF-IDF weight
0.02
0.05
…
0.5
Significant Term
गावकरी, लोक
वाघ, प्राणी
झकास
विचार
TF-IDF weight
0.98
0.83
0.77
0.71
Stories
गावकरी, लोक
आळशी
…
प्रेम
S1 S2 … S 1256
0.97 0.56 … 0.54
0.65 0.78 … 0.32
… … … …
0.21 0.31 … 0.11
Two LDA models are built. The coherence score is calculated for both of the models. Figures 34.3 and 34.4 show the various coherent scores for different number of topics extracted using HSVMM and VSMM, respectively. X-axis shows the number of topics and Y-axis shows coherence constant in both of the figures. The number of topics retrieved using HSVSMM and VSMM are 3 and 4, respectively. The maximum coherence score observed for using VSMM is 0.6 and using HSVSMM is 0.9 (Fig. 34.5)
Fig. 34.3 Coherence constant using HSVSMM
1 0.5
Coherence constant
0 1
Fig. 34.4 coherence constant for different number of topics using VSMM
3
5
7
9
1 0.5
Coherence constant
0 1
Fig. 34.5 constant values of features of topic 3 by HSVSMM
2
3
4
5
6
7
8
9
10
ाणी,… राजा चतरु धू यवहार ,धोरण
constant
0
0.2
0.4
0.6
0.8
t
34
A Novel Hysynset-Based Topic Modeling Approach for Marathi Language
337
Table 34.6 Topics and their representative words using HSVMM Topic 1 Premkatha
Topic 2 Tenalikatha
Topic 3 Panchatantra
आवड प्रेम, मिठी सुंदर हृदय
तेनाली संतुष्ट, आनंद कृष्णदेवराय स्तुती
व्यवहार, धोरण चतुरधूर्त राजा प्राणी, सिंह
Table 34.7 Topics retrieved using VSMM Topic 1 sensitive stories
Topic 2 Tenali stories
Topic 3 animal stories
Topic 4 animal stories
नात मन सुंदर हृदय
तेनाली संतुष्ट कृष्णदेवराय स्तुती
वन सिंह राजा प्राणी
व्यवहार आनंद आवड तज्ञ
Table 34.6 shows the topic and its corresponding words using VSMM. Table 34.7 presents the topics and their features using HSVSMM. The difference between words detected by both of the models can be well understood using pragmatic knowledge. For example, the features of topic 3, derived using HSVMM indicate that values of life are inculcated using animal stories due to Hysynset {व्यवहार, धोरण}. It means these could be ‘Panchatantra’ stories. But, topic 3 and topic 4 derived using VSMM do not specifically indicate a special class of stories. The words generated using HSVSMM are more relevant to the topic and informative than VSMM. The topics generated by HSVSMM are semantically interpretable and can be easily titled as ‘Romantic’, ‘Tenali Raman’, and ‘Panchatantra’. The topics detected by VSMM can be categorized as ‘Touching stories’, ‘Tenali Raman’ ‘Animal’, and ‘General’. Figure. 34.5 shows the b constant of the words which represents the specific topic. b constant of Hysynset ‘प्राणी, सिंह’ has value 0.5. The features of the topic ‘Panchatantra’ and retrieved by HSVSMM are depicted in Fig. 34.5. The features of topic 4 that is ‘General stories’ extracted using VSMM and values of b constant are shown in Table 34.8. HSVSMM denotes the stronger membership value of the feature with respect to topic. For eg., token ‘व्यवहार’ has highest b constant that is 0.3.
Table 34.8 Tokens and b constant values
Tokens
b constant
व्यवहार आनंद आवड तज्ञ
0.3 0.1 0.09 0.08
338
P. B. Bafna and J. R. Saini
Fig. 34.6 Dendrogram representing topic wise clusters
Fig. 34.7 Entropy and purity comparison using VSMM and HSVSMM
1 0.8
VSMM Entropy
0.6
VSMM Purity
0.4
HSVSMM Entropy
0.2 HSVSMM Purity 0 25
50
100
500
1256
Documents
To visualize which all documents belong to one topic and are similar, the dendrogram is presented. HAC is used to achieve clustering. One cluster represents one topic shown in the box, documents present in one cluster are the most similar documents. Figure 34.6 shows 3 topics retrieved by HSVSMM in the form of clusters presented through dendrogram. To validate the clusters, entropy and purity are used. Both of the parameters are calculated for HAC obtained on VSMM and HSVSMM. Different stories are considered in incremental data size value. Entropy and purity values are improved in case of HSVMM for all datasets. It confirms more separated topics and more similar words obtained using HSVSMM method. For example, topic 2 contains stories having ids as 23,3,15. To have clear visualization only 26 documents are considered to build a dendrogram. In Fig. 34.7, improved values of entropy and purity retrieved by using HSVSMM can be clearly seen. The improvement is consistent throughout the dataset size.
34.5
Conclusions
VSMM is extended to incorporate semantic relationships using Hysynset. Hysynset is the novel and first time used concept in this research world. LDA is implemented using HSVSMM which not only reduces dimensions but also produces semantically interpretable topics. Two types of evaluations are carried out. Intrinsic evaluation of topic model using the coherence score and entropy, purity to measure the quality of clusters representing topics. The coherence score is maximum for three and four topics extracted using VSMM and HSVSMM. Hierarchical clustering is
34
A Novel Hysynset-Based Topic Modeling Approach for Marathi Language
339
applied on the corpus of 1256 text stories to produce cluster which represents the topic. HSVSMM produces better entropy and purity tested on varied data size. Topics derived by HSVSMM are ‘Romantic’, ‘Tenali Raman’, and ‘Panchatantra’. The proposed approach is useful to derive hidden topics or domains of the corpus in an unsupervised way.
References 1. Rakholia, R.M., Saini J.R.: Information retrieval for Gujarati language using cosine similarity based vector space model. In: Proceedings of the 5th International Conference on Frontiers in Intelligent Computing: Theory and Applications. Springer, Singapore (2017) 2. Bafna, P.B., Saini, J.R.: Scaled document clustering and word cloud based summarization on hindi corpus. In: 4th International Conference on Advanced Computing and Intelligent Engineering (2019) 3. Bafna, P.B., Saini J.R.: BaSa: a context based technique to identify common tokens for Hindi verses and proses. In: IEEE International Conference for Emerging Technology, Belagavi, India IEEE-INCET (2020 in press) 4. Bafna, P.B., Saini, J.R.: Marathi document: similarity measurement using semantics-based dimension reduction technique (2020) 5. Bafna, P.B., Saini, J.R.: Measuring the similarity between the Sanskrit documents using the context of the corpus technique. Int. J. Adv. Comput. Sci. Appl. 11(5) (2020) 6. Rakholia, R.M., Saini, J.R.: A rule-based approach to identify stop words for Gujarati language. In: Proceedings of the 5th International Conference on Frontiers in Intelligent Computing: Theory and Applications. Springer, Singapore (2017) 7. Rakholia, R.M., Saini, JR.: Lexical classes based stop words categorization for Gujarati language. In: 2016 2nd International Conference on Advances in Computing, Communication, & Automation (ICACCA) (Fall). IEEE (2016) 8. Raulji, J.K., Saini, J.R.: Generating stopword list for Sanskrit Language. In 2017 IEEE 7th International Advance Computing Conference (IACC), pp. 799–802. IEEE. (2017). 9. Bafna, P.B., Saini J.R.: Hindi multi-document word cloud based summarization through unsupervised learning. In: 2019 9th International Conference on Emerging Trends in Engineering and Technology-Signal and Information Processing (ICETET-SIP-19). IEEE (2019) 10. Saini, J.R., Rakholia, R.M.: On continent and script-wise divisions-based statistical measures for stop-words lists of international languages. Proc. Comput. Sci. 89, 313–319 (2016) 11. Bafna, P.B., Saini, J.R.: An application of Zipf’s law for prose and verse corpora neutrality for Hindi and Marathi languages 12. Bafna, P.B., Saini, J.R.: On exhaustive evaluation of eager machine learning algorithms for classification of Hindi verses. Int. J. Adv. Comput. Sci. Appl. (2020) 13. Rakholia, R.M., Saini, J.R.: The design and implementation of diacritic extraction technique for Gujarati written script using Unicode transformation format. In: 2015 IEEE International Conference on Electrical, Computer and Communication Technologies (ICECCT). IEEE (2015) 14. Jasleen, K., Saini, J.R.: POS word class based categorization of Gurmukhi language stemmed stop words. In: Proceedings of First International Conference on Information and Communication Technology for Intelligent Systems, vol. 2. Springer, Cham (2016) 15. Rakholia, R.M., Saini, J.R.: Automatic language identification and content separation from Indian multilingual documents using unicode transformation format. In: Proceedings of the International Conference on Data Engineering and Communication Technology. Springer, Singapore (2017)
340
P. B. Bafna and J. R. Saini
16. Byron D.K., et al. Readability awareness in natural language processing systems. U.S. Patent No. 10,664,507 (2020) 17. Schmidt, A., Wiegand, M.A.: Survey on hate speech detection using natural language processing. In: Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media (2017) 18. Rehurek, R., Sojka, P.: Software framework for topic modelling with large corpora. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks (2010) 19. Nikolenko, S.I., Koltcov, S., Koltsova, O.: Topic modelling for qualitative studies. J. Inf. Sci. 43(1), 88–102 (2017) 20. Bolelli, L., Ertekin, Ş., Giles, C.L.: Topic and trend detection in text collections using latent dirichlet allocation. In: European Conference on Information Retrieval. Springer, Berlin, Heidelberg (2009)
Chapter 35
Identification of Malayalam Stop-Words, Stop-Stems and Stop-Lemmas Using NLP Sarath Kumar, Jatinderkumar R. Saini, and Prafulla B. Bafna
Abstract For the retrieval of information from different sources and formats, preprocessing of the collected information is the most important task. The process of stop-word elimination is one such part of the pre-processing phase. This paper presents, for the first time, the list of stop-words, stop-stems and stop-lemmas for Malayalam language of India. Initially, a corpus of Malayalam language was created. The total count of words in the corpus was more than 21 million out of which approximately 0.33 million were unique words. This was processed to yield a total of 153 stop-words. Stemming was possible for 20 words, and lemmatization could be done for 25 words only. The final refined stop-word list consists of 123 stop-words. Malayalam is a widely spoken language by people living in India and many other parts of the world. The results presented here are bound to be used by any NLP activity for this language. Keywords Malayalam · Stop-words · Stemming · Lemmatization · Natural language processing (NLP)
35.1 Introduction Malayalam is one of India’s 22 scheduled languages. It is the official language of the state of Kerala and in many parts of the Lakshadweep and Puducherry Union Territories. It is spoken by more than 34 million people around the world [1]. Malayalam alphabet consists of 37 consonant letters and 15 vowel letters [1]. The sample list of Malayalam consonants and vowels along with their phonetics is presented in Table 35.1.
S. Kumar · J. R. Saini (B) · P. B. Bafna Symbiosis Institute of Computer Studies and Research, Symbiosis International (Deemed University), Pune, India e-mail: [email protected] P. B. Bafna e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_35
341
342
S. Kumar et al.
Table 35.1 Malayalam consonants [1]
In natural language processing (NLP), elimination of stop-words is a very common step during the data pre-processing phase. The pre-processing phase consists of the identification of words, phrases, sentences, removal of stop-words, stemming, lemmatization, etc. Stop-words are the most common words used in any natural language and do not contain any semantic purpose. Like any other language, even Malayalam language contains stop-words. Stemming is a process where the suffix from a word is removed so as to reduce the word to its root form. For example, “walking” is a word and its suffix is “-ing”, if we remove the suffix from the word, then the root word will become “walk” [2]. The process of lemmatization is where words that are derived from one another are mapped to a central word, especially if they have the same core meaning. The root word that is attained after this process is known as lemma. For example, runs, running, ran are all different forms of the word run, so here run becomes the root word or lemma [2]. These pre-processing steps improve the performance of the system and reduce the searching time.
35.2 Literature Review Truica et al. [3] used the concept of stop-words as well as diacritics for the identification of romantic content. Na and Xu [4] proposed the generation of stop-word list through the corpus of Chinese patents. Rakholia and Saini [5] have presented an approach based on lexical classes (parts-of-speech tags) to construct and categorize the stop-words list in the Gujarati language. Khan et al. [6] have used Zipf’s law of two-factor dependency to identify stop-words for the Urdu language. Haque et al. [7] proposed a corpus-based technique for the elimination of stop-words in the Bengali language. Pimpalshende and Mahajan [8] have used the concept of Finite Automata to remove stop-words from Devanagari languages. Manalu [9] conducted a comparative study between automatic review summarization with and without stop-words using TextRank. Amarasinghe et al. [10] have proposed a methodology for selecting optimal domain-specific stop-words for improved accuracy in text mining. Popova and Skitalinskaya [11] have proposed a key-phrase extraction-based methodology from short texts to create an extended list of stop-words.
35 Identification of Malayalam Stop-Words, Stop-Stems …
343
Sharma and Mittal [12] have proposed an algorithm that compares different translation-based techniques for experimental analysis of Hindi–English crosslingual information retrieval to create a refined stop-words list. Namly et al. [13] presented an approach that combines the stop-words list with a machine learningbased algorithm to get the most appropriate stop-words. Rani and Lobiyal [14] constructed an unbiased domain-specific list of stop-words in the Hindi language using Borda’s rule along with its application in text mining. Kaur and Saini [15], the list of stop-words in the Punjabi language has been presented with its speech-based classification, its Gurmukhi and Shahmukhi script versions. Khalifa and Rayner [16] proposed an aggregation technique using different approaches to automatically construct a general list of Malay stop-words. Mathews and Abraham [17] presented and implemented the pre-processing phases in sentiment analysis for the Malayalam language to achieve meaningful and more precise results. Miretie and Khedkar [18] have proposed an aggregation-based methodology to identify the stop-words in the Amharic language. Kaur and Saini [19] presented an approach to categorization of the Gurmukhi language based on parts-of-speech. Kaur and Saini [20], in another work, presented a paper where they identified stop-words in the Punjabi language with an approach based on natural language processing. Saini and Rakholia [21] did a comprehensive analysis of the stop-words list of international languages based on statistical measures. Rakholia and Saini [22] proposed a rule-based approach focusing on automatic and dynamic identification of a list of stop-words. Raulji and Saini [23] presented and implemented the stop-words removal algorithm for the Sanskrit language. Raulji and Saini [24] proposed an approach to generate a list of stopwords in the Sanskrit language based on a hybrid methodology. Raulji and Saini [25] proposed a methodology to evaluate stop-words in Sanskrit using a rule-based morphological analyzer and then retrieve stop-words in the Gujarati language from it. Venugopal et al. [26] presented an approach to generate a comprehensive Stoplemma list. Alajmi et al. [27] proposed the generation of stop-words list for the Arabic language. The motivation behind this research work is the fact that after going through the literature review it is found that no research paper explicitly provides the table or the list of Malayalam stop-words, and therefore, we are working in this direction. The rest of the paper is structured as follows. The next section talks about the methodology, which is followed by results, and finally, we conclude the paper with conclusions and some directions for the future.
35.3 Research Methodology In this section, we are presenting the detailed methodology for creating a stopwords list for the Malayalam language consisting stemmed and lemmatized list of stop-words. Algorithm for the proposed approach can be depicted pictorially using the research methodology diagram (Fig. 35.1).
344
S. Kumar et al.
Fig. 35.1 Diagrammatic representation of research methodology
35.3.1 Data Collection Data formats in Malayalam such as news reports [28, 29], mythological scripts [30], and stories [31] and poems [32] were collected to create a Corpus. The size of the created Corpus was 15.3 megabytes (MB). The total count of words in the document was 2,104,638.
35.3.2 Stop-Words List Generation Python algorithm was used to process and sort the data. The output was then generated in the form of an Excel file, inside which two columns were created namely “Words” and “Frequency”. The “Word” column contains unique words from the data, and the “Frequency” column contains the number of times those unique words occurred. This excel file contains a total of 328,025 unique words. The size of this Excel file was 16 Megabytes (MB). The stop-words list was then created, and it was sorted based on the frequency. The Python code used for pre-processing and extracting the unique list of Malayalam stop-words along with their frequencies is shown in Fig. 35.2. Post finding out the frequency a cut-off value was needed to cover the important stop-words. This cutoff value is called the threshold value. To find this threshold value, the list was shown to 12 native speakers of the Malayalam language. Randomly, this number was chosen as per the availability of the speakers. The native speakers were then asked to give the threshold value that they feel should be considered which is given in Table 35.2, as they knew which of the words are stop-words and which are not. An average of these 12 values in turn was taken to come to the final threshold value of 164.
35 Identification of Malayalam Stop-Words, Stop-Stems …
345
Fig. 35.2 Screenshot of Python code
Table 35.2 Average of threshold value
Speakers
Threshold value
1
170
2
165
3
166
4
158
5
162
6
164
7
161
8
167
9
164
10
161
11
166
12
164
Total
1968
Average
164
After using the threshold value, we found that there were still some non-stopwords in the list that we created, so they were required to be removed manually. After the removal of those words, the list that was created had 153 stop-words.
35.3.3 Stemming After the list of stop-words was created, we then categorized the words based on their semantics and we found out of 153 stop-words 34 were Synonyms. This is the additional piece of information about the stop-words sample list we have provided. These synonyms are presented in Table 35.3 and stemmed sample list in Table 35.4.
346
S. Kumar et al.
Table 35.3 Categorizing words based on their semantics
Table 35.4 Lemmatized stop-words list
35.3.4 Lemmatization After the process of stemming, we found that still there are a few words which are having a common lemma, and therefore to further refine our list we then processed the list for lemmatization. From those refined lists of stop-words, we found 25 words that could be lemmatized manually. The final lemmatization process was also done manually. The list of lemmatized stop-words as a sample is presented in Table 35.4, and revised stop-words are presented in Table 35.5. Out of 80, a list of sample stop-words is presented.
35 Identification of Malayalam Stop-Words, Stop-Stems …
347
Table 35.5 Revised stop-words list
Most frequently used consonants from the list of stop-words and non-stop-words are shown in Table 35.6. Most frequently used vowels from the list of stop-words and non-stop-words is shown in Table 35.7. Top 5 most frequently used stop-words along with their frequencies are shown in Fig. 35.3. Table 35.6 Most frequently used consonants from the list of stop-words and non-stop-words
Table 35.7 Most frequently used vowels from the list of stop-words and non-stop-words
348
S. Kumar et al.
Fig. 35.3 Top 5 most used stop-words and their frequency from the Corpus
ഒ
35.4 Results and Discussions Stop-words are the most commonly used words in any natural language that usually do not add much value to the semantics of the sentence or document. It was found that there was no list of stop-words in the Malayalam language that was publicly available. In this paper, we have presented the list of stop-words in the Malayalam language which is 123. Along with that we have also presented the list of lemmatized and stemmed stop-words. We also did a comparison study of 100 most commonly used Malayalam words in which we did a comparison of the most common Malayalam words found through google and most common Malayalam words from the corpus. We also found the most frequently used consonants and vowels in the stop-words and non-stop-words list. The most frequently used consonant in the stop-words and non-stop-words list is and the most frequently used Vowel in the stop-words and . We also found the most frequently used stop-words i.e., non-stop-words list is . We hope that this list of stop-words and other findings will help the researchers working with Malayalam documents.
35.5 Conclusions and Future Work Stop-words are the most commonly used words in any natural language that usually do not add much value to the semantics of the sentence or document. It was found that there was no list of stop-words in the Malayalam language that was publicly available. In this paper, we have presented the list of stop-words in the Malayalam language which is 123. Along with that we have also presented the list of lemmatized and stemmed stop-words. We also did a comparison study of 100 most commonly used Malayalam words in which we did a comparison of the most common Malayalam words found through Google and most common Malayalam words from the corpus.
35 Identification of Malayalam Stop-Words, Stop-Stems …
349
We also found the most frequently used consonants and vowels in the stop-words and non-stop-words list. The most frequently used consonant in the stop-words and non-stop-words list is and the most frequently used vowel in the stop-words and non-stop-words list is . We also found the most frequently used stop-words, i.e., . We hope that this list of stop-words and other findings will help the researchers working with Malayalam documents. We plan to implement future work related to performance evaluation like time complexity and so on.
References 1. Available: https://en.wikipedia.org/wiki/Malayalam. Accessed: 10 March 2021 2. Available: https://www.datacamp.com/community/tutorials/stemming-lemmatization-python. Accessed: 10 March 2021 3. Truica, C.O., Velcin, J., Boicea, A.: Automatic language identification for romance languages using stop words and diacritics. In: 2015 17th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), pp. 243–246. IEEE (2015) 4. Na, D., Xu, C.: Automatically generation and evaluation of stop words list for Chinese patents. Telkomnika 13(4), 1414 (2015) 5. Rakholia, R.M., Saini, J.R.: Lexical classes based stop words categorization for Gujarati language. In: 2016 2nd International Conference on Advances in Computing, Communication, & Automation (ICACCA) (Fall), pp. 1–5. IEEE (2016) 6. Khan, N., Bakht, M.P., Khan, M.J., Samad, A., Sahar, G.: Spotting Urdu stop words by Zipf’s statistical approach. In: 2019 13th International Conference on Mathematics, Actuarial Science, Computer Science and Statistics (MACS, pp. 1–5. IEEE (2019) 7. ul Haque, R., Mehera, P., Mridha, M.F., Hamid, M.A.: A complete Bengali stop word detection mechanism. In: 2019 Joint 8th International Conference on Informatics, Electronics & Vision (ICIEV) and 2019 3rd International Conference on Imaging, Vision & Pattern Recognition (icIVPR), pp. 103–107. IEEE (2019) 8. Pimpalshende, A., Mahajan, A.R.: Test model for stop word removal of devnagari text documents based on finite automata. In: 2017 IEEE International Conference on Power, Control, Signals and Instrumentation Engineering (ICPCSI), pp. 672–674. IEEE (2017) 9. Manalu, S.R.: Stop words in review summarization using TextRank. In: 2017 14th International Conference on Electrical Engineering/Electronics (2017) 10. Amarasinghe, K., Manic, M., Hruska, R.: Optimal stop word selection for text mining in critical infrastructure domain. In: 2015 Resilience Week (RWS), pp. 1–6. IEEE (2015) 11. Popova, S., Skitalinskaya, G.: Extended list of stop words: does it work for keyphrase extraction from short texts. In: 2017 12th International Scientific and Technical Conference on Computer Sciences and Information Technologies (CSIT), Vol. 1, pp. 401–404. IEEE (2017) 12. Sharma, V., Mittal, N.: Refined stop-words and morphological variants solutions applied to Hindi–English cross-lingual information retrieval. J. Intell. Fuzzy Syst. 36(3), 2219–2227 (2019) 13. Namly, D., Bouzoubaa, K., Tajmout, R., Laadimi, A.: On Arabic stop-words: a comprehensive list and a dedicated morphological analyser. In: International Conference on Arabic Language Processing, pp. 149–163. Springer, Cham (2019) 14. Rani, R., Lobiyal, D.K.: Social choice theory based domain specific Hindi stop words list construction and its application in text mining. In: International Conference on Intelligent Human Computer Interaction, pp. 123–135. Springer, Cham (2018) 15. Kaur, J., Saini, J.R.: Punjabi stop words: a Gurmukhi, Shahmukhi and Roman scripted chronicle. In: Proceedings of the ACM Symposium on Women in Research 2016, pp. 32–37 (2016)
350
S. Kumar et al.
16. Khalifa, C., Rayner, A.: An automatic construction of Malay stop words based on aggregation method. In: International Conference on Soft Computing in Data Science, pp. 180–189. Springer, Singapore (2016) 17. Mathews, D.M., Abraham, S.: Effects of pre-processing phases in sentiment analysis for Malayalam language. Int. J. Comput. Sci. Eng. 6, 361–366 (2018) 18. Miretie, S.G., Khedkar, V.: Automatic generation of stopwords in the Amharic text. Int. J. Comput. Appl. 975, 8887 (2018) 19. Jasleen, K., Jatinderkumar, R.S.: POS word class based categorization of Gurmukhi language stemmed stop words. In: Proceedings of First International Conference on Information and Communication Technology for Intelligent Systems, Vol. 2, pp. 3–10. Springer, Cham (2016) 20. Kaur, J., Saini, J.R.: A natural language processing approach for identification of stop words in Punjabi language (2015) 21. Saini, J.R., Rakholia, R.M.: On continent and script-wise divisions-based statistical measures for stop-words lists of international languages. Proc. Comput. Sci. 89, 313–319 (2016) 22. Rakholia, R.M., Saini, J.R.: A rule-based approach to identify stop words for Gujarati language. In: Proceedings of the 5th International Conference on Frontiers in Intelligent Computing: Theory and Applications, pp. 797–806. Springer, Singapore (2017) 23. Raulji, J.K., Saini, J.R.: Stop-word removal algorithm and its implementation for Sanskrit language. Int. J. Comput. Appl. 150(2), 15–17 (2016) 24. Raulji, J.K., Saini, J.R.: Generating stopword list for Sanskrit language. In: 2017 IEEE 7th International Advance Computing Conference (IACC), pp. 799–802. IEEE (2017) 25. Raulji, J.K., Saini, J.R.: Sanskrit lemmatizer for improvisation of morphological analyser. J. Stat. Manag. Syst. 22(4), 613–625 (2019) 26. Venugopal, G., Saini, J.R., Dhanya, P.: Novel language resources for Hindi: an aesthetics text corpus and a comprehensive stop lemma list. Int. J. Adv. Comput. Sci. Appl. 11(1) (2020) 27. Alajmi, A., Saad, E.M., Darwish, R.R.: Toward an ARABIC stop-words list generation. Int. J. Comput. Appl. 46(8), 8–13 (2012) 28. Available: https://www.manoramaonline.com/home.html. Accessed: 10 Feb 2021 29. Available: https://www.janmabhumi.in/. Accessed: 12 Feb 2021 30. Available: https://archive.org/. Accessed: 12 Feb 2021 31. Available: https://www.kadhajalakam.com/. Accessed: 12 Feb 2021] 32. Available: http://malayalamkavithakal.com/. Accessed: 12 Feb 2021
Chapter 36
Face Recognition-Based Automatic Attendance System Yerrolla Chanti, Anchuri Lokeshwar, Mandala Nischitha, Chilupuri Supriya, and Rajanala Malavika
Abstract In today’s digital world, in almost every field, the face recognition technology plays a vital role. The attendance marking system has become difficult and interesting. This system of automation is used for surveillance, authentication, recognition of the face of a specific person, and has many more benefits. Everyone is adopting the conventional method of taking attendance these days, this consumes more time, and there could be possibilities for proxy participation. We used several libraries in this automation framework, such as OpenCV, face recognition, and Harrcascade classifier. Using a Haar-cascade classifier, face detection and recognition are executed. And, in the Excel sheet, the attendance is revised. Keywords OpenCV · Face recognition and Haar cascade
36.1 Introduction This project’s main focus is on a face recognition-based automatic attendance system. We use this technique to gather photos of registered pupils from various angles. It nearly catches 60 photographs of each student. All the captured images further trained using the traditional method of taking attendance consumes more time and, they may chance of proxy. Every class consists of around 60 students, the faculty should call every student in each grade, and the process of manual taking attendance consumes a lot of time. First, we need to record all of the photographs from the event student. All the captured images are further analysed using the classifier, and it contains already trained machine learning code in XML format. And these images trained by the classifier are stored, and these trained images are compared with the captured images machine learning algorithms. In this system, we use an available Y. Chanti (B) Department of Computer Science and Artificial Intelligence, S R University, Telangana, Warangal 506371, India e-mail: [email protected] A. Lokeshwar · M. Nischitha · C. Supriya · R. Malavika Department of Computer Science Engineering, S R Engineering College, Telangana, Warangal 506371, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_36
351
352
Y. Chanti et al.
machine learning algorithm classifier (Haar_cascade_classifier). We have many other alternative proposed products, but in this pandemic situation, face recognition is best. In our proposed system, we collect the data from the live stream class. It compares the captured photos from the stream class with the pictures present in the database. If both the images are similar, then it marks attendance. If the pictures are dissimilar, these images are captured and stored in an unknown folder. Many institutions proposed different mechanisms like the fingerprint, but in this COVID-19 time, face recognition is the best system because it does not have any physical contact with others. In the fingerprint mechanism, we need to use the same machine that others use. They may chance of viruses affect. So, face recognition is the best system in this pandemic situation.
36.2 Motivation • While fingerprint biometrics is a common option, time monitoring for facial recognition is an alternative biometric approach with some distinct benefits. • There are some circumstances we oversee and well, some of them we overlook. • I saw this as a relevant topic for myself. • There is a probability of replication of enrolment in most educational establishments. • We took up this project to eliminate this kind of circumstances and add productivity in papers.
36.3 Existing Systems During lecture time, the lecturer takes attendance by summoning every single student or moving the number of people present sheet around the room, as is customary. The professor then uses the online web-based attendance system to offer the student’s attendance information, or the students can access the system to obtain the final attendance report. There are two types of attendance approaches in the new student attendance system [1]: manual attendance and computerised attendance. During the manual method, the lecturer is responsible for obtaining, checking, and maintaining student records, which is especially important when there are a many number of candidates in the classroom. In actuality, keeping track of and calculating each student’s average attendance is a difficult task, takes longer using the manual method. Furthermore, the automatic approach, as opposed to the manual system, may give the lecturer with higher benefits. Because several other employees are still responsible for the responsibilities that the professor had previously deemed difficult.
36 Face Recognition-Based Automatic Attendance System
353
36.4 Libraries Used OpenCV: OpenCV is a computer vision library, cross-platform toolkit that supports the development of real-time computer vision applications. It primarily picture recognition [2], video capture, and analysis are the main areas of interest. Capabilities like facial detection and artefact detection. Using the OpenCV database, you can read and write images, record and store videos, and analyses images (filter, transform), identify properties, and detect specific objects such as faces, eyes, automobiles, movies, or photos. Analyse the video by measuring the motion in it, subtracting the background, and keeping an eye on the objects in it. Face Recognition: Recognise and control faces with the world’s best face recognition and control software from Python or the command-line easiest facial recognition library. Deep learning was used to create this, and dlib’s state-of-the-art face recognition was used. On the labelled faces in the wild benchmark, the model achieved a precision of 99.38%. A simple command-line programme for facial recognition is also included. The facial recognition attributes are: • Find all the faces showing in an image. • Get the location and outline of the eyes, nose, mouth, and chin of each person. • Recognise who in each picture appears.
36.5 Proposed System All pupils in the class must register themselves by supplying the required information, after which their images will be gathered and saved in the dataset. Faces are identified from live streaming during each session. Student Information Folder: This is an Excel file that contains the details of all the students who have enrolled in the folder. Ex:—In the Excel sheet Fig. 36.1, we have stored two student details with their Id and name. After their faces has been recognised. Training Image Folder: It has around 60 images of a single person from different angles. Fig. 36.2 are the examples of two students images stored in training image folder. Haar_Cascade Classifiers: A cascade is a function learned from a series of both positive and negative aspects images in this machine learning method. It is then used to detect the items in the other photographs based on the instruction. As a result, they are large individual .xml files with a variety of characteristics, each of which corresponds to a particularly distinct use case type [3]. We will be working on face expressions recognition here. To train the classifier, the method necessitates a high
354
Y. Chanti et al.
Fig. 36.1 Students details
Fig. 36.2 Students images stored in training image folder
number of positive (facial photos) and negative images (images without faces). After that, we must remove functions from it. This is done with the hair features displayed in the image of Fig. 36.3 [4]. They are similar to our kernel for convolutional layers. Each feature is a single value obtained by subtracting the total of the other characteristics. Subtract the number of pixels beneath the black rectangle from the number of pixels beneath the white rectangle. To estimate function loads, all possible sizes and placements of each kernel are now examined. For each function measurement, the number of pixels beneath the white and black rectangles must be determined. Fig. 36.3 is the Haar-cascade frontal face default .xml file.
36 Face Recognition-Based Automatic Attendance System
355
Fig. 36.3 Haar-cascade frontal face default .xml file
Fig. 36.4 Attendance sheets with dates and time
Attendance Folder: This will contain the final Excel document, which will have all of the students’ attendance marks. Attendance sheets with dates and times will be saved in the as shown in (Fig. 36.4).
36.6 Four Stages of Project 36.6.1 Dataset Creation Photographs of students are captured with a webcam. Several images of the same student can be obtained using various moves and angles. Such photos are subjected to preprocessing. The images have been cropped. The clipped photos will then be
356
Y. Chanti et al.
resized at a specific pixel point. These photos would then be converted from RGB to grayscale. These images will then be saved in a folder with the names of the students.
36.6.2 System Requirements They are written with technical jargon and/or specs in mind, and they are intended for development or testing teams. 1. 2. 3. 4.
Processor Memory Web technologies Python
Functional Requirements (FRs): These specify a system’s functionality, such as how it should react to a set of certain inputs and what the outcome should be. • Recording Attendance NFRs they do not have anything to do with the system’s desired functionality. NFRs, on the other hand, can typically define how the system should behave in specific instances. An NFR could, for example, specify that the system should have 256 MB of RAM. In certain circumstances, an NFR may be more important, FR is more significant than FR non-functional needs are further subdivided into the following categories: categories such as • Product requirements: 1.
Should be able to support Firefox 5 in Ubuntu.
• Performance requirements: 1. 2.
Works 24/7. Supports huge network traffic. Hardware Requirement:
• Processor: Intel Core i5. • RAM: 256 MB.12 Software requirements: • Operating system: Windows. • Programming languages: Python 3.
36 Face Recognition-Based Automatic Attendance System
357
36.6.3 Software Design Python is a flexible, high-level, and interpreted programming language with a wide range of applications. It facilitates the use of object-oriented programming easier to use in the building of applications. It is easy to pick up and use, and it comes with a variety of high-level data structures. Python is an easy-to-learn programming language with a lot of power and flexibility, making it excellent for application development [5]. Python’s syntax and dynamic type, as well as its interpretability, make it a good language for scripting and quick application development. Objectoriented, imperative, functional, and procedural programming approaches are all supported by Python. Python is not built for specific tasks, like web development. Because it may be utilised with a variety of applications, including web, enterprise, 3D CAD, and others. It is considered a general-purpose programming language. Because the variable is dynamically typed, we do not need to define it using data types. Python allows for quick development and debugging because there is no compilation stage in the development process, and the edit-test debug cycle is very quick.
36.6.4 Algorithm 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.
Begin. Is the student enrolled? If yes, proceed to the next step; otherwise, proceed to the next step. Enter the student’s ID/roll number. Fill in the student’s name. Press the take photographs button to capture images of the learner from various angles. Once you have finished capturing photos, click Save Profile. Select the Take Attendance option. It takes a picture of your face and displays the result on the console while also saving it to an Excel file. Download the attendance Excel sheet. End.
36.6.5 Face Detection Face recognition is performed with OpenCV and the Haar-cascade classifier. Before the Haar-cascade method can be used to recognise human faces, it must first be certified. This is referred to as function extraction. Haar-cascade’s training data is an XML file called Haar-cascade frontal face default. The OpenCV detect multi scale module is used here. This is required in order to draw a rectangle around the faces in a photograph [6]. There are three factors to consider: scale factor, minNeighbors,
358
Y. Chanti et al.
Fig. 36.5 minSize specifies the minimum object size
and minSize. The scale factor is used to demonstrate how much each picture scale is needed to reduce an image to its smallest size. The amount of neighbours that each candidate rectangle should have is determined by minNeighbors. Higher values typically detect fewer faces, but in the picture, they detect high quality. minSize specifies the minimum object size (Fig. 36.5).
36.6.6 Face Recognition While running the train .py code, the project’s user interface appears. Then, to train the image, provide the webcam’s ID and name. It captures all of the images for a few seconds. All of the trained photographs are saved in .yml format. We are now able to follow a picture. Faces will be surrounded by a rectangle, with their ID and name visible. Face recognition examples for two students are shown below, with a rectangle around their faces and their ID and name displayed (Fig. 36.6).
36.6.7 Attendance Updation After the face recognition procedure, the students with their ID, name, date, and time will be marked as present in the attendance folder in Excel sheets, while the rest of the students will be marked as absent (Figs. 36.7 and 36.8).
36 Face Recognition-Based Automatic Attendance System
Fig. 36.6 Face recognition examples for two students
Fig. 36.7 Attendance updation
Fig. 36.8 Face recognition-based attendance system
359
360
Y. Chanti et al.
36.7 Literature Review There are several works using various algorithms and facial recognition techniques using machine learning related to automated attendance systems. Automated Control Scheme for Attendance Based on Facial Recognition Algorithms They suggest an integrated attendance management method in this article. Face detection and identification algorithms are at the heart of this system, which recognise pupils and automatically identify them as they enter the classroom. The strategy is based on the fact that in terms of recognition rate and false positive rate, LBPH outperforms other algorithms. Because they are superior to distance classifiers, SVM and Bayesian classifiers are utilised as classifiers. A human enters the classroom and is photographed by the camera at the door in the machine design workflow [7]. After that, a specific area of the face is separated and preprocessed in preparation for further refinement. Progress on the face detection system is slow because only two people can go to school at a time. The future work, they recommend in this research is to enhance algorithm identification rates if an individual’s unconscious changes, such as tonsuring, wearing a mask, or developing facial hair. The device’s drawback is that it can only handle angle variations of up to 30°, which means it needs to be strengthened. To boost system efficiency, gait recognition should be utilised in conjunction with facial recognition technology. Facial Recognition Algorithms and Performance in Unconstrained Factors: A Video-Based Evaluation We look at eigenfaces, Fisherfaces, and LBPH, three well-known algorithms that employ a database of human faces in a variety of settings and expressions, in this article. LBPH has the best precision on possible external variables such as light exposure, noise, and video resolution, according to the testing findings. This algorithm is more constrained than other prediction techniques due to its negative light sensitivity and high noise intensity [8]. Three distinct camera resolutions have been tested for recognition accuracy: 720, 480, and 360p. LBPH had the best 720p camera resolution accuracy, while the others had the best 360p camera resolution accuracy, according to the statistics. Because it is based on the likeness of a histogram, LBPH can provide precise identification in some cases it was vulnerable. Class Room Attendance System with Face Recognition This paper proposes the building of a 3D facial model as a novel technique for identifying a student in a classroom setting using a facial recognition system. This thesis aims to create an integrated attendance system that detects students from an image/video source and tracks and analyses their attendance at lectures or sections using facial recognition technology. Attendance Monitoring System with Real-Time Face Recognition In this study, they propose an automated attendance monitoring system that includes real-time facial recognition. Create a using the personal component analysis (PCA)
36 Face Recognition-Based Automatic Attendance System
361
algorithm, create a background world for a database of student information. Coming to this is a challenging task since in an image, real-time backdrop subtraction is always a concern. The method must also cope with the maintenance of a big database of student records. The three main procedures in the implementation of this approach are identifying the facial region, retrieving the template, and recognising the face. Both input photos are retrieved and transformed from RGB to grayscale images before the feature extraction process [9]. To maximise the file, the machine performs histogram equalisation and resizes the image to guarantee that all photos are the same size. The Euclidian distance is a measurement of distance between two points that measure the distance between the extracted image and the image in the template database during the matching process. The approach is then demonstrated. Face Recognition-Based Automatic Attendance System Here, this also includes a participation plan that allows lecturers and teaching assistants to keep track of student attendance. In this article, the Viola–Jones method and the PCA technique are used to detect and recognise faces. The machine takes two photographs per class, the first at the beginning and the second at the end. In order for a student to be recognised and remembered, both photos are essential. In conclusion, the candidate is only labelled if he/she is present that can be identified in all of the images. The working technique is depicted in a series diagram below. Using MATLAB and the Raspberry Pi 2, Create a Strong Face Recognition-Based Attendance Control Framework The development of an automatic attendance system using the Raspberry Pi camera module and MATLAB R2014a is discussed in this work. Two alternative procedures are utilised to extract functions [10]. The recorded database’s local binary pattern (LBP) and directed gradient histogram (HOG) are utilised to detect pupils who are missing class. The support vector machine (SVM) classifier is used to match stored features in the database with extracted attributes from the taken image. Implementation of a SMART-FR-Based Attendance Tracking System The authors of this paper define the face detection module for Raspberry Pi and identification. The Raspberry Pi module will be attached to the camera. GSM technology will be used to send the students’ attendance to their parents. The framework handles students using a fingerprint device-interfaced OpenCV and Raspberry Pi module. Facial Recognition Technique is Used to Track Student Attendance in the Classroom This article uses radial basis functions to distinguish facial artefacts and combines discrete wavelet transforms (DWT) and discrete cosine transforms (DCT) are two types of discrete wavelet transforms (DCT) to produce features from the recipient’s face (RBF). Face recognition is accomplished with the use of an input photograph taken during a lecture class in which the students were present.
362
Y. Chanti et al.
36.8 Conclusion and Future Work • In learning and teaching scenarios, an automated student attendance system is required. The majority of the new programmes take a long time to complete and allow the teacher or students to conduct a semi-manual activity during lecture time, such as calling student IDs or moving attendance sheets around the room. The proposed framework seeks to address the aforementioned difficulties by incorporating face recognition into the attendance management system, which might be used to save time and effort during tests or lectures. Other researchers have used the facial recognition method, but it has a number of shortcomings in terms of usability, accuracy, illumination concerns, and so on, which the suggested system can address. • The new scheme would also make the current attendance system more efficient for students by reducing the amount of time available for attendance marking while increasing the amount of time available for actual instruction. • Improving the system’s overall performance. • Increased security, yet even teachers would have no idea how many students were present.
References 1. Chintalapati, S., Raghunadh, M. V.: Automated attendance management system based on face recognition algorithms. In: 2013 IEEE International Conference on Computational Intelligence and Computing Research (2013) 2. Jaturawat, P., Phankokkruad, M.: An evaluation of face recognition algorithms and accuracy based on video in unconstrained factors. In: 2016 6th IEEE International Conference on Control System, Computing and Engineering (ICCSCE), pp. 240–244. IEEE (2016) 3. https://realpython.com/face-recognition-with-python/ 4. https://www.pyimagesearch.com/2018/06/18/face-recognition-with-opencv-python-anddeep-learning/ 5. Jha, A.: ABES Engineering College, Ghaziabad, “Class room attendance system using facial recognition system”. Int. J. Math. Sci. Technol. Manage. 2(3), ISSN: 2319-8125 6. Ramesh, D., chanti, Y., Pasha, S.N., Sallauddin, M.: Face-recognition based security system using deep learning. J. Mech. Continua Math. Sci. 15(8), pp. 457–463, ISSN (online): 24547190, august ISSN (print) 0973–8975 (2020) 7. Ramesh, D., Pasha, S.N., Roopa, G.A.: Comparative analysis of classification algorithm son weather dataset using data mining tool. Orient. J. Comp. Sci. Technol, 10(4); Ramesh, D., et al. IOP Conference Series: Materials Science and Engineering, vol. 981 p. 022016 (2020) 8. Ramesh, D.: Enhancements of artificial intelligence and machine learning. Int. J. Adv. Sci. Technol. 28(17), 16–23. Retrieved from http://sersc.org/journals/index.php/IJAST/article/ view/2223 (2019) 9. Ramesh, D.: An research to enhance the old manuscript resolution using deep learning mechanism. Int. J. Innovative Technol. Explor. Eng. Journal article EID: 2-s2.0-85071496360 Part of ISBN: 22783075. https://doi.org/10.35940/ijitee.F1321.0486S419 (2019) 10. Ramesh, D.: Variation analysis of artificial intelligence, machine learning and advantages of deep architectures. Int. J. Adv. Sci. Technol. Journal article EID: 2-s2.0-85080151171 Part of ISBN: 22076360-20054238 (2019)
Chapter 37
Soft Computing-Based Approach for Face Recognition on Plastic Surgery and Skin Colour-Based Scenarios Using CNN Divyanshu Sinha, J. P. Pandey, and Bhavesh Chauhan Abstract In recent years, face recognition has become one of the most common technologies in the identification of humans which is widely used as biometrics. With many of the biometric systems that are already out there, facial recognition has become one of the most important technologies for quickly identifying people without having to ask them. This will not cause any unnecessary delays. However, there are several face-recognition systems that rely on traditional machine learning as well. But they do not work when it comes to things like facial expressions, posture, occlusion, and scale. The face-recognition method described in this article employs a convolutional neural network (CNN) to locate faces within a picture. Viola–Jones face detection is used to locate faces in a picture, and a pre-trained CNN extracts facial traits from the faces automatically. However, a large database of facial images is made in order to have more images for each subject and to include different noise levels for the best training of CNNs. Overall accuracy of the system was increased to 98.81% which will depict the effectiveness of deep face recognition. We attempt to design an algorithm that overcomes the difficulties associated with facial identification owing to skin by feeding faces into a CNN for feature extraction and then passing them to a softmax classifier for classification. Also to develop an algorithm to overcome the challenges that occur in the face recognition due to external and artificial features like plastic surgery. Keywords CNN · Face recognition · Artificial features · Machine learning
D. Sinha (B) Uttarakhand Technical University, Dehradun, India e-mail: [email protected] J. P. Pandey Department of Electrical Engg, MMMUT, Gorakhpur, India B. Chauhan Department of ECE, SRMCEM, Lucknow, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_37
363
364
D. Sinha et al.
37.1 Introduction Identifying people is a very important part of the process. People have faces that are very important. Facial biometric matching is mostly used to make sure that people are who they say they are. Face-recognition technology used to be thought of as something out of a science fiction movie. But in the last decade, this new technology has not only become possible, but it has become commonplace as well. As it turns out, it is almost impossible to read about technology these days and not see a storey about face recognition. If you want to know who someone is, you can use their face as a biometric. When it comes to biometric identification, content-based data search and video surveillance as well as access control and social media, face recognition has long been exciting and active. Facial-recognition systems do not need the person to help them work. This is different from other biometric systems, which do. This means that it does not slow things down. Besides, it can also recognise multiple people at the same time, which makes it go even faster than before. There are several approaches for facial recognition that use classic machine learning and may be found in books and other publications. Computer vision and machine learning are always improving, which implies that things are improving for us and the planet. However, most of the traditional methods cannot handle changes in lighting, facial expression, scale, occlusions, and posture, so they do not work well at all. Computer vision systems have evolved significantly over the previous decade, owing to the proliferation of large data and graphical computing. Deep learning has been very important because of these changes. Convolutional neural networks are shown in this paper. They are used to recognise faces. Face detection: It uses a Viola–Jones face detector to find faces in an image, and then it uses a pre-trained CNN to recognise those faces. The convolutional neural network is trained by making a lot of facial images of people and adding them to a database. This way, there are more images for each person and they have different lighting and noise conditions so that the network can be trained properly. When it comes to deep face recognition, a model that has been trained and a set of hyper parameters are found to be the best. Show that deep face recognition can be used in automated biometric authentication systems, with an overall accuracy rate of 98.76%, which is good news Fig. 37.1.
37.2 Related Work Face-recognition technology used to be thought of as something out of a science fiction movie. But in the last decade, this new technology has not only become possible, but it has become commonplace as well. As it turns out, it is almost impossible to read about technology these days and not see a storey about face recognition. If you want to know who someone is, you can use their face as a biometric. When it
37 Soft Computing-Based Approach for Face Recognition …
365
Fig. 37.1 Face recognition to unlock a device
comes to biometric identification, content-based data search and video surveillance as well as access control and social media, face recognition has long been exciting and active. Facial-recognition systems do not need the person to help them work. This is different from other biometric systems, which do. This means that it does not slow things down. Besides, it can also recognise multiple people at the same time, which makes it go even faster than before. There are several approaches for facial recognition that use classic machine learning and may be found in books and other publications [1–5]. Computer vision and machine learning are always improving, which implies that things are improving for us and the planet. However, most of the traditional methods cannot handle changes in lighting, facial expression, scale, occlusions, and posture, so they do not work well at all. Computer vision systems have evolved significantly over the previous decade, owing to the proliferation of large data and graphical computing. Deep learning has been very important because of these changes. Convolutional neural networks are shown in this paper. They are used to recognise faces. Face detection: It uses a Viola–Jones face detector to find faces in an image, and then it uses a pre-trained CNN to recognise those faces. The convolutional neural network is trained by making a lot of facial images of people and adding them to a database. This way, there are more images for each person and they have different lighting and noise conditions so that the network can be trained properly. When it comes to deep face recognition, a model that has been trained and a set of hyper parameters are found to be the best. Show that deep face recognition can be used in automated biometric authentication systems, with an overall accuracy rate of 98.76%, which is good news.
37.3 Methodology There are a lot of challenges resulting from different skins in face recognitions. On the basis of skin colour [6] information face is recognised from already trained
366
D. Sinha et al.
database of individual persons. The skin colour is feature-based techniques which is robust in nature and it is not affected by scale, illumination orientation changes in facial features due to accidents or any other happenings. The issues resulting from ambiguity due to external and artificial features in face recognition. Little work has been done in face recognition with ambiguity due to external and artificial features in faces such as face surgery, etc. We use both feature and knowledge bound algorithms [7] to partition areas of face using skin colour and geometrical characteristics [8] of the face.
37.3.1 Recognition of Face Based on Skin Colour Using CNN Face recognition is accomplished by feeding these area suggestions into a CNN for extraction of feature and then classification to a classifier like Softmax. In this part, we go through how can we divide skin pixels using several methods. Second, we will talk about how to detect a face based on its geometry. Subsequent will discuss face recognition in greater detail.
37.3.1.1 (a)
Segmentation of Skin
Skin colour model
One of the important properties that will differentiate human face from other object is the skin colour (Fig. 37.2).
37.3.1.2
Morphological Operations
Morphological procedures are performed on pictures that are not-skin coloured in order to create filters known as non-skin pixels (Figs. 37.3 and 37.4).
37.3.1.3
Edge Detection and Overlapping
Face regions are majorly connected to non-face parts in the preceding face image. The edges from original image are recognised and utilised to denote their separation. The robustness of the Canny detection technique is utilised here (Fig. 37.5).
37.3.1.4
Segregating Faces and Non-faces
To distinguish between faces and non-faces, we rely on our understanding of the geometrical properties of human representations (Figs. 37.6 and 37.7).
37 Soft Computing-Based Approach for Face Recognition …
367
Fig. 37.2 Face detection
37.3.2 Face Recognition 37.3.2.1
Pre-processing
The RGB colour photos are obtained and then resized to 64 × 64 resolution. Pixel values are also resized to fit inside a certain range [0, 1].
368
D. Sinha et al.
Fig. 37.3 Original image
Fig. 37.4 After skin segmentation
Fig. 37.5 Skin and non-skin pixels
Fig. 37.6 Face and non-face
37.3.2.2
System Architecture
Convolutional Layer: In this layer, the decoder’s weights are utilised as the basis for a convolutional network.
37 Soft Computing-Based Approach for Face Recognition …
369
Fig. 37.7 Image with detected faces
Fig. 37.8 Proposed convolutional neural network
Pooling Layer: Subsampling rectangular chunks from the convolutional layer to generate a single output. Softmax Classifier: A softmax classifier is used to classify feature blocks from the pooling layer (Fig. 37.8).
37.3.3 Recognition of Face from Plastic Surgery People deliberately change their facial appearances by getting plastic or cosmetic surgery done to bring about alteration in the shape of their nose or mouth. Our analysis is based on following datasets: Plastic Surgery Database from IIIT Delhi’s Image Analysis and Biometrics Lab and American Society of Plastic Surgeons Face Database. To maintain a uniformity in the distribution of data, the train test split ratio was kept as 70:30. Both the databases contain images which were taken before and after the subjects had their surgery. The PSD file includes images of various skin tones and facial expressions. The PSD file includes 541 photos, 329 of which are of Rhinoplasty (RPY), 129 subjects of have undergone Blepharoplasty (BPY), and 15 of which have undergone Lip Augmentation (LA), 0.432 photos were utilised for training and 109 images for testing, in accordance with the train test split ratio. We further enhanced the model by utilising the ASPSD, which contains a total of 1,000,000 images (Figs. 37.9 and 37.10).
37.3.3.1
Proposed Method
This work proposes and designs a convolutional neural network.
370
D. Sinha et al.
Fig. 37.9 Sample face images from plastic surgery face database. (a, b, e) nose, eyelids, lips pre surgery, (b, d, f ) nose, eyelids, and lips variation post-surgery
Fig. 37.10 Face images from ASPSD (a, c, e) pre-plastic surgery nose, eyelids, and lips; (b, d, f ) post-plastic surgery variation in nose, eyelids, and lips
37 Soft Computing-Based Approach for Face Recognition …
1.
371
Layer of Convolution
The convolution layer is composed of two elements: the weight matrix w and the bias matrix b. We suppose the convolution kernels are 5×5 and 3×3 in size. The convolution layer’s mathematical representation. x lj = f X
xil−1 kil j + blj l−1
i∈N j
(37.1)
Hierarchical feature maps may be represented mathematically as follows: I symbolises input feature map indexes (I), and the output map indexes (j). Nj is reference number of distinct feature map. fx is ReLU-based activation function. 2.
Pooling Layer
The most critical information is stored and organised in this manner. Nonlinear down sampling is the method used. The number of feature maps generated by the pooling layer stays constant despite the addition of new feature maps. There will be a reduction in the dimension for every feature map. Here, sampling strategy is to use the greatest value in case of pooling layer. 3.
FC Layer
We use two completely linked layers, each of which is coupled to every neuron in the preceding layer. Equation below expresses the layer mathematically. y pj = f
n (xil−1 wlji + bkl )
(37.2)
i=1
n is number of neurons in previous layer, and wji represents weights of neurons i and j in layer 1, bk is the bias in the layer k, and 1 is the value of activation function for first layer, respectively. There are n neurons in the preceding layer l−1, and the weight for the connection between neuron I in layer 1 and neuron j in layer l is shown by wji , while the bias of neuron j and the activation function of layer 1 are represented by bj and l, respectively (Fig. 37.11).
37.4 Results 37.4.1 Based on Skin Colour The filters used to discriminate between facial features and non-face objects have high accuracy and a low false positive rate, demonstrating their value. To begin with,
372
D. Sinha et al.
Fig. 37.11 Proposed system
Table 37.1 Performance of proposed method Proposed method
True_Positive
False_Positive
Accuracy_Metric
96%
4.80%
98.81%
the existence of non-facial skin colour areas causes these false positive findings since they lie inside the boundaries of our skin colour model. The neck, hands, and other body parts, as well as clothes and backgrounds of a similar hue, fall into this category. This approach has a greater detection rate than other methods, but its precision is superior. Utilising both YCbCr [9] and RGB colour spaces has also been shown to provide better outcomes than using just one colour space. Performance: true positive: Accurate facial detection false positive: Detection of things other than faces as faces true negative/Dismissal: Faces that go unnoticed. The suggested technique was able to properly identify 334 faces out of 338 in 300 photos. In addition, there were eight non-face outcomes. As a result, 98.81% of genuine positives are discovered, whereas 2.2% of false positives are discovered. Table 37.1 summarises the method’s effectiveness in detecting facial features. The suggested approach produces results that are much better than those obtained using the Viola and Jones method [10]. Face and non-face objects can be distinguished with high precision and low false positive rates because to the filters utilised in this system. To begin with, the existence of non-facial skin colour areas causes these false positive findings since they lie inside the boundaries of our skin colour model. The neck, hands, and other body parts, as well as clothes and backgrounds of a similar hue, fall into this category. This approach has a greater detection rate than other methods, but its precision is superior. Utilising both YCbCr and RGB colour spaces has also been shown to provide better outcomes than using just one colour space.
37 Soft Computing-Based Approach for Face Recognition … Table 37.2 Performance of proposed method on the basis of type of plastic surgery
373
Type of plastic surgery
Dataset_PSD
Dataset_ ASPS
RPY
97.1
92
LA
100
100
BPY
94
98
37.4.2 Based on Plastic Surgery Here, we propose a CNN framework for plastic surgery altered faces and evaluate the performance for the same on two datasets names PSD and ASPS. We use a binary cross entropy loss function and ADAM as an optimiser. Because of the usage of binary cross entropy loss function we define mode value of class with binary with as it has two variations in terms of classes, one before surgery and other after surgery (Table 37.2).
37.5 Conclusion In this work, we used a CNN to look into facial recognition based on skin colour and plastic surgery. Methods are used in the study of skin colour to find and recognise images with a high level of accuracy. There are limits on the colour spaces of the RGB and CbCr colour spaces that we use to separate skin pixels from their original colour. Use morphological techniques to get rid of this image. The canny detector found an edge in the original image. A HAAR cascade is used to pass the areas that were left out through different filters based on the shape of a person’s face. Achieved an accuracy of 98.81%.
References 1. Ling, H., Soatto, S., Ramanathan, N., Jacobs, D.W.: Face verification across age progression using discriminative methods. IEEE Trans. Inf. Secur. 5(1), 82–91 (2010) 2. Bianco, S.: Large age-gap face verification by feature injection in deep networks. Pattern Recogn. Lett. 90, 36–42 (2017) 3. Huang, Y., Ao, X., Li, Y.: Real time face detection based on skin tone detector. Int. J Comput. Sci. Netw. Secur. 9(7), 71–77 (2009) 4. Hu, G., Yang, Y., Yi, D., Kittler, J., Christmas, W., Li, S.Z., Hospedales, T.: When face recognition meets with deep learning: an evaluation of convolutional neural networks for face recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 142–150 (2015) 5. Gong, D., Li, Z., Tao D., Liu, J., Li, X.: A maximum entropy feature descriptor for age invariant face recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5289–5297 (2015)
374
D. Sinha et al.
6. Chai, D., Phung, S.L., Bouzerdoum, A.: A novel skin color model in ycbcr color space and its application to human face detection. In: Proceedings of IEEE International Conference on Image Processing, vol. 1, pp. 1–289 (2002) 7. Mairal, J., Bach, F., Ponce, J., Sapiro, G.: Online dictionary learning for sparse coding. In: Proceedings of ACM International Conference on Machine Learning, pp. 689–696 (2009) 8. Starovoitov, V., Samal, D.: A geometric approach to face recognition. In: Proceedings of the IEEE-EURASIP Workshop on Nonlinear Signal and Image Processing (NSIP’99), pp. 210–213 (1999) 9. Ghimire, D., Lee, J.: A robust face detection method based on skin color and edges. J. Inf. Process. Syst. 9(1), 141–156 (2013) 10. Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR, pp. I–I (2001)
Chapter 38
Text-to-Speech Synthesis of Indian Languages with Prosody Generation for Blind Persons A. Neela Madheswari, R. Vijayakumar, M. Kannan, A. Umamaheswari, and R. Menaka Abstract The major form of communication for humans is speech in order to exchange their thoughts and feelings in a better manner. There are a number of research works that focused on automatic text-to-speech conversion in various fields, and its applications are helpful for visually impaired and other physically challenged persons also. This paper mainly focused on implementing text-to-speech synthesis using Python for various Indian languages such as Bangla, Gujarati, Hindi, Kannada, Malayalam, Marathi, Nepali, Punjabi, Tamil, and Telugu. The main parameters considered for analysis are execution time and audio file size generated for the given text. The main phases involved in the proposed system are text-to-speech conversion for various Indian languages and text-to-speech conversion with prosody generation for various Indian languages, and compare which method gives better results. Finally, it is analyzed which Indian language gives better performance with prosody generation based on the analyzing parameters. Keywords Execution time · Prosody · Indian languages · Text-to-speech · Audio file size · gTTS
38.1 Introduction The best mode of communication for information exchange is by means of speech. Voice to speech synthesis is an important field of research in computer science arena in which the given text is converted into speech form. The national program for control of blindness was first put forward by India in the world during 1976 with the goal of reducing blindness prevalence to 0.3% by the year 2020. Blindness is defined as visual acuity < 3/60 in better eye with available correction [1]. In order to have the connectivity among people from the surroundings vision plays A. Neela Madheswari (B) · R. Vijayakumar · M. Kannan · A. Umamaheswari Mahendra Engineering College, Namakkal, India e-mail: [email protected] R. Menaka Velalar College of Engineering and Technology, Erode, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_38
375
376
A. Neela Madheswari et al.
a major role. We can perform many activities easily in our day-to-day lives with the help of vision such as object or person identification, reading, writing, playing, dancing, etc. Visually impaired people cannot easily carry out these tasks when compared to normal persons unless properly trained to observe the surroundings and act accordingly. With the invention of devices like Braille, visually impaired people gain more in their skills including employment status. Even then there is a lot of space for them to improve their lifestyle. Thus visually impaired people lack their interactions with their colleagues in their workplace [2]. Of the total estimated 30 million blind persons in the world, 6 million are in India. The main motive of the Tamilnadu government program named Tamilnadu State Blindness Control Society is to reduce the prevalence of blindness is 4 per 1000 population [3]. These statistics together with the issues faced by disabled people specified in [4, 5] specifies that it is very important to design a model for blind people especially when they are out of home. In order to help visually impaired persons, it is proposed to develop a system for text-to-speech synthesis. The given input text is converted into audio file of required Indian languages such as Hindi, Tamil, etc. The input text is English language for the proposed system and the expected output from our system is speech of required language with prosody generation also.
38.2 Related Background The main idea of text-to-speech conversion is to convert text input into spoken output by generating synthetic speech. Some of the applications of text-to-speech synthesis system are: (i) applications for the blind, (ii) applications helpful for deaf and vocally handicapped persons, (iii) educational applications, (iv) applications for telecommunications and multimedia, and (v) human–machine interaction. The synthesis starts from text input. The major process involved is given in Fig. 38.1 [6]. Speech usually involves two main components namely: i) verbal component and ii) prosodic component. Verbal component contains two systems wherein the first system forms words from phonemes and in the second system, words are combined to form sentences. Emotions are expressed using prosodic component. Also to stress a particular word or specify the end of a sentence, prosodic component is used [7]. Many factors affect the prosodic generation such as the pitch, duration of the speech, stress patterns, and intonation [8]. Nowadays, prosody generation also plays a major
Fig. 38.1 Text-to-speech synthesis
38 Text-to-Speech Synthesis of Indian Languages …
377
role while synthesizing text-to-speech. There are a number of papers studied prosody generations [9, 10] which show its significance.
38.3 System Methodology 38.3.1 Converting Text Input to Speech There are several application programming interfaces (APIs) available to convert text-to-speech in Python. In this work, we use the Google Text-to-Speech API named gTTS API. Using this gTTS, the input text is easily converted into audio file in mp3 format and is stored for further analysis. The gTTS API supports several languages including English, French, German, Hindi, Tamil, and many more.
38.3.2 Speech Engine The proposed system uses Python for the conversion of text-to-speech. In Python, we have pyttsx3 speech engine which is used here. The additional features of using pyttsx3 speech engine are: (i) it supports speech synthesis of different voices and at different frequencies, and (ii) it supports SAPI5 (speech application programming interface) which is a standard technology for voice recognition and synthesis.
38.4 Results and Discussion The proposed work utilizes an English passage as input and convert into a number of Indian languages such as Tamil, Telugu, Malayalam, Bangla, Gujarati, Hindi, Kannada, Marathi, Nepali, and Punjabi speech in the form of mp3 audio file. The output is obtained in two different forms. The audio file with prosody is generated, and the corresponding execution time or time complexity is calculated. Similarly, audio file without prosody is generated and the corresponding execution time is calculated. Also the output audio file size is noted for every execution of the Indian languages. The parameters used for the evaluation of the system are execution time and audio file size generated for every Indian language by considering prosody generation and without prosody generation. Table 38.1 specifies these parameters for various Indian languages are given. The input is given in English text and is shown in Fig. 38.2, and this source is taken from [11]. The text input is converted into the specified Indian language speech in mp3 audio file format with prosody generation. The output is audio file generation. The execution time and audio file size are noted. Similarly, the English text input is
378
A. Neela Madheswari et al.
Table 38.1 Parameter values for various Indian languages with and without prosody generations Language
Execution time (with prosody) in ms
Execution time (Without prosody) in ms
Audio file size (With prosody) in KB
Audio file size (Without prosody) in KB
Tamil
53
Telugu
56
51
212
273
50
265
295
Malayalam Bangla
58
47
307
176
60
50
334
307
Gujarati
60
50
327
261
Hindi
56
49
305
376
Kannada
56
50
282
295
Marathi
54
47
195
257
Nepali
57
46
307
245
Punjabi
57
51
334
428
Fig. 38.2 English text input file considered for speech synthesis
converted into specified Indian language format and without considering the prosody audio file is generated. Again the execution time and audio file size are noted. Figure 38.3 specifies the execution time in terms of milliseconds for obtaining the audio file with prosody and without prosody considerations. Figure 38.4 specifies the audio file size generated for Indian languages with and without prosody generations. From Fig. 38.3, it is clear that Marathi language takes less execution time for speech synthesis for both the audio generation with and without prosody considerations and Punjabi and Bangla languages take more execution time for speech synthesis for both the audio generation with and without prosody considerations.
38 Text-to-Speech Synthesis of Indian Languages …
379
Fig. 38.3 Execution time for obtaining audio file with prosody and without prosody
Fig. 38.4 Audio file size in KB obtained with prosody and without prosody
From Fig. 38.4, it is clear that Marathi language takes less audio file size for speech synthesis for both the audio generation with and without prosody considerations and Punjabi language takes large audio file size for speech synthesis for both the audio generation with and without prosody generations.
38.5 Conclusion This paper gives the overview of text-to-speech synthesis for various Indian languages with and without prosody considerations. The main parameters used to evaluate the given system are execution time and the generated audio file size. From the observed values of the system, it is clear that Marathi language produces the speech synthesis
380
A. Neela Madheswari et al.
in lesser time execution while comparing a few other Indian languages considered, in this paper and the Marathi language produces lesser audio file size output while comparing others. In the future, it can be extended to provide text-to-speech synthesis from the language of any country to the language of any other countries and it is further planned to design a user friendly interface for the selection of source language to destination language. Further, the work can be studied for how to include emotion other than prosody generation while synthesizing speech and also find the methods of improving quality of speech. The proposed system is developed mainly for assist the blind people while they are away from their safer home environment.
References 1. National Blindness & Visual Impairment survey India, 2015—2019—A Summary Report. https://npcbvi.gov.in/writeReadData/mainlinkFile/File341.pdf 2. Arora, A., Shetty, A.: Common problems faced by visually impaired people. Int. J. Sci. Res. 3(10) (2014) 3. Tamilnadu State Blindness Society. https://tnhealth.tn.gov.in/tngovin/blindnesscontrol/blindn esscontrol.php 4. Census of India 2011, Census Disability. http://www.disabilityaffairs.gov.in/upload/upload files/files/disabilityinindia2011data.pdf 5. Srivastava, P., Kumar, P.: Disability, its issues and challenges: psychosocial and legal aspects in Indian scenario. Delhi Psy. J. 18(1) (2015) 6. Ifeanyi, N., Ikenna, O., Izunna, O., Synthesis, T.-t-S.: Int. J. Res. Inf. Technol. 2(5), 154–163 (2014) 7. Valizada, A., Jafarova, S., Sultanov, E., Rustamov, S.: Development and evaluation of speech synthesis system based on deep learning models. Symmetry (2021) 8. Archana, B., Agrawal, S.S., Dev, A.: Speech synthesis: a review. Int. J. Eng. Res. Technol, 2(6) (2013) 9. Ronanki, S.: Prosody generation for Text-To-speech synthesis. University of Edinburgh, Thesis (2019) 10. Kenter, T., Sharma, M., Clark, R.: Improving the prosody of RNN-based english Text-to-Speech synthesis by incorporating a BERT model, INTERSPEECH (2020) 11. Source file url. https://www.poetryfoundation.org/poems/153876/what-if-a-much-of-a-whichof-a-wind. Accessed on 1.Dec.2021
Chapter 39
Microservices in IoT Middleware Architectures: Architecture, Trends, and Challenges Tushar Champaneria, Sunil Jardosh, and Ashwin Makwana
Abstract The Internet of Things (IoT) enables humans and computers to learn from and interact with billions of devices such as sensors, actuators, services, and other Internet-connected gadgets. The implementation of IoT technologies leverages seamless integration of the cyber and physical worlds, radically altering and empowering human interaction. Middleware, commonly described as a software system designed to be the intermediary between IoT devices and applications, is a fundamental technology in developing IoT systems. The IoT middleware solutions must match the requirements of the IoT ecosystem to acquire the widespread adoption. Among various approaches to middleware, service-oriented approach (SOA) is most suitable. Extending advantages of SOA, the special case of service orientation paradigm called microservices approach that has created a hype in the domain of cloud and enterprise application business. Furthermore, the microservices model has several advantages, particularly in dynamic IoT applications, where it is highly straightforward to utilize microservices-based architectures. This paper provides an overview of the current state-of-the-art and practice regarding the usage of microservice architectures by IoT. More specifically, we examine the requirements of a typical IoT middleware and presents an in-depth investigation of microservice-based IoT middlewares to address the middleware requirements and their implementation. Keywords Internet of Things · Cyber-physical systems · Middleware · Microservices
T. Champaneria (B) · A. Makwana U & P U. Patel Department of Computer Engineering, Chandubhai S. Patel Institute of Technology, Charotar University of Science and Technology (CHARUSAT), Changa, India e-mail: [email protected] A. Makwana e-mail: [email protected] S. Jardosh Progress Software, Hyderabad, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_39
381
382
T. Champaneria et al.
39.1 Introduction IoT, or the Internet of things, is used to describe the fast-increasing paradigm of cyber-physical systems. IoT will eventually connect and communicate with billions of different objects. Everything from tiny sensors with energy management to supporting hardware with communication protocols, location awareness, addressing, and software stacks and libraries to cloud computing and humongous data can make the IoT a genuine possibility. Based on Fig. 39.1, it is reasonable to consider the IoT design as a three-tiered architecture with sensors at its base, middleware in the center, and applications at its top. After being collected by sensors, the data are sent to the middleware by the edge devices, which abstracts it and offers it to the application layer as needed. Finally, as previously mentioned, the application layer provides access to software that various clients are using. The IoT middleware serves as a link between disparate application domains that communicate via different interfaces. Middleware is the software layer that sits between the technology and application levels. It is mainly employed to provide applications with shared services and functions and abstract implementation details to make increasingly sophisticated applications easier [1]. The IoT has application areas such as smart home/smart building environment, health care, smart transportation, utilities, mobile, and environment monitoring [2], to name a few. Various challenges are required to be addressed to implement IoT applications. Application domains of the IoT are many, and their deployment produces massive data generation due to scale, which needs to be handled by backend technologies, leading to scalability requirements. As IoT consists of many devices, from sensors to servers, heterogeneity at different levels is a critical requirement. Interoperability among various device communication and technologies is required, and it is one of the critical requirements. Reliability in case of fault is essential for the success of IoT deployment. Fig. 39.1 IoT middleware as a gluing layer
39 Microservices in IoT Middleware Architectures: Architecture …
383
Adaptability, resilience, and reconfigurability as part of sustainability are requirements to be addressed in IoT. The security requirement in the IoT domain is crucial as it includes integrity, confidentiality, and availability of challenges, in turn, to be addressed [3, 4]. In the domain of IoT, various approaches exist in the literature for the implementation of IoT middleware. In [5], authors have surveyed different middleware architectures and classified them into service-based, cloud-based, and actor-based middlewares. Service-based middlewares employ service-oriented architecture and allow IoT devices to be treated as services. Cloud-based middleware allows to connect, collect, and interpret data with ease. Finally, actor-based middlewares provide a flexible architecture that emphasizes the plug-and-play paradigm. Furthermore, in [1], author classified middleware based on its technical characteristics. The broad classification includes the following approaches (i) publish/subscribe (ii) SOA (iii) semantic Web (iv) semantic ontology, and (v) context-awareness. Each of the above approaches addresses different requirements of the IoT middleware. Among various approaches presented in the literature, the service orientations approach is best suitable for IoT in architecture realization [4]. The SOA technique aids in separating functionality at each layer and aids in interacting with other applications via APIs. Service-oriented computing (SOC) is a critical paradigm and plays an essential role in IoT implementation. Additionally, cloud computing helps realize IoT requirements like scalability, availability, and also aids in hosting SOA. It can be envisioned that an IoT device acts as a little window into the cloud’s vast resources. Service-oriented architecture (SOA) allows the flexibility to break down much more extensive monolithic applications into manageable ones. Microservices are an extension of the SOA. The microservices approach is not different from SOA. Instead, it is a refined and finegrained SOA. Microservices work at the application scale instead of the enterprise scale. Microservices are an architectural style rather than a technology itself. It is an approach where a single application can be defined as a set of small Web services, each running separately and independently to achieve a specific task. Microservices also scale independently of each other. It is designed to cater to specific business capabilities and, in turn, create a complex Web application. Microservices can be targeted toward end users and the others that cater to other microservices; hence, they may not have all layers of a typical architecture like presentation, application, and the data layer. Data of each microservices are controlled and managed by each microservices. Each microservice is made up of components that connect over a defined interface, resulting in a fully functional application. The rest of the paper is organized as follows: Sect. 39.2 presents microservices, their pros and cons along with their significance to the IoT middleware requirements. Section 39.3 discusses various microservice architectures for IoT. Finally, Sect. 39.4 concludes the paper with remarks and future direction.
384
T. Champaneria et al.
39.2 IoT Middlewares Requirements and Microservices Middleware is a gluing layer between IoT devices and software applications. Well, IoT middleware inherits requirements of IoT like interoperability, scalability, abstraction provision, spontaneous interaction, unfixed infrastructure, multiplicity, security and privacy, context awareness which includes detection and necessary actions on some context, management of data at scale with big data technologies, trust, mobility management, random topology, multiplicity, unknown data point availability, actuation conflicts, bootstrapping, extensibility, modularity, real-world integration [1, 6, 7]. In a typical IoT scenario like a smart city consisting of millions of devices and communicating, the middleware should address the requirements highlighted in Fig. 39.2. Furthermore, let us discuss each IoT requirement concerning the IoT middleware ecosystem. The heterogeneous device is evident in the IoT domain, which needs to be handled with various aspects like protocol and device-make model. Interoperability is a crucial requirement that needs to be addressed between various IoT actors, whether M2M or M2H. Availability is one of the requirements that middleware must ensure, especially in IoMT scenarios where 100% availability is required. The requirement like scalability, where millions of devices must communicate seamlessly, shows the magnitude to which the system can handle sensors and user requests traffic. Scalability is of two types; one is horizontal scalability, and the other is vertical scalability. Reliability is crucial for ensuring the IoT implementation responds to the faults gracefully and not with disruption. Security and privacy are essential IoT middleware requirements that are vertically applicable across all layers [8], starting from the device to the application. In addition to the above requirements in light of changing IoT paradigm and business requirements, extensibility, which is capable of adding new functions to the system, is nowadays a challenge that middleware must address. Flexibility is required to allow modification in a particular service provided by middleware. At the same time, developing a middleware solution, its design must support the reusability of components to avoid service duplication. With the increase
Fig. 39.2 Essential IoT middleware requirements
39 Microservices in IoT Middleware Architectures: Architecture …
385
in AI-assisted solutions, context awareness has become one of the desirable requirements of middleware which helps to make the right decision at the right time. The requirement of data management also arises as a crucial requirement as data being generated is very high in volume and needs to be managed effectively [9]. Microservices help to address above discussed middleware requirements. It is a new paradigm for implementing a service-oriented architecture. Microservices are an evolution of SOA over time, and it is SOA done well. It is an architectural pattern and a particular case of service orientation that help realize service orientation in the real sense. The characteristics of microservices are functional decomposition, technology agnostic, and polyglot in implementation. Microservices in IoT architectures are prominent due to their advantages. The microservice paradigm made it more convenient and intuitive for the development of software, middleware, and applications for IoT as goals of both are matching like lightweight communication, minimal management, accessible, and independent deployment [9]. Advantages of using microservice are manifold and are as follows: technological heterogeneity, which enables to adopt cutting edge technology for a specific task, scaling of an application horizontally by replication, resilience in turn availability by hosting microservices on different servers and locations, small functions make it easy to manage code and also provide the advantage to reusability. Small teams ensure the management of the entire development cycle, starting from analysis, design, development, and testing. Ease of deployment ensures faster adoption of a newer version of an application. Besides many advantages, microservice also poses a few disadvantages: Microservices need complex coordination at the central level as it runs in a distributed environment. For example, a database is distributed across microservices; a transaction will be across microservices, which needs management of such a situation. In addition, testing a microservices-based application can be cumbersome instead of the single monolithic application, as each microservice should be up and running with the latest version to start testing. Finally, communication between microservices is an atypical task and needs careful handling; otherwise, in case of malfunction, teams suffer from tracing the problem and becoming a time-consuming task to locate the problem. Nevertheless, shortcomings of microservices can be mitigated effectively by adequate planning for centralized management of microservices interfaces, well-defined testing strategy, and effectively and gracefully handling communication [10].
39.3 Microservice-Based IoT Middleware Architectures As discussed in Sect. 39.2, the microservice paradigm helps in addressing the requirements of IoT middleware due to its characteristics. Figure 39.3 shows envisaged microservice-based middleware architecture where the middleware hosts microservices, which perform dedicated tasks. As shown in figure, data reach to middleware from gateway and device layer. Inter-service communication component
386
T. Champaneria et al.
Fig. 39.3 Typical microservice-based middleware architecture
handles communication between microservices using event-based message queuing. There are various middlewares proposed in literature using the microservices-based approach. This section presents an overview of microservices-based middleware that are most relevant to the study undertaken. In [11], the authors present a framework that facilitates the broad distribution of transdisciplinary simulation models through a microservice IoT-big data architecture. In order to achieve real-time digital synchronization, the platform optimizes the flow of data from the shop floor while maintaining the integrity and confidentiality of critical data. Authors of [9] propose an IoT platform that uses microservice architecture and strives to provide scalability, stability, interoperability, and reusability while minimizing costs. The platform’s capabilities are validated by using the smart farming use case, and the performance of the proposed platform is evaluated to ensure scalability. In [12], a general microservice framework is presented to alleviate monolithic systems’ limitations. The authors consider interoperability, heterogeneity, flexibility, scalability, and platform independence requirements. To support interoperability, the authors of [13] present which microservice-based middleware aims to preserve system scalability while ensuring cohesiveness between various devices, services, and communication protocols. Achieving interoperability is the goal of the authors of [14], who describe an interoperable microservices architecture on the Web of Objects that includes virtual objects (VO) and composite virtual objects (CVO). Furthermore, the architecture aspires to interoperability for IoT services in the same domain and cross-domain services. Like any other service, microservices are bound to fail at some point, and this is no exception. The authors of [15] propose a reactive microservice architecture for yet another critical middleware aspect, availability, ensuring availability by detecting and responding to failure. The authors demonstrate it by using the smart agriculture domain and precision agriculture application. The reactive microservice architecture uses a circuit breaker pattern to increase the availability of the service functionalities
39 Microservices in IoT Middleware Architectures: Architecture …
387
provided by the architecture components. Because context is critical in decisionmaking, the authors of [16] present the context detection approach for detecting context. To extract business insights from a vast volume of raw application data, this work presents location and context-based microservices such as a contextual trigger microservice, visualization microservice, anomaly detection microservice, and root cause analysis microservices, among other things. One of the advantages of microservices is their reusability. Leveraging microservices to promote reusability, authors of [15] presented reuse of objects in WoO-based IoT environment, eliminating duplications, and cutting down on the time it takes to find and instantiate them from their registries. The approach also includes a discovery technique for microservices and associated objects that considers object reusability via a central objects repository. The required object matching technique is also provided to support object reuse. In order to facilitate object reuse, the required object matching mechanism is also included in the package. The authors of [17] propose a DIMMER platform that supports middleware and smart city services to address interoperability and other requirements associated with smart cities. DIMMER manages the metadata of IoT devices by using a decentralized data management approach based on microservices. It also considers the requirements for context awareness and interoperability in their approach. The authors of [18] propose that microservices be treated as agents who perform specific duties on their behalf. Microservice autonomous agents deliver services to end users by cooperating, demonstrating reduced coupling, and cohesion due to their cooperative nature. For the IoT domain, security and privacy are the key concern. Therefore, any middleware must address it effectively. In the automation and IoMT domain, security is the primary concern of any consumer. The authors in [11] proposed a microservicebased MAYA platform primarily designed for a smart factory environment. Furthermore, it provides a set of functionalities: Privacy-enhancing technologies (PETs) that encompass authentication, authorization, and encryption mechanisms. The authors of [19] present risks related to microservice security in general and address them in the early design phase. Furthermore, in light of implementation, the serverless paradigm was also experimented with within the existing literature for IoT domains. The serverless paradigm is essentially a cloud computing execution model where it does not store resources in volatile memory; instead, it performs the computation in brief bursts with the results saved to disk. There are no CPU resources assigned to an app when it is not used. Thus, it helps in harnessing server CPU power effectively. The serverless paradigm works hand in hand with the microservice paradigm. In [20], the authors discuss an iFaasBus framework that primarily focuses on security and privacy. iFassBus implements serverless functions to detect COVID-19 patients using incoming data from IoT devices and ML models. Authors use TLS and OAuth 2.0 protocols to secure the health data. The authors of [21] propose the microservice-based middleware architecture, which focuses on improving the performance and security of IoT solutions. When considering microservices-based middleware/platforms, it is critical to establish quality of service (QoS) characteristics. For this reason, an attempt is made in [22] to describe a QoS aware microservice architecture that monitors service delivery in terms of microservice QoS (mQoS)
388
T. Champaneria et al.
metrics such as response time, throughput, availability, and dependability while also minimizing service delivery costs. As data are at the heart of IoT, it must be handled efficiently in IoT to ensure the system’s efficiency. In IoT, data gathered from various sensors tend to have noise in reading due to thermal and electromagnetic interference, especially analog sensors, resulting in noisy data. Also, data in IoT pass through various places like edge devices and other hardware or software gateway, which causes bursty data due to buffering. In [23], the authors present a framework to monitor groundwater with four components: Data retrievers which extract the data and transform it in a standard format. Collector receives the data from data retrievers and further transforms data in a standardized format to store it in k maintenance times and non-retrieval of data, and data API management manages user authentication and API accessibility. Collectively, the proposed framework eases stakeholders from analyzing the data rather than collecting and manipulating it. The authors of [24] present a distributed IoT platform infrastructure using machine learning computing technologies for air pollution monitoring in South African cities. The authors collect real-time environmental data, analyze the data collected from specific locations to determine whether the examined area is polluted or not, and further compare performance measures of various machine learning algorithms adopted. Although the authors of [23, 24] propose the approaches for data management at specific application areas like water management and air pollution management in the smart cities domain, the above approaches can be extended to microservice-based approaches. Table 39.1 summarizes the middleware requirements addressed by the above studies. It can be seen that different approaches address a different set of requirements. In Table 39.1, yes, no, and partial are mentioned according to whether it addresses requirements or not. The middlewares are implemented for the specific application domain of IoT, for example, smart farming, digital factory, IoMT, etc. It is evident from Table 39.1 that among the whole set of IoT middleware requirements, context awareness and data management are essential requirements to be addressed in any IoT middleware. Also, to increase the reliability of IoT systems, middleware must address fault tolerance requirements as well. The fog computing and 5G also plays an very crucial role in IoT-based emerging applications [26, 27]. The integration of fog computing with microservice architecture can prove to be a promising solution.
39.4 Conclusion In this paper, we had a brief overview of the usage of microservice architectures by IoT. IoT domain exhibits peculiarities like heterogeneity, interoperability, scalability, rapid development, fixes, and new technology adoption, i.e., blockchain, machine learning, etc. As discussed above, with an array of advantages of the microservices, it outperforms older approaches and best suits current IoT implementations. Since most of the requirements of the IoT middleware are satisfied by the microservice
Yes
Yes
A Yes microservice-based middleware for the digital factory [11]
Yes
Yes
An open IoT framework based on microservices architecture [12]
Architecture of an interoperable IoT platform based on microservices [13]
Yes
Yes
Yes
An IoT platform based on microservices and serverless paradigms for smart farming purposes [9]
Yes
No
No
Yes
Yes
Yes
Yes
Yes
No
No
Yes
No
No
No
Yes
Yes
Yes
No
No
Yes
Yes
Yes
Yes
Yes
No
No
No
Yes
No
No
No
No
(continued)
Yes
No
Yes
Yes
Microservice-based IoT middleware requirements middlewares Heterogeneity Interoperability Availability Scalability Fault Security Extensibility Flexibility Reusability Context Data references tolerance and awareness management privacy
Table 39.1 IoT middleware requirements addressed by relevant studies
39 Microservices in IoT Middleware Architectures: Architecture … 389
Yes
Yes
Location and Yes context-based microservices for mobile and Internet of Things workloads [16]
Microservices in Web objects-enabled IoT environment for enhancing reusability [15]
Yes
Yes
Yes
Exploiting interoperable microservices in Web objects-enabled Internet of Things [14]
No
Yes
Yes
No
Yes
Yes
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
No
No
Yes
No
No
No
Yes
No
(continued)
No
Yes
No
Microservice-based IoT middleware requirements middlewares Heterogeneity Interoperability Availability Scalability Fault Security Extensibility Flexibility Reusability Context Data references tolerance and awareness management privacy
Table 39.1 (continued)
390 T. Champaneria et al.
Yes
Yes
Yes
Designing a smart city Internet of Things platform with microservice architecture [17]
Microservices as agents in IoT systems [18]
Security in microservices architectures [19]
Yes
Yes
Yes
No
No
Yes
No
No
Yes
No
No
No
Yes
No
Yes
No
Yes
No
No
Yes
No
No
Yes
No
No
No
Yes
(continued)
No
No
No
Microservice-based IoT middleware requirements middlewares Heterogeneity Interoperability Availability Scalability Fault Security Extensibility Flexibility Reusability Context Data references tolerance and awareness management privacy
Table 39.1 (continued)
39 Microservices in IoT Middleware Architectures: Architecture … 391
Yes
Yes
Yes
iFaaSBus: a security and privacy-based lightweight framework for serverless computing using IoT and machine learning [20]
In.IoT—a new middleware for Internet of Things [21]
Enhancing the microservices architecture for the Internet of Things [22]
Yes
Yes
Yes
Yes
No
Yes
Yes
Yes
Yes
Yes
No
Yes
Yes
Yes
Yes
Yes
Yes
Yes
No
Yes
Yes
No
Yes
Yes
No
No
No
(continued)
No
No
No
Microservice-based IoT middleware requirements middlewares Heterogeneity Interoperability Availability Scalability Fault Security Extensibility Flexibility Reusability Context Data references tolerance and awareness management privacy
Table 39.1 (continued)
392 T. Champaneria et al.
Increasing the availability of IoT applications with reactive microservices [25]
Yes
Yes
Yes
Yes
Yes
No
No
No
No
No
No
Microservice-based IoT middleware requirements middlewares Heterogeneity Interoperability Availability Scalability Fault Security Extensibility Flexibility Reusability Context Data references tolerance and awareness management privacy
Table 39.1 (continued)
39 Microservices in IoT Middleware Architectures: Architecture … 393
394
T. Champaneria et al.
paradigm, it has massive potential to become a defacto standard for development in the IoT domain. Microservices are challenging to introduce existing monolithic environments, but as for IoT, most architectures and/or solutions are designed from scratch, proving to be the best architectural style to follow for IoT architectures. Based on the in-depth analysis, this paper advocates the design of a microservice architecture as a future research direction that can addresses the maximum number of requirements of IoT middleware.
References 1. Fersi, G.: Middleware for internet of things: a study. In: Proceedings—IEEE International Conference on Distributed Computing in Sensor Systems, DCOSS 2015, pp. 230–235 (2015). https://doi.org/10.1109/DCOSS.2015.43 2. Asghar, M.H., Mohammadzadeh, N., Negi, A.: Principle application and vision in Internet of Things (IoT). In: International Conference on Computing, Communication and Automation (ICCCA2015), pp. 427–431 (2015). https://doi.org/10.1109/CCAA.2015.7148413 3. Bandyopadhyay, D., Sen, J.: Internet of things: applications and challenges in technology and standardization. Wireless Pers. Commun. 58(1), 49–69 (2011). https://doi.org/10.1007/s11277011-0288-5 4. Gunes, V., Peter, S., Givargis, T., Vahid, F.: A survey on concepts, applications, and challenges in cyber-physical systems. 8(12), 4242–4268 (2014) 5. Ngu, A.H., Gutierrez, M., Metsis, V., Nepal, S., Sheng, Q.Z.: IoT middleware: a survey on issues and enabling technologies. IEEE Internet Things J. 4(1), 1–20 (2017). https://doi.org/ 10.1109/JIOT.2016.2615180 6. Chaqfeh, M.A., Mohamed, N.: Challenges in middleware solutions for the internet of things. In: Proceedings of the 2012 International Conference on Collaboration Technologies and Systems, CTS 2012, pp. 21–26 (2012). https://doi.org/10.1109/CTS.2012.6261022 7. Bandyopadhyay, S., Sengupta, M., Maiti, S., Dutta, S.: A survey of middleware for internet of things. Commun. Comput. Inf. Sci. 162 CCIS, 288–296 (2011). https://doi.org/10.1007/9783-642-21937-5_27 8. Razzaque, M.A., Milojevic-Jevric, M., Palade, A., Cla, S.: Middleware for internet of things: a survey. IEEE Internet Things J. 3(1), 70–95 (2016). https://doi.org/10.1109/JIOT.2015.249 8900 9. Trilles, S., González-Pérez, A., Huerta, J.: An IoT platform based on microservices and serverless paradigms for smart farming purposes. Sensors (Switzerland) 20(8) (2020). https://doi. org/10.3390/s20082418 10. Butzin, B., Golatowski, F., Timmermann, D.: Microservices approach for the internet of things. In: IEEE International Conference on Emerging Technologies and Factory Automation, ETFA, vol. 2016 (2016). https://doi.org/10.1109/ETFA.2016.7733707 11. Ciavotta, M., Alge, M., Menato, S., Rovere, D., Pedrazzoli, P.: A microservice-based middleware for the digital factory. Proc. Manuf. 11, 931–938 (2017). https://doi.org/10.1016/j.pro mfg.2017.07.197 12. Sun, L., Li, Y., Memon, R.A.: An open IoT framework based on microservices architecture. pp. 154–162 (2016). https://doi.org/10.1109/CC.2017.7868163 ˇ 13. Vresk, T., Cavrak, I.: Architecture of an interoperable IoT platform based on microservices. In: 2016 39th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), pp. 1196–1201 (2016). https://doi.org/10.1109/ MIPRO.2016.7522321
39 Microservices in IoT Middleware Architectures: Architecture …
395
14. Jarwar, M.A., Ali, S., Kibria, M.G., Kumar, S., Chong, I.: Exploiting interoperable microservices in web objects enabled Internet of Things. In: International Conference on Ubiquitous and Future Networks, ICUFN, pp. 49–54 (2017). https://doi.org/10.1109/ICUFN.2017.7993746 15. Jarwar, M.A., Kibria, M.G., Ali, S., Chong, I.: Microservices in web objects enabled IoT environment for enhancing reusability. Sensors (Switzerland) 18(2) (2018). https://doi.org/10. 3390/s18020352 16. Bak, P., Melamed, R., Moshkovich, D., Nardi, Y., Ship, H., Yaeli, A.: Location and contextbased microservices for mobile and internet of things workloads. In: 2015 IEEE International Conference on Mobile Services, pp. 1–8 (2015). https://doi.org/10.1109/MobServ.2015.11 17. Krylovskiy, A., Jahn, M., Patti, E.: Designing a smart city internet of things platform with microservice architecture. In: Proceedings—2015 International Conference on Future Internet of Things and Cloud, FiCloud 2015 and 2015 International Conference on Open and Big Data, OBD 2015, pp. 25–30 (2015). https://doi.org/10.1109/FiCloud.2015.55 18. Krivic, P., Skocir, P., Kusek, M., Jezic, G.: Microservices as agents in IoT systems. Smart Innov. Syst. Technol. 74, 22–31 (2018). https://doi.org/10.1007/978-3-319-59394-4_3 19. Mateus-Coelho, N., Cruz-Cunha, M., Ferreira, L.G.: Security in microservices architectures. Proc. Comput. Sci. 181(2019), 1225–1236 (2021). https://doi.org/10.1016/j.procs.2021.01.320 20. Golec, M., Ozturac, R., Pooranian, Z., Gill, S.S., Buyya, R.: iFaaSBus: a security and privacy based lightweight framework for serverless computing using IoT and machine learning. IEEE Trans. Ind. Inform. 1–1 (2021). https://doi.org/10.1109/TII.2021.3095466 21. Cruz, M.A.A., et al.: In.IoT—a new middleware for internet of things 8(10), 7902–7911 (2021) 22. Al-Masri, E.: Enhancing the microservices architecture for the internet of things. In: Proceedings—2018 IEEE International Conference on Big Data, Big Data 2018, pp. 5119–5125 (2019). https://doi.org/10.1109/BigData.2018.8622557 23. Senožetnik, M., et al.: IoT middleware for water management. Proceedings 2(11), 696 (2018). https://doi.org/10.3390/proceedings2110696 24. Mandava, T., Chen, S., Isafiade, O., Bagula, A.: An IoT middleware for air pollution monitoring in smart cities: a situation recognition model, pp. 1–19 (2018) 25. Santana, C., Andrade, L., Delicato, F.C., Prazeres, C.: Increasing the availability of IoT applications with reactive microservices. SOCA 15(2), 109–126 (2021). https://doi.org/10.1007/s11 761-020-00308-8 26. Kumhar, M., Bhatia, J.: Emerging communication technologies for 5G-enabled internet of things applications. In: Blockchain for 5G-Enabled IoT, pp. 133–158. Springer, Cham (2021) 27. Modi, A., et al.: Process model for fog data analytics for IoT applications. In: Fog Data Analytics for IoT Applications, pp. 175–198. Springer, Singapore (2020)
Chapter 40
Sustainability of Green Buildings and Comparing Different Rating Agencies Devender Kumar Beniwal, Deepak Kumar, and Vineet Kumar
Abstract The excessive use of all the energy resources like land, water, and the air is part of industrialization and urbanization which creates an imbalance of the natural ecosystem. As a result of rapid industrialization, tons of waste are getting produced every year, posing damage to our environment. Also, a huge amount of waste is included from the demolished buildings, large structures made up of sand, gravel, concrete bricks, etc. Recycling of this demolished material as aggregate for the construction of new house old buildings, industries can fill the demand–supply gap and also help us reduce the waste produced. To compete with the above requirements, green building construction is essential for the reduction of depleting ecosystem. This paper initially focuses on the systematic study of the literature of GBRS adopted in different countries and reviews them individually. A quick case study of three leading green buildings of India is also reviewed in this paper. Various green building rating agencies such as LEED, BREEAM, CASBEE, GRIHA, and GREEN MARK are disclosed below. This paper also focuses on the minimum energy approach and also emphasizes the green energy concept to be adopted in the various rating agencies, and also, a quick comparison of various leading rating agencies has been made in this paper. Keywords GBRS · Sustainable · NHBC · Green building
D. K. Beniwal (B) · D. Kumar · V. Kumar Department of CIVIL Engineering, UIET MDU, Rohtak, India e-mail: [email protected] D. Kumar e-mail: [email protected] V. Kumar e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_40
397
398
D. K. Beniwal et al.
Abbreviations BCAS BREEAM CASBEE ERF Fuzzy AHP GB GBDF GRIHA GBRS GBCA GBTs IGBC JSBC LEED NHBC SDGBs TERI USGBC
“Building and Construction Authority of Singapore” “Building Research Establishment Environmental Assessment Method” “Comprehensive Assessment System for Built Environment Efficiency” “Estimated ranking framework” “Fuzzy Analytic Hierarchy Process” “Green building” “Green building design factors” “Green Rating for Integrated Habitat Assessment” “Green building rating systems” “Green Building Council of Australia” “Green building tools” “Indian Green Building Council” “Japan sustainable building consortium” “Leadership in energy and environmental design” “National House Building Council” “Sustainable Development of Green Buildings” “Tata energy and research institute” “United States Green Building Council”
40.1 Introduction In ancient times, zero embodied energy materials such as leaves, sand, stone, and unprocessed timber were used for building purposes. But the quest for durable substance led to the use of contemporary building substances such as metals, plastics, cement, and bricks. A large amount of energy is required to manufacture and transport these materials from the construction area to the construction site. This is the reason why the Indian manufacturing companies are rapidly moving from zero energy materials to energy-intensive materials. Continuous growth in the country’s population and urbanization has demanded higher rate for modern construction materials. Modern building materials are also emitting greenhouse gases which cause environmental depletion, global warming, and loss of biodiversity; also, according to a recent study, construction sector leads to 23% of air pollution, nearly 49–50% of climate change, a 50% of landfill waste, almost 40% of energy consumption, and 40% of drinking water pollution. “Role of green building in environmental sustainability” is the green building that promotes building efficiency concerning water, energy, and material use while minimizing the impact of buildings on the health of individuals and nature through improved construction, design, operation, maintenance, and removal. Using sustainable materials in the construction industry requires concentrated efforts, thereby reducing energy consumption. Today, we need more
40 Sustainability of Green Buildings and Comparing …
399
eco-friendly and green construction materials and house building materials to minimize the use of energy and to decrease the depletion of our environment. We have to efficiently use our energy, materials, water, and land. Green building not only helps us decrease our daily demands of energy but also helps us reduce the depletion of the environment in several ways. As the name suggests green building, green is not the color it implies the nature of the building and its purpose of construction. Supervisory needs are still not mandated in many places around the globe, and particular rules for GB design either do not exist in these places or are still in the stage of becoming one; while this is measured in a real way of implementing a GB design standard, it is quite tough to apply at the construction stages with different investors. These standard practices and codes are followed under the supervision of a certified national body. For example, NHBC is the United Kingdom’s only government body that gives protection to home buyers and also sets the standard for construction. Nearly 80% of new houses constructed in the UK every year have an NHBC ten-year build mark guarantee. NHBC is the UK’s principal construction regulation inspector. Today, the smart city model combines new technologies to increase the quality of life of city residents, provide efficient use of resources, and reduce operating costs. To reach its aim, it is essential to provide an efficient framework among the various components involved to support various smart city projects. To make the environment smart, it becomes important to take stock of discussing prototypes, ideas, frameworks, protocols, and techniques. The principle goal for implementing the GB concept, in the context of the smart environment, is to provide rapid construction and reduce the cost for such construction using stone dust, blast furnace slag, fly ash, tire waste, etc. Current methods provide a solution up to a certain level, and existing techniques have their specific shortcomings and hence cannot be applied to every application in the smart network. A smart environment has various specific needs for smart applications and thus needs to address related challenges and issues in implementing and operating smart system applications. Advanced smart systems require to ensure efficiency, reliability, and sustainability, and the same expectations can be overcome by developing or enhancing intelligent approaches. We require new methods to address various issues in our environment. We require to come up with a new technique for construction in nature, where the green building concept is used to minimize the construction damage to our ecological system. There are various examples where green construction concepts can be used like highway construction, etc. The green building concept is followed by various organizations to significantly improve and enhance the utilization of city resources, improve the quality of life, and reduce operational costs.
400
D. K. Beniwal et al.
40.2 Literature Review The Need for Green Buildings for Environmental Sustainability: Today, various smart city projects are being run by different governments in order to achieve sustainability goals. Substitute materials with low carbon emission and ecological properties for construction sectors are being searched. Green building is a key component of sustainable development, but earlier studies have shown little in the way of an orderly review of green building development from a project management view [1]. It is understood that cities should be smart and green in the twenty-first century. This is perhaps only possible by promoting the sustainable design of buildings and cities. The idea of sustainability is a big problem involving various connected studies about society, the environment, and people. As construction affects society, the environment, and people, it becomes more important to study sustainability in how we build and maintain our buildings [2]. Planning buildings for sustainability is the main issue of green materials and sustainable resources. Although sustainable materials have become significant, lack of knowledge and long periods are doing it complicated to attain. In ancient times, the building was 100% green and built with zero energy usage, but our greed to find more sustainable and comfortable materials has led us to materials that degrade the environment. Today, India is growing rapidly and is currently the third-largest economy in the world, with its energy consumption almost doubling since the year 2000. A recent study has found that 41% of the country’s total energy generation is solely for building construction. The Indian Green Building Council (IGBC) has recognized it as a clear opportunity and claims that the second-largest registered green building footprints are in India. The presence of well-structured, highly qualified, and trained people for the implementation of sustainable approaches and methods while construction is a deficit in developing countries like India. There should be national guidelines present and followed strictly, and there should present trained professionals to teach both guidelines and necessary steps for the construction of green buildings. Aladdin [3] submitted global green building design factors (GBDF), and ranking score will not deal with the environmental considerations of the various local design systems. They applied the ERF to present environmental and energy modeling tools and proposed that it could fill gaps in these tools in terms of renewable energy generation and weather conditions by promoting green practices. His paper has compared, classified, and has ranked GBDFs and has suggested that the proof-based GBDF ranking system states a fixed number of criterion for the assortment and scrutiny of various design factors through energy conservation and environmental studies. Hemsath researched the impact of new construction on the environment through research using the Zero Net Energy Test House as a framework, which includes a four-bedroom, three-and-a-half bathroom 1800-squarefoot home, as well as a 1,000-square-foot basement, is placed. The study is being used to validate many research projects and provides a platform for applied research of many technological advancements. Teng et al. [4] have observed and noticed all
40 Sustainability of Green Buildings and Comparing …
401
techniques and approaches to the concept of SDGBs, and physical/potential thrust including all • • • •
Degree of social participation Market development environment Ecological value and Economic value.
An SEM (Structural Equation Modeling) method explores the key roles of compelling forces and the dynamic interactions of the SDGB based on data collected and analyzed from a survey of 220 interviewees. Shen et al. [5], This paper examines how database methodology can help build performance; forecasts have become a vital aspect of today’s green building studies. Many scientific institutions have looked at building performance databases, using new technologies to incorporate occupants’ satisfaction with building indoor environmental quality and energy consumption information. Their paper also presents and outlines the info types and various collection methods involved in the USA, the EU, Australia, China, and Japan. An insight into the recent critical problems of limited coverage, poor quality, and methods of GB in China, this paper presents a “3-D framework” of the Green Building Performance Database. Review of different rating agencies: Lwin et al. [6] conducted research on green building assessment indicators for Myanmar. The study focused on identifying suitable green building assessment indicators (GBAIs) as an important leading step for the progress of a future rating system for Myanmar. He adopted nine categories and forty-eight criteria initially to review seven broadly adopted rating systems, and present certified green buildings are also examined. The important levels of the identified assessment indicators are ranked and determined using the Fuzzy AHP “Fuzzy Analytic Hierarchy Process.” His results displayed that “energy efficiency” and “water efficiency” are the most important categories with weights of 17.40% and 14%, respectively. He suggested that the framework of the rating system should include evaluation indicators, score allocation, and certification. As an outcome of their research, the most appropriate assessment indicators along with their relative weights can be identified for the development of an indigenous rating system for Myanmar. Similarly, another author, Liu et al. [7], review some leading green building assessment systems (GBAS) across the globe, for instance, the U.K. assessment system BREEAM, the U.S.A assessment system LEED, the Japanese assessment system CASBEE, and the Canada assessment system GB tools, and relating them with Taiwan’s EEWH system, a separate set of valuation products to calculate the sustainability level of a building project. These findings may be useful for the improvement of the existing assessment systems, particularly from environmental and ecological aspects. Vishanthini et al. examine the effectiveness of current green building tools (GBTs) in ASEAN countries which are, respectively, as follows: (i) (ii) (iii)
Green Mark (Singapore) Green Building Index (Malaysia) GREENSHIP (Indonesia)
402
(iv) (v)
D. K. Beniwal et al.
BERDE (Philippines) Lotus (Vietnam).
These GBTs were analyzed based on the implementation of sustainable design and maintenance in terms of similarities and differences. They developed the model to estimate the life cycle cost, GHG emissions, and energy usage of buildings that might be used across the region. Jalaei et al. [8], Studying the energy utilization of those components at the theoretic design phase is helpful for designers while making conclusions related to the choice of the most appropriate design alternative that will lead to an energy-efficient building. BIM (Building Information Modeling) is useful for the users to access various design alternatives and choose vital energy strategies and structures at the conceptual design phase of proposed projects. This paper aims to suggest an integrated procedure that links with green building certification systems and BIM and energy analysis tools. This technique is useful in the initial design phase of a project’s life. It is useful for designers to identify and measure potential loss or gain of energy for various design alternatives, calculate the potential LEED points they may collect, and select the top one. An actual building project will be used to illustrate the capability and workability of the designed methodology. Illankoon et al. [9] review various green building rating tools all over the globe and give recommendations in the context of Australia. He said green building rating tools have come around the earth, and various countries follow various regulations, incentives, and rules. However, environmental issues are still substantial from buildings in Australia, regardless of the promotion of the green building rating tool. They compared green building rating tools in Australia and other countries or regions around the world. Their research found that rating tools in Australia lack both the mandatory criteria, regulations, and incentives. His paper suggested that government incentives should be promoted.
40.3 Sustainability in Green Buildings Green building construction also safeguards the minimum degradation of environment through its period and its conservative techniques. The natural features of the development site should be analyzed during the construction of green building and how it would be maintained. Use of less carbon footprint material during the GB design is also important for the sustainable use of natural resources. It encouraged the sustainable use of existing resources. Listed below are some fundamentals of green building construction.
40 Sustainability of Green Buildings and Comparing …
403
40.4 Energy-Reduction and Renewable Energy Use Energy efficiency shows an important part in minimizing the environmental effect of construction during its lifetime. A highly efficient construction can significantly reduce greenhouse gasses. Energy optimization in the design of the GB consists of two elements: first, related to construction and, another, with the operation of the building. For instance, in an office building, 60% of losses, most of them occur due to ventilation and infiltration. Losses are reduced by providing an airtight and well-designed ventilation system. Enhanced window glazing and reasonable superior insulation decrease the losses due to building façade. The aim for minimizing carbon dioxide emission release is depicted by currently released guidelines which provide a new section on energy strategy development. Another important reason for enhancement of the system phase in the buildings, e.g., refrigerator, boiler, lightings, and other devices and renewable heat is used. In comparison to a conventional condensing boiler, a high-performance, higher rating boiler will give greater performance and will save more energy during its lifespan. The efficiency of a device is dictated by energy label on it (Fig. 40.1).
Fig. 40.1 Energy optimization in the design of green building
404
D. K. Beniwal et al.
40.5 Various Green Building Assessment Techniques Green buildings can be of many types, two buildings with the same energy efficiency in different topographical areas and different climatic conditions cannot be compared together, so the need for different green building rating systems was observed, and hence different rating systems were built in different countries; some of them are world widely acceptable and very effective. These different rating systems are the tools that assess the performances of different buildings against some nationally acceptable benchmarks [10]. It creates a sense of competitiveness and a positive impact on the health of occupants and also promotes sustainable green energy use. A rating system is a tool that evaluates the performance of building on different aspects and their impact on the environment. There is a predefined set of criteria to judge the operation of green buildings, their construction, and their design. A green building certification system or green building rating system helps us broaden the focus beyond the basic energy-saving techniques. All types of building certification systems reward any building based on its performance and ability to achieve environment-saving goals with different rating levels (Table 40.1). Some of green building rating agencies are as follows: Table 40.1 Rating tools in different countries Country
Name of Green Building Council (GBC)
The green building rating tool Region
USA
US GBC
LEED Green Globes
America
UK
UK GBC
BREEAM
Europe
South Africa
GBC South Africa
Green Star SA (adapted from Green Star Australia)
Africa
Singapore
Singapore GBC
Green Mark
Asia–Pacific
New Zealand
New Zealand GBC
Green star adapted from Green Star Australia)
Asia–Pacific
Japan
Japan Sustainable Building Consortium
CASBEE (Comprehensive Assessment System Built Environment Efficiency)
Asia–Pacific
India
Indian GBC
IGBC (Indian Green Building Asia–Pacific Council) Rating LEED
Germany
German Sustainable Building Council
DGNB BREEAM Germany
Europe
France
France GBC
HQE (Haute Quality Environmental)
Europe
Canada
CANADA GBC
LEED Canada Green Globes
America
Australia
GBC Australia
Green Star
Asia–Pacific
40 Sustainability of Green Buildings and Comparing …
405
“Leadership in Energy and Environmental Design”: LEED rating system is indigenously developed by the US green building council. It is a voluntary rating system that provides owners of the building and operators with the framework to identify and implement measurable green building design [11]. LEED rating certificate for any building project is awarded when the project satisfies all the prerequisites and provides third-party verification independently to building and neighborhood development projects. Seven basic components of “green building” are as follows: • • • • • • •
“Environmentally Preferable Building Materials and Specifications.” “Energy Efficiency and Renewable Energy.” “Toxics Reduction.” “Smart Growth and Sustainable Development.” “Waste Reduction.” “Water Efficiency.” “Indoor Air Quality.”
The building project to be analyzed by the LEED rating system should satisfy the number of credit scores and earn a minimum of 40 points on a 110-point rating system. The credit point numbers achieved by any building determine the level of LEED certification. LEED certification level is divided into 4 types: • • • •
LEED platinum “40–49 points” LEED gold “50–59 points” LEED silver “60–79 points” LEED-certified “80 points.”
This rating system is created to motivate manufacturing teams to strive for innovative solutions that maintain public health and environmental and energy savings throughout a project’s life cycle (Fig. 40.2). “Building Research Establishment Environmental Assessment Method”: BREEAM is the first environmental certification system of the United Kingdom built in 1990. BREEAM uses a wide range of criteria and categories from energy to ecology
Fig. 40.2 LEED certification
406
D. K. Beniwal et al.
[12]. BREEAM scoring includes various aspects like management, water, pollution, energy, waste, materials, health and well-being, and last innovation. Further, BREEAM rating is divided into five categories (Table 40.2): CASBEE “Comprehensive Assessment System for Built Environment Efficiency”: It is an originally Japanese green building assessment system. It is an approach for rating the environmental performance and evaluating the building and built environment. CASBEE is available in English and developed by Japanese. It was established in 2001 and developed by the research committee with the collaboration of industry, national and local government, and academia. Which latter on established as the Japan sustainable building consortium (JSBC). CASBEE is applicable in the US market and follows the “BEE approach” for performance evaluation data. CASBEE is designed to increase the quality of people’s lives and decrease the resource use and environmental load in the built environment [13]. CASBEE is a set of different tools of this family, each of these tools following the type of different structures and their purpose of assessment. All the tools are listed below. • • • • • • • • • • • • • •
“CASBEE Health Checklist.” “CASBEE for Buildings (New Construction) 2014 edition*.” “CASBEE for Housing Renovation Checklist.” "CASBEE for Buildings (Existing Buildings) 2014 edition.” “CASBEE for Buildings (Renovation) 2014 edition.” “CASBEE Community Health Checklist.” “CASBEE for Markets Promotion (2014 edition)*.” “CASBEE for Commercials Interiors 2014 edition.” “CASBEE for Temporary Construction 2007 edition.” “CASBEE for Heat Island (2010 edition).” “CASBEE for Cities (2013 edition)*.” “CASBEE for Cities—Pilot version for worldwide use (2015 edition)*.” “CASBEE for Detached Houses 2014 edition.” “CASBEE for Dwelling Unit (New Construction) 2014 edition” (*Marked items are available in the English language also).
GRIHA “Green Rating for Integrated Habitat Assessment”: The meaning of the word GRIHA is Abode, it is derived from Sanskrit language, GRIHA is a rating Table 40.2 Categories of BREEAM rating
BREEAM rating
Score
Outstanding
>84
Excellent
>69
Very good
>54
Good
>44
Pass
>29
Unclassified
0
(58.2)
Fig. 58.1 Prediction through classifiers
596
V. Shaga et al.
2 Gaussian Radial Basis function: K (x, x ) = exp −γ x − x Sigmoid: K (x, x ) = tanh(γ x T · x + C)
(58.3) (58.4)
where x is the inner product and x represents the vector space, ‘d’ is the degree of the polynomial, ||x − x || is the Euclidean distance between x and x’, γ represents the gamma value (varies from 0 to 1), C is the bias or a constant term.
58.3 Dataset and Attributes The dataset consists of 205 postgraduate students’ responses from two different countries. The responses have been collected through the structured questionnaire of Gnomio e-learning portal. Table 58.1 shows the feedback survey attributes with their type and description. Relevance, reflective thinking, interactivity, tutor support, peer support, and interpretation each receive a maximum of 20 points. Students can give feedback points within the range (1–20) for each of the feedback attributes based on their engagement in learning the course online. ‘Time taken for feedback’ is an important attribute to check how much time a student takes (in minutes) for filling up the course feedback survey form. Table 58.1 Dataset and attributes description S. No.
Attributes
Type
Description
1
Relevance
Numeric
How much important this course in professional practice
2
Reflective thinking
Numeric
Critically thinking about ideas
3
Interactivity
Numeric
Discussing ideas with other peers
4
Tutor support
Numeric
How tutor interacts with students
5
Peer support
Numeric
Other students’ opinions about participation
6
Interpretation
Numeric
Interpreting other students’ messages
7
Time taken for feedback (in minutes)
Numeric
Time taken by student to fill the feedback form
8
Result
Numeric
Either pass or fail
58 Performance Prediction Using Support …
597
Fig. 58.2 Methodology [10]
58.4 Methodology Students’ feedback is collected and preprocessed before being classified as PASS or FAIL depending on the result they received. A total of 205 postgraduate student feedbacks were collected, with 41 of them being used as a training set (20%). A testing set of 164 records is chosen from the remaining records. To obtain the results, all types of SVM kernel functions were used individually. Loading the dataset, exploring the data, and then splitting the data are all part of the preprocessing stage. In addition, all relevant attributes have been converted to numeric data types. Then, to obtain the desired results, feature extraction is performed. This feature is used to train and perform classification using SVM kernel functions. Finally, examine the experimental results as shown in (Fig. 58.2).
58.5 Experimental Result and Analysis We have implemented the SVM using Python Scikit-Learn library and code for evaluating the performance of each kernel function separately. In this research study, we have used a confusion matrix for the classification performance measurement. ‘Precision’ and ‘Recall’ are the two important performance metrics that can be calculated using a confusion matrix and can be used to assess the models. The ratio of correct positive predictions to the total number of positive predictions is known as ‘precision’ (Eq. 58.5), and the ratio of correct positive predictions to the total number of positive examples in the test set is known as ‘recall’ (Eq. 58.6). ‘f -score’ is defined as the harmonic mean of model’s precision and recall (Eq. 58.7). The number of actual occurrences of the class in the specified dataset is referred to as ‘Support’, and the percentage of correctly classified observations is referred to as ‘Accuracy’ (Eq. 58.8). More accuracy means a more effective model being tested [11, 12]:
598
V. Shaga et al.
Precision = Recall = F - Score = Accuracy =
TP TP + FP
(58.5)
TP TP + FN
1 Recall
(58.6)
2 1 + Precision
(58.7)
TP + TN TP + FP + FN + TN
(58.8)
where TP = quantity that predicts the true positive as a positive, FP = quantity that predicts the false positive as a positive, TN = quantity that predicts the true negative as a negative, FN = quantity that predicts the false negative as a negative. Following the implementation of the SVM code for various kernel functions, the following resultant tables (Tables 58.2, 58.3, 58.4, and 58.5) were obtained. The performance in terms of accuracy of SVM was evaluated, and it was observed that the best performance has been achieved by the linear kernel (93%). In addition to these three kernels, the radial basis kernel function performed significantly better (90%) in correctly classifying the students’ results as shown in (Fig. 58.3).
Table 58.2 Result for sigmoid kernel function Kernel function
Sigmoid
Accuracy (%)
71%
Confusion matrix 0 1 Classification report 0
0
1
0
12
0
29
Precision
Recall
f -score
Support
0.00
0.00
0.00
12
1
0.71
1.00
0.83
29
Micro-average
0.71
0.71
0.71
41
Macro-average
0.35
0.50
0.41
41
Weighted average
0.50
0.71
0.59
41
58 Performance Prediction Using Support …
599
Table 58.3 Result for polynomial function Kernel function
Polynomial
Accuracy (%)
73%
Confusion matrix
0
1
0
0
11
1
0
30
Classification report
Precision
Recall
f -score
Support
0
0.00
0.00
0.00
11
1
0.73
1.00
0.85
30
Micro-average
0.73
0.73
0.73
41
Macro-average
0.37
0.50
0.50
41
Weighted average
0.54
0.73
0.73
41
Table 58.4 Result for Gaussian radial basis kernel function Kernel function
Gaussian RBF
Accuracy (%)
90%
Confusion matrix 0 1 Classification report 0
0
1
12
4
0
25
Precision
Recall
f -score
Support
1.00
0.75
0.86
16
1
0.86
1.00
0.93
25
Micro-average
0.90
0.90
0.90
41
Macro-average
0.93
0.88
0.89
41
Weighted average
0.92
0.90
0.90
41
Table 58.5 Result for linear kernel function Kernel function
Linear
Accuracy (%)
93%
Confusion matrix
0
1
0
20
0
1
3
18
Classification report
Precision
Recall
f -score
Support
0
0.87
1.00
0.93
20
1
1.00
0.86
0.92
21
Micro-average
0.93
0.93
0.93
41
Macro-average
0.93
0.93
0.93
41
Weighted average
0.94
0.93
0.93
41
600
V. Shaga et al.
Fig. 58.3 Accuracies of SVM functions
58.6 Conclusion The student’s feedback on the online course is critical for instructors to understand whether they are progressing in the right direction in terms of learning or not. It also increases the student’s self-awareness, enthusiasm for learning, and motivates them to gain confidence. The goal of this study is to improve theoretical understanding of the effects of kernel functions for the support vector machine on student feedback data. In this paper, we also focused on improving student performance based on feedback they provided for online courses during e-learning. According to the experimental results, the classification model based on the SVM linear kernel function is more effective in predicting student performance with 93% accuracy.
References 1. Mundt, F., Hartmann, M.: The Blended Learning Concept e:t:p:M@Math: Practical Insights and Research Findings, pp. 11–28 (2018). https://doi.org/10.1007/978-3-319-90790-1_2 2. Shaga, V., Sayyad, S., Vengatesan, K., Kumar, A.: Fact findings of exploring ICT model in teaching learning. Int. J. Sci. Technol. Res. 8, 2051–2054 (2019) 3. Joughin, G.: Introduction: refocusing assessment. Assess. Learn. Judgement High. Educ. 1–11 (2009). https://doi.org/10.1007/978-1-4020-8905-3_1 4. Shaga, V., Gebregziabher, H., Chintal, P.: Predicting Performance of Students Considering Individual Feedback at Online Learning Using Logistic Regression Model, pp. 111–120 (2022). https://doi.org/10.1007/978-981-16-0739-4_11 5. Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-based Learning Methods (2000). https://doi.org/10.1017/CBO9780511801389
58 Performance Prediction Using Support …
601
6. Géron, A.: Hands-On Machine Learning with Scikit-Learn, Keras and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. O’Reilly Media, p. 851 (2019) 7. Stoean, C., Stoean, R.: Introduction. Intell. Syst Ref. Libr. 69, 1–4 (2014). https://doi.org/10. 1007/978-3-319-06941-8_1 8. Samsudin, N.A.M.: Modeling student’s academic performance during covid-19 based on classification in support vector machine. Turkish J. Comput. Math. Educ. 12 (2021). https://doi. org/10.17762/turcomat.v12i5.2190 9. Samanta, D., Kharagpur, I.: Support Vector Machine Autumn 2018 (2018) 10. Asraf, H.M., Nooritawati, M.T.B., Shah, R.: A Comparative study in kernel-based support vector machine of oil palm leaves nutrient disease. Proc. Eng. 41, 1353–1359 (2012). https:// doi.org/10.1016/j.proeng.2012.07.321 11. Burkov, A.: The hundred page ML book. Book 5 (2019) 12. Lumbanraja, F.R., Fitri, E., Junaidi, A., Prabowo, R.: Abstract classification using support vector machine algorithm (case study: abstract in a computer science journal). J. Phys. Conf. Ser. 1751, 12042 (2021). https://doi.org/10.1088/1742-6596/1751/1/012042
Chapter 59
Car Type and License Plate Detection Based on YOLOv4 with Darknet Framework (CTLPD) Hard Parikh, R. S. Ramya, and K. R. Venugopal
Abstract Vehicle and license plate detection are important functions in an intelligent transportation system. This research paper proposes automatic vehicle identification and categorization, and also detects and recognizes of license plates and characters on it. The state-of-the-art You Only Look Once (YOLO)-Darknet deep learning framework is used to address these tasks. Two different YOLO-Darknet frameworks are applied. The first one detects and categorizes the cars. The other framework detects the license plates of the identified car accurately. Experiment results show that the proposed technique is efficient and outperforms the traditional machine learning and deep learning techniques like SVM-classifier, traditional-CNN, and fast R-CNN networks for object detection and categorization and character recognition. Keywords Character segmentation · Deep learning · Darknet · Sliding window · Vehicle detection · Vehicle type detection · YOLOv4
59.1 Introduction Object detection is an important methodology used in applications such as video surveillance, image retrieval system, forensics, and also in the latest technologies such as driverless cars [7]. Detection of objects is accomplished in two steps: localization and image classification [13]. Localization is the process of determining the location of one or more objects in an image and drawing bounding boxes around them. The region proposal network (RPN)-based [10] and regression-based [2] approaches are the two types of object detection algorithms. In RPN-based approaches, detection occurs in two steps: first, interesting regions of a picture are selected, and then the H. Parikh (B) · R. S. Ramya Dayananda Sagar College of Engineering, Bangalore 560078, Karnataka, India e-mail: [email protected] R. S. Ramya e-mail: [email protected] K. R. Venugopal Bangalore University, Bangalore 560056, Karnataka, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_59
603
604
H. Parikh et al.
image is classified using a convolutional neural network. On the other hand, You Only Look Once (Yolo) and single shot detection (SSD) [2] are regression-based techniques and are single-step procedures. Darknet is a neural network framework that is open-source software. We propose using two distinct Darknet frameworks, one for identifying the class of the Cars in the frame and the other for detecting the license plate (if visible) from the car. The Stanford 10K dataset was used to train the car-type detecting Darknet framework, along with some personally gathered photos to improve performance. The license plate detecting Darknet framework was trained on around 500 manually annotated and labeled images. The proposed model can even carry out the recognition of the license plate characters automatically. This ALPR task is accomplished by first segmenting the characters [1, 6] on the detected license plate. The detected license plate image is preprocessed [1, 6, 8, 9, 15] to make segmentation easier and more accurate. After segmentation, the characters are fed into the OCR-Tesseract engine for character recognition.
59.2 Motivation In the existing works on vehicle detection and categorization, emphasis was given more on the frontal view of the vehicles and not much on the side and rear views. To overcome these limitations, it was necessary to train the detecting and categorizing system on a dataset which had an equal distribution of the images with respect to the positional configuration (frontal, rear, and side views) of the vehicle, using an approach (YOLO) which is capable of capturing small details, which can be used for categorizing the vehicles accurately, in an image. Moreover, the proposed work also aims at providing a reliable system for license plate detection and recognition.
59.3 Contributions The principle contributions of this work are as follows: 1. An efficient and accurate model to detect and categorize the cars, on the basis of their shape and features, from the input image or frame. 2. The cropped images of these detected cars are then fed to another model which does the task of detecting the license plate, if visible, of the car. 3. The cropped images of the detected license plates are preprocessed before feeding them to the Tesseract OCR model for character recognition.
59 Car Type and License Plate Detection …
605
59.4 Related Work In this section, we look at the recent works that use machine learning and deep learning techniques in the context of object detection and character recognition.
59.4.1 Object Detection Methods The YOLO detection system is compared to several top detection frameworks, highlighting key similarities and differences. Deformable Parts Model: z Objects are detected using a sliding window approach in deformable parts models (DPM)[3]. Using a disjoint pipeline, DPM extracts static features, classifies regions, predicts bounding boxes for high-scoring regions, and so on. All of these disparate parts are replaced by a single convolutional neural network in our system. Because of our unified architecture, our model is faster and more accurate than DPM. R-CNN: R-CNN and its variants find objects in images by using region proposals rather than sliding windows. Potential bounding boxes are generated by selective search [14]. A CNN network extracts features and boxes are scored using a support vector machine (SVM). YOLO is similar to R-CNN in some ways. Potential bounding boxes are generated by the grid cells and scored using the convolutional features. The proposed system also produces far fewer bounding boxes than from selective search by placing spatial constraints on the grid cell proposals. Other Fast Detectors: Fast and faster R-CNN, rather than selective search, focuses on speeding up the R-CNN framework by sharing computation and utilizing neural networks to propose regions [4, 10]. Many research efforts are aimed at accelerating the DPM pipeline [2, 5, 12]. They accelerate HOG computation by using cascades and pushing computation to GPUs. T30 Hz DPM [12] is the only DPM that can address work in real time. YOLO is designed to be fast, and it discards the entire pipeline instead of trying to optimize all the components of a large detection pipeline individually.
59.4.2 Character Recognition Methods The process of converting scanned or printed text images, as well as handwritten text, into editable text for further processing is known as optical character recognition (OCR). The first step for an OCR model is to extract or identify the text regions from the input image. Phan et al. used the Laplacian operator to analyze edge pixel density and maximum gradient differences to identify text regions. Shivakumara et al. [11] used gradient difference maps and global binarization. To extract text in
606
H. Parikh et al.
Fig. 59.1 Pipeline of the proposed system
uniform colors, Nikolaou and Papamarkos [8] used color reduction. The second step for an OCR model is the segmentation and classification of the extracted text. Five different Har-based block patterns were used by Chen and Yuille [1] to train the classification system using the Adaboost learning model. Kim et al. [15] treated the text as a specific texture and used a support vector machine (SVM) model to analyze the textural features of characters.
59.5 Proposed System In this section, the proposed system is explained in details with four subsections, which are: 1. 2. 3. 4.
Car Detection and Car Categorization. License Plate Detection. Character Recognition. Dataset.
Figure 59.1 shows the workflow/pipeline of the proposed system.
59.5.1 Car Detection and Car Categorization The Stanford 10K Car dataset is used to train the YOLO model, that classifies cars into six categories. The YOLOv4 model and pre-trained weights were used as a base. The YOLOv4 model architecture was changed to match our needs and to lower the model’s computational cost. The number of classes was reduced from 80 (in the
59 Car Type and License Plate Detection …
607
Fig. 59.2 Training of YOLO model for car categorization model
original architecture) to six to classify the images (cars) in only six categories. After training, our model could achieve an average loss of 0.3399 while classifying the cars into six categories. Figure 59.2 shows how the training proceeded.
59.5.2 License Plate Detection A separate YOLO model was trained on about 500 images to detect license plates (if visible) in images of the cars. To assure the model’s robustness, the images in the dataset were chosen in such a way that there was an even distribution of the frontal, rear, and side views of the license plate. The YOLOv4 model architecture was changed to match our needs and to lower the model’s computational cost. The number of classes was reduced from 80 (in the original architecture) to one as in order to detect only the license plates. After training, our model could achieve an average loss of 0.2025 while detecting the license plates. How the training proceeded is shown in Fig. 59.3.
608
H. Parikh et al.
Fig. 59.3 Training of YOLO model for license plate detection model
59.5.3 Character Recognition Tesseract OCR is one of the most widely used and high-quality optical character recognition engines with open-source code. For example, the character ‘S’ was confused with the character ‘5’, The number ‘4’ was confused with the number ‘9’, the character ‘O’ was confused with the number ‘0’, and so on. The reason for this was the images of the license plates that were obtained after detection were pixelated and unclear. Other than this, the Tesseract OCR supported many characters which can be considered as unwanted while recognizing the license plate character, and this only increased the confusion and hence, resulted in poor results. To overcome the abovementioned problems, it was necessary to preprocess the license plate images before the Tesseract OCR model could run over them for recognition of the characters. The Tesseract OCR model was also configured so that it does not recognize the unwanted characters.
59 Car Type and License Plate Detection …
609
59.5.4 Dataset The dataset for car detection and categorization was a mixture of the Stanford 10K Car images dataset and personally collected images is used for experiment. The images used for training most car and license plate detection models focused on the frontal view of the car. The models trained on such images worked well when the input images contained the frontal view of the car but the results fell apart when the image contained only the rear view or side view of the car. In addition to this, for the categorization model to perform well, it was important to have an equal distribution of the images w.r.t. the categories. To train the license plate detection model, around 500 images from the collected dataset were chosen such that the dataset consisted of an equal distribution of the frontal view and rear view of the cars.
59.6 Results In this section, the results obtained after carrying out experiments to verify the effectiveness and robustness of the proposed system are reported.
59.6.1 Car Detection and Categorization Results In this stage, a confidence threshold of 0.5 was employed for the detection and categorization of the cars. Even though the confidence threshold used for the system was much higher than that for other available systems, the system was able to detect all the cars and achieved an F1 rate of 99.98% while testing on the images from the training dataset which consisted of around 5 K images. Some of the detection and categorization results are shown in Fig. 59.4.
59.6.2 License Plate Detection and Recognition Results In this stage, a confidence threshold of 0.7 was employed for the detection of the license plates from the images of the detected cars. A high threshold was kept to make sure that the license plate detector was detecting license plates only as it is easy for a detecting system to confuse certain rectangular textual boxes with the license plates. It is observed that the system was able to achieve an F1 rate of 84.53% while detecting the license plates, which can be considered fairly high given that the testing dataset consisted of medium to low-quality images. Figure 59.5 shows some of the license plate detection results.
610
H. Parikh et al.
Fig. 59.4 Some of the car detection and categorization results
Fig. 59.5 Some of the license plate detection results
59.7 Conclusion In this paper, two systems are presented; a car detection and categorization system and a license plate detection and recognition system. Both the systems use the YOLO for the tasks of detection and categorization. The experiments denote that the average loss for the car categorization model was 0.3399. It was able to achieve an accuracy of 97.50% while categorizing high to medium quality car images and performed
59 Car Type and License Plate Detection …
611
well even on medium to low-quality images. The license plate detection model was able to achieve an accuracy of 99.17% while detecting the license plates from high to medium quality car images and performed well even on medium to low-quality images.
59.8 Future Work One of the tasks for future work would be to improve the performance of the car categorization system on medium to low-quality car images. The number of epochs used during training can be increased to improve performance. In addition to the currently available car categories, additional car categories such as multi-purpose vehicles and crossover can be covered. Even to address this task, more data will need to be collected, and the model will need to be re-trained. In addition to these tasks, a two-step license plate detection and recognition model can be developed, which would detect/predict the license plate format/type and then use that knowledge to accurately recognize the license plate characters. Acknowledgements The author would like to thank the Department of Computer Science and Engineering at Dayananda Sagar College of Engineering (DSCE), particularly Dr. Ramya R.S., for her assistance and support throughout this work.
References 1. Chen, X., Yuille, A.L.: Detecting and reading text in natural scenes. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004, vol. 2. IEEE. pp. II–II (2004) 2. Dean, T., et al.: Fast, accurate detection of 100,000 object classes on a single machine. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1814– 1821 (2013) 3. Felzenszwalb, P.F., et al.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2009) 4. Girshick, R.B.: Fast R-CNN. CoRR. arXiv:1504.08083 (2015) 5. Hinton, G.E., et al.: Improving Neural Networks by Preventing Co-Adaptation of Feature Detectors. arXiv preprint arXiv:1207.0580 (2012) 6. Kumar, S., et al.: Text extraction and document image segmentation using matched wavelets and MRF model. IEEE Trans. Image Process. 16(8), 2117–2128 (2007) 7. Lowe, D.G.: Object Recognition from local scale-invariant features. In: Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2. IEEE. pp. 1150–1157 (1999) 8. Nikolaou, N., Papamarkos, N.: Color reduction for complex document images. Int. J. Imaging Syst. Technol. 19(1), 14–26 (2009) 9. Ofek, B.E.E.: YW: Detecting Text in Natural Scenes with Stroke Width Transform (2010) 10. Ren, S., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 28 (2015) 11. Risnumawan, A., et al.: A robust arbitrary text detection system for natural scene images. Expert Syst. Appl. 41(18), 8027–8048 (2014)
612
H. Parikh et al.
12. Sadeghi, M.A., Forsyth, D.: 30 Hz object detection with DPM-v5. In: European Conference on Computer Vision. Springer, pp. 65–79 (2014) 13. Sermanet, P., et al.: Overfeat: Integrated Recognition, Localization and Detection Using Convolutional Networks. arXiv preprint arXiv:1312.6229 (2013) 14. Uijlings, J.R.R., et al.: Selective search for object recognition. Int. J. Comput. Vis. 104(2), 154–171 (2013) 15. Yin, C., et al.: A new SVM method for short text classification based on semi-supervised learning. In: 2015 4th International Conference on Advanced Information Technology and Sensor Application (AITS). IEEE, pp. 100–103 (2015)
Chapter 60
Dynamic Search and Integration of Web Services Sumathi, Karuna Pandith, Niranajan Chiplunkar, and Surendra Shetty
Abstract This paper is focused on search and integration of the Web services dynamically according to the user request. Due to the dynamicity and similarity of Web services, the search and automatic integration of Web services are required to satisfy the user request. The unavailability of UDDI is motivated this system to use search engine to discover the required Web services. The methods used here are retrieval of the WSDL of services using search engine’s search technique, processing the WSDL links to extract operation elements and integration of Web services which generates more than one integration plan. This paper is also focused on selection of suitable integration plan from similar integration plans. Keywords Web services · Web scraping · Dynamic search · Web service composition
60.1 Introduction Web services are software modules which are loosely coupled that can be accessed programmatically using Internet and which can provide required solution to customer. WSDL documents are used to describe WSDL-based Web Services and Web Ontology Languages (OWL-S) are used to describe semantic Web services [1]. Three types of WSDL-based methods are (1) According to text (2) According to Structure (3) According to Semantics [2]. The popular discovery method is WSDLbased discovery which is supported by both industry and development tools. Sumathi (B) · K. Pandith · N. Chiplunkar · S. Shetty NMAMIT University (Deemed to be University), Nitte, Karkala, Karnataka, India e-mail: [email protected] K. Pandith e-mail: [email protected] N. Chiplunkar e-mail: [email protected] S. Shetty e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_60
613
614
Sumathi et al.
The proposed method uses WSDL-based discovery of Web services with the help of Bing search engine and creates cluster for grouping similar operations. The proposed system also performs the intra-cluster search, inter-cluster search and horizontal search methods for integration of required operations. In this paper the word “integration” is used as alternative word for “composition”. Additional efforts are required to discover the Web services and to get required Web service at low cost. Nowadays public UDDIs are not unavailable to discover Web services and we need to depend on search engines. This research searches and integrates only WSDL-based Web services. As a scenario, the user may request to get “Weather of city Washington”. Many Web services may give “weather” according to different parameters such as zipcode, city and state. The proposed system searches the required Web service from these similar Web services using WSDL processing. During invocation of searched Web services, some input parameters may be unknown. For example during invocation of “GetWeatherByZipcode” operation, the zipcode of the required area may be unknown to the user. In this situation, the proposed system automatically searches and integrates the operation “GetInfoByCity” of the service “UsZip” which gives zipcode of requested city. The integration of the Web service is continued till all the intermediate unknown input parameters are resolved and requested output is found. The proposed research uses “Bing” search engine to search the requested functional word as Web services using query as “Query_word?WSDL” [3–5]. This query returns the list of WSDL of Web services which are semantically similar to the given query word and are represented as “community Web services” in the proposed research. The list gives the WSDL which are having semantically similar operations and more than one WSDL links are retrieved for a requested functional word. But among these results, some of the functions may not be matching with the user requested query words. Therefore, it is necessary to process each of the WSDL to find suitable Web services. The semantically matched operations to the requested query word are clustered. The input elements of all operations of this cluster are compared against input requirement of the user request. In any situation if the input requirement of the user is not matching with the input element any of the matched operations, then intra-cluster search, inter-cluster search, horizontal search and vertical search is performed. The different operations resulted in these search processes are integrated to achieve the final output. Because of similarity of Web services, more than one integration plans may be generated for each request. Therefore suitable integration plan is selected through weight matrix of composition plan. The data mining techniques used for implementing this system are Web scraping, intra-cluster search, inter-cluster search and horizontal searches. The organization of this paper is as follows. Section 2 is focused on literature review, Sect. 3 is focused on mathematical model of integration, Sect. 4 is on results and Sect. 5 is on conclusion and future work.
60 Dynamic Search and Integration of Web Services
615
60.2 Literature Review The detailed literature survey is discussed on the existing methods of Web service selection and composition. The proposed system selects suitable Web services through WSDL processing and clustering. If user requested parameters are not available in the selected Web services, then composition/integration of Web services is carried out. Therefore literature review on Web service selection and composition is carried out in detail. Selection of semantically similar Web services is carried out with the help of Ontologies and Mining the QoS parameters of many functionally similar Web services are based on user’s preference which uses the ranking algorithm. Authors take user requirement as inputs and considered user feedback and QoS parameters to rank the Web services [6]. Selection of appropriate service at runtime is according to the objectives defined by the customer. QoS factors considered are Response time (Rt), Availability (At), Cost (Ct) [7]. Once the Web services are ranked according to the QOS, the selection procedure is started. Dynamic transactional composite Web service operations are performed by following three steps: A Services Selection Engine (SSE), Transaction Management Engine (TME) and a Composition Generator Engine (CGE). Dynamic Workflow is automatically reconfigured at runtime that supports multiple instantiation of the services and thus allows more than one instance of the service activity at the same time. Abortion, cancelation and compensation dependencies are followed for handling of failure and for recovering information. But user interaction is not taken into consideration if the service is compensable. The proposed system considers user interaction during selection of composed Web services for satisfying user requirement. Selection of the best user desired service without user interaction is proposed using a novel concept called skyline [8]. The two important concepts defined by service model are service schema and service model. Composite Web services are represented by service graph which includes set of operations that need to be followed in the sequence to achieve the goal. Then they perform topological sort on this graph, which orders the operations based on their dependencies. Service Execution plans are dynamically generated for each request. Since the Web Service operations are static, the QoS factors of each operation is stored in the repository and indexed by operation IDs. Execution plans which are dominated by other plans are filtered according to the QoS factors. The web service composition based on bio-inspired algorithms uses ant colony algorithm. But efficiency of ant colony algorithm is very low which is compensated by the genetic algorithm and particle swarm algorithm [9]. In genetic algorithm, the mutation and crossover operator are relatively fixed and randomly search without guide will cause degeneration [10]. A combination of genetic algorithm and cultural algorithm is used to search the Web services. Workflow-based and AI planning-based algorithms are two categories of Web service composition [11]. An approach for Web service composition is proposed
616
Sumathi et al.
in this paper but only input/output parameters of WSDL are used. In the proposed research operation element is one of the important element which is used in discovery of Web services. Generation of random test cases to check composability of Web services leads to generate numerous test cases and it is proved as an efficient way to find new behavior [12]. But test cases had all the possible sets of Web services which lead to a lengthy process of checking the composition combination. Different aspects of Service composition are modeled by Petri nets. The flexibility of testing process, the test code can be encapsulated in the aspect by using Aspect Orientation Process. QOS parameters of a composite service is done by considering QOS of Web services [13]. A trusted third party need to note the QOS parameters of the Web services because now a days Web services does not have QOS parameters. In TESSI, a tool based on TASSA, XML form is used to define test cases and these test cases are executed by sending and receiving messages through SOAP protocols and these test cases are analyzed [14]. AI planning technique is used through OWL-S where descriptions used to do planning using PDDL and semantic information is used for composition of Web services in an enhanced approach. But OWL-S is static in nature therefore it is not used in the proposed research [15]. The Genetic Algorithm is another approach which is used in Web service selection and composition [16]. The different genetic operations are selection, mutation and recombination which are big challenges to choose for increasing the efficiency and correctness. Another drawback of genetic algorithm is, it runs endlessly. Therefore, users need to fix a number of iterations. But there is no guarantee that increasing the number of iterations increases the quality of the work. Therefore genetic algorithm is not a correct choice for selecting optimal compositions. Causal Link Matrix (CLM) [17] is used for semantic Web service composition through AI technique and CLM provides causal relationship between semantic Web services. Because very smaller number of experimental results, here pre-computed parameters are used for composition. The mapping of input and output data to the data nodes of the network is done by Web services composition [18]. The graph search algorithm to find the shortest path between the source data node (corresponding to the input data) and the destination data node is calculated and this shortest path is translated into Web services composition solution. The composition problem is solved by both the Breadth-first search (BFS) and Depth-first search (DFS) algorithm on the network. But generation of synthetic networks is a problem with unrelated atomic Web services. Authors of the result proposed a model for Composition of Web services based on Complex Networks [19]. In this composition system, the service is represented by a service node and two data nodes. But real-world Web services cannot be represented by this simple method and usually are more complicated. Status of Web service execution may change after its execution. The representation of Web service network as a static network indicates that Web service’s nodes and edges do not change. But proposed system proves that the structure of Web service may change from time
60 Dynamic Search and Integration of Web Services
617
to time which requires dynamic composition of Web services for each request of consumer. Authors used service execution logs to present an approach for identifying service composition patterns [20]. They traced control flow of frequently executed Web services and identified the pattern and thus finding the reuse of service composition patterns which improves the SOA developers’ productivity. In this system the generation of network nodes between client and destination Web service requires the analysis of the network layer packets. The Web Service Composition example “Vacation planner” plans for trip by booking flight tickets and hotel rooms [21]. This service composition considers quality of service such as availability, cost, response time and reliability of Web services for selection of Web services. They used finite state models which is dependent on input/output sequence of service specifications to improve the QoS evaluation. The monitoring of Declarative Framework for Self-Healing Web Services Composition (DISC) is used for avoiding deadlocks and for recovery actions [22]. From above literature survey it is found that dynamic discovery of Web services is very much essential and having challenges which need to be solved according to nature of available services.
60.3 Dynamic Search and Integration of the Web Services The proposed system reads user given functional word and an input parameter name as inputs through the input interface of this system. The functional word is submitted as a query in the form of “query?wsdl” to search engine and this is called dynamic search of Web services. The retrieved result is scraped to get only WSDL links. The operation elements of these WSDL links are clustered into exact matching clusters and semantic matching clusters. If the input elements of operations of exact matching clusters are resolved by the user given inputs, then that operation is invoked/executed and result is returned to the user. Otherwise, operations need to be integrated/composed dynamically to get the value of the unknown inputs, till all unknown input values are resolved. The dynamic integration of operations may result into more than one integration plan. Then suitable integration plan is selected and executed by the system according to the user opinion. An user request Sr = contains functional word F and requested input Ri. The set OW contains retrieved WSDL links of available Web services for user requested functional word in the World Wide Web. Let OW = and n is an element of the finite set of natural numbers. The Web service is selected if F exists as operation name of any of the WSDL of the set OW and the input Ri may or may not matches to any input parameter of these operations.
618
Sumathi et al.
The selected operation So = has operation name same as functional word and input parameter name Si of the operation is either same as Ri or Ri Co, where Co = are the outputs of composable operation and ‘n’ belong to the finite set of natural numbers. Let CS = are the operations which are matching exactly to the given functional word and CS is cluster of exact matching operations. But the input parameters Si = of these operations are not matching to Ri. Therefore Ri ∈ / Si. Then input elements of these operations are to be extracted and should be searched as / of existing available Web services. Let INi = are the extracted input parameters of ith operation of the set CS where i ranges from 1 to n and n is an element of the finite set of natural numbers. Now a search for these parameters as / of the existing online Web services and feeding it as the input of the operation of set CS is the process of integration of the Web services. During integration of the Web services, many intermediate operations may be resulted as composable operations and let CSOi = where opc1 to opcn is composable operation which are composable with opi of CS where i and n are belong to the finite set of natural numbers. Let integration of an operation ‘i’ is tuple IOPi = for each operation opi , where opi ∈ CS and i ranges from 1 to n and CSOi and Sci are empty at the beginning for each operation. S is one of the search procedure among intra-cluster search, inter-cluster search and [23] horizontal search methods. At the beginning Sci is empty and during a search of unknown input of composable operations, the search procedures performed are joined to Sci . The search procedure is continued till Ri is found as the input of elements of these search results. During the search of composable operations, the intermediate composable operations “opc” which is found by search procedure are joined to CSOi . After all search methods, the CSOi contain composable operations of ith operation and Sci contains all search procedures followed for operation i. Thus set contain similar integration plans for the operations of the set . For all operations opi of set , if search procedure ‘Sci ’ does not return suitable operation then there is no exact matching composable operations for the operations of set , and will be empty. An operation opi ∈ IWS is 3-tuple where I = are input parameters that opi accepts, k ∈ N and N is finite set of numbers. O = {o1} is single output that OP produces. OP is an operation that generates the output from the input. are input parameter, the output parameter, and the process that produces the output from the input. Successor operations in Composable Web services: Successor operation is represented by ‘ > ’ symbol. ‘ > ’ maps output element of the OP to an input element of the set of the inputs. Given an operation opi ∈ IWS, S ⊂ IWS are successor operations of opi if and only if ∀opj ∈ S: opi .O ∩ opj .I = Ø. Successor operator is an unary operator that provides services directly invoked by an operation (it is called as successor services). Consider that an operation opi ∈ IWS invokes an operation opj ∈ IWS. Then opj ∈ (>opi ). If opj is not known in advance, we write > opi = opi + 1 unless stated otherwise. If the operation opi directly invokes a
60 Dynamic Search and Integration of Web Services
619
set of operations (op1 , op2 …, opk ) ⊂ IWS then > opi = {op1 ,…, opk }. If the operation opi does not invoke any operation from the set IWS then > opi = ∈ . “Integration of operation is the aggregation of facilities provided by the ‘n’ operations as a single operation where n ∈ N, i.e., no. of operations. Sequential integration and parallel integration are the two typical behaviors of the integration processes”. Let ‘ ⊕ s’ and ‘ ⊕ p’ be two symbols that represent the sequential integration and parallel integration, respectively. The sequential integration and parallel integration are defined as follows. Sequential Integration: Given two operations opi , opj ∈ IWS: opj ∈ (>opi ), sequential integration of opi and opj (represented as opi ⊕ s opj ) yields a composite service wk ∈ IWS such that (∀x ∈ opi .I: (opi .P(x) = n) ∧ (n ∈ opj .I) → ((x ∈ opk .I)) ∧ (opk .P(x) ⊂ opj .P(n))) [24]. Here ‘opi .P(x) = n’ is the process of operation ‘opi ’ on input ‘x’ which gives the output value ‘n’ and ‘n’ is belong to opj .I which is given as ‘n ∈ opj .I’. Parallel Integration: Given two operations opi , opj ∈ IWS: opj ∈ (>opi ). Parallel integration of opi and opj (represented as opi ⊕ p opj ) yields a composite service opk ∈ IWS such that the input parameters of set of opk is union of the input parameter sets of opi and opj and the output parameters of set of opk is consolidation of the output parameters of sets of opi and opj . opi ⊕ p.opj {opk : opk.I = (opi.I ∪ opj.I)) ∧ opk.O = (opi.O ∪ opj.O)) ∧ opk.P = (opi.P ∪ opj.P))}
(60.1)
Sequential integration operator and parallel integration operators are represented by ⊕ symbol (removing the suffixes s and p from ⊕ s and ⊕ p) [24]. ‘’ is defined as “is equal by definition to everywhere”, ‘∧’ symbol is logical conjunction and ‘∪’ is union symbol. Let opi , opj ∈ IWS be two operations such that their integration (opi ⊕ opj ) is possible. In the proposed research, sequential and parallel integration is achieved by intra-cluster search, inter-cluster search and horizontal search of operations in finding the unknown inputs of the required operations. Recursive Integration: opi ⊕ { (>R opi)}. This type of integration generates a directed tree with opi as root. Every path in the tree is called a trace. ‘ ’ is a representation to indicate recursive integration and ‘i’ is a natural number which indicates operation position. A canonical set Ci for an operation opi ∈ W with respect to the set W is a subset of W which consists all leaf nodes (other than the root node) of the tree. Tree is generated from application of recursive composition operation on the operation opi . Canonical sets are used to find deadlocks. Given two operations opi and opj , the fulfillment of the condition in Eq. 60.2 infers that the traces (T ) generated by opi (Topi ) and opj (Topj ) may lead to deadlock condition. (60.2)
620
Sumathi et al.
Table 60.1 Weight matrix of a composition plan Composition plan (col1)
MFWSDL of i to i+1 (col2)
MFWSDL of i Total Weight (col2+col3) to i+1 (col3)
Operationi
Partial 0.3 × 0.3 = 0.09
Exact 0.5 × 0.2 = 0.1
0.19
Operation++i
Synonym 0.2 × 0.3 = 0.06
Exact 0.5 × 0.3 = 0.15
0.21
Operation++i+n−i
Synonym 0.2 × 0.3 = 0.06
Fail 0.2 × 0.0 = 0
0.06
60.4 Selection of Suitable Integration Plan When more than one composition plan is generated from the above process, these composition plans are passed through the weight matrix of WSDL. Each operation of the composition plan is passed through various columns of the WSDL weight matrix. It is known that composition plans generated are having more than one operation of the Web services. The element of the ith of a particular composition plan is compared with element of (i + 1)th of the same composition plan. The process is summarized in the Table 60.1. Here weight is normalized and given to the different columns of the WSDL weight matrix. The column “ i to i+1” is assigned by the weight, i.e., (Weight of Element) WEL-0.3 and “ i to i+1” is assigned by the weight, i.e., WEL- 0.5. The matching factors such as exact, partial, synonym and fail are assigned by the weighting factors 0.5, 0.3, 0.2 and 0. The element of ith operation is referred as “ i” and name element of i + 1th operation is referred as “ i+1 ”. The element i is compared with the element i+1 . If it matches exactly then matching factor MF is 0.5, if it matches by the factor partial then matching factor MF is 0.3 and if it matches by the factor synonym then matching factor MF is 0.2 and if it does not match, i.e., fail then matching factor is 0. The sum of total weight of different composition plans of a request is compared and the composition plans which have got more weights are stored in the prioritized order. Thus user can choose the highest scored composition plan. This is nothing but mining of composition plans with respect to the WSDL elements before invoking the operations of these Web services.
60.5 Results The outcome of the proposed system is integrated Web services based on the user requirement and a system to select suitable integration plan from more than one similar integration plans.
60 Dynamic Search and Integration of Web Services
621
http://webservicex.net/globalweather.asmx?wsdl http://www.webservicex.net/WeatherForecast.asmx?WSDL http://www.ejse.com/WeatherService/Service.asmx?WSDL http://www.tempe.gov/wx/Default.asmx?WSDL. http://graphical.weather.gov/xml/SOAP_server/ndfdXMLserver.php?wsdl http://wsf.cdyne.com/WeatherWS/Weather.asmx?wsdl Available URLs:6 URL Required :4
Fig. 60.1 Retrieved WSDL links for the request currency? WSDL
60.5.1 Links of Online Available Web Services for Different User Requests When the request given by the user is “weather”, then the system retrieves the following WSDL links [4]. The snap shot of the result of Fig. 60.1 shows the extracted operation names of retrieved WSDL links after WSDL processing.
60.5.2 Results of Currency? WSDL After WSDL Processing Results of Fig. 60.2 shows operation elements of “currency” Web service and also executed output of “ConvertionRate” operation. Fig. 60.3 shows support and confidence values of WSDL links available as the result of online search engine for different requests [4]. Existence of Web
Fig. 60.2 Execution of currency Web Service
622
Sumathi et al. Support and Confidence of Search Methods 100% 80% 60%
Support
40%
Confidence
20% 0% Weather
Temperature
ZipCode
Currency
Fig. 60.3 Confidence and Support of dynamic search result
service is justified by the 67% support value of the search for “weather” service in element of WSDL. 40% confidence value is proved in the // element of WSDL. For requests such as “Zipcode”, “Temperature”, “Currency” the confidence value and support value of search results are shown in the result of Fig. 60.3. Equations 60.3 and 60.4 finds support and confidence of the Web service search result.
60.5.3 Support and Confidence of Search Results support s(t1) = |St1 | |S|
(60.3)
confidence c(t1) = |Sot1 | |St1 |
(60.4)
|S t1 | is number of WSDL links contain requested functional word in any element. |S| is number of WSDL links retrieved. |Sot1 | is number of operations contain requested functional word.
60.5.4 Multiple Composition Plans The Fig. 60.4 shows the result of composable operations [25]. In this result the request for GetWeather requires “cityName” and “countryName” as the input. When the cityName is unknown to the user, the proposed system searches for the operation which gives “cityName” as the output. “GetCitiesByCountry” is one of the operation which match to this unknown input in the same cluster of operations through intracluster search. The support calculated is 40% indicates ratio of retrieved number of similar operations to the required number of similar operations. Another composable operation which matches with the user request is “GetWeatherByZipcode”. But the input of this operation is “zipcode”, which is unknown to the user and should be searched as output of another operation. A search is performed to search this unknown input through inter-cluster search, because there is no such
60 Dynamic Search and Integration of Web Services Fig. 60.4 Support of multiple composition plans
70%
623
Composable Web services for "weather" request CityStatetoZipCode--> GetWeatherByZipCode
60% 50% 40%
GetCitiesByCountry--> GetWeater
30% 20% 10% 0%
GetInfoByState--> GetHumidity
Support
operation in the same cluster. The inter-cluster search through the Bing search result gives the proper operation “CityStateToZipcode” as the composable operation. The support of this composable operation is 60% which indicates, among the total number of retrieved operations, only 60% of operations are required operations. Similarly other composable operations GetInfoByState → GetHumidity scores 40% support. The support of composition plan is calculated in the Eq. 60.5. S=
CountofWebservicesparticipatingincomposition NumberofavailableWSDLlinks
(60.5)
Therefore composition plans generated for “weather” request are [25] • CityStateToZipcode → GetWeatherByZipcode • GetCitiesByCountry → GetWeather • GetInfoByState →GetHumidity Since by considering the support and weight of WSDL matrix of a composition plan from Table 60.1, the selected composition plan is GetCitiesByCountry → GetWeather.
60.6 Conclusions Due to unavailability of UDDI and through analysis of literature survey it is justified that dynamic discovery of Web services are required and implemented through this research. Authors described an approach for the “automatic showing of Web service descriptions and their further enhancement based on the calculation of semantic similarity measures” [26]. But the proposed research is not using semantic similarity to enrich service descriptions, instead this research uses integration of Web services to interpret the name of the unknown input parameters. Authors [27] of existing systems focused Artificial Intelligence (AI) which uses semantic web services with OWL for composition plan. In this system syntactic web
624
Sumathi et al.
services are converted into semantic web services using a tool such as Protégé. But the proposed research used open source software eclipse for implementation and no need to depend on proprietary rights, which is one more advantage of this system. Authors [28] used data staging tool to share the data across networks and research of authors [29] designed an algorithm to increase the performance of the network by considering the minimum route advertisement interval time. It is known from the concepts of Web services that the signatures of the remote Web services’ are unknown to the user. Therefore, dynamic invocation of the Web services is a research issue which is resolved by the proposed system. Mining repository of composition plans and maintaining the repository is future work of this system. As another future work of proposed research, deadlock is detected by creating the graph of operations that visited during the integration and by detecting the cycle in the graph.
References 1. Web Services Tutorial (tutorialspoint.com). Last accessed 2019/12/11 2. Sumathi, P., Chiplunkar, N.N., Ashok Kumar, A.: Dynamic discovery of web services. IJITCS 6(10), 56–62 (2014). ISSN: 2074-9015, Mecs Publisher. https://doi.org/10.5815/ijitcs.2014. 10.08 3. Sumathi, P., Chiplunkar, N.N.: Survey on discovery of web services. Indian J. Sci. Tech. 11(16), 1–10 (2018). Print ISSN: 0974–6846, Online ISSN: 0974–5645. https://doi.org/10.17485/ijst/ 2018/v11i16/120397 4. Sumathi, P., Chiplunkar, N.N.: Populating parameters of web services by automatic composition using search precision and WSDL weight matrix. IJCSE. 13(7), 1742–7193. InderScience Publishers–Scopus indexed (2018). https://doi.org/10.1504/IJCSE.2016.10007953 5. Sumathi, P., Chiplunkar, N.N.: Open source APIs for processing the XML result of web services. In: International Conference on Advances in Computing, Communications and Informatics, MIT, Manipal, pp. 1848–1854 (2018) 6. Sachan, D., Dixit, S.K., Kumar, S.: A system for web service selection based on QoS. In: International Conference on Information Systems and Computer Networks, IIT Roorkey, pp. 139–144 (2013) 7. Abbassi, I., Graiet, M., Hamel, L., Jaoua, Z.: An event-B driven approach for ensuring reliable and flexible service composition. Int. J. Services Comp. 2(1), 45–57. ISSN 2330-4472 (2014) 8. Joseph Manoj, R., Chandrasekar, A.: A literature review on trust management in web services access control. Int. J. Web Service Comp. (IJWSC) 4(3), September, 1–18 (2013) 9. Wang, L., Shen, J., Yong, J.: A survey on bio-inspired algorithms for web service composition. In: Proceedings of the 2012 IEEE 16th International Conference on Computer Supported Cooperative Work in Design (CSCWD), Wuhan, pp. 569–574 (2012) 10. Su, S., Zhang, C., Chen, J.: An improved genetic algorithm for web services selection. In: Indulska, J., Raymond, K. (eds) Distributed applications and interoperable systems. DAIS 2007. Lecture Notes in Computer Science, vol 4531. Springer, Berlin, Heidelberg (2007) 11. McGovern, J., Tyagi, S., Stevens, M., Matthew, S.: Java web services architecture. Morgan Kaufmann, First Edition (2003) 12. Fan, G., Yu, H., Chen, L., Liu, D.: Aspect orientation based test case selection strategy for service composition. In: Proceeding of IEEE International Symposium on Theoretical Aspects of Software Engineering, Birmingham, United Kingdom, pp. 95–104 (2013) 13. Rathore, M., Suman, U., Evaluating QoS parameters for ranking web service. In: 3rd IEEE International Advance Computing Conference (IACC), UP, India, pp. 1437–1442 (2013)
60 Dynamic Search and Integration of Web Services
625
14. Ilieva, S., Pavlov, V., Manova, I., Manova, D.: A framework for design-time testing of servicebased applications at Bpel level. Serdica J. Computing 5, 367–384 (2011) 15. Hatzi, D.V., Nikolaidou, M., Bassiliades, N., Anagnostopoulos, D., Vlahavas, I.: An integrated approach to automated semantic web service composition through planning. IEEE Trans Services Comp 5(3), 319–332 (2012) 16. Jaeger, M.C., Muthl, G.: QoS-based selection of services: The implementation of a genetic algorithm”, KiVS 2007, pp. 350–359. Workshops, Bern, Switzerland, VDE Verlag (2007) 17. Lécué, F., Léger, A.: A formal model for semantic web service composition. In: Cruz, I., et al. (eds.) The semantic web-ISWC 2006. Lecture Notes in Computer Science, vol. 4273, pp. 385–398. Springer, Berlin, Heidelberg (2006) 18. Shang, J., Liu, L., Wu, C.: WSCN: Web service composition based on complex networks. In: Proceeding of IEEE International Conference on Service Sciences (ICSS), Shenzhen, pp. 208– 213 (2013) 19. Cavalcanti, D.J.M., Souza, F.N., Rosa, N.S.: Adaptive and dynamic quality-aware service selection. In: Proceeding of IEEE 21st Euromicro International Conference on Parallel, Distributed and Network-Based Processing, Belfast, pp. 323–327 (2013) 20. Tang, R., Zou, Y.: An approach for mining web service composition patterns from execution logs. In: 2010 IEEE International Conference on Web Services, Miami, FL, pp. 678–679 (2010) 21. Kondratyeva, O., Kushik, N., et al.: Using finite state models for quality evaluation at web service development steps. Int. J. Serv. Comp. 1(1), 1–14. ISSN 2330–4472 (2014) 22. Zahoor, E., Munir, K., Perrin, O., Godart, C.: An event based approach for declarative, integrated and self-healing web services composition. Int. J. Services Comp. 1(1), 13–24 (2013) 23. Sumathi, N. N.C.: Necessity of dynamic composition plan for web services. In: Proceedings of 2015 International Conference on Applied and Theoretical Computing and Communication Technology (iCATccT), Davangere, pp. 737–742 (2015) 24. Gopal, N.R, Gangadharan, G.R., Padmanabhan, V.: Algebraic modeling and verification of web service composition. In: Proceeding of Science Direct-6th International Conference on Ambient Systems Networks and Technologies, London, UK, pp. 675–679 (2015) 25. Sumathi, N.N.C.: Dynamic composition of web services by service invocation dynamically. IJEME, ISSN: 2305–3623(Print), ISSN: 2305–8463, Mecs Publishers pp 41–50 (2017) 26. Bravo, M., Rodríguez, J., Reyes, A.: Enriching semantically web service descriptions. In: Meersman, R. et al. (eds.) On the move to meaningful internet systems. Lecture Notes in Computer Science, vol 8841, pp. 776–792. Springer, Berlin, Heidelberg (2003) 27. Remli, M.A., Deris, S., Jamous, M.: Web services composition for concurrent plan using artificial intelligence planning. J Theoretical Appl. Info. Tech. 7(2), 228–235 (2015) 28. Arjunan, V.R., Kishore, B.: Data staging under supply planning landscape. CiiT Int. J. Data Mining Knowledge Eng. 5(5), 209–215 (2013) 29. Vadakkepalisseri, M.V.N., Chandrashekaran K.: Border gateway routing protocol convergence time analysis with minimum route advertisement information. In: Proceeding of International Conference on Information Processing, Communications in Computer and Information Science, vol. 70. Springer, Berlin, Heidelberg (2010)
Chapter 61
SAMPANN: Automated System for Pension Disbursal in India (Case Study: BSNL VRS Pension Disbursal) V. N. Tandon, Shankara Nand Mishra, Taranjeet Singh, R. S. Mani, Vivek Gupta, Ravi Kumar, Gargi Bhakta, A. K. Mantoo, Archana Bhusri, and Ramya Rajamanickam Abstract India is steadily growing with respect to Information and Communication Technology (ICT). ICT has been extended not only in complex projects and micro technologies but also to touch the life of every person in the country. In today’s scenario, a lot of successful portals have been deployed on cloud. A number of organizations are offering cloud computing as a service. Technology can be considered omnipresent only when the digital gap is bridged wherein the government is able to reach every citizen located at the last mile. This paper discusses a system deployed on cloud, helping in removal of intermediaries and provisioning direct disbursal to V. N. Tandon · S. N. Mishra · T. Singh Department of Telecommunications, Sanchar Bhawan, Ashoka Road, New Delhi 110001, India e-mail: [email protected] S. N. Mishra e-mail: [email protected] T. Singh e-mail: [email protected] R. S. Mani · V. Gupta · R. Kumar · G. Bhakta · A. K. Mantoo · A. Bhusri · R. Rajamanickam (B) National Informatics Centre, A-Block, CGO Complex, Lodhi Road, New Delhi 110003, India e-mail: [email protected] R. S. Mani e-mail: [email protected] V. Gupta e-mail: [email protected] R. Kumar e-mail: [email protected] G. Bhakta e-mail: [email protected] A. K. Mantoo e-mail: [email protected] A. Bhusri e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_61
627
628
V. N. Tandon et al.
the pensioners and on how cloud computing facilitates implementing a Comprehensive Pension Management System called SAMPANN—System for Accounting and Management of Pension. The objective of SAMPANN is to provide a transparent end-to-end delivery of services under single integrated platform. It includes end-to-end services including Pension Processing, Authorization, Sanctioning, and Disbursement. It also allows the pensioner to track latest status of pension. It has enabled faster processing of pension cases. The challenges faced in developing an ICT platform for the Finance and Accounts Department of Telecom Department for Pension Management are discussed, and the enhancement of the software to include the BSNL Voluntary Retirement Scheme 2019 pensioners is also extensively studied as a further case study. The vision is to develop an automated and scalable pension processing, authorization, sanctioning, and disbursement software on cloud for pensioners which can cater to all departments in future irrespective of difference in their functionalities. Keywords CPMS · SAMPANN · Pension · BSNL VRS · Pensioners · e-PPO
61.1 Introduction Central government employees have been the bedrock of governance in the country. And after retirement, government provides them with social security benefits such as gratuity, commutation, and regular pension so that they can lead a comfortable life after have spent decades in public service. Therefore, payment of these retirement benefits to pensioners who are normally Senior Citizens is an extremely important function of any related department. In the traditional process of pension payments for Telecom pensioners, pension for 3.5 lakh DoT and BSNL retirees was being sanctioned and authorized by the CCA (Controller of Communication Accounts) Offices across the country. The pension was thereafter disbursed on commission basis by intermediary agents—Public Sector Banks and Post Offices—to the pensioners. The amount of pension paid is to the tune of Rs. 10,500 crores per annum. However, several shortcomings were seen in the traditional process. Some of these are as follows: • Delay in disbursement of the first pension due to time taken in physical movement of PPO from sanctioning authorities to banks via Central Pension Processing Cells (CPPCs) or to Post Offices. • Delay in payment of arrears of pension after revision, wrong disbursement of pension due to error and painful recovery thereafter. • In addition, the multiplicity of organizations results in difficulty in pensioners’ grievance redressal and lack of transparency and accountability. • Non-refund of excess payment by Banks/ POs to the Telecom Department, actual figures not communicated to pensioner or auditor, and paper-based ineffective audit.
61 SAMPANN: Automated System for Pension Disbursal in India …
629
In spite of all these limitations, the commission outgo to Banks and Post Offices is approximately Rs. 35 crores per annum. Therefore, to mitigate the above, SAMPANN was conceptualized which is a seamless pension processing system through integrated software, which would bring the processing, sanctioning, authorization, and payment units under a common platform. It has decided to introduce direct credit of pension to the bank accounts of pensioners, leveraging the modern technology, thus realizing the goal of “Sanchar Pension, Seedha Vitaran.”
61.2 Research Methodology The methodology followed in the paper is to do a secondary research of published work to study the issues that have traditionally existed in pension settlement and disbursement. And thereafter, compare it with the performance of SAMPANN and to see if SAMPANN has been effective in solving the issues.
61.3 SAMPANN—System for Accounting and Management of Pension SAMPANN is a seamless pension processing system through integrated software, which brings the processing, sanctioning, authorization, and payment units under a common platform. It was dedicated to the nation by the Honorable Prime Minister of 29th December 2018 at Varanasi. The system falls the government’s objective of “Minimum Government, Maximum Governance” envisaging Paperless, Cashless, and Faceless services across the country, especially in rural and remote parts of India (Fig. 61.1).
61.4 Technology Platform and Business Architecture SAMPANN is deployed on Government of India Cloud—Meghraj to maintain the robust and scalable system with: • • • •
3 Application servers 2 Database server 2 Reporting servers DR site comprising one Application, one Database server, and one Reporting server.
630
V. N. Tandon et al.
Fig. 61.1 Disbursal of pension
61.5 Benefits to Pensioners SAMPANN was introduced in December 2018 to make the pension settlement of DOT/BSNL pensioners; more accountable, transparent, informative, and responsive SAMPANN is an end-to-end solution covering following functions, processing of pension cases, Issue of e-PPO, Payment of gratuity /Commutation and monthly pension payment, and any subsequent changes to pension. Implementation of SAMPANN has accrued various benefits to the pensioners, some which can be quantified, e.g., Cost of travel to collect PPOs and register grievances, some which are qualitative, e.g., transparent and informed pension settlement (Fig. 61.2).
61.5.1 e-PPO Pension payment order (PPO) acts as the most important document for any pensioner. A PPO contains the personal details, pension settlement information, family pensioner details, etc., of the pensioner. It also acts as the reference document for the pensioner and PAO for any pension revision, conversion, transfer, restoration, and grievance settlement. The upkeep and record keeping of such PPO is a herculean
61 SAMPANN: Automated System for Pension Disbursal in India …
631
Fig. 61.2 Benefits to pensioners
task for the pensioner and his/her family. Many a times, PPOs are lost by pensioners and on death of pensioner’s family is not able to trace the PPO. SAMPANN offers Electronic Pension Payment Order (e PO) to its pensioners, where a pensioner may access his/her PPO through SAMPANN website or mobile application AT ANY-TIME, FROM ANYWHERE.
e-PPO not only offers the transparency, convenience, and safety of record keeping, but also helps in other spill over benefits such as cost, time, and effort saving in collection and safe keeping of physical PPOs. Advantages of e-PPO: 1.
Cost, time, and effort saving in collection of PPOs
SAMPANN has processed more than 1 Lakh pension cases, for which e-PPOs have been generated for the pensioners. In the absence of such a solution, the pensioners would have had to collect these PPOs physically from department’s field units. Such collection would require pensioners to physically travel to the office premises which would involve cost and time to the pensioners. In addition to it, since most of the pensioners have health issue and require assistance to travel, it therefore creates additional hardship for the pensioners, which is addressed through issuance of e-PPO.
632
V. N. Tandon et al.
• Total number of pension cases settled by SAMPANN 101,299/ • Amount saved by the pensioners for collection of PPOs | 1,51,94,850/-. • Presuming a very conservative estimate of 150/- for the cost of travel for the pensioners in collection of PPOs. 2.
Protection from health risks
Pensioners belong to one of the most vulnerable group with respect to health risks. In prevailing situation due to COVID-19, the physical travel for the collection of PPOs would involve associated health risk exposure; E-PPO therefore provided an additional safeguard to the health safety of DOT/BSNL pensioners. 3.
Accessibility and Reliability of information
e-PPO is accessible from anywhere and anytime at the convenience of the pensioners. Such e-PPO therefore eliminates the need of upkeep of a physical copy. Any pensioner may print e-PPO, multiple times, as per need, if he/she desires to. Further, the information given on e-PPO is not subjected to any physical rectification or modifications, therefore making it more reliable to its physical counterpart. It thus provides a psychological satisfaction to pensioners regarding upkeep of his/her PPO.
61.6 Online Grievance Redressal SAMPANN offers registration and settlement of grievances of the pensioners online, through the SAMPANN website and mobile application. Pensioners can therefore not only lodge their grievances, but also can monitor, track, and get it resolved, all through SAMPANN application from the comfort of their home. The above functionality acts as a single window grievance redressal system which therefore eliminates the need of running between sections and offices for the resolution of pension grievances. Once logged into the system, it becomes the responsibility of the office to resolve the grievance, irrespective of the section involved in its resolution.
61.6.1 Grievances Settled Through SAMPANN Offices of Controller of Communication Accounts (CCA) across India have been responsive to resolve the grievances of pensioners as and when received. The following chart shows the grievances resolved by various field offices across India. As on June 2021, a total of 8000 grievances were resolved online through SAMPANN application. In its absence, the pensioners would have had to come to the office physically or send their application through post to get it resolved. Both of which are not only time consuming but also pose a huge financial burden on the pensioners.
61 SAMPANN: Automated System for Pension Disbursal in India …
633
61.7 Pensioner’s Dashboard Following details are readily available in pensioner dashboard in SAMPANN website/app. These details are immensely useful for the pensioner as he can login and check any details about his pension anytime/anywhere. 1. 2. 3. 4. 5. 6. 7. 8.
Facility to track Status of pension case E-PPO, Commutation, and Gratuity sanction Life certificate expiry date Commutation restoration date View/Download revision order issued Details of monthly pension paid Raise grievance and monitor status of grievance Update of email, mobile number, and contact address.
61.7.1 Informative Pension Settlement and Disbursal Pensioners often face problem about knowing the details of their pension settlement stage and disbursement components. Such information is crucial in their personal finance management, self-assessment of pension, as well as for their tax management. They often were dependent on the helplines offered by their pension disbursing offices or frequent visit to these offices for the resolution of such queries. SAMPANN provides a one-stop solution to all such information dissemination. Each pensioner is provided with a personal dashboard, which he/she can access to get not only information about their pension settlement but also the details about their disbursement. SAMPANN has been integrated with Jeevan Pramaan to ensure direct updation of Life Certificate details in order to ensure uninterrupted payment of pension to pensioners.
61.7.2 Timely SMS Alerts to Pensioners 1.
2. 3. 4.
On step-wise progress of pension case settlement, a SMS is sent to the pensioner providing details of the pension settlement stage. The same can be viewed from Pensioner’s dashboard on SAMPANN. On monthly credit of pension, a SMS alert providing details of various components in the pension paid is sent to the pensioner. On update of life certificate also, SMS alert is sent so pensioner is aware that his life certificate has been updated. Department of Pension and Pensioners’ Welfare as per its guidelines has requested pension disbursing agency to provide the breakup of monthly pension to the pensioners. The banks were impressed upon to undertake this welfare
634
V. N. Tandon et al.
measure, as this information is required by pensioners in connection with Income Tax, Dearness Relief payment, DR arrears, etc. This is already implemented in SAMPANN and it has been providing the breakup of the pension disbursement to its pensioners, both in form of SMS and in the form of ledger in pensioner login.
61.7.3 Income Tax-Related Facilities to Pensioners 1.
2.
Facility available in both SAMPANN website/app for pensioners to submit income tax declaration details and investment proof online, and same is verified and taken into account in tax calculation. Form 16 is made available to the pensioners in their dashboard for downloading.
61.8 Benefits to CCA Offices SAMPANN has been monumental in replacing the archaic process of pension settlement and disbursement. In addition to saving of cost, time, and effort in pension processing, it has introduced digitization in pension processing. Such digitization is not only limited to automation of activities, but also has transformed the processing of pension cases (Fig. 61.3).
61.9 Cost Saving SAMPANN utilization has saved a huge direct cost to the department of Telecommunications in settlement and disbursement of pension payments. Since pension settlement is a continuous process and pension disbursement is monthly recurring, the savings of cost of processing the cases will be cumulated with time. It is thus, what SAMPANN offers is not just a one-time cost saving, but a cumulative saving of government expenditure on cost related to pension payment with every passing month.
61.10 Disbursement of Pension Pensioners take their monthly pension payment, either through public sector banks or post offices. In erstwhile, non-SAMPANN setup, the agencies indicated above acted as the pension disbursement authority (PDA), thereby taking commission on number of pension transactions done via them. The commission per transaction by disbursement agencies is indicated below:
61 SAMPANN: Automated System for Pension Disbursal in India …
635
Fig. 61.3 Benefits to controller of communications account
Public sector banks (CPPC) up to June 2019
| 65/- per transaction
Public sector banks (CPPC) from July 2019
| 75/- per transaction
Post Offices
| 80/- per transaction
SAMPANN provides direct transfer of pension to the bank accounts of the pensioners. It thereby eliminates the intermediary disbursement offices and enables offices of Controller of Communication Accounts (CCA) across the country to s to act as the pension disbursement authority (PDA). This has resulted in enormous cost saving for the department in pension disbursement, with every successive month including more pensioners, thereby increasing this benefit.
636
V. N. Tandon et al.
61.10.1 Digitization of Pension Settlement SAMPANN offers a fully digitalized form of pension settlement where the pension case is transferred among offices and officers digitally. The cases are also reviewed and processed by the officers digitally after approvals at various levels. It is noteworthy to mention that the cases before finalization have to be digitally signed to affix the accountability of the officers too. At each stage of processing of the case, an alert is sent to the pensioner through SMS regarding the progress of the pension settlement, which can also be viewed on the dashboard of the pensioner. Once the pension cases are settled, SAMPANN also offers Pension Disbursement Agency (PDA) functionalities. It is thus that the pension of the pensioners, as finalized by the office, along with all the retirement benefits is transferred directly to the bank accounts of the pensioner, without involvement of the intermediaries. This has revolutionized the pension settlement process in the department and checked the delay in settlement and disbursal of the rightful pension payment to the pensioners.
61.10.2 Data Generation and Upkeep SAMPANN offers centralized data management system. At each stage of pension settlement and pension disbursement, the data as generated in the processing and transactions performed. Such data is easily accessible and manageable when compared to the physical counterparts. This has also vastly improved the reliability and reconciliation ability of the data. Relevant reports as needed by the officers at various levels can also be generated from SAMPANN comfortably and instantly.
61.10.3 Reconciliation, Monitoring, and Assessment Since the data generated through SAMPANN is digitally managed, reconciliation is faster and effortless. Various reconciliation reports can be generated to ascertain the authenticity of the payments through SAMPANN. Further due to the reduction of intermediaries and agencies involved in the pension payment, the need for audit of scrolls is reduced too. This has optimized the workforce engagement in the settlement of the pension cases.
61.10.4 Inter-Circle Transfer of Pension Cases SAMPANN also intends to offer inter-circle transfer of cases. Such transfer of the case will be instant as compared to physical transfer of cases in the erstwhile settlement.
61 SAMPANN: Automated System for Pension Disbursal in India …
637
The need for recollection of PPO from Banks or Post Offices is dispensed with, thereby vastly improving the turnaround time involved in the transfer of cases among telecom circles.
61.10.5 Accountable Grievance Redressal SAMPANN offers digital grievance redressal system, where the pensioners can lodge their grievances directly through their SAMPANN portal anytime and anywhere. Such grievances are registered and are given a unique ID for their tracking and settlement. At each stage of settlement, the pensioner can view the office and the resolution for their registered grievances. The login ID of officer accountable for the settlement of such grievances is captured and can be viewed for setting accountability. SAMPANN thus offers a robust and accountable grievance redressal mechanism to its valued pensioners.
61.10.6 Empowering Pensioners SAMPANN offers capturing of basic information of the pensioners, by pensioners, through the SAMPANN portal. This greatly improves the conflict resolution scenarios and delay in correction of minor yet necessary personal details. Flexibility to the pensioners is provided regarding updating their basic details, if needed from themselves at their comfort. The balance of information disbursal through the dashboard and edit rights of the pensioners transforms them from being a passive beneficiary to an active recipient who are a part of SAMPANN’s improvement and evolution.
61.11 Manpower Optimization Pension payment through intermediaries such as banks and post office required a due checking of the payment scrolls and slips. This reconciliation is not only time consuming, but also involves a significant movement of record. This movement of record involves multiple agencies and therefore imposes red tape in reconciliation. Such reconciliation at CCA level is done via a dedicated team, Pension Vouchers Audit (PVA), and a dedicated software for scroll reconciliation, PVA software. In SAMPANN, since CsCA acts as the pension disbursing agencies, it therefore eliminates the need of scrolls transfer and audits. Reconciliation in SAMPANN is done within a click of a button and therefore even eliminates the need of a dedicated team for this purpose. It thereby saves manpower and cost involved in the process which can be optimally used for other tasks (Fig. 61.4).
638
Fig. 61.4 Comparison—before and after SAMPANN
V. N. Tandon et al.
61 SAMPANN: Automated System for Pension Disbursal in India …
639
61.12 BSNL VRS 2019—Case Study Government of India had a mass Voluntary Retirement Scheme for BSNL as part of its revival package. The scheme has the following salient features: 1. 2. 3.
All VRS optees will retire on 31st January 2020. Gratuity will be deferred and will be paid upon actual Date of Superannuation or in February 2025 depending on age of the optee. Option to opt for Commutation will also be deferred until Date of Superannuation or in February 2025 depending on age of the optee.
Because of these conditions, these retirees were not normal retirees and therefore modifications had to be done to the system. The key challenges faced in this were: 1. 2. 3. 4.
The above specific features of the scheme which warranted changes in the system. The huge number of employees who opted for VRS. Around 77,000 employees opted for VRS. These cases had be settled as soon as possible. Data migration of such a huge number of pensioners as manual entry would have consumed a lot of time. Manpower utilization and optimization in order to ensure faster settlement of cases.
Keeping the above in mind, a strategy was devised along with BSNL highlights of which were: 1. 2. 3. 4.
Data of all employees will be obtained directly from BSNL’s ERP system and imported into SAMPANN to save time. BSNL offices will perform the role of Head of Office and submit all forms to the CCA office online via SAMPANN. BSNL VRS optees will fill all the forms online only. Till the cases are settled, BSNL VRS optees will be given Provisional Pension so that pensioners do not suffer.
The above strategy worked pretty well and the department was able to settle close to 76,000 cases within a span 6 months ensuring all pensioners get pension on time without any delay. The enormity of the work can be gauged from the fact that since January 2019 till January 2020, only around 12,000 pensioner had been on-boarded on SAMPANN. And with BSNL VRS 2019, 76,000 were on-boarded within 6 months. Comparison between No. of Pensioners on-boarded in SAMPANN since January 2019 till January 2020 and since February 2020 till July 2020 (Charts 61.1, 61.2, 61.3).
640
V. N. Tandon et al.
Chart 61.1 Pensioners on board in SAMPANN from January 2019 to January 2020
Chart 61.2 Pensioners on board in SAMPANN from February 2020 to till July 2020
The key success factors for implementation of the scheme were as follows: • Scalable and flexible design of SAMPANN which enabled accommodating the modifications • Direct data import from BSNL’s ERP • Online filling of forms by all retirees • Online forwarding of forms and documents by BSNL to CCA offices • Dedicated team effort by all CCA offices personnel
61 SAMPANN: Automated System for Pension Disbursal in India …
641
Chart 61.3 No. of BSNL VRS 2019 pensioners out of total pensioners in SAMPANN
• Dedicated support team at the headquarter • Timely infrastructure modifications • Timely monitoring by a high level committee. In fact, the work carried on smoothly even after the COVID-19 pandemic struck due to the online cloud nature of SAMPANN.
61.13 Conclusion SAMPANN has been implemented to crack all challenges in providing an automated system using cloud. The system speaks for itself as it not only manages pension for Department of Telecommunication but extends its functionalities to BSNL pensioners who have different functionality in Pension calculation and disbursal because of VRS. The paper clearly discusses the challenges and attributes faced and considered while designing and implementing the software. The sampling of the BSNL pensioners was also implemented in a very short time frame. So far, approximately Rs.15,625 Cr has been disbursed since the launch of SAMPANN. Provision for providing personalization for each department to customize and use the system for Accounting and Management of Pension for betterment makes the system scalable. The innovation becomes invention only when it is widely accepted and used by the users. India is a country that is going strongly in ICT by using all recent technologies and state-of-the-art techniques where virtualization and cloud play a major role. By using these technologies, we can move forward in reaching each citizen in every possible way, one such example is SAMPANN.
642
V. N. Tandon et al.
Bibliography 1. Agarwala, R., Sharma, R.K.: India’s pension system reform: Challenges and opportunities. International Conference on Social Security Policy: Challenges before India and South Asia, New Delhi, November (1999) 2. Anita, Kumar, P.: National pension system Swavalamban scheme. Asian J. Multidisciplinary Stud. 2(7), July (2014) 3. Biggs, A.G.: The public pension quadrilemma: The intersection of investment risk and contribution risk. J. Retirement 2(1), 115–127 (2014) 4. Barik, B.K.: Analysis of mutual fund pension schemes & national pension scheme (Nps) for retirement planning. Int. J. Business Admin. Res. Rev. 3(11), 108, July–Sep (2015) 5. Chatterjee, B.: Journey from my provident fund to our pension scheme. Employees’ Provident Fund Organisation, New Delhi (1996) 6. Bloom, D.E.: The design and implementation of public pension systems in developing countries: Issues and options. IZA Policy Paper No. 59, May (2013) 7. https://dotpension.gov.in/ (Sep 3, 2021) 8. Glaeser, E.L., Ponzetto, G.A.M.: Shrouded costs of government: The political economy of state and local public pensions. J. Public Econ. 116, 89–105 (2014) 9. Goswami, R., & Fellow.: Indian pension system: Problems and prognosis (2001) 10. https://gtmetrix.com/blog/ (Sep 21, 2021) 11. Keloth, S., Baskaran, M.: Evaluation of national pension scheme for retirement planning, 154–159 (2018) 12. Lutz: The fiscal stress arising from state and local retiree health obligations. J. Health Econ. 38, 130–146 13. Mehra, R., Prescott, E.C.: Chapter 14 the equity premium in retrospect. Handbook of the Economics of Finance, Financial Markets and Asset Pricing, Elsevier, 889–938, Mar 5. http:// www.sciencedirect.com/science/article/pii/S1574010203010239 (2020) 14. Okolie, O., Kuyoro, S., Lawal, A., Aina-Marshal, A.: Web-based pension fund management system. Int. J. Comp. Tech. 3, 223–233. https://doi.org/10.24297/ijct.v3i2a.6774 15. Otilia: Cloud accounting: A new business model in a challenging context. Procedia Econ. Finance 32, 665–671 (2015) 16. https://pensionersportal.gov.in/ (Sep 2, 2021) 17. Umachandran, K.: An automated pension benefits system—Systems study analysis and design. https://doi.org/10.13140/2.1.1035.3447 (2006)
Chapter 62
FOG Computing: Recent Trends and Future Challenges Shraddha V. Thakkar and Jaykumar Dave
Abstract The cloud computing is a technology which connects multiple nodes to each other in network. Cloud computing facilitate sharing of computational task and resources with clients and servers. It also operates in heterogeneous environment but the main issue is privacy and security of data. The data generated by IoT is also very large in size. If all the data is passed to cloud, the latency will increase. Fog computing was introduced to decrease the problems of cloud computing. Cloud computing is not replaced by fog computing but it is a service which enhances the cloud services by separating data from users which is need to live on a node in the outskirts. Fog computing provides the service of cloud computing data centers which collaborate on processing, storage, and networking to end users. Keywords Cloud computing · Cloud · Data center · Fog computing · CISCO
62.1 Introduction CISCO coined the phrase fog computing or fogging. Fog is the layer which resides between cloud and IoT device layer. IoT produces a large number of data which all cannot handle by cloud with following privacy security and latency. So, for resolving this issue fog has been introduced. Fog’s main goal is to solve problems faced by cloud during IoT data processing. Fog is useful when IoT deals with cloud. All the IoT devices are laid down in device layer as shown in Fig. 62.1; the physical devices operate, and for processing and storage, information is transmitted to the cloud. So, fog is similar with middleware or a middle layer in that it does some calculation, processing, and storing, at least S. V. Thakkar (B) Government Polytechnic Gandhinagar, Gujarat Technological University Ahmedabad, Ahmedabad, Gujarat, India e-mail: [email protected] J. Dave Sankalchand Patel College of Engineering, Sankalchand Patel University Visnagar, Visnagar, Gujarat, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_62
643
644
S. V. Thakkar and J. Dave
Fig. 62.1 Fog computing [1]
transient storage, before sending the data sensed by these devices to the cloud. This is the whole idea behind the use of a fog computing. The existing cloud computing model’s ability to manage large amounts of data is insufficient for fulfilling the requirement of the IoT. The main issues which arise with IoT and cloud is Volume of data, Latency, and bandwidth. Private factory, Aviation companies produces colossus of data everyday which cannot be handled by traditional cloud as it can be anywhere in continent. Data is also required to be filtered which can be done by fog computing architecture. Latency time is the time required for the data packet to a round trip. Some data is also time-sensitive data. If edge devices send time-sensitive data to the cloud to process it and then wait for the cloud to take appropriate action, it can result in a slew of undesirable outcomes. A millisecond can make a tremendous impact when dealing with time-sensitive data.
62.2 The Role of FOG Computing Fog computing stores the data, compute, and provide networking service between end devices and regular clouds. It filters data for the Cloud Data Center and performs an analysis to the end device [2].
62.2.1 Fog Computing’s Role in the Internet of Things (IoT) The amount of data which is emitted by IoT devices is quite huge. So, to handle this kind of huge data one should use big data analytics. Fog plays important role in data processing time from the network point of view.
62 FOG Computing: Recent Trends and Future Challenges
645
Fig. 62.2 FOG’s role in IoT and datacenters [3]
The fog computing system collects and processes data generated by end devices. Fog analyzes received data online and makes local decisions. The data collected by the IoT devices is kept on hand to monitor the situation. At the same time, the fog nodes must collect real-time data from the end nodes. Body sensors will continuously upload a vast amount of data in numerous body area networks, which are examples of applications (BAN). The data was processed, filtered, and aggregated by the fog node. The fog node will handle any emergency data uploaded by the ECG sensor as soon as possible, according to local policy. Fog can also store data locally and perform calculations as well as clever data analysis (Fig. 62.2).
62.2.2 Fog Computing’s Role in Data Center When any device is linked to Internet and produces data, it is possible that, due to existing duplicate data, it is no longer essential to synchronize the device or upload data for the cloud [4]. The cloud continues to analyze data and make global decisions using the fog model. Although it frequently makes more means to localize at the cutting edge, analysis, and decision-making, the cloud can also allocate a portion of jobs to the fog node. Also, if the fog does not require in-depth analyses, it simply filters the constrained data and sends it to the cloud with care [5]. As a result, the dissemination efficiency of a widely spread sensor network is important.
646
S. V. Thakkar and J. Dave
Fig. 62.3 FOG architecture [6]
62.3 The FOG Implementation Fog computing is based on a unique architecture for edge servers that provides constrained computing and storage, in addition to network service in a scattered fashion between the end-user computing and cloud computing Centers for Information Technology. Fog computing delivers end-user devices with logical intelligence as well as data filtering for Data Centers, with the main goal of achieving low latency among latency-sensitive situation with Internet of Things (IoT) applications, particularly in health care. Figure 62.3 depicts a fog computing architecture. In between the end device and the cloud lies fog as shown in the architecture, and each smart node is connected to any of the fog node. Fog nodes are low-performance devices that are installed at the edge of the network and perform activities such as storing and networking resource scheduling, as well as managing distributed processing. In this work, these devices are referred to as “Fogs.” The fog is diverse, devices such as mobile phone, smart watch, and sensor to high-end server, edge router, access point, set-top box, and sensor [2].
62.4 Securing Clouds with FOG All of the traditional security approaches have been shown to fail due to a variety of factors, likewise malfunction implementation, flawed coding, deviation service, insider attack, and the creative. Building a safe and dependable cloud computing infrastructure is insufficient since attacks on data might occur, and if information is lost, there is no way to recover it. In the event of an accident, solutions should be considered. The core idea is that by limiting the value of theft data for the attacker,
62 FOG Computing: Recent Trends and Future Challenges
647
one can reduce the harm caused by it. One can get there by taking “preventive” actions. By implementing the supplied advertising on the security feature, we can obtain cloud services.
62.5 FOG Computing Application Domains According to CISCO, there are several areas where fog computing plays an essential role [7].
62.5.1 Smart Grids A smart grid is an application that makes advantage of fog computing. The smart device can also be used for renewable energy sources such as wind and solar, depending on the energy requirement. The audience will receive directives from the edge, which will process the data collected by the fog collectors. The created data will be locally consumed before being passed on to a higher tier for visualization, real-time reporting, and transaction analysis. To the highest tier, fog computing provides temporary storage.
62.5.2 Connected Vehicle Self-driving vehicles are a new trend that is gaining traction on the road. Tesla is being used to enhance automatic control and enable “hands-free” vehicle operations. This feature is being used to test and deliver self-parking technologies that do not require a person to be behind the wheel. Because fog computing also allows for real-time interaction, new car on the road will be able to link to surrounding cars and the net in the near future. Cars, traffic signals, and access points can interact with each other via fog computing, allowing linked cars to save lives in car accidents.
62.5.3 Smart Traffic Lights By sensing the flashlights of an ambulance, fog computing assists traffic signals in opening lanes. It detects vehicles, pedestrians, and neighboring vehicles, as well as measuring their speed and space. By detecting movement, the sensor lights turn on and off. Fog devices use smart lights to synchronize and give warning signals to approaching vehicles. 3G, Wi-Fi, a roadside device, and a smart traffic light are used to create connection between the car and the access point.
648
S. V. Thakkar and J. Dave
62.6 FOG Computing Characteristics the FOG Implementation The following are the many fog computing characteristics: – Geographical Dispersal: Fog computing allows linked vehicles and access points to communicate more effectively with a location that is close by [7]. – Endpoints with bluecoat have edge location awareness, and low latency service at the edge benefit from Edge location awareness and reduced latency are two benefits of fog computing. – Real-time interaction: For quick service, fog computing necessitates real-time engagement. – Mobility: Fog devices provide flexibility by using the Cisco Locator ID Separation (LISP) protocol to divorce, and identification of the host is derived from the identity of the place. – Nodes Heterogeneity: Fog nodes are used in a wide range of scenarios due of their heterogeneity. – Fog components’ interoperability: Fog components can communicate with one another to provide a large number of services such as cascade. – Provide cloud-based support in online diagnostics and collaboration: The fog is critical for data absorption and processing (Table 62.1).
62.7 Literature Survey 62.8 Conclusion The vision and important elements of fog computing are discussed in this study. The fog is a consolidated architecture that enables the new evolution of latencydiplomatic IoT applications by executing data center tasks at the server’s periphery. Overcrowding and slowness can be addressed via fog computing. Fog computing also provides a user-friendly platform for dealing with the scattered and real-time nature of the growing Internet of Things (IoT) infrastructure. Using fog computing to implement this service at the edge will open up new commercial opportunities.
62.9 Future Work Future research will focus on local storing and computing solutions, as well as fog computing implementation in vehicle networks and other areas.
62 FOG Computing: Recent Trends and Future Challenges
649
Table 62.1 Literature survey No.
Source
Problem defined
Approach used
Results
1
Fog computing with P2P (Peer-to-Peer): enhancing fog computing with bandwidth IoT scenarios [8]
To improve network functionality, this study divides a peer-to-peer network architecture into different layers: client, fog, and cloud
The fog architecture improves fog computing by introducing a peer-to-peer (P2P) technique between fog layers, allowing fog nodes to collaborate and thereby meeting client needs
The system priorities requests that satisfy the majority of the users’ vicinity. The proposed model improves fog computing bandwidth
2
Fog-Assisted IoT-Enabled Patient Monitoring in Smart Homes [8]
This study demonstrates a fog rely on IoT health’s monitoring approach
The proposed architecture is a five-layer solution to several types of health surveillance: 1. Acquisition of data 2. Classification of event 3. Data mining 4. Decision establishing policy 5. Storage of cloud
Fog computing based on IoT is more efficient at passing sensitive patient data to the end user, allowing for more reliable monitoring
3
From cloud computing to fog computing: Platforms for Internet of Things [9]
The focus of this study is on end-device flexibility and edge computing
Fog computing, as opposed to centralized computing nodes, pushes computing power, applications, and data to boundary networks
Real-time interactions improve data processing efficiency and reduce connection delay
4
IoT-Fog-based Healthcare Framework to identify and control Hypertension [10]
The use for an algorithm and the sensor to detect and control hypertension attacks is discussed in this work
They employ a sensor to determine the location of the diet and heart rate monitoring to show the application of an artificial neural network algorithm. It keeps track of numerous things, including: • Stage classification • Temporary data granulation • Hypertension attack risk assessment • Alert generation
In this article, utilizing an IoT—fog-based health monitoring technique, blood pressure and other health metrics are continually monitored to identify the stage of hypertension
(continued)
650
S. V. Thakkar and J. Dave
Table 62.1 (continued) No.
Source
Problem defined
Approach used
5
Fog-Assisted IoT: A Smart Fog Gateway for End-to-End Analytics in Wearable Internet of Things[11]
This research provides an end-to-end approach to generate smart analytic of wearable data, which depends upon data and intelligence
Smart Gloves feature a The suggested flex sensor from Spectra device monitors Symbol hand motion and corrects diseases such as Parkinson’s disease by feeling it
Results
6
Fog computing in Healthcare Internet of Things: A Case study on ECG Feature Extraction [12]
This study illustrates the disadvantages of health monitoring systems, such as remote supervision and diagnosis portability, which cause patients and clinicians more inconvenience
(a) Medical sensor node (b) Fog computing service (c) Embedded operating system
This study proposes a system that provides location awareness and a graphical user interface with access management, as well as a method for extracting ECG properties
7
Health monitoring and tracking system for soldiers using Internet of Things [13]
The research depicts the injured soldier track on the border who is suffering from an injury at the moment, and if information is not provided, it could result in permanent disability or death
The algorithm is utilized in this research to determine the correct position of the injured soldier
Sensor is placed on the soldier’s body in the paper for keeping track of his or her health and whereabouts with the use of Global Positioning System
8
IoT-based Emergency Health Monitoring System [14]
Health monitoring of Patient system was always interrupted; therefore, the patient was continuously monitored
(i) Photoplethysmography Principle (ii) Algorithms for coding
In this article, using a wireless application, the danger of infection is minimized, mobility is improved, and patients may be monitored at the same time (continued)
62 FOG Computing: Recent Trends and Future Challenges
651
Table 62.1 (continued) No.
Source
Problem defined
Approach used
Results
9
A Secure and Energy Efficient Resource Allocation Scheme for Wireless Body Area Network [15]
The study discusses how smart health care may be made more efficient
The Star Topology Sign cryption Certificate less Algorithm was employed by the authors
It is applicable. The sensors utilized are energy efficient, and packets in the status of the queue buffer is queued and transferred. to the hub
References 1. https://internetofthingsagenda.techtarget.com/definition/fog-computing-fogging 2. Shi, Y., Ding, G., Wang, H., Roman H.E., Lu, S.: The fog computing service for healthcare. 2015 2nd International Symposium on Future Information and Communication Technologies for Ubiquitous HealthCare (Ubi-HealthTech), pp. 1–5 (2015). https://doi.org/10.1109/Ubi-Hea lthTech.2015.7203325 3. https://medium.com/dataseries/a-primer-on-edge-computing-3ef550c3d84e 4. Aazam, M., Hung, P., Huh, E.: Smart gateway based communication for cloud of things.IEEE, 1–6 (2014) 5. Fog Computing—clearly the way forward for IoT [Online]. Available: http://blog.opengear. com/fog-computing-clearly-the-way-forward-foriot 6. https://www.omnisci.com/technical-glossary/fog-computing 7. Bhardwaj, S., Tomer, S.: Fog computing: A survey. Int. J. Eng. Res. Tech. (IJERT). ISSN: 2278–0181 Published by, www.ijert.org ICADEMS (2017) 8. Verma, P., Sood, S.: Fog assisted-IoT enabled patient health monitoring in smart homes. IEEE Internet of Things J., 1 (2018). https://doi.org/10.1109/JIOT.2018.2803201 9. Ahuja, S.P., Deval, N.: From cloud computing to fog computing. Int. J. Fog Comp. 1(1), 1–14 (2018). https://doi.org/10.4018/ijfc.2018010101 10. Sood, S., Mahajan, I.: IoT-fog-based healthcare framework to identify and control hypertension attack. IEEE Internet of Things J. 1. https://doi.org/10.1109/JIOT.2018.2871630 11. Constant, N., Borthakur, D., Abtahi, M., Dubey, H., Mankodiya, K.: Fog assisted wIoT: A smart fog gateway for end-to-end analytics in wearable Internet of Things. ArXiv, abs/1701.08680 12. Gia, N., Tuan, Jiang, M., Rahmani, A.M., Westerlund, T., Liljeberg, P., Tenhunen, H.: Fog computing in healthcare internet-of-things: a case study on ECG feature extraction (2015). https://doi.org/10.1109/CIT/IUCC/DASC/PICOM.2015.51 13. Patii, N., Iyer, B.: Health monitoring and tracking system for soldiers using Internet of Things(IoT). 2017 International Conference on Computing, Communication and Automation (ICCCA), Greater Noida, pp. 1347–1352, (2017). https://doi.org/10.1109/CCAA.2017. 8230007 14. Ruman, M.R., Barua, A., Rahman, W., Jahan, K.R., Jamil Roni, M., Rahman, M.F.: IoT based emergency health monitoring system. 2020 International Conference on Industry 4.0 Technology (I4Tech), pp. 159–162 (2020). https://doi.org/10.1109/I4Tech48345.2020.910 2647 15. Abiramy, V., Smilarubavathy, G., Rangaranjan, N., Dinesh, A.: A secure and energy efficient resource allocation scheme for wireless body area network, 729–732 (2018). https://doi.org/ 10.1109/ISMAC.2018.8653789
Chapter 63
Study of Fake News Detection Techniques Using Machine Learning Debalina Barik, Sutirtha Kumar Guha, and Shanta Phani
Abstract In today’s world, due to the emerging use of social media platforms, fake news spreads like a bonfire in the online world. Social media and various other news media broadcast fake news to increase viewership and readership among people to make the post trending. The people get easily attracted to fake news psychologically. In this paper, we have proposed a new fake news detection method with most frequent 1000 words in a corpus, consisting of the statements of an open freely accessible dataset named Liar (Wang in “Liar, liar pants on fire”: a new benchmark dataset for fake news detection, 2017 [18]). This paper also aims at studying different standard and basic techniques useful for text classification in the context of fake news detection. We have studied the performance of basic machine learning (ML) algorithms using basic lexicon-based features being implemented over standard fake news detection dataset “Liar” (Wang in “Liar, liar pants on fire”: a new benchmark dataset for fake news detection, 2017 [18]). We have also studied the effect of the feature–classifier combination for a Bengali fake news detection dataset, “BanFakeNews” (Hossain et al. in BanFakeNews: a dataset for detecting fake news in Bangla, 2020 [8]) as well. Keywords Fake news detection · Text classification · Text mining · TF · TF-IDF · Machine learning
63.1 Introduction In this digital era, we come across a lot of information from various news media, social networking sites, etc. All the news that we see are not real, and they may be false, spurious information furnished as news which in other words is called the fake or false news. Fake news aims at ruining the goodwill of a person or entity, or D. Barik · S. Phani (B) Bengal Institute of Technology, Kolkata, West Bengal, India e-mail: [email protected] S. K. Guha Meghnad Saha Institute of Technology, Kolkata, West Bengal, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_63
653
654
D. Barik et al.
making money through deceptive publicity. Information can be of broadly classified into three categories: (i) Misinformation: false information communicated publicly without doing any harm to others. (ii) Disinformation: false information created and shared by people with an objective of doing some harm to others. (iii) Malinformation: deliberately changing the context and content of an information with the intent of causing harm to others. We come across all the above three types of information in our daily life, from different news media, of which social media has become the easiest source of information sharing worldwide. Information is spread in social at a very low cost and rapidly. This has steered people to search for news from social media nowadays. Social media enables the extensive spread of “fake news” which is false, misleading and low-quality information. Due to these reasons, identifying false news has gained tremendous importance. Many researchers have proposed their works where they have proposed effective methods to solve the problem. Some of these works have been studied and detailed in Sect. 63.2. Many researchers have used machine learning (ML)-based techniques. In such type of solutions, the datatype plays an important role. The wide spread of fake news adversely affects the individuals and society to a great extent. Fake news intentionally convince consumers to trust false beliefs and take biased decisions. Fake news is usually manipulated by propagandists to spread political messages, communal messages or faithless statements. For example, some fake news is spread to confuse the people or society. People lose the ability to differentiate between true news from fake news. The nature of the people is also changing abruptly for the wide spread of fake news worldwide. Fear, insecurity and distrust are the implicit effects seen among individuals nowadays. In this paper, we have tried to analyse such state-of-the-art datasets, using simple lexicographic features for ML-based algorithms. We have analysed two datasets, namely Liar [18] and BanFakeNews [8]. In each of the datasets, we examined the statements and their corresponding labels against each document ID. From the statements, we extracted 1000 most frequent words and arranged them in descending order of their occurrences. We calculated the term frequency (TF) and term frequency inverse document frequency (TF-IDF) of each word against each document ID. From TF-IDF, we tried to predict the labels for each document ID in the test set and classified the statements as fake or true. We have divided the dataset into training set and test set in various ways, such as 80% training and 20% testing set, 75% training and 25% test set and 70% training and 30% test set and tested the accuracy of our proposed method using basic ML algorithms. The rest of the paper is arranged as follows. Section 63.2 outlines a brief literature review on fake news detection. Section 63.3 outlines the details of the Liar [18] dataset and the classifiers that we have used in this research paper. Section 63.4 describes our proposed method for fake news detection. Section 63.5 describes the performance metrics that we have used to verify our results. Section 63.6 illustrates the results that we have obtained on different classifiers and at different scenarios, our key observations and a brief discussion of our results. Section 63.7 concludes our paper.
63 Study of Fake News Detection Techniques Using Machine Learning
655
63.2 Related Works Identification and distinction of fake news from the real thus have become a burning topic in research. Survey papers like [11, 20] give a glimpse of the several fake news detection techniques being used by the researchers over the years. Oshikawa et al. in their paper [11] have systematically reviewed and compared the task formulations, datasets and NLP solutions for fake news detection and also discussed the potentials and limitations of them. They have used three datasets: Liar [18], Fever [16] and Fakenewsnet [14]. They have reviewed the effect of different classification models like non-neural network models and neural network models on these three datasets. Many researchers like Castilo, Mendoza and Poblete in their paper [3] have used the words denoting sentiments as a text feature and came to the conclusion that the sentimental tweets generally indicate more erroneous information. Vlachos and Riedel in their paper [17] first introduced fake news and fact-checking datasets. They collected 221 statements from Channel1 and Politifact.com.2 They examined contemporary fact-checks from each website at the time of writing. They recorded the date it was made, the speaker, the label of the verdict and the URL. They aligned the labels of the verdicts to a five-point scale: true, mostly-true, half-true, mostly false and false. Shu et al. in their paper [15], reviewed existing fake news detection approaches from a data mining perspective, including feature extraction and model construction. Conroy et al. in their paper [4] drafted a typology of methods available for further refinement and evaluation, and provided a basis for designing an exhaustive fake news detection tool. Zhang et al. in their paper [19] have proposed a deep diffusive network model named Gated Diffusive Unit (GDU) on the basis of interrelation among various news articles, creators and news subjects. They have done experiments on a PolitiFact network dataset and illustrated the performance of their proposed model in determining the fake news articles, creators and news subjects in the network. da Silva et al. in their paper [6] reviewed different methods of handling the problems associated with the detection of fake news, rumours and misinformations. They have reviewed the ML techniques, which involved composite classifiers like neural networks. They have focused on lexical analysis of the data for prediction purpose. As a preliminary step, they have also used external contextual information (e.g. topologic distribution of microblogging entries, user’s profiles, social media metrics, etc.) to improve their classification results. For fake news detection, several researchers have introduced and used several standard datasets. Wang in his paper [18] has introduced a dataset for fake news detection named “Liar” and performed fake news detection using bidirectional long short-term memory networks (Bi-LSTM) on the “Liar” dataset. Hossain et al. in their paper [8] have introduced the first labelled dataset on Bangla fake news named “BanFakeNews” and evaluated the results with linear classifiers and neural network models. They illustrated that the performance of linear classifiers with conventional linguistic features surpasses the neural network-based models. 1 2
http://blogs.channel4.com/factcheck/. http://www.politifact.com/.
656 Table 63.1 Features of the Liar [18] dataset Serial No. of columns 1 2 3 4 5 6 7 8 9–13 14
D. Barik et al.
Fields The ID of the statement ([ID].json) The label The statement The subject(s) The speaker The speaker’s job title The state info The party affiliation The total credit history count, including the current statement The context (venue/location of the speech or statement)
63.3 Fake News Detection In this paper, we have tried to analyse two standard fake news datasets, Liar [18] and a Bengali dataset, BanFakeNews [8] using basic lexical features and ML classification algorithms. The main objective of this paper is to study the performance of these feature–classifier combinations for the dataset Liar [18] and then verify whether we get the same performance over the dataset BanFakeNews [8].
63.3.1 Liar Dataset We have used the Liar [18] dataset available at Kaggle.3 The dataset includes both fake and genuine news statements from different domains. Table 63.1 shows the fields, i.e. the features of the “Liar” [18] dataset. It consists of fourteen features. The data is classified into six labels, namely “half-true”, “false”, “mostly-true”, “true”, “barelytrue” and “pants-fire”, as denoted by the second field of the said dataset. For our work, we have considered only three features, namely “ID”, “label” and “statement”. This Liar [18] dataset is divided into three parts, namely training set, validation set and test set. The data size of the training, validation and testing sets are shown in Table 63.2. The number of rows present with each label in training, validation and test datasets of Liar [18] is shown in Tables 63.3, 63.4 and 63.5, respectively.
3
http://www.kaggle.com/.
63 Study of Fake News Detection Techniques Using Machine Learning Table 63.2 Liar [18] dataset statistics Division of dataset Training set size Validation set size Testing set size
Size (no. of rows) 10,269 1284 1283
Table 63.3 Label statistics for training set of Liar [18] dataset Labels Size (no. of rows) Half-true False Mostly-true True Barely-true Pants-fire
2114 1995 1962 1676 1654 839
Table 63.4 Label statistics for valid set of Liar [18] dataset Labels Size (no. of rows) False Mostly-true Half-true Barely-true True Pants-fire
263 251 248 237 169 116
Table 63.5 Label statistics for test set of Liar [18] dataset Labels Size (no. of rows) Half-true False Mostly-true Barely-true True Pants-fire
265 249 241 212 208 92
657
658
D. Barik et al.
63.3.2 Feature Extraction We have used basic lexical features which have been found to be showing great results in any text classification problem. We have used TF and TF-IDF in this paper. TF and TF-IDF are often used in information retrieval (IR) using natural language processing (NLP) techniques and text mining. TF denotes how frequently a word occurs in a particular document with respect to total words in the document. TF-IDF is a scientific measure to show how important a particular word is to a particular document present in a corpus.
63.3.3 Classifiers We have considered non-neural network models. We have tested the accuracy of our test data using supervised machine learning algorithms such as logistic regression (LR), support vector machine (SVM), decision tree classifier (DT), random forest classifier (RF) and k-nearest neighbour classifier (KNN). – Logistic Regression (LR): LR is an algorithm which uses logistic function to be able to decide which class a data point will belong to. A minimized cost function is considered in LR. – Support Vector Machine (SVM): This [2, 5] is a supervised ML algorithm which is used in case of classification as well as regression problems. SVM is used to determine a hyperplane in an N-dimensional space that clearly classifies the data points under consideration. The dimension of the hyperplane is determined by the number of features that we consider for our problem. We have used three different kernels for SVM like linear, polynomial and radial basis function (RBF). – Decision Tree Classifier (DT): Decision trees are another type of supervised ML algorithm which is also used for both classification and regression problems. In decision trees, the data under consideration is continuously split according to definite parameters, and in each step, a true and false decision is encountered. The splitting occurs based on classification features. – Random Forest Classifier (RF): In RF, we come across several decision trees on each subset of the given dataset. The splitting in the decision trees is based on random subsets of features instead of taking all of them under consideration. – k-Nearest Neighbours (KNN): KNN is a distance-based classifier, where each new data point is compared to the existing data points, i.e. its neighbours. The class of the existing data point with the shortest distance to the new data point is assigned to the new one [9]. The “k” indicates the amount of neighbours that will be checked when a new data point has to be assigned a label. We have considered the value of k as “3” and “5” in our paper.
63 Study of Fake News Detection Techniques Using Machine Learning
659
63.4 Methodology We have combined all the three sets of data, i.e. training, valid and test data of the Liar [18] dataset into a single set, shuffled the dataset and created a new training and test set, to remove any kind of biasness of the data. We have performed the following steps before training our model: – Took the Liar [18] dataset as a dataframe in Python and used only the ID, Label and Statements as our features. – Combined all the statements of the entire dataset into a corpus. – We have performed noise removal for our dataset. We have removed punctuations, symbols, numbers and spaces. Then we have removed stopwords present in English language using NLTK [10] Library. – Counted the occurrences of each word present in the corpus and arranged them in descending order. – From the list, we took only the first 1000 most frequent word unigrams. – Built a dataframe with ID, 1000 words and Label corresponding to each ID. So we had, 1002 columns in the dataframe. – Then we have calculated TF and TF-IDF, respectively, for each of the 1000 words occurring in each statement against the corresponding ID fields. – Coded six labels present in the Liar [18] dataset as 0–5, 0–1, 0–2, respectively, and inserted into the label field in our dataframe. – Divided the combined dataset into training and test ratios such as 0.7–0.3, 0.8– 0.2, 0.75–0.25 using Scikit-learn library [12] and analysed the accuracy of our proposed method using different ML classifiers. – All the columns except the label field are fed into the training the classifiers as independent variables. The label field is fed as the dependent variable. – Trained our proposed model using all the machine learning classifiers like LR, RF, linear SVM, polynomial SVM, radial SVM, decision tree, KNN of size of k = 3 and k = 5.
63.5 Performance Metrics Different metrics are used in this paper to evaluate the performance of the basic classifier algorithms. The metrics used by us are discussed below: – Accuracy: Accuracy is the most commonly used metric, which represents the total percentage of accurately predicted observations. Accuracy is used to calculate the performance of a model. The following is the equation for accuracy: Accuracy =
TP+TN TP+TN+FP+FN
(63.1)
660
D. Barik et al.
Here, TP, TN, FP and FN stand for true positives, true negatives, false positives and false negatives, respectively. In most of the cases, high accuracy value decides that the model is very good, but since we are training a classification model, a statement that is predicted as true, but it is actually false (false positive) can have negative impact; similarly, if a statement is predicted as false, but it contained authentic data, it can generate trust issues. Therefore, we have considered the following four other metrics that help us to determine the incorrectly classified observations: – Precision: It is the ratio of true positives to the total predicted positives (TP plus FP) expressed as (63.2) Precision =
TP TP+FP
(63.2)
– Recall: The recall is the ratio of true positives to the total number of data present in the positive class, i.e. true positives plus the false negatives. The equation for recall is shown below: Recall =
TP TP+FN
(63.3)
– F1-score: F1-score is the combination of precision and recall. It is the harmonic mean of precision and recall. F1-score is calculated using the following formula: F1-score = 2 ×
Precision×Recall Precision+Recall
(63.4)
– Root Mean Square Error (RMSE): RMSE which is also known as root-meansquare deviation (RMSD) is commonly used to measure the differences between predicted values by a model and the actual values observed. RMSE is always nonnegative, and a value of zero indicates a perfect fit to the data. So, a lower RMSE value is better than a higher RMSE value.
63.6 Results The classifiers used by us are compared based on precision, recall, F1-score and RMSE. We have numerically coded the six labels of Liar [18] dataset as shown in the four cases under consideration. For Case 1, shown in Sect. 63.6 we have done an additional analysis by splitting the data into training and test sets using Scikit-learn Library [14] function into three different ratios: 0.8–0.2, 0.75–0.25 and 0.7–0.3 and as per the following cases: – Case 1: scores = {‘true’: 1, ‘mostly-true’: 1, ‘half-true’: 0, ‘barely-true’: 0,
63 Study of Fake News Detection Techniques Using Machine Learning
661
‘false’: 0, ‘pants-fire’: 0} – Case 2 scores = {‘true’: 1, ‘mostly-true’: 1, ‘half-true’: 1, ‘barely-true’: 0, ‘false’: 0, ‘pants-fire’: 0} – Case 3 scores = {‘true’: 1, ‘mostly-true’: 0, ‘half-true’: 0, ‘barely-true’: 0, ‘false’: 0, ‘pants-fire’: 0} – Case 4 scores = {‘true’: 0, ‘mostly-true’: 1, ‘half-true’: 2, ‘barely-true’: 3, ‘false’: 4, ‘pants-fire’: 5} Table 63.6 summarizes the results that we have obtained when we applied each of the above-mentioned ML algorithms on our model. First the Liar [18] dataset is split into 80–20%, i.e. 80% training data, 20% test data, and we have considered Case 1 (Sect. 63.6) coded labels. In this case, RF, linear SVM and polynomial SVM have shown the highest accuracy of 84% and precision of more than 70%. Table 63.7 summarizes the results that we have obtained after training our model on each ML algorithm for 75–25% split of training data, test data, respectively, for the Liar [18] dataset. In this case, the highest accuracy obtained is 66% with a precision of 77% in case of linear SVM. Table 63.8 summarizes the results we have obtained for each ML algorithm on the Liar [18] dataset for 70–30% split of training data and test data respectively considering Case 1 (Sect. 63.6). Here the accuracy and precision are almost the same as the previous split. Table 63.9 summarizes the results obtained by splitting the training and test data into 80% and 20%, respectively, for Case 2 (Sect. 63.6) coded labels. Here the accuracy and precision are a bit lower than that in Case 1. We have obtained a maximum accuracy of 61.2% for RF and radial SVM, with a precision of 60.9%. Table 63.10 summarizes the results obtained by splitting the training and test data into 80% and 20%, respectively, for Case 3 (Sect. 63.6) coded labels. We have obtained a maximum accuracy of 84.8% for LR, linear and radial SVM with the highest precision of 76.9%. Table 63.11 summarizes the results obtained by splitting the training and test data into 80% and 20%, respectively, for Case 4 (Sect. 63.6) coded labels. Here the accuracy and precision are the lowest compared to all the four cases under consideration.
662
D. Barik et al.
Table 63.6 Classification results on 80–20% split for Case 1 (Sect. 63.6) Train–test Classifier Accuracy Precision Recall F1-score split ratio used 0.8–0.2
LR RF Linear SVM Polynomial SVM Radial SVM Decision tree KNN (k = 3) KNN (k = 5)
0.657 0.847 0.848 0.841
0.626 0.769 0.718 0.742
0.657 0.847 0.848 0.841
0.556 0.778 0.778 0.777
0.586 0.391 0.390 0.399
0.660 0.586
0.630 0.584
0.660 0.586
0.589 0.585
0.583 0.644
0.615
0.579
0.615
0.585
0.620
0.625
0.580
0.625
0.581
0.612
Table 63.7 Classification results on 75–25% split for Case 1 (Sect. 63.6) Train–test Classifier Accuracy Precision Recall F1-score split ratio used 0.75–0.25
LR RF Linear SVM Polynomial SVM Radial SVM Decision tree KNN (k = 3) KNN (k = 5)
RMS error
RMS error
0.662 0.657 0.654 0.654
0.640 0.617 0.774 0.608
0.662 0.657 0.654 0.654
0.563 0.582 0.518 0.558
0.581 0.586 0.588 0.588
0.666 0.583
0.639 0.580
0.666 0.583
0.595 0.582
0.578 0.645
0.611
0.576
0.611
0.583
0.623
0.621
0.577
0.621
0.581
0.615
We have obtained a maximum accuracy of about 25% for LR and radial SVM with the highest precision of about 29.7%. We have further examined our approach on a Bengali dataset on fake news named BanFakeNews [8]. We have combined the training, valid and test datasets into a single dataset to avoid any biasness on data. Then we have removed the Bengali stopwords which we have obtained from [1] and used other data preprocessing steps like removal of punctuations and spaces. Then we have calculated the TF, TF-IDF of 1000 frequently occurring Bengali words in the statements of BanFakeNews [8] dataset. We have divided the training and test data into 80% and 20%, respectively, using Scikit-learn Library [9]. We have trained our proposed model using all the clas-
63 Study of Fake News Detection Techniques Using Machine Learning Table 63.8 Classification results on 70–30% split for Case 1 (Sect. 63.6) Train–test Classifier Accuracy Precision Recall F1-score split ratio used 0.7–0.3
LR RF Linear SVM Polynomial SVM Radial SVM Decision tree KNN (k = 3) KNN (k = 5)
LR RF Linear SVM Polynomial SVM Radial SVM Decision tree KNN (k = 3) KNN (k = 5)
RMS error
0.663 0.664 0.656 0.655
0.641 0.631 0.774 0.606
0.663 0.664 0.656 0.655
0.563 0.589 0.519 0.558
0.580 0.580 0.587 0.587
0.666 0.573
0.636 0.576
0.666 0.573
0.597 0.574
0.578 0.653
0.605
0.572
0.605
0.580
0.628
0.621
0.579
0.621
0.584
0.615
Table 63.9 Classification results on 80–20% split for Case 2 (Sect. 63.6) Train–test Classifier Accuracy Precision Recall F1-score split ratio used 0.8–0.2
663
RMS error
0.603 0.612 0.599 0.571
0.604 0.609 0.614 0.587
0.603 0.612 0.599 0.571
0.574 0.599 0.548 0.481
0.630 0.623 0.634 0.655
0.612 0.542
0.609 0.542
0.612 0.542
0.598 0.542
0.623 0.676
0.533
0.548
0.533
0.531
0.683
0.525
0.543
0.526
0.522
0.689
sifiers that we have discussed in Sect. 63.3.3. The results are illustrated in Table 63.12. We have obtained an accuracy of 99.4% using RF classifier.
63.6.1 Key Observations 1. While calculating the TF-IDF of each statement with the 1000 most frequently occurring words in the Liar [18] dataset, we got a very sparse matrix because only a few words are present in the statement against a particular ID field. The
664
D. Barik et al.
Table 63.10 Classification results on 80–20% split for Case 3 (Sect. 63.6) Train–test Classifier Accuracy Precision Recall F1-score split ratio used 0.8–0.2
LR RF Linear SVM Polynomial SVM Radial SVM Decision tree KNN (k = 3) KNN (k = 5)
0.848 0.847 0.848 0.841
0.718 0.769 0.718 0.742
0.848 0.847 0.848 0.841
0.778 0.778 0.778 0.777
0.390 0.391 0.390 0.399
0.848 0.742
0.718 0.754
0.848 0.742
0.778 0.748
0.390 0.507
0.804
0.743
0.805
0.769
0.442
0.832
0.753
0.832
0.780
0.409
Table 63.11 Classification results on 80–20% split for Case 4 (Sect. 63.6) Train–test Classifier Accuracy Precision Recall F1-score split ratio used 0.8–0.2
LR RF Linear SVM Polynomial SVM Radial SVM Decision tree KNN (k = 3) KNN (k = 5)
RMS error
RMS error
0.250 0.250 0.227 0.227
0.297 0.272 0.243 0.288
0.250 0.250 0.227 0.227
0.227 0.238 0.168 0.151
1.804 1.835 1.701 1.635
0.251 0.207
0.286 0.207
0.251 0.207
0.238 0.207
1.837 2.112
0.203
0.224
0.204
0.197
2.239
0.194
0.199
0.195
0.191
2.158
missing words in the statement under consideration are filled up with zero, hence the resultant array became sparse. 2. The method proposed by us to detect fake news when applied to other datasets like “BanFakeNews” [8], an accuracy of 95–99% is obtained. 3. Without using traditional NLP techniques like “bag of words” model, lemmatization, stemming, we are getting a decent accuracy for our proposed method. 4. The Liar [18] dataset has six labels. So, the accuracy is very low, i.e. 25% when we index the labels from 0 to 5. But when we map the six labels to fewer number of indices, we get a high accuracy. For example, if the six labels are relabelled as true (1) and false (0) only, we get a high accuracy of 84%.
63 Study of Fake News Detection Techniques Using Machine Learning
665
Table 63.12 Classification results on 80–20% split for the BanFakeNews [8] dataset Train–test Classifier Accuracy Precision Recall F1-score RMS error split ratio used 0.8–0.2
LR RF Linear SVM Polynomial SVM Radial SVM Decision tree KNN (k = 3) KNN (k = 5)
0.955 0.994 0.955 0.97
0.911 0.994 0.911 0.968
0.955 0.994 0.955 0.97
0.933 0.994 0.933 0.964
0.213 0.079 0.213 0.174
0.973 0.982
0.974 0.984
0.973 0.982
0.968 0.982
0.164 0.135
0.959
0.954
0.96
0.956
0.201
0.966
0.962
0.966
0.96
0.183
63.6.2 Discussion We have achieved a very high accuracy and precision when we coded the labels of Liar [18] dataset as only “0” for false and “1” for true statements. We have also checked our proposed method with fake news dataset for Bengali language, namely BanFakeNews [8], which is freely available at Kaggle. We got an accuracy between 95 and 99% on BanFakeNews [8] dataset using our proposed method.
63.7 Conclusion In this paper, we have proposed a fake news detection method and checked its results on the “Liar” [18] dataset and “BanFakeNews” [8] dataset. We have calculated TF and TF-IDF of the most frequently occurring 1000 words in the “statement” feature of the Liar [18] dataset. Based on the TF-IDF value of these words, we tried to predict the label for each record in the test set. We have obtained a substantial accuracy and precision after using our proposed method. We found a significant rise in accuracy while using our fake news detection model on fewer coded labels on the dataset. The maximum reached to 84.8% when be scaled down the six labels of Liar [18] dataset to only two labels, i.e. true (1) and false (0). Also in case of “BanFakeNews” [8] dataset, we have got a significant accuracy using our model. We have used the basic ML algorithms in this paper. We would further like to test our proposed method using Bidirectional Encoder Representations from Transformers (BERT) [7], recurrent neural network (RNN) and long short-term memory (LSTM) [13] network.
666
D. Barik et al.
References 1. Bengali stopwords. https://www.ranks.nl/stopwords/bengali. Accessed 20 Nov 2021 2. Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pp. 144–152 (1992) 3. Castillo, C., Mendoza, M., Poblete, B.: Information credibility on twitter. In: Proceedings of the 20th International Conference on World Wide Web, pp. 675–684 (2011) 4. Conroy, N.K., Rubin, V.L., Chen, Y.: Automatic deception detection: methods for finding fake news. Proc. Assoc. Inf. Sci. Technol. 52(1), 1–4 (2015) 5. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995) 6. da Silva, F.C.D., Vieira, R., Garcia, A.C.: Can machines learn to detect fake news? A survey focused on social media. In: Proceedings of the 52nd Hawaii International Conference on System Sciences (2019) 7. Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) 8. Hossain, M.Z., Rahman, M.A., Islam, M.S., Kar, S.: BanFakeNews: a dataset for detecting fake news in Bangla. arXiv preprint arXiv:2004.08789 (2020) 9. Nearest neighbors. http://scikit-learn.org/stable/modules/neighbors.html. Accessed 30 Sept 2020 10. Nltk 3.6 documentation. https://www.nltk.org/. Accessed 30 Sept 2020 11. Oshikawa, R., Qian, J., Wang, W.Y.: A survey on natural language processing for fake news detection. arXiv preprint arXiv:1811.00770 (2018) 12. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011) 13. Sherstinsky, A.: Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Phys. D Nonlinear Phenom. 404, 132306 (2020) 14. Shu, K., Mahudeswaran, D., Wang, S., Lee, D., Liu, H.: FakeNewsNet: a data repository with news content, social context and spatialtemporal information for studying fake news on social media. arXiv preprint arXiv:1809.01286 (2018) 15. Shu, K., Sliva, A., Wang, S., Tang, J., Liu, H.: Fake news detection on social media: a data mining perspective. ACM SIGKDD Explor. Newsl. 19(1), 22–36 (2017) 16. Thorne, J., Vlachos, A., Christodoulopoulos, C., Mittal, A.: Fever: a large-scale dataset for fact extraction and verification. arXiv preprint arXiv:1803.05355 (2018) 17. Vlachos, A., Riedel, S.: Fact checking: task definition and dataset construction. In: Proceedings of the ACL 2014 Workshop on Language Technologies and Computational Social Science, pp. 18–22 (2014) 18. Wang, W.Y.: “Liar, liar pants on fire”: a new benchmark dataset for fake news detection. arXiv preprint arXiv:1705.00648 (2017) 19. Zhang, J., Dong, B., Philip, S.Y.: Fakedetector: effective fake news detection with deep diffusive neural network. In: 2020 IEEE 36th International Conference on Data Engineering (ICDE), pp. 1826–1829. IEEE (2020) 20. Zhou, X., Zafarani, R.: A survey of fake news: fundamental theories, detection methods, and opportunities. ACM Comput. Surv. (CSUR) 53(5), 1–40 (2020)
Chapter 64
Identification of Skin Diseases Using Deep Learning Architecture K. Himabindu, C. N. Sujatha, G. Chandi Priya Reddy, and S. Swathi
Abstract Dermatology is the branch of medicine dealing with the skin. Skin diseases vary from place to place and from season to season and are the more common disease that everyone faces. The cause for the disease may be of fungal, bacterial, viral, or many other reasons. But proper identification of the disease is the most difficult part related skin diseases. Sometimes the skin disease may be identification or symptom to chronic disease. In a vast country like India, educating the people remote areas about skin disease is a difficult task. So a system with ability to identify skin disease will be of a great help in giving timely treatment to all people irrespective of their location. The aim of the project is to develop a system by using concept of deep learning that is able to identify the seven skin disease by the image of the affected area. The process involves steps like choosing the dataset and developing a model that can predict the skin disease with more accuracy and least loss. Keywords HAM dataset · AlexNet architecture · Adagrad · Accuracy · Loss
64.1 Introduction Dermatology remains the most complicated and uncertain branch of science due to the complexities in the process involved in the diagnosis of diseases affecting the skin, hair, and nails. Skin disease diagnosis entails a series of pathological lab tests to determine the right disease. These diseases have been a source of concern for the past ten years, as their sudden emergence and complexities have increased the risk of death. These skin abnormalities are highly contagious and must be treated as soon as possible to prevent them from spreading, as many of them are fatal if not treated promptly. In the present, existing situation diagnosis is dependent on doctor. This method is not available to everyone, and it has resulted in an increase in computer-based diagnosis. Many models are being developed in this area using
K. Himabindu · C. N. Sujatha (B) · G. Chandi Priya Reddy · S. Swathi Sreenidhi Institute of Science and Technology, Hyderabad, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_64
667
668
K. Himabindu et al.
various techniques like image processing, deep learning, and machine learning. In this paper, we will talk about how to create a model using the deep learning concept. Deep learning is a subset of machine learning that involves the construction of a three- or more-layered neural network. The built neural networks try to simulate the behavior of the human brain, with limited success, allowing it to “learn” from extremely huge amounts of data. A single-layer neural network can still make approximate predictions, but adding additional hidden layers can help optimize and refine for accuracy. Most of the artificial intelligence (AI) applications and services use deep learning techniques to improve automation by performing analytical and physical tasks without human intervention. Out of many types of learning techniques, supervised learning is the mostly used learning. Supervised learning is a type of machine learning technique in which machines are trained using labeled training data and then predict output based on that labeled data. The primary goal of a supervised learning algorithm is to find a mapping function that maps the input variable (x) to the output variable (y). This paper is about building a model by using the AlexNet architecture.
64.2 Literature Survey “The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions” is the paper written by Philipp Tschandl, Harald Kittler & Cliff Rosendahl. In this paper, they discussed about the HAM10000 dataset. They discussed about how the HAM10000 dataset was created and the already existing datasets. It also gives brief information about the diseases in dataset and their diagnosis process [1]. “Melanoma Skin Cancer Detection using CNN AlexNet Architecture” is the paper written by Shikha Rai A, T Nithin Kumar, Velentina Laira Veigas, Joswin Mark Monteiro and Namitha. In this paper, they discussed about building a CNN model using AlexNet architecture in order to predict whether the given skin disease in melanoma or not. With their model, they got the test accuracy of 70% [2]. Agnes Lydia, F. Sagayaraj Francis wrote the paper “Adagrad—An Optimizer for Stochastic Gradient Descent”. In this paper, they discussed about the various optimizers available and how their performance changes with the change in the hyper parameters [3]. In the paper, “Automated Skin Disease Identification using Deep Learning Algorithm” written by Sourav Kumar Patnaik, Mansher Singh Sandhu they discussed about building a model by using Inception V3, InceptionResnetV2, and Mobile net architecture. They built model with their own dataset by using the architecture, and the output is determined by using the voting from each model dataset and got the accuracy of 88% with an ability to predict 20 skin diseases [4]. Parvathaneni Naga Srinivasu, Muhammad Fazal Ijaz and Jalluri Gnana Sivasai wrote the paper with title “Classification of Skin Disease Using Deep Learning Neural Networks with MobileNet V2 and LSTM”. In this paper, they discussed about building a model by using HAM10000 dataset. They choose MobileNet V2 and LSTM architecture and
64 Identification of Skin Diseases Using Deep Learning Architecture
669
Fig. 64.1 Proposed model for training and testing phases
attained the accuracy of 85% [5]. The paper “A Smartphone-Based Skin Disease Classification Using MobileNet CNN” written by Jessica Velasco, Cherry Pascion, Jean Wilmar Albe-rio discuss about the development smart phone application to predict seven skin diseases using MobileNet architecture [6]. “Skin lesion/cancer detection using deep learning” written by Neema M, Arya S Nair, Asiya Haris, Amal Pradeep Menon, and Annette Joy discussed about building a deep convolutional neural network model to classify the melanoma disease into a malignant or benign class. They trained their model using HAM10000 dataset and gave the images from different datasets to predict the output and got 70% accuracy [7]. Figure 64.1 gives an overview of the proposed model for skin cancer detection. The paper “Skin Disease Detection: Machine Learning vs Deep Learning” written by Payal Bose, Prof. Amiya Bhau-mik, Dr. Sandeep Poddar discuss about three machine learning and three deep learning techniques. They trained the model by using dataset from kaggle using (ML techniques) K-nearest neighbor, bagged tree ensemble, support vector machine (SVM), (DL techniques) VGG16, GoogleNet, and ResNet50 [8] (Fig. 64.2). Li-sheng Wei, Tao Ji and Quan Gan wrote the paper with title “Skin Disease Recognition Method Based on Image Color and Texture Features”. In this paper, they discussed about the developing that is capable of predicting a skin disease based on the color and texture of skin. For this, initially, they removed the noise from the image and used gray-level co-occurrence matrix method to segment images. They used SVM algorithm for classification. Figure 64.3 gives the steps wise procedure in skin disease identification [9]. “Optimization of Deep Learning using various Optimizers, Loss functions and Dropout” is the paper written by “S.V.G. Reddy, K. Thammi Reddy, and V. ValliKumari”. In this paper, they discussed about building a deep learning model for CNN and recurrent neural networks (RNNs) using various optimizers like RMSProp, Adam, Adagrad, and Loss functions like mean squared error (MSE) and binary cross entropy. Dropout concept and verified the model’s performance such as accuracy and loss
670
K. Himabindu et al.
Fig. 64.2 Graphical comparison of performance analysis
Fig. 64.3 Process for skin disease identification
[10]. “Deep Learning in Skin Disease Image Recognition: A Review” written by LING-FANG LI1, WEI-JIAN HU, XU WANG, NEAL N. XIONG. In this paper, they reviewed 45 research studies on the identification of skin disease by using deep learning technology since the year 2016. They examined those 45 studies in terms of dataset, disease type, data augmentation technology, data processing technology, skin disease image recognition model, evaluation indicators, deep learning framework, and model performance [11]. Jyotsna More, Maitreyee Nath, Pranali Yamgar, Anjali Bhatt wrote the paper “Skin Disease Classification Using Convolutional Neural Network”. Using mobile devices, the proposed system assists users in
64 Identification of Skin Diseases Using Deep Learning Architecture
671
determining whether skin spots are cancerous or benign. The model was trained using a convolutional neural network on the HAM10000 dataset with MobileNet architecture. The model was integrated with android, and images are taken from camera [12]. The paper “Malayalam Handwritten Character Recognition Using AlexNet Based Architecture” written by Jay James, Manjusha J, Chandran Saravanan discussed about building a CNN model by using AlexNet architecture and then using support vector machine algorithm as a classifier. This model is able to recognize 44 primary and 36 compound Malayalam characters with 98% accuracy [13].
64.3 Pre-requirements 64.3.1 Dataset Training a model to classify and detect skin disease is great task. This can be properly accomplished by training our model with large set of images. HAM10000 dataset helps in solving this problem. It is a dataset which contains images of seven kinds of diseases namely melanocytic nevi, benign keratosis-like lesions, melanoma, actinic keratoses, basal cell carcinoma, vascular lesions, and dermatofibroma [7]. This dataset contains 10,015 images on the whole and is freely available through kaggle and ISIC achieve. This benchmark dataset can be used for both machine learning and expert comparisons. All of the major diagnostic categories in the field of pigmented lesions are represented in the cases [12]. More than half of the lesions were verified by pathology, with the rest of the cases relying on follow-up, in-vivo confocal microscopy, or expert consensus for confirmation. Figure 64.4 is the graph with number of cases in the y-axis and type of disease on the x-axis. This dataset also contains csv files with information about the age,
Fig. 64.4 Graph of number of each type disease in the dataset
672
K. Himabindu et al.
gender, location, and how the disease is diagnosed. The diseases in this dataset are diagnosed by using four methods [1]. They are histography, confocal, follow-up, and consensus.
64.3.2 Interface Google Colabs Colaboratory most commonly known as Colab is a research product of Google. It enables everyone to write and execute the Python codes via browser. It is more helpful for machine learning, data analysis, and deep learning. Colab is a hosted Jupyter Notebook service that requires no setup and provides free access to computing resources such as GPUs. Google Colabs enable users to work on large data at a fast rate. It also provides the facility of storing the data in Google drives.
64.4 Implementation Following data collection, we must resize the data to meet our needs [9]. Reduced data size allows the model to learn at a faster rate. The dimensions of the images present in HAM10000 (Human Against Machine) dataset is 450 × 600 × 3. Tensor flow models cannot handle this size, so we need to resize the images. In our project, we have resized the images to 100 × 75 × 3 dimension. The next step is to split the data after it has been resized. We divided the data so that 80 percent was used for training and 20 percent was used for testing. In order to avoid over fitting, the train data is again split to get validation data in the ratio of 90:10. After splitting the data, we developed CNN model using AlexNet architecture.
64.4.1 Architecture There are many architectures available to train a deep learning model. Some of them are TSM12, InceptionV3, ResNet152V2, and VGG16 [8]. We used AlexNet architecture to train our model. In 2012, ImageNet large-scale visual recognition challenge is won by AlexNet. Alex Krizhevsky along with his colleagues proposed the model in ImageNet classification with deep convolution neural network research paper in the year 2012 [2]. The model is composed of five layers, the first of which is a max pooling layer and three fully connected layers, and each of these layers, with the exception of the output layer, uses ReLU activation [13].
64 Identification of Skin Diseases Using Deep Learning Architecture
673
The convolutional (Conv2D) layer is the first. It is similar to a series of programmable filters. For first two conv2D layers, we choose 32 filters and 64 filters for the latter two. Using the kernel filter, each filter transforms a part of the image (specified by the kernel size). On entire image, the kernel filter matrix is applied. Filters can be thought of as image transformations. From these modified images, the convolutional neural network can extract features which are useful elsewhere (feature maps). The pooling (MaxPool2D) layer is the second important layer in convolutional neural network. This layer is simply a down sampling filter. It selects the maximum value by examining the two neighboring pixels. These are used to minimize the computational cost and over fitting. We must choose the pooling size (i.e., the area size pooled each time) greater than the pooling dimension. CNN can combine local features and learn more global image features by combining convolutional and pooling layers. Dropout is regularization method in which a proportion of nodes in the layer are randomly ignored (means their weights are set to zero) for every training sample. This is randomly distributed network and forces the network to learn features in distributed manner. This technique also improves generalization and reduces over fitting. The activation function used is the rectified linear unit function (ReLU) (maximum activation function) (0, x). To add nonlinearity to the network, the rectifier activation function is used. Figure 64.5 is the line plot of ReLU activation for both negative and positive inputs. The use of flatten layer is to combine all of the final feature maps into a single 1D vector. This flattening step is required so that fully connected layers can be used after some convolutional/maxpool layers. It combines all of the previous convolutional layers discovered local features. Finally, we created an artificial neural networks (ANNs) classifier using the features from two fully connected (dense) layers. The net probability distribution of each class is output in the final layer (dense (10, activation = “softmax”)). The softmax function was used as the final activation function. The output of a neuron in the softmax activation function is dependent on the outputs of all the other neurons in its layers [11]. Fig. 64.5 Line plot of ReLU function
674
K. Himabindu et al.
64.4.2 Optimizers We will need to define a loss function and an optimization algorithm after we have added our layers to the model. When mapping inputs to outputs, an optimization algorithm determines the value of the parameters (weights) that minimize the error [6]. These optimization algorithms or optimizers have a significant impact on the deep learning model’s accuracy and speed of training. Optimizer is a function or algorithm that modifies the weights, learning rates, and many other neural network’s attributes. As a result, it aids in the reduction of overall loss and the improvement of accuracy. So we need to choose optimizer wisely. Adagrad is the optimizer used by us out of all the available optimizers [10]. Adagrad stands for adaptive gradient descent. The adaptive gradient descent algorithm differs from other gradient descent algorithms in a minor way [3]. The difference its special ability of changing learning rates for iterations whenever required. The difference in learning rate is determined by the differences in training parameters. The more parameters that are changed, the more minor the changes in learning rate are. Because ham datasets contain both sparse and dense features, this modification is extremely beneficial. The characteristics of each disease’s skin lesions vary greatly. Each lesion has its own characteristics and features. As a result, having the same learning rate value for all features is unjust. To update the weights, the Adagrad algorithm employs the following formula. In this equation, αt represents the different learning rates at each iteration, n is a constant, and E is a small positive number to avoid division by zero. Wt = Wt−1 − η t ηt =
∂L ∂w(t − 1)
η sqrt(αt + ∈)
By using the Adagrad optimizer, we gained the advantage of adjusting the learning rates manually. It is more reliable than gradient descent algorithms and their variants, and it achieves convergence faster.
64.4.3 Loss Function Loss function is defined to know how well our model performs on data (images) with known labels. The difference between the observed and predicted labels is the loss function. The lower the loss accuracy value, the better the model. We used mean squared error as the loss function. MSE =
N 1 (yi − y i )2 N i=1
64 Identification of Skin Diseases Using Deep Learning Architecture
675
Fig. 64.6 a Epoch vs. Accuracy and b Epoch vs. Loss
This is the equation used to calculate mean square error (MSE). The MSE will never be negative value is always square the errors. MSE is useful to ensure our trained model does not have any predictions with large errors, because the MSE gives these errors more weight due to the squaring part of the function. After choosing all the parameters, we wrote the code using above architectures and optimizers, we have started to train our model. Batch size is a hyper parameter that specifies how many samples must be processed before the internal model parameters are updated [8]. Epochs are a hyper parameter that specifies the number of times the learning algorithm will run through entire training dataset. Each sample in the training dataset has had one epoch, which means that the internal model parameters have been updated. We trained our model by giving the epoch as 150, batch size as 10 with a learning rate of 0.01.
64.5 Results Upon building and running the model, we got 90.94% train accuracy and 78.43% test accuracy. Training loss we obtained is 1.9% and the testing loss we got is 4.4%. Figure 64.6a is the graph plot with epochs as x-axis and accuracy in the y-axis for both train and validation data. Figure 64.6b is the graph plot with epochs as x-axis and loss in the y-axis for both train and validation data.
64.6 Conclusion This paper discussed about developing a deep learning model using AlexNet architecture to identify skin diseases. Using deep learning in the field of medicine will have great impact in the living of mankind in positive way. Especially in the field of
676
K. Himabindu et al.
dermatology, it can help lot of people. It reduces the complexity in diagnosis of skin diseases. The model we built helps in diagnosing of seven kinds of skin diseases. It helps people in remote areas to get treatment for skin disease in advance. Most of the skin disease identification models use MobileNet, Inception V3, and GoogleNet. The model built with AlexNet in this paper gives the train accuracy almost similar to TSM12, InceptionV3, ResNet152V2, and VGG16, but the test accuracy is better than that of the TSM12 and InceptionV3 architectures. Using of Adagrad optimizer helps in altering the learning parameters for each iteration. This model can be further developed and can create as a mobile app [5] that is capable of detecting the skin disease with a photo that is scanned from a mobile camera.
References 1. Tschandl, P., Rosendahl, C., Kittler, H.: The HAM10000 dataset, a large collection of multisource dermatoscopic images of common pigmented skin lesions. Scientific Data, 1–9 (2018) 2. Shikha Rai, A., Joswin Mark Monteiro, N., Velentina Laira Veigas, T.N.K.: Melanoma skin cancer detection using CNN AlexNet architecture. IJRASET 8(V), 301–305 (2020) 3. Lydia, A., Francis, S.: Adagrad—an optimizer for stochastic gradient descent. Int. J. Information Computing Science 6(5), 566–568 (2019) 4. Patnaik, S.K., Sidhu, M.S., Gehlot, Y., Sharma, B., Muthu, P.: Automated skin disease identification using deep learning algorithm. Biomedical & Pharmacology J. 11(3), 1429–1436 (2018) 5. Srinivasu, P.N., SivaSai, J.G., Ijaz, M.F., Bhoi, A.K., Kim, W., Kang, J.J.: Classification of skin disease using deep learning neural networks with MobileNet V2 and LSTM. Sensors 21(8), 1–27 (2021) 6. Velasco, J., Pascion, C., Alberio, J.W., Apuang, J., Cruz, J.S., Gomez, M.A., Molina, B., Jr., Tuala, L., Thio-ac, A., Jorda, R., Jr.: A smartphone-based skin disease classification usingmobilenet cnn. Int. J. Adv. Trends in Comp. Sci. Eng. 8(5), 2632–2637 (2019) 7. Neema, M., Nair, A.S., Joy, A., Menon, A.P., Haris, A.: Skin lesion/cancer detection using deep learning. Int. J. Appl. Eng. Res. 15(1), 11–17 (2020) 8. Bandyopadhyay, S., Bhaumik, A., Poddar, S.: Skin Disease Detection: Machine Learning vs Deep Learning. Preprints (2021). 9. Wei, L.S., Gan, Q., Ji, T.: Skin disease recognition method based on image color and texture features. Comput. Math. Methods Med. 2018, 1–10 (2018) 10. Reddy, S.V., Reddy, K.T., ValliKumari, V.: Optimization of deep learning using various optimizers, loss functions and dropout. International J. Recent Tech. Eng. (IJRTE) 7(4S2), 448–455 (2018) 11. Li, L.F., Wang, X., Hu, W.J., Xiong, N.N., Du, Y.X., Li, B.S.: Deep learning in skin disease image recognition: A review. IEEE Access 8, 208264–208280 (2020) 12. Albahar, M.A.: Skin lesion classification using convolutional neural network with novel regularizer. IEEE Access, 38306–38313 (2019) 13. James, A., Manjusha, J., Saravanan, C., Malayalam.: Handwritten character recognition using AlexNet based architecture. Indonesian J. Electrical Engineering Inform. (IJEEI) 6(4), 393–400 (2018)
Chapter 65
IoT-Based Saline Controlling System Mittal N. Desai, Raj Davande, and K. R. Mahaadevan
Abstract With the increasing global population, the field of health care is becoming modernized with the help of technology. One such field is telemedicine. To reduce the diagnosis time of the patient, telemedicine is vital, and an efficient system is needed for the proper and timely care of the patients. This paper proposes a remote IoT-based saline controlling system using Arduino Uno where the doctor and staff can administer the saline flow of the patient remotely. It makes the use of Blynk in order to communicate with other devices over the Internet. The proposed system can reduce the need of physical presence of the hospital staff and save lives. Keywords Telemedicine · Saline controlling · IV automation · Arduino · Blynk
65.1 Introduction The field of health care is evolving every day with new innovations and technologies. Internet of Things is one such technology which can play a vital role in the field. There is also an acute shortage of doctors for treating the patients. IoT can become an important mediator between the patients and doctors reducing the waiting time and diagnosis through automated systems and telemedicine [1]. In the field of health care, intravenous control is a very common but an important treatment process. If proper care is not taken care, it can lead to adverse effects. Negligence and inattentiveness or other factors may also lead to the reverse flow of the blood to the veins through the saline bottle.
M. N. Desai (B) · R. Davande · K. R. Mahaadevan CMPICA, Charotar University of Science and Technology, Anand, India e-mail: [email protected] R. Davande e-mail: [email protected] K. R. Mahaadevan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_65
677
678
M. N. Desai et al.
Existing Approaches Various works have been done for saline monitoring which includes using a photodiode and LEDs [2], wherein they act as sensors and detectors and a clamping mechanism to limit the saline bottle, if needed. In [3], the project deals with an automatic saline level controlling system using the body mass index of a patient using DAQ assistant and LabView. Here, the system only calculates the fluid rate based on BMI, and no control is given to the doctors or the nurses [4]. In [5], an IoT-based bottle level monitoring system is proposed, wherein it makes use of ESP8266 12E NodeMCU to monitor and send alerts when the saline level reaches below the threshold level. One can only receive alerts using this model. In [6], a systemic impulse radio ultraviolet wave band (IR-UWB) radar is proposed for perceiving and observing human targets dependent on the identification of crucial signs because it provides high-distance resolution, low risk of exposure to the human body which includes electrolyte bottle level (EBL) sensors, acting as level sensor to monitor the critical level of the electrolyte bottle, and alert the concerned nurses in hospitals. One cannot regulate the flow of fluid in this model; on critical level, the model can stop the flow and generate alert. In [7], system is built by using load sensor and ultra-low power low-cost ESP32 Wi-Fi system on chip (SoC) microcontroller and making use of MQTT-S protocol for data transmission. Since the IR sensors need to be placed on the bottle, adding a new bottle is always a hassle; the accuracy of the saline reading can decrease due to misplacement of the IR sensor. In [8], a wearable medical device for electrocardiogram (ECG) and blood pressure monitoring is presented along with an automatic, low-cost saline level measurement system using microcontroller ATMEGA 328. Bluetooth module and IR sensors, with this model, one may only monitor the levels. Our paper proposes a system where the doctor or the nurse can remotely view the saline level through an application and control the flow of saline at any given time, while making use of load sensors to detect the weight and evaluate it through Arduino. The flow is controlled through a dial-a-flow device attached to a servo motor.
65.2 Proposed System A.
System Components
• Arduino Uno R3 Arduino Uno R3, as represented in Fig. 65.1, is a microcontroller board with 14 digital pins provided for input/output functionality; it is also equipped with a USB port for the transfer of instructions. We have used this for decision-making and its IoT application updating functionality (Blynk).
65 IoT-Based Saline Controlling System
679
Fig. 65.1 Arduino Uno R3
• Load cell A load cell, shown in Fig. 65.2, is nothing but a force transducer, a device that is used to detect the mechanical force, which will be converted into readable digital readings. Our model makes the use of this device for its weight-detecting capability [9]. • GSM Module This is an addition to the Arduino Uno board. Figure 65.3, SIM 900 GSM module is a hardware device that uses telephone technology to provide a data link to remote network, an example can be of a home security system [10]. We can add a sim card in the mentioned module to establish a connection with the remote application, Fig. 65.2 Load cell (10 kg)
Fig. 65.3 SIM 900 GSM module
680
M. N. Desai et al.
Fig. 65.4 16 × 2 LCD display
hence allowing us to give real-time updates as seen in modern day health monitoring systems [11]. • 16 × 2 LCD Figure 65.4 is a common module integrated with microcontrollers; it is an LCD display capable of displaying 16 characters in each line, while there are two such lines. • Servo Motor A servo motor, Fig. 65.5, is an actuator that has precise control over its velocity, torque, and the angle of rotation. The motor at all time knows at what angle the output shaft is tiled at. Our model uses this precision and minute angular movements to rotate the dial-a-flow device. • IV set This is a tubing set which provides the fastest way of infuse a fluid throughout the patient’s body from a sterile vacuum bag or bottle. This tubing set also has a roller clamp used to control the amount of flow through the tube. Though we will be using another device to control the flow of the fluid through the tubing (Fig. 65.6). Fig. 65.5 Servo motor
65 IoT-Based Saline Controlling System
681
Fig. 65.6 Intravenous infusion set
• Dial-A-flow device This is a device that goes onto an IV tube set and regulates the flow of the fluid inside the tube. With the given marker and readings on the dial, one can precisely control the rate at which the fluid goes through the tubing with a simple rotation (Fig. 65.7). • Blynk This is platform, where one can easily develop a graphical interface while providing proper addresses and widgets. With the help of those widgets, one can control their Fig. 65.7 Dial-a-flow device
682
M. N. Desai et al.
Fig. 65.8 Block diagram
microcontroller, in our case Arduino over the Internet. The same has been previously used in other healthcare systems [12]. B.
Proposed Mechanism
The proposed system aims to reduce the need of the presence of doctors and nurses at all times and allows them to monitor and control the patient’s intravenous therapy. The main purpose of the proposed system is to regulate the flow. To achieve this, we make use of dial-a-flow device with a servo motor attached to its slider. The regulation level is done with the help of Arduino programming. In order to send and receive data, Blynk is used as depicted in Fig. 65.8. The flow states are indicated through the lcd on Arduino. If the saline bottle level drops below 30%, the doctors will be notified, and once the level becomes critical (near to zero), the supply will be cut off using a dial-a-flow device. If the level is 50%, it is indicated with a green alert signaling normal flow. In below Fig. 65.9, you can observe the flow of execution. First, we calibrate the load cell with the full saline bottle; we then calculate the percentage, and based on those percentage values, we either cut the supply of fluid if critical or continue in the normal case scenario. Since the whole system is equipped with Blynk and a GSM module, the doctors can monitor real-time saline levels, while also controlling the rate of the fluid as a servo motor is attached to the regulator. C.
Flowchart
D.
Simulation
In Fig. 65.10, we can see the circuit connection of an Arduino and servo motor. Here, the use of two switches is made to replicate the two conditions stated by the load cell (Critical and normal).
65 IoT-Based Saline Controlling System
Fig. 65.9 Flowchart of execution
683
684
M. N. Desai et al.
Fig. 65.10 Circuit demo at normal level
Since the green led is turned on indicating, we can continue the flow of the fluid, meaning the servo motor rests on the idle position. In Fig. 65.11, the led is red; it means the load cell is detecting weight under 8%; now, the Arduino will send the notification to the application while also cutting off the supply by turning the servo motor 90° and pinching the IV tube. Below, you can see the application view from the doctor’s or the nurse’s perspective, created using the Blynk app [13]. In the application, the doctor will be able to see the patient number and the status of the saline fluid flow. The doctor will also the have the “flow regulator” slider with which he may alter the flow of the given fluid at any given time. There will be three notification lights “Normal,” “Critical,” “Emergency” and will be lit with corresponding situations as shown in Fig. 65.12. At the bottom, we can see the saline-level percentage, kept up-to date by the Arduino and load cell.
65.3 Conclusion In this paper, we have presented an effective as well an inexpensive saline bottle controlling system. The saline controlling system is able to monitor as well as regulate the flow with the help of IoT, keeping in mind not much work has been done on “Controlling flow” prior to this paper. Doctors and nurses can control the IV bottle’s
65 IoT-Based Saline Controlling System
685
Fig. 65.11 Circuit demo at critical level
flow remotely. The system can mitigate the need of physical presence at all times and save many lives. Such a system can be very useful in rural hospitals. As of now, each patient will require a sim card for proper communication to the server, but with further development, each department of the hospital can be run by a single sim card by segregating patients using software techniques. The system is simple to use and therefore can play a huge role in rural areas requiring extensive travel.
686
M. N. Desai et al.
Fig. 65.12 Application on android
References 1. Haleem, A., Javaid, M., Singh, R.P., Suman, R.: Telemedicine for healthcare: Capabilities, features, barriers, and applications. Sensors Int. 2 (2021) 2. More, A., Bhor, D., Tilak, M., Nagare, G.: IoT based smart saline bottle for healthcare. Int. J. Eng. Res. Tech. (IJERT) 10(6), June (2021) 3. Narkhede, S.R., Sivakumar, R.: Automatic saline level controller and indicator. ICAICTSEE (2020) 4. Ingale, J., Sahare,S., Kanchane, P., Tonge, N., Shah, A., Banmare, A.: Automatic saline level monitoring system using IoT. Int. J. Creative Research Thoughts (IJCRT). ISSN:2320-2882 9(5), i515–i521 (2021) 5. Gopal, R., Vimaladevi, M., Sowmitha, V.: Ir-Uwb based electrolyte bottle level for healthcare using IoT. Int. J. Aquatic Sci. 12(3) (2021) 6. Ghosh, D., et al.: Smart saline level monitoring system using ESP32 and MQTT-S. 2018 IEEE 20th International Conference on e-Health Networking, Applications and Services (Healthcom). IEEE (2018) 7. Landge, P.P.: Smart saline level monitoring and control system. Int. J. Res. Appl. Sci. Eng. Tech. 7(8), 898–901. https://doi.org/10.22214/ijraset.2019.8132 (2019)
65 IoT-Based Saline Controlling System
687
8. Kalaivani, P., et al.: Real time ecg and saline level monitoring system using Arduino UNO processor. Asian J. Appl. Sci. Tech. (AJAST) 1(2), 160–164 (2017) 9. Aravind, R., Arun Kumar, E., Harisudhan, R.K., Karan Raj, G., Udhayakumar, G.: Load cell based fuel level measurement using arduino uno microcontroller. Int. J. Adv. Res. Ideas Innovations in Tech. 3(3) (2018) 10. Chaudhuri, D.: GSM based home security system. Int. J. Eng. Tech. Res. (IJETR) 3(2), 38–40 (2015) 11. Digarse, P.W., Patil, S.L.: Arduino UNO and GSM based wireless health monitoring system for patients. 2017 International Conference on Intelligent Computing and Control Systems (ICICCS). IEEE (2017) 12. Hasan, D., Ismaeel, A.: Designing ECG monitoring healthcare system based on internet of things blynk application. J. Appl. Sci. Tech. Trends 1(3), 106–111 (2020) 13. Todica, M.: Controlling Arduino board with smartphone and Blynk via internet (2016)
Chapter 66
In-situ Measurement in Water Quality Status—Udalka Uttarakhand, India S. Harini, P. Varshini, S. K. Muthukumaaran, Santosh Chebolu, R. Aarthi, R. Saravanan, and A. S. Reshma
Abstract Water is a fundamental substance for basic survival of living beings. The statistics revealed that there has been an alarming increase in the scarcity of water across the globe. This work investigated the status of water quality in Udalka village in the northern Indian state of Uttarakhand. This research was carried out in a rural village, where one of the tributaries of river Ganges is the source of water, a prime source for the regions in upper half of India. The initial phase entailed the analysis of levels of contamination and steps to decrease their concentration. Water quality levels in the surrounding zones of the Udalka village helped in identification of the pollutants. The study was followed by the design of a solution for augmented potable water quality. The observation validated the rationale of an Internet of Things technology-based rationale for addressing water quality. It also focused on modern technology to ensure adaptability by the villagers for judicious use of water. Keywords Water · Scarcity · Water quality level · Contamination · Internet of Things technology
S. Harini Department of Electronics and Communication Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India P. Varshini Department of Civil Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India S. K. Muthukumaaran · R. Aarthi Department of Computer Science Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India S. Chebolu · R. Saravanan Department of Mechanical Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India A. S. Reshma (B) Amrita School for Sustainable Development, AmritaVishwaVidyapeetham, Coimbatore, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_66
689
690
S. Harini et al.
66.1 Introduction Water is the most important component of life on Earth. The majority of the water sources on Earth is saltwater, leaving only 3% as freshwater, only 1% of which is available for human consumption. The quantity of potable water is limited; its availability has been gradually dropping, supervened by steady increase in the global population, thus increasing contamination to the environment. According to recent studies, around 1.9 × 109 (billion) people worldwide rely on an unimproved or improved source of faecal contaminated water, primarily groundwater and rural piped sources [1]. Microbial contamination of water is common with faecal contamination being most prevalent pollutant in African and South Asian countries, per the report in [1]. According to the World Water survey, 31% of Indians have been consuming unsafe drinking water, with no other healthier choices, with heightened vulnerability to water-borne diseases, due to pathogenic microorganisms—viruses, bacteria, parasites, and protozoans present in contaminated water. While bacteriological quality contributes to acute health problems, contamination of water due to the presence of chemicals lead to chronic health issues. Women and young girls are obligated to reserve an average of 700 h per year, carrying water home, with direct repercussions on their families’ quality of life and well-being [2]. The primary parameters apropos water quality are odour, colour, taste, dissolved oxygen, turbidity, dissolved solids, and hardness [3]. Access to pure water is a global challenge that many of the underdeveloped and developing countries have yet to overcome. Poor sanitation and unhygienic lifestyles contributed to 4% of deaths in the global population. Among India’s urban dwellers, 80% of rural India have access to improved drinking water sources per the 2014 report on global progress against the MDG on water and sanitation [1]. Water contamination can transmit diseases such as cholera, diarrhoea, typhoid, and dysentery. Social inequities within an area lead to the unequal distribution of water among the people belonging to the same village or a city [4]. Everyone has the right to adequate supply of safe, accessible, and affordable water for personal and domestic use. Better water sources also mean less expenditure on health, as people often fall sick and meet with remedial costs and are better able to remain economically productive [5]. Inadequate water supply, uncovered sources of drinking water, open defecation near water sources, poor systems for human wastedisposal, and lack of awareness among people are some of the factors that escalate contamination. Therefore, sagacious management of all water resources is inevitable to ensure affordable potable water quality. Under the aegis of the Amrita Vishwa Vidyapeetham Live-in Labs® initiative, this research was primarily focused on studying the water quality challenges in Udalka village in the northern Indian state of Uttarakhand [6]. The following section provides an overview of the study area, covering information on the major occupation, traditional practices, climatic changes, and issues they confront. The major findings and methods employed in the research are described in Sect. 3. Section 4 includes design and development of solutions by the participants, followed by closing remarks, conclusion, and future considerations in Sect. 5.
66 In-situ Measurement in Water Quality Status …
691
66.2 Study Area The investigation reported in this study was conducted in Udalka village, in the Uttarkashi district, in the state of Uttarakhand. 120 families of the village reside in 65 houses. The main languages spoken by the Udalka residents are Gadwali and Hindi. Dunda, the nearest town to the village, is located about 6 km away. This town’s facilities include schools, markets, public health centres, banks, and police stations. The main occupation of the village men is working as labourers in far-off places like Delhi. Women are engaged in agriculture, maintaining livestock, and household works. Some men join the military to serve the great nation. Udalka village’s location on the riverside of Singuni, which merges with the perennial river Bhagirathi, one of the roots of the sacred Ganges River, one of the largest water resources of India. Sadly, the Ganges River is reputed as one of the most polluted rivers in the world. The village has a tank situated at the top, from where most of the houses receive their water supply. Initial field visits revealed that people were highly vulnerable to consumption of unhygienic drinking water. Water travelling long distances inside the dense forest has to flow through contaminants like animal carcasses, excreta of animals, and birds. The direct consumption of untreated water is harmful to human health. The concentration of contaminants in water is generally higher in summer than winter. Children under the age of ten were affected by water-borne diseases mainly during summer seasons.
66.3 Methodology and Results In our study, participatory techniques such as participatory rural appraisal (PRA) and human centered design (HCD) were utilized to understand the problems faced by the villagers. To bring out the problems faced by the villagers, PRA techniques were adopted to glean their requirements from various stakeholders, for ease of adaptability and provide a better solution [7, 8]. This strategy was instrumental in the development of a user-friendly model that could be effectively utilized. This method embraced the villagers as decision-makers and co-participants, towards the fabrication of knowledge-based solution for the proposed model. The model can be used in all stages of development in the design process. Employment of this technique in the ideation or conceptual phases is obligatory to enable partnership with the user community to acknowledge their contributions in idea generation, knowledge development, and concept development, to design pragmatic initiatives. This approach aided in building deep collaboration between the stakeholders, solutions providers, and end-users. Using the same methodology, a micro water distribution system was built by researchers at Amrita Vishwa Vidyapeetham to ensure water accessibility by all members in the Udalka community [9].
692
S. Harini et al.
PRA tools utilized in this study are delated below: 1.
2.
Resource map: The Village Resource Map is a tool that compiles—information on the major local resources, such as, village infrastructure, available water sources, major crops cultivated, health centres, schools, and religious places [10]. During the initial days of the field visit, this tool was productive, capacitating the research team to identify various local resources available in the village. Problem tree mapping: Problem tree analysis helps in the perception of the key hardships endured by the villagers. Mapping provides details regarding the downstream causes and effects of the identified challenges [11]. Problem tree analysis was effective in organization of all the meaningful data collected in the village. The research time surmised the following key issues prevalent in the village: a. b. c.
3.
4.
5.
6.
No access to clean and safe drinking water. Villagers unaware of the importance of drinking clean water. Susceptibility to health hazards and disease by consumption of unhygienic water.
Sketching: Following the activity, all the villagers involved in discussions of problems, and solutions thereto, were asked to sketch their solutions. Group sketching enabled the research team to infer the depth of knowledge the villagers have about solutions to their challenges. Technological advancements in the domain of the selected solution were described to the villagers, on review of relevant data from published literature. Awareness campaign: On the last day of the village visit, our team conducted meetings to familiarize the villagers on the imperative of clean and safe water among. Water purification experiments were performed the previous day, to demonstrate the benefits of using a water-filter. Human centered design (HCD): HCD is an effective method to tackle the challenges in which the end-user plays an equally important role in the design and development of a solution [12–14]. In this process, participation of users in every step is assured, including interactive discussions on the solutions proposed for the identified problems. Major challenges faced by the villagers were documented, and additional details were collected by human centric tools such as participant observation, in-depth interviews, and field observations, as explained below. Participant observation: Over the course of seven days, the villagers’ activities were observed. Using the activities, environment, interactions, objects, and users within an AEIOU observation paradigm, these—observations allowed for the identification of specific framework [15].
As part of the field observation, the water tank situated at the top of the village was visited to review the condition and depth of water in the water tank, along with the sediments settled in the tank. The water tank had many settled particles at the bottom of the tank which included leaves, sticks, sand, and small insects. In-depth
66 In-situ Measurement in Water Quality Status …
693
interviews with the villagers were also conducted which provided different water treatment methods pursued by the villagers. Also, the interviews with health workers in primary healthcare centres helped in collection of details of villagers’ adherence to hygienic lifestyles and water-borne diseases.
66.4 Pathway for Development 66.4.1 Treatment System Based on the existing water quality challenges in the village, a simpler, affordable, smart water purification, and quality monitoring system was proposed. The proffered solution has an alert feature that notifies the user of any changes in water quality variations. Four major steps are involved in the water treatment process, each step having a particular purpose indicated by an LED. Untreated raw water coming from the top of the village passes through first stage of the high-density filter made of medium-sized wire mesh, which effectively removes large particles such as leaves, sticks, and large-sized stones. In the second stage, TDS of the water is monitored. If the TDS of water is greater than 100 NTU, it will be indicated by the switch on of the third LED water from the third stage will be passed on to reverse osmosis filtration for further purification. The fourth LED indicates the output of the finally purified water after RO filtration to analyse the efficiency of the proposed solution, a design prototype was developed, and water samples were collected before and after the treatments. The samples were tested to appraise the reduction in concentration of contaminants in the water samples at Udalka (Fig. 66.1). Table 66.1 shows the results of the samples before and after the filtration. This shows that the filtration process was effective in making the water clean and safe.
66.4.2 Pathway for Development: Quality Monitoring System The analysed water quality data and the quality check in villages are crucial as they tend to have unregulated water supply through different seasons. Commercially available sensors could be utilized to monitor the above-mentioned parameters, by placing them in water. Sensing devices for each parameter can be incorporated, and their overall communication as a system can be achieved with the help of the Internet of Things (IoT). Furthermore, a facility to alert the end-user of the design can be achieved through an LED/buzzer/voice message in the local language. An IoTenabled system ensures quality checks and a regulated supply of water throughout the region. This system can also be extended to places outside Udalka, where distribution of clean, safe water is needed.
694
S. Harini et al.
Fig. 66.1 Steps involved in water treatment process
Table 66.1 Results of the samples S. no
Test parameter
Unit
Acceptable limit Before filtration After filtration
1
Colour
Hazen
5
1
1
2
Odour
–
unobjectionable
unobjectionable
unobjectionable
3
Turbidity
NTU
1
2.35
1.49
4
pH
–
6.50–8.50
8.67
8.47
5
TDS
mg/L
500
1234.58
856.75
6
Total Hardness
mg/L
200
187.9
165.67
7
Calcium
mg/L
75
37.717
35.3
8
Magnesium
mg/L
30
17
35.3
9
Chloride
mg/L
250
17.9
17.5
10
Iron
mg/L
1
1.55
17
66 In-situ Measurement in Water Quality Status …
695
66.5 Discussion and Conclusion The data collection was conducted with the help of various tools which helped in understanding the thoughts and knowledge of villagers. Key results of this research clearly show that water quality was the most significant issue affecting the village’s residents. Despite the fact that significant technical improvements have been made, the majority of villages continue to use unproductive old methods and practices. Hence, the importance of presenting a technological solution for the purification and rationing of drinking water. Also, it is observed to find out the adaptability by the end-users to migrate to new technologies to obtain better quality water, where water is rich in resources. The aim is to provide the villagers with pure drinking water after analysing the quality of water that is brought from the source and thus avoid the spread of water-borne diseases and for better healthy and safe living. There are water purifiers available in the market, a major concern with them is that their working ends with only the purification process. After their filtration processes, therefore, integrating IoT with the purification system will bridge the gaps that the existing solutions fail to achieve. This study can further be developed for designing and implementing the IoT model as the second stage of work. The model can be designed keeping in mind the essential parameters in water quality check and rationalization with the use of microcontrollers like Arduino or Raspberry Pi. Acknowledgements This research was funded by the E4LIFE International Ph.D. Fellowship Program offered by Amrita Vishwa Vidyapeetham. We extend our sincere gratitude to the Amrita Live-in-Labs® academic program for providing all the support. The authors express their immense gratitude to Sri Mata Amritanandamayi Devi, Chancellor of Amrita Vishwa Vidyapeetham, who has inspired them in performing selfless service to society. The authors also thank to Mrs. Subhadra and Ms. Neelam, the Village Coordinators and all the stakeholders in the village for cooperating with this study and guiding us throughout the process which helped us make this work possible despite all the hurdles faced at both ends. And also the staff members of the Amrita Self Reliant Villages (Amrita SeRVe) program.
References 1. Kaur, R.: Access to clean water and sanitation: the rural-urban gap closes. India water Portal, 27 May (2014) 2. Social Impact through a Water Route, JanaJal is Changing Critical Issues of Drinking Water for All. Business World, NGOBOX, 19 June (2018) 3. Sarda, P., Sadgir, P.: Assessment of multi parameters of water quality in surface water bodies-a review. Int. J. Res. Appl. Sci. Eng. Technol. 3(8), 331–336 4. Guleria, M., Rajesh, T., Panicker, A.J., Raja, S.R.R., Soundharajan, B., Frey, L.M., Nandanan, K.: Using human centered design to improve socio-economic livelihoods with modernized irrigation systems. In 2020 IEEE 8th R10 Humanitarian Technology Conference (R10-HTC), pp. 1–6. IEEE (2020)
696
S. Harini et al.
5. Frisbie, S.H., Mitchell, E.J., Dustin, H., Maynard, D.M., Bibudhendra, S.: World Health Organization discontinues its drinking-water guideline for manganese. Environmental Health Perspect. 120(6), 775–778 (2012) 6. Kadiveti, H., Eleshwaram, S., Mohan, R., Ariprasath, S., Nandanan, K., Divya Sharma, S.G., Siddharth, B.: Water management through integrated technologies, a sustainable approach for village Pandori, India. 2019 IEEE R10 Humanitarian Technology Conference (R10-HTC) (47129), pp. 180–185. IEEE (2019) 7. Chambers, R.: The origins and practice of participatory rural appraisal. World Development 22(7), 953–969 (1994) 8. Chambers, R.: Participatory rural appraisal (PRA): Analysis of experience. World Dev. 22(9), 1253–1268 (1994) 9. Ramesh, M.V., Mohan, R., Nitin Kumar, M., Brahmanandan, D., Prakash, C., Lalith, P., Ananth Kumar, M., Ramkrishnan, R.: Micro water distribution networks: A participatory method of sustainable water distribution in rural communities. In 2016 IEEE Global Humanitarian Technology Conference (GHTC), pp. 797–804. IEEE (2016) 10. Abhinaya, P.B., Adarsh, T., Vanga, P., Sivanesh, S., Vishnuvardhan, Y., Radhika, N., Reshma, A.S.: Case study on water management through sustainable smart irrigation. IOT with Smart Systems, pp. 569–578. Springer, Singapore (2022) 11. Keerthana, K.M., Arjun, C., Ramdev Krishnan, J., Hari Krishna, N.S., Mohan, R., Valsan, V.: Technology assisted rural futures in the village of moti Borvai. In ICDSMLA 2019, pp. 1654– 1661. Springer, Singapore (2020) 12. Vechakul, J., Agogino, A.: A comparison of two transdisciplinary human-centred design approaches for poverty alleviation. The Future of Transdisciplinary Design, pp. 179–192. Springer, Cham (2021) 13. Ajith, V., Reshma, A.S., Mohan, R., Ramesh, M.V.: Em-powering community in addressing drinking water challenges using participatory sustainable technological systems business models. In Technological Forecasting Social Change, Elsevier (InPress) 14. Ganesh, V., Sharanya, B., Sri Vyshnavi, U., Pradeep, K.R., Kumar, A.G., Purna Chaitanya, G., Frey, L.M., Cornet, S., Chinmayi, R.: Technology for addressing income insufficiency in rural India. In 2020 IEEE 8th R10—Humanitarian Technology Conference (R10-HTC), pp. 1–6. IEEE (2020) 15. Lee, M.J., Wang, Y., Been-Lirn Duh, H.: AR UX design: Applying AEIOU to handheld augmented reality browser. In 2012 IEEE—International Symposium on Mixed and Augmented Reality-Arts, Media, and Humanities (ISMAR-AMH), pp. 99–100. IEEE (2012)
Chapter 67
Implementing AI-Based Comprehensive Web Framework for Tourism Nada Rajguru, Harsh Shah, Jaynam Shah, Anagha Aher, and Nahid Shaikh
Abstract In current societies, the movement of all business activities is for effective participation among the market contenders, in addition not just withstanding their physical presence but also challenging their virtual presence. The smart tour recommender is a web-based framework for facilitating tourists through tour planning. Unlike other similar web-based systems our system is unique as it streamlines all the processes required for travel planning making it easy and convenient to use. It provides a plethora of features like providing information about tourist attractions, recommending tours based on the user’s interests, searching hotels or restaurants based on the user’s budget, booking accommodation, and providing users with a personalized itinerary. It focuses on making e-tourism easier and convenient as more and more people use such travel websites to plan their trips. Keywords Virtual presence · Travel planner · Recommending tours · User’s interests · Personalized itinerary · e-tourism
67.1 Introduction The focus of this research is to create an AI-based Web Framework for Tourism. The recommendation framework is a web-based system that may deliver a customized rundown of vacation sites, eateries, and accommodations relying upon the traveler’s N. Rajguru (B) · H. Shah · J. Shah · A. Aher · N. Shaikh A.P. Shah Institute of Technology, Mumbai University, Thane, India e-mail: [email protected] H. Shah e-mail: [email protected] J. Shah e-mail: [email protected] A. Aher e-mail: [email protected] N. Shaikh e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_67
697
698
N. Rajguru et al.
inclinations. Conventional recommendation techniques like content-based filtering and synergistic sifting are known to be advantageous in the tourism industry. Moreover, in light of the information gained from user’s interests and inclinations, recommendations are made utilizing content-based suggestions. Data dependent on comparable profiles is suggested when the client utilizes the service more using communityoriented recommendations. Furthermore, a design of a chatbot is provided which gives a genuine and precise response for any question using artificial intelligence, which permits individuals to textually communicate with the objective of organizing visits and requesting fascinating spots worth visiting. The Flask development framework is utilized to construct the system since it accentuates rapid development and simple, pragmatic design. The system methodology is discussed comprehensively with the aim of explaining the functionalities of the system.
67.2 Literature Review In relation to the tourism industry and recommendation, the context can be defined as the characteristic information of an entity such as users or an object. The contextual information can be helpful to personalized recommendations when the available information about the item or person is not sufficient. This method is beneficial in producing recommendations on tours, travels, and places as suggested by Thomas and John [1]. This is also useful for recommending tourist sites according to the user’s location, moods, the climate of the user’s environment, etc. According to Adomavicius and Tuzhilin [2], the proper application of algorithms in recommendation systems is critical to delivering improved results. The dependability factor of various recommendation algorithms is determined by their data qualities. In some circumstances, an algorithm may generate superior results when working on one data set but fail to produce desirable results when working on another. A range of characteristics, such as the number of users, the number of items, the distribution of user ratings, and the influence of data sparsity, define an algorithm’s applicability. [3]. Yoke Cheng and Noor Raihan also describe the understanding of the web surfer’s behavior and preferences to allow the travel and tourism service providers to strategize their businesses effectively. To analyse the correlations and differences between the web browsing inclinations among people and worldwide web users. The authors propose the utilization of a mixed hybrid recommendation technique including demographic, content-based recommendations, preference-based filtering to travel and tourism service providers [4]. Regardless of how valuable these functionalities are, all of these applications lack the following: the ability to create user profiles (e.g., makeMytrip.com), view places of interest near the user’s destination or current location (e.g., makemytrip.com Expedia and Trivago.in), calculate routes, or track the user’s travel history (e.g., Tripadvisor.in). Furthermore, none of these have the option of restricting personal periods in the trip plan.
67 Implementing AI-Based Comprehensive Web Framework for Tourism
699
67.3 Proposed System 67.3.1 Technology Stack Flask Dashboard. We introduce, Flask Dashboard, a Python library which is an extension for Flask-based Python applications functionalities that help developers to monitor the performance and utilization. The dashboard and the web application are created in Python using Flask to bind the application’s web services and adding extra routes to the service for making interaction quite efficient. JavaScript Object Notation (JSON). JSON is an appropriate information trading language since it is used for serializing and transferring structured data over a network for all cutting-edge languages. It is used to evaluate the results of all models and choose the best-performing model for chatbot on our dataset. It is used to preprocess and alter data before it is passed to the model. We need to pass the natural language representation that we need to perform. JSON is the language utilized in this application for making the information base. TensorFlow using Python. TensorFlow is an open-sourced software library for jobs that require a large amount of numerical processing. As compared to other machine learning libraries, TensorFlow has a faster compile time. It is compatible with CPUs, GPUs, and distributed processing. Python API is simpler to use than C+ + API. TensorFlow has a faster compile time than other machine learning libraries. Google Positioning System (GPS). The GPS in our system keeps track of the tourist location in real time. The web application keeps on updating location as the tourist changes location. Using GPS, the system is location-aware. The GPS receiver sends position and time data to any device by utilizing navigation satellites, 24 of which are operational and three of which serve as backups [5]. Trilateration is then used to estimate the user’s precise position which then is fed to the system for a further recommendation of a nearby location to visit. MongoDB. MongoDB is a JSON-formatted open-source document database that stores organized and unstructured data. Documents, which are groupings of fields with a dynamic design, are used to hold all individual records. MongoDB is a more versatile and user-friendly database than SQL. In modern programming language, the document data model is a powerful way to store and retrieve data. Google Maps API. By utilizing Google Maps API, maps can be added to any web application. The API helps to guide the system depending on Google Maps information. Google Maps servers handle all the reactions to the plan signal and to use the Google Maps API the developer has to register the project on the Google Developer Console and get a Google API key which can be used further on the site. Without the key Google Maps services cannot be accessed.
700
N. Rajguru et al.
Google Places API. This API can be used to create a location-aware page. With the help of nearby local businesses and other places near the device as per place, the API can respond to the output. To implement Google Places API, the developer must register the project on the Google Developer Console. Additionally, in order to access the google places API, the developer must include google play services in their development project [6].
67.3.2 Architecture Users can make use of the website’s features by checking into the system after browsing the site. Users who have registered/signed up for the website are the only ones who can log in. New users can sign up for an account on the website and create a username, password, and other credentials, which are then recorded in the system database. After registering, a user may log in to the website using the username and password they used to sign up. The login is successful when the login information match those maintained in the database, and the user continues to the next page. After successfully logging in, the process of locating the optimal attraction location for the user begins with the user providing tour information. This procedure works by taking into account the interests and preferences of the users. The category of tourist attraction site that the tourist chooses to visit, such as interior sites, outdoor activities, architectural heritage sites, environmental regions, and so on, is collected from users. The user must choose between two and three different categories of favorite interests. The recommendation system uses AI/ML to provide recommendations based on the data (interests and preferences) provided by the user. The recommendation system’s two main objectives are to provide users with generic travel plans and lodging options based on their preferences. The user is given a generic plan to which they may apply various filters to design the components of the tour that are most appropriate for them (Fig. 67.1). Filters by gross tour budget, area of stay, type of lodging, length of stay, hotel type, culinary preferences, and so forth. Following that, the customer may make a hotel reservation after deciding on a budget, length of stay, quality of stay, service, and food preference. It allows users to personalize their plans and is appropriate for people of all ages and interests. The consumers are given a wide variety of options that may be modified from one individual to the next. By selecting the cuisine that the user likes, the user may pick a restaurant with convenience and speed. Next, consumers may narrow down the best restaurants for them by picking a restaurant category, such as fast food, vegetarian, non-vegetarian, Jain, and so on. Users may also pick a financial range to spend at the restaurant, as well as the sort of restaurant they want to visit, such as fine dining, casual dining, buffet, café, fast casual, and so on. The processes for booking a hotel room are identical to those for booking a restaurant. To begin, customers can choose their desired budget for lodging. The user then chooses the sort of accommodation they want to stay in, such as single, double, deluxe, twin, or suite. The customer must then choose depending on availability, budget, luxury,
67 Implementing AI-Based Comprehensive Web Framework for Tourism
701
Fig. 67.1 Proposed system architecture
cleanliness, pool, and laundry facilities, among other factors. Following the customer is given with a one-of-a-kind and tailored trip plan, complete with tourist information, nearby eateries, and an itinerary.
67.3.3 Principle Working of the System AI-based Chatbot. A chatbot is an artificial intelligence (AI) program that can mimic a discussion with a user in genuine human language through informing mediums like websites, mobile applications, or messaging apps. This project incorporates the presence of an AI chatbot using TensorFlow which is a library of Python. Thus, while operating the website the users can get a personalized automated support assistant to guide them through in and out of the system. The AI bot is trained using intents and entities to identify and understand the user and to provide suitably trained replies. However, the AI chatbot can only be used for basic user queries. Furthermore, other
702
N. Rajguru et al.
complex user queries can be posted on the forum. The forum is a community of travel agents and designated forum administrators that assist users by resolving their questions (Fig. 67.2). Sign up Process. On the off chance that a client is new to the website, the client would need to initially join and register themselves on the website. In this cycle, the client is approached to enter essential data. The client is additionally approached to give a username and a password for signing into the website later on. The information taken from the client is shipped off the server data set where it is put away for future references. The password will be put away in the data set while carrying out a hash secret key for the client’s information security. After completion of the registration process, the message will display “sign up successful”. Login Process. Users can use the website’s features by signing into the system after browsing through it. Users who have registered or signed up for the website can log in. Once a user has been registered, they can log in to the website using the input username and password used during sign up. The login and password data entered by the user are taken in the system and compared to the data saved on the database. Once the login details match the details saved on the database; the login is successful, and the user moves to the next page (Fig. 67.3). Client Data Input Process. On logging in successfully, the process to find the best attraction place for the user begins with collecting details of the tour from the user. It considers users’ interests and choices for hotels, restaurants, and attraction sites. Data taken from users include the type of tourist attraction site the user wishes to tour, e.g., outdoor activities, indoor sites, architectural heritage sites, environmental
Fig. 67.2 Output generated from trained AI chatbot using TensorFlow
67 Implementing AI-Based Comprehensive Web Framework for Tourism
703
Fig. 67.3 Login process
locations, etc. The user must select up to two or more types of preferred interests (Fig. 67.4). Recommendation System. In the content-based filtering technique, recommendations are made according to the analysis of the item’s attributes. It makes use of the algorithms that are domain dependent. The recommendations are derived from user profiles based on the characteristics of the users’ interactions with the system. To provide recommendations, content-based filtering algorithms apply various models, such as vector space models. Furthermore, the models such as neural networks and
Fig. 67.4 Client data input process
704
N. Rajguru et al.
decision trees are also used to produce the similarities or the relationships between different items. Later, the user profiles are set up similarly according to the terms related to the users’ interests obtained from the different types of feedback. These terms are then classified into two classes named positive and negative classes. Only the positive class is relevant to the users for a recommendation, and the negative class is irrelevant [7]. The two principal functions performed by the recommendation system are to recommend users with generalized tour plans and hotels suggestions based on their inclination. The recommendation system operates using artificial intelligence/machine learning calculations utilizes historical data as a contribution to predict new yield esteems. The recommendation system provides a unique output by using the data (interest and preferences) given as input by the user. Hence, based on different data collected from the user’s preferences a wide range of suggestions is generated by the recommendation system [8] (Fig. 67.5). Data Dynamic Updating. It is an important function to adapt to changing information so that the data in the system is updated in a timely and accurate manner to interact with the data model. Dynamic data has functionalities that can assist in developing a data-driven application to perform CRUD functions. It has the ability to change how users see and modify data fields. Data maintenance is regulated to maintain the integrity and authenticity of the system. So, in order to improve the quality of the system, proper implementation for dynamic data collection and maintenance is crucial. Filter and Reservation Process. Users can proceed with the generated tour itinerary by merely using the system’s tour, restaurant, and hotel recommendations; or else they can add filters to the recommended itinerary to perfectly tailor the tour plan, as suitable to the user. The generalized itinerary is generated upon which the user can add various filters to plan the aspects of the tour suitable to the user. Filters based on gross tour budget, area of stay, type of accommodation, period of stay, type of hotel, food preference, etc. Subsequently, the user can proceed with a hotel reservation after determining the budget, time of stay, quality of stay, service, and choice of cuisine.
Fig. 67.5 Content-based filtering recommender system [7]
67 Implementing AI-Based Comprehensive Web Framework for Tourism
705
67.3.4 Obstacles The origins of tour recommendations traced back to the orienteering problem and related variations, do not include customization for specific users. As a consequence, given the identical starting/ending interest points and time budget as inputs, the same tour route is recommended to all users [9]. Orienteering Problem. In which participants visit checkpoints with predetermined scores that attempt to maximize their overall score within a certain time limit. Numerous research in recent years have used the OP and its many versions to predict tour suggestions. Similarly, several online applications [10] have been built based on the OP. We begin by describing the original OP [11, 12] and how it has applied to the domain of tour recommendation. Numerous systems concentrate on specific cities; each has its own set of prime point. A visitor visiting a specific city will have concerns of a specific time or distance budget, and have to choose starting and ending preferred interest point. A tourist’s budget often signifies the amount of time tourist is willing to spend on a tour or the distance that the tourist is willing to travel. Similarly, the starting and ending destination indicate the tourist’s preference to begin the trip near a specific spot (e.g., the tourist’s hotel) and complete the tour at a different point (e.g., near a restaurant). Thus, given a set of criteria such as a budget, a starting interest point, and a destination, our major goal is to recommend a tour itinerary that maximizes a certain score while sticking to the budget, starting, and destination POI limitations. Problem for Itinerary Mining. The itinerary mining problem (IMP) was introduced based on OP [13], which tries to discover an itinerary that optimizes tourist prime point popularity while keeping touring time within a predetermined budget. The model is based on the number of visits by unique visitors, transit times between tourist interest points based on the median transit time by all tourists, and interest point visit times based on visit time by all tourists. To solve the IMP, a recursive greedy algorithm [14] is implemented, which attempts to determine the itinerary’s middle node.
67.4 Technical Methodology 67.4.1 Bias Removal Method by Joshua C. [15]. We will begin by attempting to change the ratings, into a more normalized distribution using three forms of bias removal. The first method of bias elimination computes the average rating across all users and goods. The second and third types of bias elimination repeat the same process. Due to the possibility of small sample sizes for the user and/or item ratings, Laplace smoothing is used to reduce the influence of outliers.
706
N. Rajguru et al.
The first model, base bias, is a simple system that employs the current mean rating, as the anticipated rating. This is a straightforward model, but it may be a starting point for more sophisticated models and a benchmark against which future advances can be assessed. The next model is the user bias which attempts to reduce our baseline model’s prediction error by recognizing that users may have intrinsic biases reflected in their ratings. Starting with the baseline mean, we calculate the individual user’s bias to enhance our model. Alternatively, the third model that is the item bias (trips), understands that, just as each user has biases, each object (trips) may likewise have a general bias associated with it. To improve our model, we start with the baseline mean and then calculate the individual item’s bias to improve our model. The nature of the fourth model is the user to item bias, and the fifth model is the item to user bias, which is quite identical. The models begin by computing the residual bias from either user bias or item bias. The projected ratings for the models’ user to item bias and item to user bias are then determined. These five models serve as baseline predictors that our subsequent models will employ as a stepping stone to better performance.
67.5 Conclusion In this project, we have created an e-tourism website that allows users to plan their trip entirely by themselves from researching places, to know about restaurants and hotels suited to the user’s budget and preference, booking hotels, generating unique personalized tour plans, etc. The system provides a streamlined approach through the entire tour planning process. This project helps users connect better with the tourism community and support when users have any tour related queries. This project thereby improves efficiency, simplifies the process, and consumes less time to plan a tour.
67.6 Future Scope As AI/ML is employed for analysis and solution generation, and the framework will become quicker and more efficient. Once used and managed domestically, the system may be used for an international tour all over the world. It will allow customers to reserve trains, automobiles, planes, and other kinds of transportation. A system can collaborate with hotels, restaurants, and other eateries to promote and highlight them on the site. If a recommendation system can be developed based solely on considering the visual content of the video, it would become the most accurate recommender system. A more realistic step toward achieving this can be to develop a model that could make recommendations according to the visual content of the video shorts instead of the whole video [16]. This could also ensure that the recommended list varies according to the latest trends in video content.
67 Implementing AI-Based Comprehensive Web Framework for Tourism
707
References 1. Barranco M.J., Noguera J.M., Castro J., Martínez L.: A context-aware mobile recommender system based on location and trajectory. In: Casillas, J., Martínez-López, F., Corchado Rodríguez, J. (eds) Management Intelligent Systems. Advances in Intelligent Systems and Computing, vol. 171. Springer, Berlin, Heidelberg (2012) 2. Adomavicius G., Tuzhilin A.: Context-aware recommender systems. In: Ricci, F., Rokach, L., Shapira, B., Kantor, P. (eds) Recommender Systems Handbook. Springer, Boston, MA (2011) 3. Pazzani, M.J., Billsus, D.: Content-based recommendation systems. In: Brusilovsky, P., Kobsa, A., Nejdl, W. (eds) The Adaptive Web. Lecture Notes in Computer Science, vol. 4321. Springer, Berlin, Heidelberg (2007) 4. Cheng, A.Y., Ab Hamid, N.R.: Behaviour and preferences in browsing the travel and tourism websites. In: 2011 IEEE Colloquium on Humanities, Science and Engineering (2011) 5. Fatima, R., Zarrin, I., Qadeer, M.A., Umar, M.S.: Mobile travel guide using image recognition and GPS/Geo tagging: A smart way to travel. In: 2016 Thirteenth International Conference on Wireless and Optical Communications Networks (WOCN) (2016). 6. Jafri, R., Alkhunji, A.S., Alhader, G.K., Alrabeiah, H.R., Alhammad, N.A., Alzahrani, S.K.: Smart travel planner: A mashup of travel-related web services. In: 2013 International Conference on Current Trends in Information Technology (CTIT) (2013) 7. Meteren, R., Someren, M.: Using content-based filtering for recommendation [online]. Available at: [PDF] Using Content-Based Filtering for Recommendation | Semantic Scholar (2000) 8. Ekstrand, M.D., Riedl, J.T., Konstan, J.A.: Collaborative filtering recommender systems. Found. Trends Hum. Comput. Interact. 4(2), 81–173, February (2011) 9. Lim, K.H., Wang, X., Chan, J., Karunasekera, S., Leckie, C., Chen, Y., Tan, C.L., Gao, F.Q., Wee, T.K.: PersTour: A personalized tour recommendation and planning system. In Proc. of HT’16 (2016) 10. Vansteenwegen, P., Oudheusden, D.V.: The mobile tourist guide: An OR opportunity. OR Insight 20(3), 21–27 (2007) 11. Gunawan, A., Lau, H.C., Vansteenwegen, P.: Orienteering problem: A survey of recent variants, solution approaches and applications. Euro. J. Operational Res. 255(2), 315–332 (2016) 12. Vansteenwegen, P., Souffriau, W., Oudheusden, D.V.: The orienteering problem: A survey. Euro. J. Operational Res. 209(1), 1–10 (2011) 13. Lim, K.H., Chan, J., Karunasekera, S., Leckie, C.: Tour recommendation and trip planning using location-based social media: a survey. Knowl. Inf. Syst. 60 (2019). https://doi.org/10. 1007/s10115-018-1297-4 14. Chekuri, C., Pal, M.: A recursive greedy algorithm for walks in directed graphs. In: Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS ‘05). IEEE Computer Society, USA, pp. 245–253 (2005) 15. Stomberg, J.C.: A comparative study and evaluation of collaborative recommendation systems. Master’s Thesis, Michigan Technological University (2014) 16. Davidson, J., Liebald, B., Liu, J., Nandy, P., Van Vleet, T.: The YouTube video recommendation system. In: Proceedings of the fourth ACM conference on Recommender systems–RecSys (2010)
Chapter 68
Smart Water Resource Management by Analyzing the Soil Structure and Moisture Using Deep Learning Sharfuddin Waseem Mohammed, Narasimha Reddy Soora, Niranjan Polala, and Sharia Saman Abstract Resource management is one of the important steps in smart and precise agriculture systems to increase crop yield. Most of the existing technologies for smart water resource management work on moisture sensor which collect the data and control the water resources in agriculture field; here, we propose a methodology to collect the data using moisture sensor, and soil images are analyzed using real-time videos to detect the moisture level depending upon the sown crop, and decision is made to irrigate the crop; VGG-19 model is used to classify the image type and crop sown to decide the amount of water required for irrigation of sown crop. Kaggle soil structure dataset is used to train the model, and testing is performed using proprietary dataset to evaluate the proposed model, and experimentation is conducted on low computing hardware using TensorFlow lite model with transfer learning; we achieved considerable performance for proposed method. Keywords Image classification · Computer vision · Resource management · Soil structure · Soil moisture
68.1 Introduction The adhesion or tension phenomenon is responsible for holding water by soil particles. Tension produced in water molecules and soil particles is measured by Tensiometer. Tree roots try to separate water molecules from soil particles by opposing tension between them. For good management of irrigation system, soil moisture detection sensors are used. To increase profitability, better yield in crops, there is need of planned irrigation system which helps us to get knowledge about what happening at the root of crops. Soil type recognition helps us to know suitable crop, suitable type of fertilizers, etc. Agricultural water irrigation is important medium to make system with reasonable plans. Jingxinyu et al. [1] used term called soil water content (SWC) to get improved S. W. Mohammed (B) · N. R. Soora · N. Polala · S. Saman Department of Computer Science and Engineering, Kakatiya Institute of Technology and Sciences, Warangal, Telangana, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_68
709
710
S. W. Mohammed et al.
accuracy for multiple depths. SWC data and meteorological data are combined to monitored and given to the BiLSTM and ResNet for both special as well as time series feature extraction. This analysis is done in China (2016–2018) for seven different maize plant stations. Training and testing the data is done at different depth sets as 20 cm, 30 cm, 40 cm, and 50 cm. Result analysis is done using mean square error, mean absolute error, and the proposed algorithm is compared with existing state-ofthe-art moisture detection and recognition techniques to get superior performance of proposed analysis. Human life needs fresh water as essential requirement. As two-third of global fresh water plays major role for agricultural irrigation, its saving is very essential. Previous agricultural irrigation systems are scheduled manually but at present advancement in sensor network leads to modernization in irrigation system. Using sensors like soil moisture profile sensors, we avoid unnecessary water wastage in irrigation. The sensors as used outdoor purpose, it should be robust which can work longer time and it should be less maintainable, as per the convenient traditional water resource management is operated using sensor data, but in our proposed method, both sensor data and visual data are collected for decision making. Yanxiang Yang, et al. [2] used soil moisture sensors for efficient use of water in agricultural irrigation. If there are high expectations, then sensor networks are more reliable to use. In this study, author implemented soil moisture detection reliable system. For detection of sensor problem’s, fault detection is used and for optimization genetic algorithm is used. Proposed methodology improves the accuracy of fault detection and reliability in terms of mean time. Soil moisture retrieval techniques used for monitoring soil moisture in real time using mission critical (MC) sensors. There is use of image retrieval concept is used which retrieves similar information in images from the available database with which image the features are matched. Tiantianwang et al. [3] used convolutional neural network (CNN) with time and frequency analysis for soil moisture detection algorithm. Different pretrained networks are used like VGGNet, AlexaNet for classification of time and frequency (TF) distribution for different signal-to-noise Ratio (SNR) values; AlexaNet has better accuracy of classification for different TF patterns. For maximum yield of crops, there is also need of calculation of soil moisture content from soil. RemilekunSobayo, et al. [4] find soil temperature differences to get soil moisture value accurately. Drones are used to get thermal images capturing for image recognition using convolutional neural network (CNN) and regression model. Soil moisture content can be easily calculated temperature value provided by the thermal IR images. Soil moisture and soil type are easily calculated and showed superior performance of the proposed model compared to basic deep neural network (DNN) which is tested with test dataset. Some of the important soil physical characteristics includes soil color, soil texture, and soil moisture content. Using these physical characteristics, we can even decide there is need of irrigation or not for given soil type. JianXu et al. [5] studied soil moisture calculation using image-based deep learning technology which uses convolutional neural network (CNN). Automatic detection of moisture and recognition of soil irrigation is done using proposed algorithm. Soil moisture is detected perfectly
68 Smart Water Resource Management by Analyzing the Soil Structure …
711
based on the training data to layers of neural network such as convolution layer, pooling layer, etc.1200 images are trained to neural network to get stable structure for moisture and humidity.
68.2 Literature Survey Literature survey includes some research articles published till date for detection of soil type or soil moisture detection. Many of the existing techniques use basic techniques such as machine learning technique for soil type classification and moisture categorization. The agricultural structure and land-use patterns in the basin have had several striking on the water environment in Erhai Lake Basin. This article helps us to find out essential contribution’s factors for crop species through the analysis of the gray correlativity degree and to form division management in the basin based on water preservation; this will guarantee the increase of the level of agricultural improvement based on protecting the water environment of Li Chunmeiet [6]. The reason for spatiotemporal soil dampness assessment for agrarian dry spell hazard the executives is for nowcasting and gauging of the rural dry season. The information sources which we are giving for various info time-postpone neural organization for spatiotemporal multi-profundity soil dampness assessment is transient downpour, water system inclusion, standardized distinction vegetation file, (NDVI) and bundle-based evapotranspiration information. The dirt dampness assessment model gives nowcasting quality to water-stress the executives [7]. For expanding the yield, soil testing is vital. Soil-based harvest determination and manure the executive’s framework has profoundly evolved and a standardized way to deal with portray the piece of exposed soil, as follows metalloids, and mesological boundaries. For this reason, we are utilizing various sensors, like soil sensor, temperature, and camera, and to control and screen the agribusiness field. Water level sensors, GSM, and a regulator are used to control the water system [8].
68.3 Proposed Work 68.3.1 Overview The proposed method is classified into two modules, i.e., recognition module and activity module, and following section explains the detailed process of two modules with architecture diagram. Recognition module helps to train the model on local host using VGG-19 model to identify the soil types and structure which generates a modified pretrained model file, training is performed using Kaggle dataset and testing is performed on propriety
712
S. W. Mohammed et al.
Fig. 68.1 Proposed system architecture
dataset, while testing the model false negative images are retrained for improving the accuracy. Activity module is on-premises edge computing device with limited computing resources; hence, the pretrained model from recognition is inferred into the edge computing device for validating the on-premises images for decision making for water resource management; any microcontroller can also be used as edge computing device with minimal hardware support. If system failed to identify the soil type, the recognition model is retrained on the false negative results with new features. The inference model is compressed using the TensorFlow lite framework which helps to identify the soil type on-premises. The overall process is demonstrated in the proposed architecture in Fig. 68.1.
68.3.2 Dataset Details VGG-19 model is used as backbone to identify the soil structure and pattern which help to make decision for water resource management.
68 Smart Water Resource Management by Analyzing the Soil Structure …
713
Fig. 68.2 a Sample images for alluvial soil, b sample images for black soil, c sample images for red soil, and d sample images for clay soil
We used soil type data from Kaggle which includes four different types of images as. • • • •
Alluvial soil Black soil Red soil Clay soil
Dataset consist of four different types of classes which are categoric data which make model to easily train over four different classes. Description of each class of soil images is described in detailed in the below section, to understand the structure, and pattern of soil type sample images is provided in Fig. 68.2. Surface water soil deposited is called as alluvial soil. Mostly it can be found in larger flood regions which spreads more compared to other soil types. This soil is nothing but transformation of rock which takes almost million years. Black soil has black surface with mineral soil which contains carbon. Some of the sample images used for black soil. The soil in areas such as moist climate, high temperature, and warm places like mixed forest is the red soil. The color of soil is yellowish brown, and some of the sample images of red soil type images collected. Clay soil is the combination of moisture, organic materials, chemical components, and living organism. The ability of soil for more yield of crops is its texture and for clay soil; we will get good texture which helps to get more yield. Some of the sample images for clay soil are shown in Fig. 68.2.
714
S. W. Mohammed et al.
Fig. 68.3 Basic neural network structure used for soil type and moisture recognition
Above we shown all four types of soils from Kaggle dataset. These all images are having different features which helps VGG-19 model in classification. We also trained the model using a sequential neural network with convolution neural network for soil type and moisture recognition, as the input data is heterogeneous; i.e., image data and moisture sensor value which provide the details of moisture in soil. The methodology used prior to VGG-19 model is demonstrated in Fig. 68.3. Due to heterogeneous data of image and moisture sensor, we tried to train the model with flattening the three color channel values of soil images into a single dimension, i.e., 1 × 3 vector and combined with the soil moisture values to determine the soil type with different classes, with different level of soil moisture, we collected the data and performed the training but did not able to achieve the minimal accuracy. As the training parameters are increased upon each layer, the model became complex and time consuming for each hidden layer, and basic idea was to minimize the training parameters to make inference model without any further requirement of tiny machine learning frameworks. The basic experimentation is performed with the following architecture of sequential neural network which consists of three main layers as. 1. 2. 3.
Input layer Hidden layer Output layer
68 Smart Water Resource Management by Analyzing the Soil Structure …
715
Fig. 68.4 VGG 19 model used for training and testing
Input layer is helpful for providing input data to the network. Input layer is used for taking input sample for proposed work analysis. As shown in the figure, it is attached to many hidden layers. Hidden layer may consist of convolutional layer, max pooling layer, SoftMax layer, etc. All the computation is done using these hidden layers. Output layer helps us in getting predicted output by the network structure. VGG 19 model is used for training the soil images to classify the different categories of soil types. Fig. 68.4 demonstrates the architecture of model. This VGG19 model is the pretrained model available at library of TensorFlow. This network structure comes under deep learning which has very fast response for analysis of results for higher dataset or multiple input types. Algorithm used for Proposed Work 1. 2. 3. 4. 5. 6. 7.
8.
Prepare the database with different soil type and moisture level Use all the dataset images for training using VGG19 neural network model. Selected images are used for validating the model. Preprocess image and load neural network trained model. Validation is performed on the propriety dataset which is collected from on field. If model failed to recognize the type of soil along with moisture data, model is retrained using back propagation with new images included. Trained model is translated into light weight model using tiny ML framework like TensorFlow lite and transfer learning is performed to test the model on edge computing device. The inference model predicts the soil type using edge computing device and decide the action (soil type / soil moisture level).
716
9.
S. W. Mohammed et al.
Post-processing (Motor will be ON/OFF depending on soil moisture level and soil type). Alters can also be send to farmers as per the controller decision.
Above defined steps are performed to analyze the soil structure and soil moisture level.
68.4 Result Analysis Proposed system hardware parts is shown in above Fig. 68.5, and the details of hardware components and sensors used for experimental work is listed in Table 68.1.
Fig. 68.5 Hardware kit implemented for proposed method
Table 68.1 Details of hardware used for experimentation Device
Quantity
Purpose
Raspberry Pi
1
Microcontroller
LCD display
1
To display the messages
Camera
1
To acquire the images at instance of time
Soil moisture sensor
8
To collect the moisture data from different locations of soil
Water pump
1
To uplift the water for irrigation
68 Smart Water Resource Management by Analyzing the Soil Structure …
717
Fig. 68.6 Screenshot of conducted experiment which identifies the moisture level-3
Proposed system is tested on above said edge computing device with minimal hardware support. Imaging sensor is used to capture the images over a period of time; each image is validated using the pretrained model which is loaded in Raspberry Pi device; soil type and the amount of moisture present in the soil are analyzed, and decision is made to whether ON/OFF the water pump for water management. Experiment work is conducted on the field with different type of soil structures and moisture level; data is collected on field for training the model; decent improvement is observed while using these VGG-19 model when compared with the traditional sequential model (Fig. 68.6). Implementation consists of both soil type recognitions along with soil moisture content detection. Image processing is used for detection of soil type and sensors are used to find the moisture level in the soil. The information is then showed on LCD display. On LCD message displayed area as below, 1. 2. 3. 4. 5. 6. 7. 8.
’Initializing system!’, ’Importing of model done’ ’Model loaded’ ‘Image is captured’ ‘Display moisture level and water level’ ‘Detected soil type’ ‘Detected moisture level’ ‘Motor ON/OFF condition based on moisture level and water level’.
718
S. W. Mohammed et al.
68.5 Conclusion In the proposed methodology, the soil type and soil moisture are detected using the pretrained VGG-19 as backbone model, which outperform the accuracy when compare with traditional sequential and CNN models. Tiny ML or TensorFlow lite framework are used to translate trained model light weight model which can infer into edge computing device for decision making. Kaggle soil structure dataset is used to train the model, and testing is performed using proprietary dataset to evaluate the proposed model; experimentation is conducted on low computing hardware using TensorFlow lite model with transfer learning; we achieved considerable performance for proposed method.
68.6 Future Scope In the future, we can implement an android app which can be useful to detect the soil type and the moisture which helps the farmers to get alerts and monitor the status of their farms. Even we can suggest the farmer regarding the health condition of soil by analyzing the nitrogen and other mineral content and the need of fertilizers which helps the nature to avoid unnecessary usage of fertilizers. In the future, not only soil type but we can also even detect soil porosity, mineral particles contained in soil, chemical components in the soil, PH level, etc., for improving the farm yield. Acknowledgements This work was financially assisted by Science for Equity Empowerment and Development (SEED), a statutory body of Department of Science & Technology (DST), Government of India, under Grant No. SP/YO/2019/974(G). The financial supports are gratefully acknowledged.
References 1. Yu, J., et al.: A Deep learning approach for multi-depth soil water content prediction in summer maize growth period. IEEE Access 8, 199097–199110 (2020). https://doi.org/10.1109/ACCESS. 2020.3034984 2. Yang, Y., et al.: A reliable soil moisture sensing methodology for agricultural irrigation. 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications(ISPA/IUCC) (2017) 3. Wang, T., Liang, J., Liu, X.: Soil moisture retrieval algorithm based on TFA and CNN. IEEE Access 7, 597–604 (2019). https://doi.org/10.1109/ACCESS.2018.2885565 4. Sobayo, R., et al.: Integration of convolutional neural network and thermal images into soil moisture estimation. 2018 1st International Conference on Data Intelligence and Security (2018) 5. Xu, J., Luo, Y., Zhang, K., Yu, H.: Experimental study on convolutional neural network-based soil moisture feature recognition. 2019 3rd International Conference on Electronic Information
68 Smart Water Resource Management by Analyzing the Soil Structure …
719
Technology and Computer Engineering (EITCE), pp. 597–600 (2019). https://doi.org/10.1109/ EITCE47263.2019.9095171 6. Chunmeiet, L., et al.: Research on the soil and water conservation division management mode in Erhai Lake basin. 2011 Fourth International Conference on Intelligent Computation Technology and Automation (2011) 7. Kulaglic, A., et al.: Spatiotemporal soil moisture estimation for agricultural drought risk management. 2013 Second International Conference on Agro-Geoinformatics (Agro-Geoinformatics) 8. Ganesh Babu R, et al.: Soil test based smart agriculture management system. 7th International Conference on Smart Structures and Systems (ICSSS) (2020)
Chapter 69
The Assessment of Challenges and Sustainable Method of Improving the Quality of Water and Sanitation at Deurbal, Chhattisgarh K. S. Prajwal, B. Shivanath Nikhil, Pakhala Rohit Reddy, G. Karthik, P. Sai Kiran, V. Vignesh, and A. S. Reshma Abstract Over the years people around the globe are experiencing aggravated threats emanating from unacceptable changes in physical, chemical, and biological characteristics of the soil, water, and air. The rapid increase in industrialization, population, nonchalant use of chemical fertilizers, pesticides, and undesirable sand mining and leaching of soil—have contributed to the enhanced spatial distribution of unhealthy polluted water resources. This has led to a large-scale proliferation of waterborne diseases across the third-world population. This research work conducted by Live-in-Labs® aims to understand the water quality challenges existing in a small village named Deurbal located in Chhattisgarh, India and its potential impact on the health of Deurbal residents. Primary data for the research was collected using various participatory methods such as Human centered design and Participatory Rural Appraisal. The challenge identification efforts helped to understand that the community is facing severe challenges associated with poor drinking water quality. The results obtained through these tools highly recommended the need for an enhanced low cost community operated IoT-based water purification system to overcome the drinking water quality problems existing in the village.
K. S. Prajwal · G. Karthik · P. Sai Kiran · V. Vignesh Department of Electrical and Electronics Engineering, Amrita Vishwa Vidyapeetham, Bengaluru, India B. Shivanath Nikhil Department of Electronics and Communication Engineering, Amrita Vishwa Vidyapeetham, Bengaluru, India P. R. Reddy Department of Electronics and Instrumentation Engineering, Amrita Vishwa Vidyapeetham, Bengaluru, India A. S. Reshma (B) Amrita School for Sustainable Development, Amrita Vishwa Vidyapeetham, Amritapuri, Bengaluru, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_69
721
722
K. S. Prajwal et al.
Keywords Live-in-Labs® · Water quality · Purification · Waterborne dis eases · Human centered design · Participatory rural appraisal · Rural · Chhattisgarh · Deurbal
69.1 Introduction Driven by the needs of the basic survival of the global population, and industrialized economic development, the human race is ravaged by spatial and temporal variability in water quality insources of surface and groundwater. Anthropogenic factors such as the heedless land use pattern, socio-economic development, unrestrained population growth, and climatic factors, pose unforgiving adversarial health effects. A World Bank study reported that 21% of communicable diseases in India are waterborne [1]. An Asian Journal probe indicated that around 844 million people around the globe lack access to basic drinking water [2]. Availability of clean, safe potable water is an essential for the prevention of disease and improving the quality of life. In many parts of the country, the available water is becoming non-drinkable supervened by excessive residues of chemical and biological contaminants, metal ions, and pathogens [3]. Consumption of contaminated water can cause immune suppression, reproductive failure, gastrointestinal disease, kidney, and skin problems, by the presence of toxins released from industrial waste [4]. The mortality rate due to waterborne diseases is higher in rural areas impeded by the unavailability of safe drinkable water. This study, conducted as part of the Live-in Labs® initiative, the student-faculty research team focused on understanding the water quality problems existing in a village named Deurbal located in Chhattisgarh, India [5, 6]. Based on qualitative analysis of the data obtained from Human Centered Design (HCD) and Participatory Rural Appraisal (PRA), the need for an enhanced low cost community operated IoT-based water purification system has been substantiated and is detailed in this paper. The following section provides an overview of the study area, which explains major occupation, traditional practices, climatic changes, and issues confronted by the Deurbal community. The findings and methods employed in the study are described in Methodology. The solution is detailed in Sect. 69.5 of this paper which is followed by conclusion and future aspects.
69.2 Study Area Deurbal is a village located in Makdi Tehsil of Kondegaon district in Chhattisgarh, India (Fig. 69.1). Kondagaon is the nearest town to Deurbal which is approximately 25 km. There are about 196 houses in the village. Deurbal has a population of 950 people. Most of the people are of scheduled tribes (ST). The entire village
69 The Assessment of Challenges and Sustainable Method …
723
Fig. 69.1 Study area: Deurbal village located in Makdi Tehsil of Kondegaon district in Chhattisgarh, India
is divided into 5 wards namely, Belgaon para, Bhima para, Khas para, Dongli para, Chingdabeda. The main source of water is underground water. The languages spoken by the community are Halbi and Hindi. In Deurbal, the male literacy rate stands at 57.51% and female literacy rate is 35.59%. The main source of income is from Agriculture and by selling products (such as NTFP) that grow abundantly in the forest nearby. The crops primarily cultivated by the community are mahua and maize. Mostly tractors, bullock carts, jeeps, and bikes are used for transportation purposes. The village lacks basic sanitation facilities to meet the requirements of 196 households. There are public toilets built by the government in the village, but the villagers are not using them and they continue to defecate in the open because they are unaware of the consequences it causes to human health. There are 18 hand pumps across the village used for drinking and domestic purposes. There are 3 ponds in the village out of which one is dried up due to climatic variations.
69.3 Methodology A ethnographic approach was followed in the study of the Deurbal village [7]. HCD [8] and PRA comprise the key constituents of this approach. These tools helped find a viable solution, in concert with the local residents. In these approaches, village participants play a vital role in the design and development of a solution and also help us to understand their daily experiences in the village. These methods called for
724
K. S. Prajwal et al.
the study team to live with the local villagers to gain a first-hand appraisal of their problems, understand their basic requirements and develop suitable solutions.
69.3.1 Participatory Rural Appraisal The following PRA tools were utilized for this study, to understand the challenges faced by the villagers: (A) Resource Mapping: A resource map is a tool used to gain a clear understanding of the village’s resources such as water bodies, health facilities, schools, availability of basic needs such as food, water, electricity, and their physical distribution [9]. The study team was divided into two–one half visited the following 3 wards of the village: Bhima para, Khas para, and Belgaon para; the other half visited the other 2 wards—Dongli para and Chingdabeda. The resources were pertinent to the following sectors: water, health, education, income generation, and waste management. Field observations helped the team collect information about the major resources available to the community. (B) Seasons and Activities: This tool helped the team identify the activities performed by the local people during various seasons in a year [10]. The data obtained through this tool revealed that paddy farming is primarily carried out in the rainy season from June to September or mid of October as there will be plenty of water. During the summer months beginning from March, villagers sell Sal seeds from the forest; for the rest of year they sell Mahua, which is used to make alcohol and energy. These investigations revealed that there is a serious issue with the quality of water during the rainy season, supervened by the spread of waterborne diseases—cholera, viral fever, dysentery, dengue, and malaria. (C) Brainstorming: Brainstorming is the session where members of the study team sit together to generate ideas to solve the observed problems [11]. Villagers are also invited to participate in these discussions, which lead to identification of the main challenges. Brainstorming sessions are conducted before the interview to generate new questions in order to understand the problems that need attention; and finally all of these are documented in the team’s field journals. (D) Problem Tree Mapping: Problem tree mapping helped in realization of the day-to-day problems confronted by the people in the village. People reported that quality of water was not good as it is an impure form of water which led to a variety of waterborne diseases such as cholera, viral fever, dysentery, dengue, and malaria. There is no hospital or primary health center at Deurbal village; a sick person suffering from these diseases finds it an arduous exercise to receive medical attention and prescribed medication in a timely manner. People reported that as the toilets constructed in the houses are non-usable, they are compelled to practice open defecation. On the other hand, poor drainage system in the village is the main reason for stagnant water, forming a breeding
69 The Assessment of Challenges and Sustainable Method …
725
ground for a wide variety of mosquitoes and pathogens. Only sources available for drinkable water in the village are hand pumps, which are not maintained properly.
69.3.2 Human Centered Design In this study, the HCD research methodology was applied, which is a participatory design approach centered on the user’s needs. HCD was used to capture user experiences, which must be considered when making decisions on the design of a sustainable solution. This strategy was used to ensure that the collaboratively designed solution gain community acceptance and is then implemented by the community. The methods utilized in this research are explained below (A) Participant Observation: The research team stayed in the village for seven days. Various activities of the villagers were observed during these days. These observations enabled determination of the Activities, Environment, Interactions, Objects, and Users in our specific study area by using the AEIOU observation framework [12]. This method highlighted the fact that villagers were completely dependent on agriculture. Major crops grown are paddy and maize and mahua. Open water sources were utilized for irrigation. Dried lands during the summer season, indicating a shortage of rainwater, was confirmed during the interview session. During the rainy season villagers are predisposed to sickness by consumption of contaminated water; and in summer, water turns into pale yellow and reddish color which clearly indicates the presence of chemical contaminants. (B) Interview: Semi-structured interviews were conducted with both men and women above the age of 15, to understand the challenges they faced. In-depth interviews conducted with the villagers gave a clear idea of the major resources available to them and the socio-economic conditions of the village. Interviews were also held with Panchayat members and the health department to gather information about water quality, available water sources in the village, water treatment methods adopted by the villagers, and common waterborne diseases. (C) Personas and Scenarios: The team developed Personas based on the interviews conducted with the participants. Persona [13] is the character which describes a particular group of people having the same problem or people with the same needs. Persona also helps in the ideation process, which is helpful in the analysis of the problems faced by the villagers and their requirements. The variations of color of water during storage, the difficulty to collect water from distant places and proliferation of disease during rainy seasons
726
K. S. Prajwal et al.
were the major problems reported by the residents of Deurbal. Scenarios detailed the actual situation that the villagers are facing. People reported that the collected water from hand pumps turns yellow during storage time; with no other treatment facilities, they are forced to consume contaminated water.
69.4 Results and Discussion This section is divided into two sub parts. The first part—Result I deals with the data collected through the PRA and HCD methodologies whereas Result II will cover the technological solution designed on the findings listed in Results I.
69.4.1 Study Outcomes On close review and analysis of the data gathered through the PRA tools, the area of focus was narrowed down. These tools provided clear insights into the information gathered, qualitatively and quantitatively. Deurbal village faced major risks in health emanated by untreated water and the presence of pathogens in the underground water table. The data collected by the team from different families helped in estimation of the amount of water a house requires per day for household activities like washing utensils, clothes, cleaning, drinking, and bathing. Amount of water required by the entire population in the village is approximately 300 L/day. Vessels and pots of different sizes are used for the transport of water from the hand pumps. The study team’s analysis determined that water collected from some of the hand pumps turned yellow in color after a few hours of storage, which could be due to the presence of high amounts of iron content in groundwater [14]. Increased incidence of skin allergies and cholera during the rainy season suggest contamination of water by the presence of microorganisms—coliforms and parasites [15].
69.4.2 Proposed Solution After a deep understanding of all the data collected in the village, the student research team brainstormed with the faculty and relevant experts of the most doable in solving the problem of water and sanitation in the village. All the details of the device and solution is as explained below. According to WHO and Indian drinking water standards, concentration of physical chemical and biological components has to be within the standard limits [15]. Jivamritam filer unit, co-designed by the research team and the villagers at Deurbal for drinking water purification, is a low cost filtration system that is capable of removing the physical, chemical, and bacterial contaminants (Fig. 69.2). The primary goal of
69 The Assessment of Challenges and Sustainable Method …
727
the Jivamritam initiative is to supply everyone with clean and safe drinking water. Mr. Ram Nath Kovind, India’s President, launched it on October 7th, 2017 and has since deployed suitable filtration units in 250 areas based on contaminants present in the drinking water [16]. The modular system is constituted of a dual-media filter, which is filled by 4 types of sand, 2-micron filters having different pore sizes and a UV sterilizer. From the initial analysis, it is evident that water consumed by the villagers is contaminated by chemical and biological pollutants. Jivamritam filter is designed to remove both chemical and biological contaminants. The dual-media filter is capable of removing odor, turbidity, TDS and metals like iron [17]. The sand filter followed by the micron filter helps the removal of microbial contaminants as well as the residual-micro pollutants present in the water. Thus, the proposed design consists of a two-micron filter of different sizes, 5 microns followed by 1 micron to remove all the contaminants efficiently. According to the Indian standards, coliforms and Escherichia coli should be nil in 100 mL of the water sample [18]. To remove all the bacteria present in the water sample, a UV sterilizer is also built-in, after micron filtration. This helps to inactivate all the microorganisms which may be present after micron filtration. Thus, pure safe drinking water is ensured to all consumers after the filtration process. Along with purification, the system is automated to detect the water level and thus to make it touch less. Especially this COVID-19 pandemic situation, is dangerous to use a handpump which is usable by the whole community. The proposed design can sense the water level in the container; based on the water level it can automatically be switched on and off per the user demand (Fig. 69.3). The figure explains the sensor design which consists of a motor which helps to pump the groundwater. The villagers get to access the water from the tap of the tank through remote control through a smartphone. The water sensor detects the water flowing out of the tank and sends an analog signal to the microcontroller unit which
Fig. 69.2 Jivamritam filter unit deployed in the community
728
K. S. Prajwal et al.
Fig. 69.3 Schematic representation of Jivamritam filter unit
further processes the signal and converts it to a digital signal. This digital signal is passed on to the LED bulb that glows in green color when the tap is open and turns black/off when the tap is closed. The water tank gateway was designed in remote mode by allocating an IP address of 192.168.25.1 followed by a subnet cover of 255.255.255.0. It utilizes channel 6 of the 2.4 GHz recurrence band of the radio range. The neighborhood, which is the wireless local area network is all around got by utilizing remote ensured admittance II technique that uses a progressed encryption framework which is the most grounded encryption convention. Apart from purification of the water, this water purification system also helps the community in different ways. Since it is deployed in a common location, a group of people from the community itself can be given the responsibilities of delivering water to the villagers and maintenance of the system. Instead of spending the amount to buy water, a small amount can be collected by the villagers which could be utilized for system maintenance and salary for the responsible people. A similar study which developed a micro-water distribution system that ensured water accessibility to all people, was built by researchers at Amrita Vishwa Vidyapeetham and implemented in other study areas [19].
69 The Assessment of Challenges and Sustainable Method …
729
69.5 Conclusion This study helped the research team to comprehend the major challenges faced by the villagers at Deurbal—poor water quality and the associated waterborne diseases. The participatory approach utilized in this study helped easier understanding and tackling the community’s challenges. Variation of color of water during storage period and the spread of waterborne diseases served as testimonials to the presence of chemical and biological contaminants in the village’s water sources. To overcome this situation, a low cost water purification system is highly required. Conventional methods followed by the villagers such as boiling or filtering using cloth materials cannot remove the micro pollutants present in the water. The multilevel filtration system developed by the team helped in the removal of chemical and microbial pollutants to provide safe drinking water to the community. Also the touchless automated system helps to avoid the spread of other diseases. Since the ownership and maintenance is undertaken by the community members itself, this system will also provide earning opportunities to the community members and also reduce the amount for installation of clean drinking water purification system. The proposed solution helps to achieve United Nation Sustainable Development Goals—(SDG 3) Good Health and Well-being, (SDG 6) Clean water and sanitation. As the proposed remedy is a modular design, the water purification system could be also repeatable in other communities having similar water quality issues. Acknowledgements This research was funded by Amrita Vishwa Vidyapeetham’s Live-in-Labs® program and the field implementation was funded by the UN—recognized NGO, MA Math. The authors express their immense gratitude to Sri. Mata Amritanandamayi Devi, Chancellor of Amrita Vishwa Vidyapeetham, who has inspired them in performing selfless service to society. The authors also thank the faculty members, staff, and students of Amrita Vishwa Vidyapeetham, the government officials of the district of Kondegaon, Deurbal, Chhattisgarh and staff of the Amrita Self Reliant Villages (Amrita SeRVe) program.
References 1. Karthick, B., Boominathan, M., Ali, S., Ramachandra, T.V.: Evaluation of the quality of drinking water in Kerala state, India. Asian J. Water Environ. Pollut. 7(4), 39–48 (2010) 2. Edition, F.: Guidelines for drinking-water quality. WHO Chronicle 38(4), 104–108 (2011) 3. Prasad, G., Reshma, A.S., Ramesh, M.V.: Assessment of drinking water quality on public health at Alappuzha district, southern Kerala, India. Mater. Today: Proc. 46, 3030–3036 (2021) 4. Ho, Y.C., Show, K.Y., Guo, X.X., Norli, I., Alkarkhi Abbas, F.M., Morad, N.: Industrial discharge and their effect to the environment. In: Industrial Waste, Intech, pp. 1–39 (2012) 5. Ramesh, M.V., Mohan, R., Menon, S.: Live-in-labs: rapid translational research and implementation-based program for rural development in India. In: 2016 IEEE Global Humanitarian Technology Conference (GHTC), pp. 164–171. IEEE (2016) 6. Varma, D.S., Nandanan, K., Vishakh Raja, P.C., Soundharajan, B., P´erez, M.L., Sidharth, K.A., Ramesh, M.V.: Participatory design approach to address water crisis in the village of Karkatta, Jharkhand, India. Technol. Forecast. Soc. Change 172, 121002 (2021)
730
K. S. Prajwal et al.
7. Kadiveti, H., Eleshwaram, S., Mohan, R., Ariprasath, S., Nandanan, K., Divya Sharma, S.G., Siddharth, B.: Water management through integrated technologies, a sustainable approach for village Pandori, India. In: 2019 IEEE R10 Humanitarian Technology Conference (R10-HTC) (47129), pp. 180–185. IEEE (2019) 8. Chambers, R.: The origins and practice of participatory rural appraisal. World Dev. 22(7), 953–969 (1994) 9. Abhinaya, P.B., Adarsh, T., Vanga, P., Sivanesh, S., Vishnuvardhan, Y., Radhika, N., Reshma, A.S.: Case study on water management through sustainable smart irrigation. In: IOT with Smart Systems, pp. 569–578. Springer, Singapore (2022) 10. Chandra, G.: Participatory rural appraisal. In: Katiha, P.K., Vaas, K.K., Sharma, A.P., Bhaumik, U., Chandra, G. (eds) Issues and Tools for Social Science Research in Inland Fisheries. Central Inland Fisheries Research Institute, Barrackpore, Kolkata, India. Bulletin 163, 286–302 (2010) 11. Mohan, H.T., Nandanan, K., Mohan, R., Sadipe, O., Williams, I., Potocnik, T.: Case study on co-design methodology for improved cook stove solutions for rural community in India. In: 2019 IEEE R10 Humanitarian Technology Conference (R10-HTC) (47129), pp. 153–158. IEEE (2019) 12. Lee, M.J., Wang, Y., Been-Lirn Duh, H.: AR UX design: applying AEIOU to handheld augmented reality browser. In: 2012 IEEE International Symposium on Mixed and Augmented Reality-Arts, Media, and Humanities (ISMAR-AMH), pp. 99–100. IEEE (2012) 13. Shapiro, J.: Effect of yellow organic acids on iron and other metals in water. J. Am. Water Works Ass. 56(8), 1062–1082 (1964) 14. Nwabor, O.F., Nnamonu, E.I., Martins, P.E., Christiana, A.O.: Water and waterborne diseases: a review. Int. J. Trop. Dis. Health 1–14 (2015) 15. WHO, Geneva: Guidelines for drinking-water quality. World Health Organization 216, 303–304 (2011) 16. Ajith, V., Reshma, A.S., Mohan, R., Ramesh, M.V.: Empowering community in addressing drinking water challenges using participatory sustainable technological systems business models. Technol. Forecast. Soc. Change (In Press) 17. Ahammed, M.M, Meera, V.: Metal oxide/hydroxide-coated dual-media filter for simultaneous removal of bacteria and heavy metals from natural waters. J. Hazard. Mater. 181(1–3), 788–793 (2010) 18. Chan, C.L., Zalifah, M.K., Norrakiah, A.S.: Microbiological and physicochemical quality of drinking water. Malaysian J. Anal. Sci. 11(2), 414–420 (2007) 19. Ramesh, M.V., Mohan, R., Nitin Kumar, M., Brahmanandan, D., Prakash, C., Lalith, P., Ananth Kumar, M., Ramkrishnan, R.: Micro water distribution networks: a participatory method of sustainable water distribution in rural communities. In: 2016 IEEE Global Humanitarian Technology Conference (GHTC), pp. 797–804. IEEE (2016)
Chapter 70
Design of Social Distance Monitoring Approach Using Wearable Smart Tags in 5G IoT Environment During Pandemic Conditions Fernando Molina-Granja, Raúl Lozada-Yánez, Fabricio Javier Santacruz-Sulca, Milton Paul López Ramos, G. D. Vignesh, and J. N. Swaminathan Abstract Corona virus disease 2019 (COVID-19) is an enduring extensive pandemic disease globally that infected several million people, occasionally leads to fatal. COVID-19 transfers from one person to another, the prevention is possible only through maintaining social distance. This paper proposed Internet of Things (IoT)based smart wearable tags for social distance monitoring to prevent COVID-19. The advantage of the proposed system is that it alarms the person with the proposed tag and also entries the data in the IoT ledger of the channel for further analysis. The proposed system responds adequately so that social distancing can be maintained and pandemics can be prevented from further spreading. The analysis has been carried out using an experimental setup, and Thingspeak is used as an IoT channel for data analytics. The result shows the successfulness of the developed system in the prevention of pandemics with social distancing monitoring. Keywords COVID-19 · Internet of Things (IoT) · Smart wearable · Pandemic prevention · Social distance
F. Molina-Granja · M. P. L. Ramos Facultad de Ingeniería, Universidad Nacional de Chimborazo (UNACH), Riobamba, Ecuador R. Lozada-Yánez · F. J. Santacruz-Sulca Escuela Superior Politécnica de Chimborazo (ESPOCH), Faculty of Informatics and Electronics, Riobamba, Ecuador G. D. Vignesh (B) St. Joseph’s College of Engineering, OMR, Chennai, India e-mail: [email protected] J. N. Swaminathan QIS College of Engineering and Technology, Ongole, Andhra Pradesh 523272, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_70
731
732
F. Molina-Granja et al.
70.1 Introduction Novel Coronavirus Disease 2019 (COVID-19) is one of the vital problems that has collapsed the world leading to severe health issues and economic crises [1]. COVID19 became a global pandemic that has the ability to spread from one infected person to another causing severe health issues, occasionally leading to fatal [2]. At the present time, no successful immunization or specific drug for the treatment of COVID-19 has been invented. Even so, expected immunizations and some specific medications have been injected as vaccination in the majority of the countries [3]. The COVID-19 Omicron variant has infected many persons who have been vaccinated [4]. Hence, it is evident that there is no accurate vaccination to prevent the spread of COVID-19 [5]. The spread can be prevented only through maintaining proper social distance and hygiene lifestyle [6]. As of now, social distancing has shown to be a viable practice to limit the spreading of COVID-19 [7]. Accordingly, social distancing has provoked analysts and designers to track down mechanical arrangements to battle against the spread of the COVID-19 infection. Further, it is emphatically suggested by the WHO that a base distance of 2 m should be kept up with to reduce the spread of COVID-19 [8–10]. Several approaches have been discussed in monitoring and maintaining social distance to prevent the spread of COVID-19. In an approach [11], CCTV cameras either at a public spot or through the observation framework through the drone have been developed to real-time monitor the people in maintaining social distance. A few portable applications and IoT gadgets have been developed in recent days to neutralize the spread of COVID-19. In [12] discusses a conservative and minimalexpense wearable electronic gadget model that uses received signal strength of the Wi-Fi signals produced by other wearable gadgets of a similar sort, and afterwards evaluates the nearness distance between the users and issues a warning when the distance between the users is under predefined edge esteem. In [13] proposes a wearable that includes a magnetic field-based proximity sensing system to monitor and maintain social distances between individuals. The author also discusses the explores the devices that are capable of monitoring and having the option to distinguish the distance of the people drawing closer in the scope of (1.5–2.0 m), where the proposed system has been tested in a laboratory environment and becomes complex in real-time analysis. In [14] proposes an IoT-based mechanized global positioning framework recognize and monitor social distancing through utilizing RFID labels. The developed system empowers and tracks the person with COVID-19 using a mobile application integrated with the RFID tag. When the RFID label crosses a cell phone, it is recorded, and the gathered subtleties from the cell phone are passed to the edge gadget for additional handling. Further, the proposed system is capable of social distancing and the system depends on the power sources. In [15] proposes an IoT-based social distance observing framework that uses cell phone applications and a wearable gadget in monitoring the social distance. The proposed system comprises of a bunch of contacts following applications that can lay out information collection and conversion. The proposed system records the social distance and warns the user
70 Design of Social Distance Monitoring Approach Using Wearable …
733
when in contact less than 1 m and further information is synchronized utilizing their cell phone. The IoT with wearable gadgets can play an important role in the dayto-day activity of the human in simplifying the complex task and also plays a major role in preventing COVID-19. This paper proposed Internet of Things (IoT)-based smart wearable tags for social distance monitoring to prevent COVID-19. The advantage of the proposed system is that it alarms the person with the proposed tag and also entries the data in the IoT ledger of the channel for further analysis. The proposed system responds adequately so that social distancing can be maintained and pandemics can be prevented from further spreading. The analysis has been carried out using an experimental setup and Thingspeak is used as an IoT channel for data analytics. The result shows the successfulness of the developed system in the prevention of pandemics with social distancing monitoring.
70.2 Social Distance Approach in Preventing COVID-19 This section examines and details the significance of the social distance and social distance checking framework. Social distancing also incorporates limitation such as curfew, public gathering, and travel restriction to control the spread of COVID-19. It is also other ways known as non-pharmaceutical contagion precaution method that is implemented to prevent the spread of COVID-19 by restricting the contact between the infected person. Figure 70.1 depicts the significance of social distancing to avoid COVID-19 transmission. It is recommended to stay 2 m from one person to another to prevent virus transmission. Besides, the strategies for social distancing incorporate dropping enormous get togethers, like games and shows, as well as shutting schools, bars, and
Fig. 70.1 Significance of social distancing
734
F. Molina-Granja et al.
eateries, and having individuals telecommute rather than an office. In any case, it additionally implies restricting any collaboration with anybody past your close family, with whom you live. Any time you go out and are around others, you are dramatically expanding the reach you have with the world and the chance of sending the COVID19. Keep in mind, separating ourselves from others safeguards everyone—especially the more defenseless in the public arena.
70.3 Materials and Methods The sensors, modules, and methods used in the assessment has been discussed in this session. The developed approach uses Arduino Pro Mini module integrated with the IoT through the ESP8266 Wi-Fi module.
70.3.1 Sensors and Modules The sensors and modules consider in the study are listed in Table 70.1. In the study, Ultrasonic Distance Sensor-HC-SR04 is used for distance measurement, NeoPixel Ring 12 is used for light indication-based warning. Piezo-based sound indicator is used for sound warning. Further, a battery with 2000 mAh, 3.7 V is used to supply power to the developed model. Table 70.1 Components considered in the study
Sl. No.
Components
Remarks
1
Arduino Pro Mini
3.3 V powered by ATmega328
2
ESP8266
Wi-Fi module
3
Ultrasonic distance sensor
Sensors for distance measurement
4
NeoPixel ring 12
Light indication purpose
5
Piezo
Sound indication purpose
6
Battery
2000 mAh, 3.7 V
70 Design of Social Distance Monitoring Approach Using Wearable …
735
Fig. 70.2 Social distance monitoring IoT domain model
70.3.2 IoT Domain Model for Social Distance Monitoring In the developed IoT domain model, there are three layers that interface with one another as shown in Fig. 70.2. The first layer is the physical layer were the senorbased wearable tag inbuilt with module both main board and Wi-Fi module, HCSR04, NeoPixel Ring 12, piezo, and battery are placed. The second layer is the edge layer which reads the data from the physical layer. The third is the dispatch or application layer. The read data can be written here using the API keys. In this study, Thingspeak has been used for IoT analytics. This Thingspeak web analytics reads the data for every 5 min interval. The web analytics read the expected data of the foreordained region using API and stored it in MySQL informational collection at the laborer, which is considered in the figure estimation. In the challenge, the IoT Edge is an organization that develops top of the IoT Hub and engages clients to finish edge enlisting. Edge handling is when data is taken apart on devices, or at least, at the edge of the association, rather than in the real cloud. With edge handling, you can swear off moving rough data by means of doing data cleaning, aggregate, and assessment on the genuine device, and thereafter, send the pieces of information gained to the cloud. This will achieve reduced bandwidth costs, quicker response times, and diminished traffic.
70.4 Proposed IoT-Based Smart Wearable Tags The functional diagram of the developed IoT-based smart wearable tag is shown in Fig. 70.3. The sensors and module acts in the physical payer, the Wi-Fi module communicates with the edge layer, and data analytics has been carried out in the application layer. In the proposed model, the user with the model can move in public with essential protective kit. The HC-SR04 senses any contact when the user moves near to any person. The ESP8266 must be configured with the active internet connection which writes the read data in the IoT Thingspeak. These all components have been mounted in fabricated tag. When the user uses the tag, and when the user nears
736
F. Molina-Granja et al.
Fig. 70.3 Function of proposed IoT-based smart wearable tags for social distance monitoring
Fig. 70.4 Experiment setup
to another user less than 1.5 m, the developed system alarms the user and indicate them to maintain social distancing. Furthermore, the data has been read and stored to the Thingspeak IoT data analytics tool for monitoring and control. The experimental setup of the developed system is shown in Fig. 70.4. In the study, Ultrasonic Distance Sensor-HC-SR04 is used for distance measurement and NeoPixel Ring 12 is used for light indicationbased warning. Piezo-based sound indicator is used for sound warning. Further, a battery with 2000 mAh, 3.7 V is used to supply power to the developed model. The connections are made as per the circuit shown in Fig. 70.3. The circuit can be embedded in the tiny plastic tags and will start function when powered. Furthermore, the program is fed to mini board with the help of uno. The algorithm used for the analysis is as follows.
70 Design of Social Distance Monitoring Approach Using Wearable …
737
Algorithm Configure Simulator Wifi with password
if(i < 4)
host = “api.thingspeak.com”
{
String uri = “/update?api_key = AOYEXXXXYYYY5555”;
strip.setPixelColor(i, strip.Color(50, 0, 0));//green, red, blue
GET https://api.thingspeak.com/ update?api_key=AOYEXXXXY YYY5555&field1=0
}
int sensorPin = 0;
else if(i > = 4 && i < 8)
Configure
{
#include < Adafruit_NeoPixel.h >
strip.setPixelColor(i, strip.Color(50, 50, 0));//green, red, blue
int ledPin = 3;
}
int ledNo = 12;
else if (i > = 8 && I < 12)
Adafruit_NeoPixel strip = Adafruit_NeoPixel(ledNo, ledPin, NEO_RGB + NEO_KHZ800);
{
int buzzerPin = 2;
strip.setPixelColor(i, strip.Color(0, 50, 0));//green, red, blue
int echoPin = 6;
}
int trigPin = 5;
}
int minDistance = 100;
for(int i = ledsToGlow; i < ledNo; i + +)
int maxDistance = 300;
{
void setup()
strip.setPixelColor(i,strip.Color(0, 0, 0));
{
}
pinMode(buzzerPin, OUTPUT);
strip.show();
pinMode(trigPin, OUTPUT);
delay(50);
pinMode(echoPin, INPUT);
}
Serial. Begin(9600); strip.begin();
int calcDistance()
for(int I = 0; I < ledNo; i + + )
{
{
long distance,duration;
strip.setPixelColor(I, strip.Color(0, 0, 0));
digitalWrite(trigPin, LOW);
}
delayMicroseconds(2);
strip.show();
digitalWrite(trigPin, HIGH);
}
delayMicroseconds(10);
void loop()
digitalWrite(trigPin, LOW);
{
duration = pulseIn(echoPin, HIGH); (continued)
738
F. Molina-Granja et al. (continued) Algorithm int distance = calcDistance();
distance = duration/29/2;
//Serial.println(distance);
if(distance > = maxDistance)
int ledsToGlow = map(distance, minDistance, maxDistance, ledNo, 1);
{
//Serial.println(ledsToGlow);
distance = maxDistance;
if(ledsToGlow = 12)
}
{
if(distance < = minDistance)
digitalWrite(buzzerPin, HIGH);
{
}
distance = minDistance;
else
}
{
return distance;
digitalWrite(buzzerPin, LOW);
}
} for(int i = 0; i < ledsToGlow; i + +) {
70.5 Proposed IoT-Based Smart Wearable Tags An extensive analysis has been carried out to prove the function of the proposed IoT-based smart wearable tag for social distance monitoring. The result obtained is depicted in Fig. 70.5. Figure 70.5a shows the normal distance of 2 m maintained between the subject. When the subject moves closer with a gap of 1.8 m to another person, the LED shows the range without any warning as shown in Fig. 70.5b. Further, when the subject moves closer with a gap of < 1.5 m with another person, the LED shows the range with warning as shown in Fig. 70.5c. At the condition when the subject moves closer with a gap of < 1 m with another person, the LED shows the range with warning and piezo buzzer turns on as shown in Fig. 70.5c. This developed module can help to maintain social distance and prevent the spread of COVID-19. Meanwhile, the data read in the module has been transferred to the IoT domain using the Wi-Fi kit ESP8266. The data read with IoT can be visualized in the IoT Thingspeak channel. The channel name and user name have been removed for the privacy purpose in the study. Figure 70.6a shows the IoT Thingspeak channel in which the data read using the developed model is written with the help of write API key. Figure 70.6b shows the field 1 data of the channel that is distance maintained read in every 5 min interval. Further, analysis is possible using the MATLAB integrated option in Thingspeak to
70 Design of Social Distance Monitoring Approach Using Wearable …
739
Fig. 70.5 Obtained result for social distancing monitoring system
Fig. 70.6 Thing speak data analytics
explore more intelligent option and analysis. This developed module can help to maintain social distance and prevent the spread of COVID-19. The result shows the successfulness of the developed system in the prevention of pandemics with social distancing monitoring.
70.6 Conclusion Internet of Things (IoT)-based smart wearable tags for social distance monitoring to prevent COVID-19 has been developed in this study. The advantage of the developed
740
F. Molina-Granja et al.
system is that it alarms the person with the proposed tag type smart wearable device and also entries the data in the IoT ledger of the channel for further analysis. The developed system responds adequately so that social distancing has been maintained and pandemics can be prevented from further spreading. The following observation has been identified in the analysis; it responds adequate with accuracy and data is stored in the Thingspeak ledger, the device cost is comparatively cheap and easy to handle. The result shows the successfulness of the developed system in the prevention of pandemics with social distancing monitoring.
References 1. Kumaran, M., Geetha, R., Antony, J., Vasagam, K.K., Anand, P.R., Ravisankar, T., Angel, J.R.J., De, D., Muralidhar, M., Patil, P.K., Vijayan, K.K.: Prospective impact of Corona virus disease (COVID-19) related lockdown on shrimp aquaculture sector in India–a sectoral assessment. Aquaculture 531, 735922 (2021) 2. Dubey, S., Biswas, P., Ghosh, R., Chatterjee, S., Dubey, M.J., Chatterjee, S., Lahiri, D., Lavie, C.J.: Psychosocial impact of COVID-19. Diab. Metab. Syndr. 14(5), 779–788 (2020) 3. Kumar, R., Al-Turjman, F., Anand, L., Kumar, A., Magesh, S., Vengatesan, K., Sitharthan, R., Rajesh, M.: Genomic sequence analysis of lung infections using artificial intelligence technique. Interdisc. Sci.: Comput. Life Sci. 13(2), 192–200 (2021) 4. Sitharthan, R., Rajesh, M., Madurakavi, K., Raglend, J., Kumar, R.: Assessing nitrogen dioxide (NO2 ) impact on health pre-and post-COVID-19 pandemic using IoT in India. Int. J. Pervasive Comput. Commun. (2020). https://doi.org/10.1108/IJPCC-08-2020-0115 5. Rajesh, M., Sitharthan, R.: Image fusion and enhancement based on energy of the pixel using deep convolutional neural network. Multimedia Tools Appl. 1–13 (2021) 6. Sitharthan, R., Rajesh, M.: Application of machine learning (ML) and internet of things (IoT) in healthcare to predict and tackle pandemic situation. Distrib. Parallel Databases 1–19 (2021) 7. Good, M.F., Hawkes, M.T.: The interaction of natural and vaccine-induced immunity with social distancing predicts the evolution of the COVID-19 pandemic. MBio 11(5), e02617-e2620 (2020) 8. Sheffi, Y.: The New (Ab) Normal: Reshaping Business and Supply Chain Strategy Beyond Covid-19. MIT CTL Media (2020) 9. Gammel, I., Wang, J., (eds).: Creative Resilience and COVID-19: Figuring the Everyday in a Pandemic. Routledge (2022) 10. Hashmi, A., Nayak, V., Singh, K.R., Jain, B., Baid, M., Alexis, F., Singh, A.K.: Potentialities of graphene and its allied derivatives to combat against SARS-CoV-2 infection. Mater. Today Adv. 100208 (2022) 11. Al-Sa’d, M., Kiranyaz, S., Ahmad, I., Sundell, C., Vakkuri, M., Gabbouj, M.: A social distance estimation and crowd monitoring system for surveillance cameras. Sensors 22(2), 418 (2022) 12. Cleofas, J.V., Rocha, I.C.N.: Demographic, gadget and internet profiles as determinants of disease and consequence related COVID-19 anxiety among Filipino college students. Educ. Inf. Technol. 26(6), 6771–6786 (2021) 13. Fatima, A., Ansari, S.K., Waqar, H., Jameel, A.: Determine augmented risk association between health problems and screen exposure to electronic gadgets in university students, Islamabad during COVID-19 pandemic. Innovation 2(2), 22–26 (2021) 14. Carreras Guzman, N.H., Mezovari, A.G.: Design of IoT-based cyber-physical systems: a driverless bulldozer prototype. Information 10(11), 343 (2019) 15. Alhmiedat, T., Aborokbah, M.: Social distance monitoring approach using wearable smart tags. Electronics 10(19), 2435 (2021)
Chapter 71
Smart Congestion Control and Path Scheduling in MPTCP Neha Rupesh Thakur and Ashwini S. Kunte
Abstract Featuring the recent rise of mobile technology, new devices with a variety of connection ports have become more popular. Multiple communication interfaces may now be usable over a single TCP connection thanks to the multi-path transmission control protocol (MPTCP), which was developed to speed up Internet use. There are three main design aims for the MPTCP congestion management algorithms: better performance, more fairness, and congestion balancing. MPTCP congestion control algorithms now in use cannot achieve these design goals. Due to its inability to leverage the network, an MPTCP congestion-control algorithm, such as OLIA, often results in poor performance. With the current Internet’s enormous volume of transient traffic, it is difficult to keep track of MPTCP congestion management techniques. MPTCP congestion control methods may benefit from being aware of current network delay conditions. There are various sub flows in an MPTCP connection, and the schedulers are employed to deal with this heterogeneity. MPTCP’s scheduler is an important part of the software. In this study, MPTCP congestion management and MPTCP schedulers are discussed. Keywords Congestion control methods · MPTCP scheduler · LIA · OLIA · BALIA · TCP Reno · CUBIC TCP
71.1 Introduction Transport layer protocol TCP has been largely used for decades even with its reliability and fairness in competing with other Internet traffic [1]. With TCP, the primary goal is to maximize available bandwidth [2]. Ref. [3] mobile terminals with various communication interfaces may now simultaneously access several wired/wireless networks [3], thanks to substantial wireless communication technology improvements. Because of this, TCP has developed from a single-path to a multi-path connection in order to keep up with the growing expansion of network capability. As a N. R. Thakur (B) · A. S. Kunte EXTC, TSEC, Mumbai, Maharashtra, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_71
741
742
N. R. Thakur and A. S. Kunte
positive impact over single-path TCP, the IETF’s working group on multi-path TCP (MPTCP) put up a proposal to implement it [4]. To preserve network stability and maximize bandwidth utilization, TCP uses congestion management algorithms to calculate the amount of in-flight data needed to govern data flow. To ensure optimal use of the underlying network, congestion management methods for MPTCP are employed. When constructing an MPTCP congestion management algorithm, there are three main design objectives that must be addressed. The following are the main objectives. Enhanced efficiency: A multi-path flow should perform at least as well as a single-path flow on the best available pathways. This aim guarantees that multipath deployment has an incentive. No negative impact on other flows: A multi-path flow should not utilize more capacity on any of its pathways than a single-path flow utilizing just that route would, as long as all of the paths are connected. For a multipath flow to achieve balanced congestion, it must first achieve the first two objectives: it must redirect as much traffic as possible away from its most crowded channels. In recent years, mobile devices have been equipped with a wide range of interfaces. The fourth generation (4G) long-term Evolution (LTE) and wireless LAN are common interfaces on most smartphones, f for example (WLAN). The fifth generation (5G) mobile network is intended to use numerous communication routes offered by several network providers to give mobile terminals with additional interfaces. As a result of this, the classic single route TCP cannot manage many interfaces at once since it makes a connection only between two IP addresses. Linux, Apple OS/iOS [5, 6], and Android [7] have all implemented MPTCP, an extension of TCP, in order to take use of the multiple interface configuration. MPTCP may be used as if it were TCP by conventional TCP programs, and multiple byte streams are made available through a variety of interfaces. Internet engineering task force (IETF) specifies MPTCP in three RFC documents (requests for comments). RFC 6182 [7] lays forth the architecture’s rules of conduct. MPTCP connections and sub flows (TCP connections as sub flows of MPTCP connections) may be maintained, as well as data transmission over an MPTCP connection, in accordance with RFC 6824 [8]. RFC 6356 [9] provides a congestion management strategy that combines strategies for distinct sub flows. The fact that MPTCP congestion management relies on the self-regulation of individual sub flows is an essential consideration. Because of the crowded nature of the connection, the RFC states that an MPTCP stream’s throughput should be lower than that of typical single-route TCP streams. Linking and decreasing the congestion window size of separate sub flows is the goal of the method defined in RFC 6356, which is known as linked increase adaptation (LIA). Even more aggressive algorithms, such as opportunistic linked increases algorithm (OLIA), have been suggested [10, 11]. Even TCP Reno follows the additive increase and multiplicative decreasing (AIMD) scheme, which is common to all of these techniques. In other words, each time a new ACK segment is received, the congestion window grows by one. CUBIC TCP [12] and Compound TCP [13] are examples of high-speed congestion management techniques in current operating systems. These methods raise the congestion window of TCP Reno more aggressively than previous versions had done. Consequently, the throughput of LIA and other MPTCP congestion management techniques may be reduced when they coexist. A performance assessment based on
71 Smart Congestion Control and Path Scheduling in MPTCP
743
these concepts may be found in [14]. The author carried out two different kinds of experiments in his research. MPTCP congestion control may be changed from LIA to TCP Reno or CUBIC TCP using a different algorithm. The other is for measuring performance when a bottleneck link is shared by an MPTCP connection with LIA and a single-path TCP connection with TCP Reno or CUBIC TCP. Two trials showed that LIA’s throughput was lower than that of TCP Reno and CUBIC TCP. It is possible that MPTCP throughput might be reduced by using Ethernet or WLAN connections, which restrict the number of experimental runs.
71.2 Background 71.2.1 MPTCP and TCP TCP is layered on top of MPTCP, as seen in Fig. 71.1. Unlike other protocols, MPTCP does not need to be exposed to typical applications. There are two or more normal TCP connections known as sub flows that are connected with an MPTCP connection. An MPTCP connection may now be managed and transferred using the new TCP options for MPTCP operations. The TCP option MP_CAPABLE is used in the SYN, SYN, and ACK, and subsequent ACK segments when the first sub flow is formed. The MP_JOIN option is used to link the new TCP connections to the current MPTCP connection when the following sub flows are established. Fig. 71.1 TCP merged in MPTCP
744
N. R. Thakur and A. S. Kunte
Fig. 71.2 MPTCP data sequence
Each sub flow in an MPTCP implementation has enough control information to reassemble and transmit the data stream to the receiver side application correctly and in sequence. An MPTCP connection’s data sequence number is unaffected by the sub flow level sequence number. For example, in the data and acknowledgment segments, a data sequence signal (DSS) option is shown in Fig. 71.2. Depending on the option settings, the data sequence number and data ACK are either four or eight bytes long. The sequence number is assigned byte-by-byte, much as the TCP sequence number. The data sequence number is the number allocated to the first byte sent in that TCP segment. The data sequence number, sub flow sequence number, and data-level length establish the mapping between the MPTCP connection level and the sub flow level. The data ACK behaves in the same way that a typical TCP total ACK would. It tells a receiver what the next data sequence number should be. Multiple network interfaces may be used simultaneously by mobile devices thanks to MPTCP, a TCP extension. The MPTCP design shown in Fig. 71.3 is uncommon. Splitting a single byte stream into many sub flows and sending them across separate network channels is the underlying principle of MPTCP’s main idea. [15] The MPTCP protocol is commonly used to increase network performance and achieve resiliency in communication networks. In order to assess a sub flow’s quality, we look at the route it takes to connect to the main network. This includes signal convergence, loss rate, time in the queue, and the performance of the connections it uses.
Fig. 71.3 Architectural parts of MPTCP
71 Smart Congestion Control and Path Scheduling in MPTCP
745
To control the distribution of data packets through diverse sub flows, a packet scheduler is used to find the optimal scheduling plan based on several goals, such as lowering transmission latency, reducing communication cost, and boosting network throughput. MPTCP speed is greatly impacted by the scheduling procedure, which involves determining how many data packets to distribute across the sub flows. MPTCP scheduling optimization is based on the principle of sending all data along the least congested channel feasible. The multipath congestion control approach is used for this. No more than a single route’s TCP capacity should be available for a flow connection on any path or set of pathways in single-path TCP. MPTCP scheduling has scalability, reliability, and latency issues in the context of IoT applications [16].
71.2.2 Congestion Control in MPTCP LIA: In MPTCP, individual sub flows are in charge of managing their congestion windows. The MPTCP connection as a whole does not have any say in this. Under these circumstances, the throughput of an MPTCP connection will be larger than that of single TCP connections sharing a bottleneck link if the sub flows complete their congestion management separately. This technique is unfair to classic TCP, according to the RFC 6356 standard. For MPTCP connections, RFC 6356 provides the three conditions for congestion management. By using an additive increase function in combination with an unaltered lowering behavior in case of packet loss, RC6356 is able to accomplish all three of its objectives. OLIA: Two issues with LIA were raised by Khalili et al. Using MPTCP may impair the throughput of other TCPs while giving no advantage to its own throughput, since LIA is not pareto optimal. The second problem is that MPTCP may be too aggressive in its attacks on TCP connections. For MPTCP congestion management, they came up with the opportunistic linked increases algorithm (OLIA). Similar to how LIA works, TCP Reno does the same thing. BALIA: According to Peng et al., yet another congestion management algorithm should be developed that strikes a compromise between friendliness and efficiency while also being quick to respond. Balanced linked adaptation (BALIA) is a congestion avoidance AIMD method that uses this approach. It is important to notice that ssthresh is set to one packet when more than one sub flow is available, rather than two.
71.2.3 Heterogeneity Constraint in Transport Layer Protocol MPTCP scheduling on heterogeneous networks, such as those seen in multimedia systems, is problematic. In 4G/LTE networks, the connection route employs huge buffers with extended fluctuating transmission, significant latency, and high packet loss rates. However, there are small delays and increased packet loss rates when using a Wi-Fi link. Buffer sizes at data receivers have an influence on network performance
746
N. R. Thakur and A. S. Kunte
as do queuing mechanisms, how many flows share a connection, and many more. Problems such as network collisions and frequent hand offs are caused by multimedia networks relying on Wi-Fi networks sharing the wireless channel. There is a greater need for dependable data transfer in multimedia applications than in other types of applications because of the increased throughput (lower RTT). Thus, a broad variety of MPTCP protocols and approaches are being employed to design TCP congestion management systems to maximize network performance. In order to send packets at a rapid rate, MPTCP confronts various difficulties. Networking state performance limiting event, known as “HOL-blocking,” happens when the first packet in a chain of packets is blocked. In the HOL blocking state, packets planned on the faster sub stream arrive to the target buffer and then wait for packets scheduled on the slower route to arrive. This leads in two key issues: high receiver buffer congestion and out-of-order (OFO) packet delivery. With significant packet loss and drop rate sub flows, out-of-order (OFO) issues may also arise OFO may increase the receiving side waiting queue [17] as a possible remedy. Buffer bloat is an issue when huge buffers are used, particularly on low-congestion pathways, where packets sit in the queue for a long length of time. Delay-sensitive applications in the IoV and multimedia environments are especially affected by these challenges [18]. To summarize, in order to obtain an acceptable degree of MPTCP performance, the congestion control strategy and the packet scheduler must both be consistent with the needs of the network system environment. With significant packet loss and drop rate sub flows, out-of-order (OFO) issues may also arise OFO’s problem may be solved by increasing the waiting time on the receiving side. Buffer bloat is an issue when huge buffers are used, particularly on low-congestion pathways, where packets sit in the queue for a long length of time. Thus, MPTCP has the potential to solve last mile connection difficulties and achieve the required network performance QoS, as a consequence. In addition, Pokhrel et al. [19] found a discrepancy in throughput between MPTCP and normal in Wi-Fi networks in the final mile. MPTCP has been implemented in the Linux kernel by Barré et al. [20], who also provide a performance study that demonstrates that linked congestion management is more equitable than the usual TCP approach. MPTCP LIA suffers considerably from competition with TCP owing to reordering delays at the receiver, according to Pokhrel et al. [21]. MPTCP scheduling approaches are discussed in the next section.
71.2.4 Path Scheduling in MPTCP Data packets are distributed among various sub flows using the multi-path TCP protocol, which is a modified version of TCP designed to maximize the use of available resources while also increasing throughput. When it comes to successful scheduling, an MPTCP scheduler should be able to: (1) provide reliable scheduling while meeting network heterogeneity (such as delay or capacity or loss rate), (2) balance a variety of QoS constraints (such as maximizing throughput, reducing RTT,
71 Smart Congestion Control and Path Scheduling in MPTCP
747
and minimizing latency), and (3) [link citation] be adaptive to network dynamicity (real-time network conditions). Consider, for instance the wide variation in cellular and Wi-Fi network performance [22]. LIA is presently the most popular open source implementation of MPTCP. Based on an end-to-end connection and numerous parameters, the coupled congestion control changes the congestion window for each sub flow (for example, fairness, friendliness, responsiveness, and congestion balance). Throughput may improve by 50% to 100% as compared to normal TCP (using solely Wi-Fi) under varying Wi-Fi coverage situations. In the opinion of Khalili et al, LIA breaches the MPTCP fairness aim of providing both fairness and responsiveness, which is not pareto optimal. A window-based congestion management approach, the opportunistic linked increases algorithm, coupled the additive increases and used TCP behavior in the case of a loss. Raising the throughput of one sub flow without influencing the throughput of other sub flows or even increasing congestion costs was ruled unlikely by the researchers. For MPTCP congestion management, Peng et al. conducted an in-depth investigation and suggested balanced linked adaption as a generalization of MPTCP. TCP friendliness and responsiveness were shown to be in harmony in these tests. Pokhrel et al. suggested an analytical technique to enhance the coupled congestion-based approach in terms of goodput, which is an application-level network throughput measurement exclusively concerned with meaningful data transfer. Controlling the receiver’s reordering delay while accounting for packet losses and transmission delays are two key components of the suggested method’s operation. Under the limitation of a low variation data rate, the findings reveal an increase in throughput MPTCP scheduling methods other than congestion window management are discussed in the literature. Round-trip time (RTT) is the time spent between transmitting and receiving a data packet with a given sequence number [23]. MinRTT is based on RTT. After filling the CWND of the smallest RTT sub flow, it moves on to sub flows with greater RTTs and so on. Because it is mainly concerned with utilizing the congestion window (CWND) and does not give a pre-estimate of the amount of packets to be sent across accessible channels, MinRTT suffers from HoL blocking in heterogeneous networks. As per Ferlin et al. [24], packets are distributed according to the danger of HoL blocking. Large files may take a longer time to download if certain sub flows are not used. This scheduler, as compared to the default MPTCP and other schedulers, increases application goodput while avoiding needless retransmissions. There has been some discussion on whether or not the default MPTCP route scheduler can provide applications with the optimal aggregate bandwidth, i.e. the total of available bandwidths from all pathways, and this was suggested by Lim and colleagues [25]. For a real-time streaming application, the experimental findings reveal that the fast way is underutilized due to diverse paths, which results in unwanted behavior. For streaming applications, the ECF regularly outperforms alternative techniques in the presence of route heterogeneity by making better use of all available pathways. HoL blocking is caused by a high buffer size while dealing with huge data files, according to Guo and colleagues [26]. By using adaptive data redundancy transmission for tiny data chunks over several pathways, they came up with DEMS—DEcoupled multi-path scheduler. The fundamental shortcoming of the
748
N. R. Thakur and A. S. Kunte
default MPTCP scheduler, according to Adarsh et al. [27], is that it only takes RTT into consideration. Scheduler matrices like performance and loss rate should also be taken into account when dealing with heterogeneous sub flows, though. Pokhrel et al. [28] introduced a unique throughput-based MPTCP method for scheduling data packets across time-varying heterogeneous wireless channels. The method uses a load balancing and forward error correction (FEC) approach to reduce congestion in IoV networks and delay-sensitive networks. The suggested method outperforms the standard congestion control model by integrating intelligent congestion management with load balancing IoV networks. Considering the intricacy of the model and the precision of the real-time scheduling, optimization-based MPTCP approaches will not enough for dynamic networks, according to Pokhrel and Garg [29]. These concerns may be addressed using the suggested learning-based scheduling, which allows the source to learn from its own experiences.
71.2.5 Automated Learning Schedulers Heterogeneous pathways, such as those present in cellular networks, are very variable and unpredictable in terms of self-inflicted queuing delays and loss rate. Due to network instability and the need to reduce latency and loss rate while improving throughput, researchers are investigating adaptive scheduling solutions. Learningbased schedulers, which utilize machine learning and deep learning methods to analyze network behavior and develop effective scheduling strategies as a consequence, are suggested to accomplish these objectives. Coordinated TCP (C-TCP) flows for IoT applications over Wi-Fi networks may be evaluated using an unique analytical model developed by Pokhrel and Singh [30]. Using machine learning, Chung et al. [31] developed MPTCP-ML, a new approach to path management. On a regular basis, heuristic patterns will be used to assess the quality of active pathways. Quality measures include throughput, RTT, signal quality, and data rate. The random forest model was used to uncover these patterns. Prediction accuracy is quite high in a mobile setting, according to the findings. Beig et al. [32] employed a throughput-based learning system to find the appropriate signal quality rate depending on the interface type and transmitted file size. We looked at three different methods for transferring data between mobile devices: LTE, WiFi, and MPTCP. Compared to single-path TCP via LTE and Wi-Fi, the throughput of MPTCP is 10% better. In addition to the loss rate and buffer size, the model does not take into consideration additional variables. For long-lived (mainly permanent) MPTCP flows in heterogeneous networks, Pokhrel and Mandjes [21] developed a thorough approach for assessing their performance (Wi-Fi and cellular network). They took into account things like retransmission restrictions and buffer sizes while designing the network. Heterogeneity variation in MPTCP performance was studied for its effect on a suggested learning method. It is possible for MPTCP to accomplish heterogeneous network cohabitation in the best-effort way using dynamic sub flow
71 Smart Congestion Control and Path Scheduling in MPTCP
749
management (add and drop). There are a number of uses for this technology, such as in-city rail and drone convoy coordination. A user-specified delay limitation was used by Chiariotti et al. [33] to study delay-sensitive MPTCP devices. LEAP is a QoS-based MPTCP protocol suggested by the authors (latency-controlled end-toend aggregation protocol). User-defined QoS limitations are taken into account when LEAP organizes and distributes traffic over several parallel lines. Deep reinforcement learning (DRL) methods are used to train a neural network (NN) to determine the optimal MPTCP distribution strategy, as described by Zhang et al. [34]. As a result, the reward function employed is somewhat complicated. The following measures are used to evaluate performance: the amount of bytes that were not in the correct sequence when they were received (a sign of HOL blocking), the application’s goodput, and the time it took to download. When it comes to scheduling and adaptive learning, ReLeS beats some of the industry’s most popular schedulers. The key issue is the complexity of the optimization function, which may lead to longer optimization times in complicated scheduling circumstances. Learning-based MPQUIC scheduler Peekaboo continually monitors the current dynamicity level of each route and determines the best suitable scheduling technique based on the current dynamicity level, as presented by Wu et al. The high-throughput pathways are chosen using the reward six function. The model under consideration is a cross between deterministic and adaptive models that may be used online. For simplicity, Peekaboo only confides in pathways that have adequate CWND. When compared to the best current schedulers, Peekaboo routinely exceeds or comes close to matching their performance. Cooperative MPTCP utility functions were defined using game theory by Pokhrel and Williamson [35]. As a game with common connected constraints, MPTCP sub flow competitions are intended to be modeled, such that if one sub flow fails to fulfill the constraints shared by other sub flows, such as the overall, all other sub flows are affected. Out-of-order packet delivery and packet loss are regular occurrences in cooperative MPTCP sub flows. Network emulation trials indicate a superior throughput and responsiveness as compared to baseline solutions for dealing with diverse network pathways. In MPTCP, Huang et al. introduced a DRLbased congestion management method that is distributed. Because it eliminates HoL blocking with the BLEST scheduler under complicated network circumstances, the new MPTCP congestion management method outperforms older ones. The C-TCP uses a mixed congestion management method [36] to sustain loss and delay during congestion windows. An IoT traffic scheduling framework has been developed by Pokhrel and Garg based on deep Q-network [37], a widely used benchmark for constructing and assessing deep reinforcement methods. By directly learning rules from large datasets using end-to-end reinforcement learning, deep Q-networks aim to reach human-like intelligence. According to Abbasloo et al., DRL approaches may be used to educate computers how to guide throughput-oriented TCP algorithms toward meeting applications’ required delays in highly dynamic networks such as those of cellular networks (e.g. 3G, 4G). While recent learning-based methods have tried to replace existing TCP schemes, DeepCC aims to support and enhance existing TCP schemes rather than replace them. It is the purpose of DeepCC to ensure that the average packet delay does not exceed the applications’ intended Targets while
750
N. R. Thakur and A. S. Kunte
maintaining good throughput. Security remains an issue when transmitting data over many network interfaces with MPTCP even if it delivers high-performance and dependable connections. For distributed on-vehicle machine learning computation and data sharing, Pokhrel and Choi [38] presented a BFL system based on blockchain that offers strong communication/data privacy while lowering time. System delays may be reduced by employing communication and adaptive arrival rate blocking depending on current network circumstances. In order to lessen the impact of centralized control on the blockchain system, the task was decentralized [39]. To ensure privacy while increasing IoV communication performance, the authors suggested federated learning [40]. As the amount and diversity of IoT big data grows, the TCP protocol may be expedited to meet these difficulties.
71.3 Related Work and Motivation Based on these design aims, a number of academics have come up with different MPTCP congestion management algorithms. Loss-based, delay-based, and hybrid congestion control algorithms [41] are the most common forms of TCP congestion control algorithms. Loss-based TCP congestion control algorithms include RFC 793 [42], Tahoe [43], Reno [44], and NewReno [45], as well as binary increase congestion control (BIC) [46] and cubic [47]. Vegas [48], FAST [49], low-latency congestion control (Lola) [50], and timely [51] are all delay-based TCP congestion control algorithms. In addition to Vegasreno [52], Compound [53], Fusion [54], Illinois [55], and the bottle-neck bandwidth and round-trip time (BBR) [56], there are other hybrid TCP congestion management algorithms available, including Vegasreno, Compound, Fusion, and Illinois. In RFC 6824 [57], MPTCP is categorized as a next-generation transport layer protocol. Three kinds of congestion management algorithms exist for MPTCP: For (a) loss-based MPTCP congestion control algorithms, such as linked increased algorithm (LIA), opportunistic linked increased algorithm (OLIA), balanced linked adaptation (BALIA), and dynamic LIA (D-LIA) [58], as well as (b) delay-based MPTCP congestion control algorithms, such as wVegas [59], coupled multi path widely accepted MPTCP congestion control algorithms are still lacking. There is a lot of work being done right now to address the need for an MPTCP congestion management algorithm that is both efficient and standard. Faster and more reliable Internet has become essential in today’s world of longdistance business and live broadcasting. Furthermore, network conditions are continually changing as a result of the large amount of transitory traffic. More than a dozen high-level research initiatives have been launched to solve this issue [60–65]. In loss-based MPTCP congestion management techniques, the transmission rate is controlled by the number of packets lost. That is, the MPTCP congestion control algorithms assume that the underlying bottleneck buffer is full anytime there is a loss of packets. They lower CWND to restrict data flow and empty the buffer in response to a crowded network. Several loss-based MPTCP congestion management
71 Smart Congestion Control and Path Scheduling in MPTCP
751
algorithms, including as LIA, OLIA, BALIA, and D-LIA, have been suggested in response to MPTCP’s key design objectives. In order to achieve the goals of MPTCP congestion management algorithms, Raiciu et al. suggested LIA. In order to increase efficiency and justice, LIA effectively shifted traffic from one crowded route to another. However, LIA is unable to fully use the underlying network because of its stress on fairness. Analysis of LIA’s behavior revealed to Khalili et al. that it demands a trade-off between optimum congestion balancing and responsiveness. In order to avoid this trade-off, they suggested OLIA, which also had these qualities. OLIA, on the other hand, is worried about the underlying network’s underutilization. Peng et al. found that OLIA might be insensitive to changes in network circumstances even though it was meant to offer optimum load balancing and responsiveness at the same time. They came up with BALIA, a modified algorithm, to overcome this issue. However, the problem of underuse of the underlying network remains in BALIA as well. As a consequence of these previously suggested congestion management strategies, the network throughput is even lower. Only the mechanism of CWND growth was considered in these algorithms. D-LIA, a loss-based congestion management method that dynamically adjusts the CWND lowering mechanism, is offered by authors as a solution to this problem. D-LIA changed the CWND reduction factor for packet losses to account for the time between packet losses. D-LIA has the potential to outperform LIA, OLIA, and BALIA in terms of throughput. Packet losses skyrocketed as a result of the system’s inability to react quickly enough to network congestion. In contrast, delay-based MPTCP congestion control algorithms use a reactive approach. Queue growth at bottle-neck buffers may be prevented using delay-based congestion management methods. The quickest feasible round-trip time (RTT) and maximum possible throughput are their major objectives. As a consequence, lowlatency applications benefit greatly from their use. Delay-based MPTCP congestion control methods wVegas and C-MPBBR are widely used. wVegas, a delay-based congestion management technique for MPTCP based on TCP Vegas, was introduced by Yu et al. As a congestion indicator, they used packet queueing delay to accomplish fine-grained load balancing. The “congestion equality principle” was the emphasis of this study rather than the three MPTCP design objectives. As a consequence, congestion control algorithms for MPTCP fail to satisfy their intended objectives. Based on single-path TCP BBR, we developed and dubbed C-MPBBR an MPTCP congestion control algorithm (CCA) that effectively met the design objectives of MPTCP congestion control algorithms. MPTCP design objectives such as high throughput, minimal latency, and better fairness were priorities for several of these researchers. Some significant challenges remain unsolved in BBR’s single route TCP congestion management technique. It is possible to achieve high throughput while still achieving the design objectives of MPTCP congestion control algorithms by using a combination of loss-based and delay-based congestion control algorithms. A hybrid strategy for MPTCP congestion management algorithms has not so far been developed to our knowledge, to our knowledge high-speed and long-delay networks may benefit from Phuong’s MPTCP congestion management method, MCompound.
752
N. R. Thakur and A. S. Kunte
In high-speed and long-delay networks, it is a multi-path implementation of the previously suggested Compound TCP. MPTCP design objectives are ignored, and as a result it behaves unfairly with single-path TCP traffic. So far, there has not been an MPTCP standard congestion control algorithm that satisfies all of the MPTCP congestion control algorithm design objectives.
71.4 Conclusion and Future Work IoT, IoV, multimedia, and satellite all embrace the growth of new kinds of network applications and technologies that progress toward more complex structures and high resource demands in computing and networking. IoT big data [66, 67] is a new workload paradigm resulting from the rapid growth of IoT devices. Data volume, velocity, and diversity are the three characteristics of IoT big data model [68, 69]. To address IoT big data’s high responsiveness and reliability requirements, conventional MPTCP approaches offer rigid transmission strategies since they ignore the huge amounts of predominantly high-speed data services [70]. MPTCP was examined by Pokhrel et al. for usage in the IIoT area, for example. For high operational efficiency, intelligent monitoring and tracking applications, and predictive and preventive maintenance, the IIoT branch of the Internet of Things (IoT) [71] is deployed. Despite the fact that MPTCP approaches for IoT networks, such as IoV, are presented in the literature as promising [72–85]. Data size and speed on a huge scale have not been extensively studied. IoT big data MPTCP scheduling is examined in this research and how it may be optimized such that network speed and latency are maintained.
References 1. Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197 (1981). https://doi.org/10.1016/0022-2836(81)90087-5 2. Ford, A., Raiciu, C., Handley, M., Bonaventure, O., Paasch, C.: [Online] (2013). Available: https://tools.ietf.org/html/rfc6824 3. Wu, J., Yuen, C., Cheng, B., Wang, M., Chen, J.: Streaming high-quality mobile video with multipath TCP in heterogeneous wireless networks. IEEE Trans. Mob. Comput. 15, 2345–2361 (2015) 4. Wischik, D., Raiciu, C., Greenhalgh, A., Handley, M.: Design 5. Staff, A.: [Online], (2019). Available: http://appleinsider.com/articles/13/09/20/apple-foundto-be-using-advanced-multipath-tcp-networking-in-ios-7 6. [Online] (2019). Available: https://multipathtcp.org/pmwiki.php/Users/Android 7. Ford, A., Raiciu, C., Handley, M., Barre, S., Iyengar, J.: Architectural guidelines for multipath TCP development. IETF RFC 6182–6182 (2011) 8. Ford, A., Raiciu, C., Handley, M., Bonaventure, O.: TCP extensions for multipath operation with multiple addresses. IETF RFC 6824 (2013) 9. Raiciu, C., Handley, M., Wischik, D.: Coupled congestion control for multipath transport protocols. IETF RFC 6356 (2011)
71 Smart Congestion Control and Path Scheduling in MPTCP
753
10. Khalili, R., Gast, N., Popovic, M., Boudec, J.: MPTCP is not pareto-optimal: performance issues and a possible solution. IEEE/ACM Trans. Networking 21(5), 1651–1665 (2013) 11. Peng, Q., Valid, A., Hwang, J., Low, S.: Multipath TCP: analysis, design and implementation. IEEE/ACM Trans. Networking 24(1), 596–609 (2016) 12. Ha, S., Rhee, I., Xu, L.: CUBIC: a new TCP-friendly high-speed TCP variant. ACM SIGOPS Operating Syst. Rev. 42(5), 64–74 (2008) 13. Tan, K., Song, J., Zhang, Q., Sridharan, M.: A compound TCP approach for high-speed and long distance networks. IEEE INFOCOM 1–12 (2006) 14. Kato, T., Diwakar, A., Yamamoto, R., Ohzahata, S., Suzuki, N.: Performance evaluation of maltipath TCP congestion control. ICN 2019: 18th International Conference on Networks, pp. 19–24 (2019) 15. Ford, A., Raiciu, C., Handley, M., Bonaventure, O., Paasch, C.: Rfc 6824: TCP extensions for multipath operation with multiple addresses. Internet Eng. Task Force (2013) 16. Pokhrel, S.R., Ding, J., Park, J., Park, O.S., Choi, J.: Towards enabling critical mmtc: a review of urllc within mmtc. IEEE Access, vol. 8, pp. 131796–131813 (2020) 17. Paasch, C., Ferlin, S., Alay, O., Bonaventure, O.: Experimental evaluation of multipath TCP schedulers. Proceedings of the 2014 ACM SIGCOMM Workshop on Capacity Sharing Workshop, pp. 27–32 (2014) 18. Abbasloo, S., Yen, C.Y., Chao, H.J.: Wanna make your TCP scheme great for cellular networks? Let machines do it for you! IEEE J. Sel. Areas Commun. 39, 265–279 (2020) 19. Pokhrel, S.R., Panda, M., Vu, H.L.: Fair coexistence of regular and multipath TCP over wireless last-miles. IEEE Trans. Mob. Comput. 18, 574–587 (2018) 20. Barré, S., Paasch, C., Bonaventure, O.: Multipath TCP: from theory to practice. International Conference on Research in Networking, pp. 444–457 (2011) 21. Pokhrel, S.R., Mandjes, M.: Improving multipath TCP performance over wifi and cellular networks: an analytical approach. IEEE Trans. Mobile Comput. 18, 2562–2576 (2018) 22. Sommers, J., Barford, P.: Cell versus wifi: on the performance of metro area mobile connections. Proceedings of the 2012 Internet Measurement Conference, pp. 301–314 (2012) 23. Postel, J.: Assigned numbers. RFC 790, USC/Information Sciences Institute (1981) 24. Ferlin, S., Alay, Ö., Mehani, O., Boreli, R.: Blest: blocking estimation-based MPTCP scheduler for heterogeneous networks. 2016 IFIP Networking Conference (IFIP Networking) and Workshops, pp. 431–439 (2016) 25. Lim, Y., Nahum, E.M., Towsley, D., Gibbens, R.J.: Ecf: an MPTCP path scheduler to manage heterogeneous paths. Proceedings of the 13th international conference on emerging networking experiments and technologies, pp. 147–159 (2017) 26. Guo, Y. E., Nikravesh, A., Mao, Z.M., Qian, F., Sen, S.: Accelerating multipath transport through balanced subflow completion. Proceedings of the 23rd Annual International Conference on Mobile Computing and Networking, pp. 141–153 (2017) 27. Adarsh, V., Schmitt, P., Belding, E.: Mptcp performance over heterogenous subpaths. 2019 28th International Conference on Computer Communication and Networks (ICCCN), pp. 1–9 (2019) 28. Pokhrel, S.R., Choi, J.: Low-delay scheduling for internet of vehicles: load-balanced multipath communication with FEC. IEEE Trans. Commun. 67, 8489–8501 (2019) 29. Pokhrel, S.R., Garg, S.: Multipath communication with deep q-network for industry 4.0 automation and orchestration. IEEE Transactions on Industrial Informatics (2020) 30. Pokhrel, S.R., Singh, S.: Compound TCP performance for industry 4.0 wifi: a cognitive federated learning approach. IEEE Trans. Ind. Inf. 17, 2143–2151 (2020) 31. Chung, J., Han, D., Kim, J., Kim, C.K.: Machine learning based path management for mobile devices over mptcp. 2017 IEEE International Conference on Big Data and Smart Computing (BigComp), pp 206–209 (2017) 32. Beig, E.F.G.M., Daneshjoo, P., Rezaei, S., Movassagh, A.A., Karimi, R., Qin, Y.: Mptcp throughput enhancement by q-learning for mobile devices. 2018 IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th International Conference on Data Science and Systems (HPCC/SmartCity/DSS), pp. 1171–1176, (2018)
754
N. R. Thakur and A. S. Kunte
33. Chiariotti, F., Kucera, S., Zanella, A., Claussen, H.: Analysis and design of a latency control protocol for multi-path data delivery with pre-defined QoS guarantees. IEEE/ACM Trans. Networking 27, 1165–1178 (2019) 34. Zhang, H., Li, W., Gao, S., Wang, X., Ye, B.: Reles: a neural adaptive multipath scheduler based on deep reinforcement learning. IEEE INFOCOM 2019-IEEE Conference on Computer Communications, pp. 1648–1656 (2019) 35. Pokhrel, S.R., Williamson, C.: (2020) 36. Tan, K., Song, J., Zhang, Q., Sridharan, M.: A compound TCP approach for high-speed and long distance networks. Proceedings-IEEE INFOCOM (2006) 37. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G.: Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015) 38. Pokhrel, S.R.: Federated learning meets blockchain at 6g edge: a drone-assisted networking for disaster response. Proceedings of the 2nd ACM MobiCom Workshop on Drone Assisted Wireless Communications for 5G and Beyond, pp 49–54 (2020) 39. A decentralized federated learning approach for connected autonomous vehicles. 2020 IEEE Wireless Communications and Networking Conference Workshops, pp 1–6 (2020) 40. Improving TCP performance over wifi for internet of vehicles: a federated learning approach. IEEE Trans. Veh. Technol. 69, 6798–6802 (2020) 41. Turkovic, B., Kuipers, F.A., Uhlig S. 42. Zaghal, R.Y., Khan, J.I.: [Online] (2021). Available: http://www.medianet.kent.edu/technical reports.html 43. Mathis, M., Mahdavi, J., Floyd, S., Romanow, A.: TCP Selective Acknowledgment Options (1996) 44. Allman, M., Paxson, V., Stevens, W.: TCP Congestion Control, vol. 5681, pp. 27–27. Available online (2021) 45. Floyd, S., Henderson, T., Gurtov, A.: pp. 27–27. [Online] (2021). Available: https://tools.ietf. org/html/rfc3782 46. Xu, L., Harfoush, K., Rhee, I.: Binary INCREASE congestion Control (BIC) for Fast LongDistance Networks. In: Proceedings of the IEEE INFOCOM, pp. 2514–2524 (2004) 47. Ha, S., Rhee, I., Xu, L.: CUBIC: a new TCP-friendly high-speed TCP variant. ACM SIGOPS Oper. Syst. Rev. 42, 64–74 (2008) 48. Brakmo, L.S., Malley, S.W., Peterson, L.L.: Vegas, new techniques for congestion detection and avoidance. In: Proceedings of the Conference on Communications Architectures, Protocols and Applications, pp. 24–35 (1994) 49. Wang, J., Wen, J., Zhang, J., Han, Y.: TCP-FIT: an improved TCP congestion control algorithm and its performance. In: Proceedings of the 2011 IEEE INFOCOM, pp. 2894–2902 (2011) 50. Hock, M., Neumeister, F., Zitterbart, M., Bless, R.: Lola, congestion control for low latencies and high throughput. In: Proceedings of the 2017 IEEE 42nd Conference on Local Computer Networks (LCN), pp. 215–218 (2017) 51. Mittal, R., Lam, V.T., Dukkipati, N., Blem, E., Wassel, H., Ghobadi, M., Vahdat, A., Wang, Y., Wetherall, D., Zats, D.: TIMELY: RTT-based congestion control for the datacenter. ACM SIGCOMM Comput. Commun. Rev. 45, 537–550 (2015) 52. Fu, C.P., Liew, S.C.: Veno, TCP enhancement for transmission over wireless access networks. IEEE J. Sel. Areas Commun. 21, 216–228 (2003) 53. Song, K.T.J., Zhang, Q., Sridharan, M.: Compound TCP: a scalable and TCP-friendly congestion control for high-speed networks. In: Proceedings of the PFLDnet (2006) 54. Kaneko, K., Fujikawa, T., Su, Z., Katto, J.: TCP-fusion: a hybrid congestion control algorithm for high-speed networks. In: Proceedings of the PFLDnet, vol. 7, pp. 31–36 (2007) 55. Liu, S., Ba¸sar, T., Srikant, R.: TCP-illinois: a loss-and delay-based congestion control algorithm for high-speed networks. Perform. Eval. 65, 417–440 (2008) 56. Cardwell, N., Cheng, Y., Gunn, C.S., Yeganeh, S.H., Jacobson, V.: BBR: congestion-based congestion control. Commun. ACM 60, 58–66 (2017)
71 Smart Congestion Control and Path Scheduling in MPTCP
755
57. Noda, K., Ito, Y., Muraki, Y.: Study on congestion control of multipath TCP based on webQoE under heterogeneous environment. In: Proceedings of the IEEE 6th Global Conference on Consumer Electronics (GCCE), pp. 1–3 (2017) 58. Lubna, T., Mahmud, I., Cho, Y.-Z.D.-L.: pp. 263–268 59. Cao, Y., Xu, M., Fu, X.: Delay-based congestion control for multipath TCP. In: Proceedings of the 20th IEEE International Conference on Network Protocols (ICNP), pp. 1–10 (2012) 60. Tsiropoulou, E.E., Katsinis, G.K., Filios, A., Papavassiliou, S.: On the problem of optimal cell selection and uplink power control in open access multi-service two-tier femtocell networks. In: Proceedings of the International Conference on Ad-Hoc Networks and Wireless, pp. 114–127. Springer (2014) 61. Chao, L., Wu, C., Yoshinaga, T., Bao, W., Ji, Y.: A brief review of multipath TCP for vehicular networks. Sensors 21 (2021) 62. Lee, W., Lee, J.Y., Joo, H., Kim, H.: An MPTCP-based transmission scheme for improving the control stability of unmanned aerial vehicles. Sensors (2021) 63. DeepCC: Multi-agent deep reinforcement learning congestion control for multi-path TCP based on self-attention. IEEE Trans. Netw. Serv. Manag. (2021) 64. Wei, W., Xue, K., Han, J., Wei, D.S., Hong, P.: Shared bottleneck-based congestion control and packet scheduling for multipath TCP. IEEE/ACM Trans. Netw. 28, 653–666 (2020) 65. Mudassir, M.U., Baig, M.: HCCA, pp. 711–711 (2021) 66. Chen, M., Liu, Y., Mao, S.: (2014) 67. Ahmed, E., Yaqoob, I., Hashem, I.A.T., Shuja, J., Imran, M., Guizani, N., Bakhsh, S.T.: Recent advances and challenges in mobile big data. IEEE Commun. Mag. 56, 102–108 (2018) 68. Fang, H., Zhang, Z., Wang, C.J., Daneshmand, M., Wang, C., Wang, H. 69. A survey of big data research 29, 6–9 (2015) 70. Bansal, M., Chana, I., Clarke, S.: A survey on iot big data: current status, 13 v’s challenges, and future directions. ACM Comput. Surv. (CSUR) 53, 1–59 (2020) 71. Yu, C., Quan, W., Cheng, N., Chen, S., Zhang, H.: Coupled or uncoupled? Multipath TCP congestion control for high-speed railway networks. 2019 IEEE/CIC International Conference on Communications in China (ICCC), pp. 612–617 (2019) 72. Raiciu, C., Wischik, D., Handley, M.: pp. 27–27. [Online] (2009). Available: https://citeseerx. ist.psu.edu/viewdoc/download?doi=10.1.1.376.3473&rep=rep1&type=pdf 73. Paasch, C., Bonaventure, O.: Multipath TCP. Commun. ACM 57(4):51–57 (2014) 74. Floyd, S., Henderson, T., Gurtov, A.: The new Reno modification to TCP’s fast recovery algorithm. IETF RFC 3728 (2004) 75. Lim, Y., Chen, Y.C., Nahum, E.M., Towsley, D., Lee, K.W.: Cross-layer path management in multi-path transport protocol for mobile devices. IEEE INFOCOM 2014-IEEE Conference on Computer Communications, pp. 1815–1823 (2014) 76. Bae, S., Ban, D., Han, D., Kim, J., Lee, K., Lim, S., Park, W., Kim, C.K.: Streetsense: effect of bus wi-fi aps on pedestrian smartphone. Proceedings of the 2015 Internet Measurement Conference, pp. 347–353 (2015) 77. Scharf, M., Kiesel, S.: Nxg03-5: head-of-line blocking in TCP and SCTP: analysis and measurements. IEEE Globecom, pp. 1–5 (2006) 78. Wischik, D., et al.: NSDI’11: 8th USENIX Symposium on Networked Systems Design and Implementation (2011) 79. Kuhn, N., Lochin, E., Mifdaoui, A., Sarwar, G., Mehani, O., Boreli, R.: Daps: Intelligent delayaware packet scheduling for multipath transport. In: 2014 IEEE International Conference on Communications (ICC), pp. 1222–1227 (2014) 80. Liu, Y., Neri, A., Ruggeri, A., Vegni, A.M.: A MPTCP-based network architecture for intelligent train control and traffic management operations. IEEE Trans. Intell. Transp. Syst. 18, 2290– 2302 (2016) 81. Pokhrel, S.R., Jin, J., Vu, H.L.: Mobility-aware multipath communication for unmanned aerial surveillance systems. IEEE Trans. Veh. Technol. 68, 6088–6098 (2019) 82. Wu, H., Alay, Ö., Brunstrom, A., Ferlin, S., Caso, G.: Peekaboo: learning-based multipath scheduling for dynamic heterogeneous environments. IEEE J. Sel. Areas Commun. 38, 2295– 2310 (2020)
756
N. R. Thakur and A. S. Kunte
83. Ha, S., Rhee, I., Xu, L.: Cubic: a new TCP-friendly high-speed TCP variant. ACM SIGOPS Operating Syst. Rev. 42, 64–74 (2008) 84. Dong, M., Li, Q., Zarchy, D., Godfrey, P.B., Schapira, M.: PCC: re-architecting congestion control for consistent high performance. In: Proceedings of the 12th USENIX symposium on networked systems design and implementation (NSDI 15), pp. 395–408 (2015) 85. Sisinni, E., Saifullah, A., Han, S., Jennehag, U., Gidlund, M.: Industrial internet of things: challenges, opportunities, and directions. IEEE Trans. Industr. Inf. 14, 4724–4734 (2018)
Chapter 72
Internet of Things-Enabled Diabetic Retinopathy Classification from Fundus Images Vinodkumar Bhutnal and Nageswara Rao Moparthi
Abstract The main aim of this research is to review design of consolidated framework that represents the IoMT-based DR detection and severity analysis with performance trade-offs. The automatic DR classification and severity analysis under IoMT is the main research problem statement of this work. From the recent studies on the DR classification and severity analysis, it is observed that the early prediction of the DR diseases along with its severity analysis is challenging research problem several concerns like the fundus image modalities, image quality, computational efficiencies, accurate region of interest localization, and overall accuracy. Achieving the trade-off among these challenges is a tedious task for the researchers. Most of the works have been built using the automatic mechanism of deep learning for DR classification. Very few attempts were reported for the automatic DR classification and disease severity analysis but not sufficient for the IoMT-enabled smart healthcare systems. This motivates us to attempt the novel IoMT-enabled automatic DR classification and severity analysis from the input retinal fundus images of different modalities. Keywords Internet of Things · Diabetic retinopathy · Fundus image · IoMT-based DR · Digital image processing
72.1 Introduction Digital image processing is a processing of images under which the input is under the form of an image or video and generates an output under the form of an image, a set of features or parameters related to that image. It usually refers to digital image processing. In the field of medical science, computer vision and image processing techniques play an important role under diabetes and modern ophthalmology. Diabetes, often referred to as diabetes mellitus, is a group of metabolic disorders, caused by too much glucose under the blood over many years. In the event that the diabetes is left untreated, it causes serious intricacies like eye, foot, V. Bhutnal (B) · N. R. Moparthi Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, AP, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_72
757
758
V. Bhutnal and N. R. Moparthi
heart issues, skin issues, hearing issues, stroke, diabetic neuropathy, and kidney issues. Diabetic eye illness incorporates diabetic retinopathy, waterfall, glaucoma, and diabetic macular edema. Diabetic retinopathy (DR) is an eye illness, which influences veins under the light-delicate tissue called the retina that is available at the rear of the eye. It is the common problem of vision loss among people with diabetes, and it is one of the leading causes of vision decay and blindness for people under the age of 70. The main reason for the DR is the contradictory increase of the blood glucose level, which results under damage to the vascular endothelium and increases the retinal vessel permeability. The growth of DR results under retinal detachment. DR patients are not aware of any symptoms until visual impairment develops; it will be the less effective treatment. By using laser photocoagulation, the earlier stages of DR can be treated and may prevent vision loss. Diabetic patients are advised to undergo regular eye check-ups to ensure the presence of DR. Abnormalities associated with the retinal-related diabetic disease are diabetic macular edema, age-related macular degeneration, cataract, conjunctivitis, and glaucoma. The statistical report of WHO states that 108 million people were affected by diabetes under the year 1980, and it rapidly increases to 422 million people under the year 2014. In 2015, 1.6 million deaths are occurred due to the cause of diabetes and high blood glucose. WHO predicts that diabetes plays a major role under causing death under the upcoming 2030? Almost half of all deaths are happened due to high blood sugar before the age of 70 years. To refrain from the cause of death, WHO suggests the people maintain a healthy diet, regular physical activities, taking proper medication, screening, and testing the complications regularly. By and large, DR is an eye illness, which influences individuals having diabetes for more than 10 years. Early identification of DR is a dreary system for regardless a completely ready clinician, who might achieve conceded treatment, miscommunication, and so on. In India, 70% of the populace remaining under provincial regions and 10,000 ophthalmologists are accessible to really focus on the whole populace (i.e., the proportion of 1:100,000 individuals). The early recognition of DR assists with conquering the difficult issues utilizing the programmed computer-aided diagnosis (CAD) frameworks from the retinal fundus-computerized pictures. The utilization of advanced picture handling innovation under the clinical field has become exceptionally famous for multipurpose applications. The advanced imaging apparatus these days is utilized under practically all the diagnosis and treatment-related regions utilizing the e-medical care frameworks. Present day, eye care specialists are utilizing fundus cameras to catch pictures of the eye. Imaging the retinal fundus of the eye is a non-obtrusive interaction. Nowadays, due to the rapid progress of the standards like the Internet of Thingsdriven (IoT) healthcare 4.0, technology has receiving significant attention for the smart healthcare systems. Healthcare 4.0 is the subclass of the emerging Industry 4.0 standard which is there for patient monitoring periodically by the healthcare organizations. In the present scenario, DR detection and grading are not sufficient to integrate with the IoT systems. This research focuses on designing the automatic DR detection with its severity analysis connected with the Internet of Medical Things (IoMT).
72 Internet of Things-Enabled Diabetic Retinopathy Classification …
759
72.2 Literature Review Since the last decade, several studies have been presented for DR detection and severity analysis using image processing, machine learning, and deep learning. We have reviewed some recent works under this section. In [1], authors have proposed the multi-scale attention network (MSA-Net) for DR characterization. They applied the encoder network to insert the retina picture under an undeniable level authentic space, where the mix of mid and significant level highlights had used to enhance the portrayal. Then, at that point, a multi-scale highlight pyramid had included depicting the retinal construction under an alternate area. In [2], authors have zeroed in on the structure the enormous dataset of the retinal fundus pictures for the DR arrangement. They developed the huge fine-grained commented on DR dataset containing 2842 pictures. This dataset had 1842 pictures with pixel-level DR-related injury comments, and 1000 pictures with picture-level marks evaluated by six board-ensured ophthalmologists with intra-ratter consistency. In [3], authors have proposed convolutional neural network-based (CNN) technique to satisfy a DR order system utilizing en face optical rationality tomography (OCT) and its angiography (OCTA). A thickly and persistently associated neural network with versatile rate dropout (DcardNet) was intended for the DR arrangement. Likewise, versatile name smoothing was proposed and used to smother over fitting. In [4], authors have proposed new multi-modular system for vessel division called ELEMENT (vessel division utilizing machine learning and availability). Component had comprises of element extraction and pixel-based grouping utilizing locale developing and AI. The proposed highlights catch reciprocal proof dependent on dim level and vessel network properties. The last data were flawlessly proliferated through the pixels at the characterization stage. In [5], authors have proposed various leveled multi-task profound learning system for synchronous diagnosis of DR seriousness and DR-related elements under fundus pictures. A progressive construction had acquainted with join the easygoing connection between DR-related highlights and DR seriousness levels. In [6], authors have proposed profound learning framework deep DR that can identify ahead of schedule to-late phases of diabetic retinopathy. Profound DR had prepared for continuous picture quality evaluation, injury discovery, and reviewing utilizing 466,247 fundus pictures from 121,342 patients with diabetes. In [7], authors have proposed further developed misfortune work, and three cross breed model constructions Hybrid-a, Hybrid-f, and Hybrid-c were proposed to work on the presentation of DR order models. EfficientNetB4, EfficientNetB5, NAS Net Large, Exception, and InceptionResNetV2 CNNs were picked as the fundamental models. These fundamental models were prepared utilizing upgrade cross-entropy misfortune and cross-entropy misfortune, individually. In [8], authors have zeroed in on a component extraction procedure that consolidates two element extractors, speeded up vigorous elements, and twofold powerful
760
V. Bhutnal and N. R. Moparthi
invariant versatile central issues, to separate the applicable highlights from retinal fundus pictures. The choice of highest level elements utilizing the MR-MR (greatest importance least excess) highlight choice, and positioning technique improves the effectiveness of arrangement. In [9], authors have cantered programmed location of DR and proposed a model for choosing the movement/seriousness utilizing fundus pictures. The strategy was grown, so DR can be distinguished under a viable and proficient way prior to making harm the eye, without the presence of an ophthalmologist. In [10], authors have proposed a transformed container network for the identification and order of DR. Utilizing the convolution and essential case layer, the highlights were separated from the fundus pictures, and then, at that point, utilizing the class case layer and softmax layer the likelihood that the picture has a place with a particular class was assessed. A.
Extraction of Retinal Blood Vessels
Ophthalmologists can anticipate DR by the presence of injuries under the retina, which is brought about by the infection. The proposed approach is powerful a result of the absence of aptitude and hardware under certain spaces where the occurrence of diabetes is high and DR location and treatment is regularly required. Individuals who impacted by diabetes are expanded; the ophthalmologists are rush to forestall the visual impairment yet the framework and the specialists for DR become more lacking. In [13], author has fostered the strategies for division by sorted into solo and administered. Managed division calculations utilize earlier information about the ground reality of a preparation set of pictures, while unaided calculations are prepared internet during division. In the solo strategy, related works utilize the calculations, for example, matched sifting, vessel following, morphological changes, shape model, Laplacian administrator, and discerning change approach. The proposed work depends on the regulated learning calculations, via preparing the network utilizing the gathered datasets from public datasets and the clinics by giving marks like ordinary, gentle, moderate, and serious. In [14] proposes not a simply directed calculation under which the network takes in the elements from a human master. It likewise accomplishes a decent AUC. He presented the free preparing dataset and executes neural network classifiers utilizing a seven-highlight set which is removed by neighborhood boundaries just as the second invariants-based technique. B.
Detection of Micro aneurysms
In [17] executed fluoresce under angiogram (FA) on radon change and multi-covering windows for the extraction of MAs. The affectability and explicitness are 94% and 75%, separately. Author executed a clever technique for the recognition of injuries to analyze the DR illness. At first, the optic nerve head was distinguished and concealed. The pre-handled picture was separated into sub-pictures, and radon change was applied under every one of the sub-picture. Veins and optic nerve head are sectioned, and MAs are identified by the proper thresholding and radon change.
72 Internet of Things-Enabled Diabetic Retinopathy Classification …
761
In [18] tended to a technique for the discovery of miniature aneurysms by directional cross-segment profiles on applying the nearby greatest pixels of the pre-handled picture. In each profile, top location is applied and determined properties like size, stature, and state of the pinnacle. Gullible Bayes arrangement is utilized to eliminate fake applicants. It accomplishes higher affectability at low bogus positive rates. In [19] researched the improvement under the discovery of MAs dependent on logical data and versatile weighting approach. A troupe-based technique allocates loads to the sore of the outfit individuals dependent on their difference and spatial area. C.
Review related Fundus Images
The authors of [23] offered an improved approach for visual deficiency diagnosis under retinal images based on newly released pre-prepared Efficient Net models, as well as a comparison of several neural network models. Our reported EfficientNetB5based model assessment outperforms CNN and ResNet50 models on benchmark datasets of retina photos obtained via fundus photography throughout shifting imaging phases. The writers of [24] presented a study to save the patient’s life from visual misfortune. The process starts with the devices themselves, which safely transport data through the IoT stage and guarantee a common language for portable apps to cooperate with one another. This step collects a vast amount of data from the device on a regular basis and stores it in a safe dataset. The author proposes in [25] a graded evaluation was also performed to evaluate the suggested approach’s performance to those of general ophthalmologists with varying levels of expertise. When compared to traditional deep learning-based strategies, the results suggest that the proposed methodology can improve performance for both DR seriousness diagnosis and DR-related element recognition. When detecting DR severity levels, it performs similarly to general ophthalmologists with five years of experience, and for referable DR detection, it performs similarly to general ophthalmologists with 10 years of experience.
72.3 Research Gap Analysis By referring several related works, the following challenges associated with the detection of vessel segmentation, micro-aneurysms, hemorrhages and exudates are noticed: • The detection of small MAs and detection of thin vessels under low-contrast images is difficult. • Automatic DR screening system depends only on vessel detection but not considering hemorrhage detection.
762
V. Bhutnal and N. R. Moparthi
• The exudates are detected with high complexity and failed to extract the real feature of the exudates. There is no clarity under differentiating the exudates and optic disk. • In exudates detection, optic disk borders are blurred; it is very tedious to find and extract the optic disk. • Most of the researchers have applied neural network and statistical classifiers for the screening of DR and resulting that these classifiers require more number of iterations and larger computational time. • Many processing techniques are developed under the segmentation of dark and bright lesions and grading of diabetic retinopathy, but they do not combine all the lesions together. The grading system of the existing work achieves less sensitivity and accuracy. • Due to the increase under a large number of DR patients, the experts do not attain a better result at a faster rate, specifically under rural areas. Therefore, the objectives to overcome above research gaps are as follows: • To formulate the definition of the Internet of Medical Things for automatic diabetic retinopathy diagnosis system. • To design a novel deep learning model for robust and efficient diabetic retinopathy detection. • To propose the consolidated model for automatic diabetic retinopathy detection and severity analysis. • To model, simulate, and evaluate proposed models using publicly available research datasets.
72.4 Methodology As per the above research gaps, we set below methodologies to overcome research gaps of the existing methods. The possible solutions are briefly presented in this section below. As per the objectives mentioned above, we attempt to propose below methodologies to overcome the research problems and satisfy the objectives. 1.
2.
Defining IoMT-enabled DR CAD system: The first aim of this research is to present the IoMT-enabled DR classification and severity analysis CAD system. In this contribution, we theoretically design the IoMT system that consists of different things and their roles for the automatic DR detection. The process of periodic retinal fundus image collection, its transmission toward the healthcare system, applying CAD, and patient monitoring according to the detection outcome. Automatic DR Classification: Applying CAD system on the input retinal fundus image for the disease detection. In this contribution, we attempt to design the accurate and computationally efficient CAD system using the deep learning mechanism without any manual processing.
72 Internet of Things-Enabled Diabetic Retinopathy Classification …
3.
763
Automatic DR Classification and Severity Analysis: The severity analysis is the important tasks in DR treatment. The DR classification does not reveal the severity of the disease; therefore, it is required to perform the grading for the accurate treatment. In this phase, we attempt to perform the appropriate features extraction and selection that directly related to the diseased regions for the grading classification. According to features ranges, we will define its disease severity.
72.5 Conclusion and Future Work This research examines a computer-based diagnosis approach that can assist ophthalmologists in diabetic retinopathy screening by recognizing early indications of diabetic retinopathy in retinal fundus pictures. We discovered that DR is divided into five categories in the literature: mild, moderate, severe, proliferative, and no illness. The major goal of this publication is to review previous research on retinopathy classification using fundus images. We discovered that various projects are carried out in accordance with the IoT platform. The findings of this study revealed a number of research gaps in the area of automated diabetic retinopathy identification and categorization. The numerous future directions are motivated by these research gaps.
References 1. Al-Antary, M.T., Arafa, Y.: Multi-scale attention network for diabetic retinopathy classification. IEEE Access 9, 54190–54200 (2021). https://doi.org/10.1109/access.2021.3070685 2. Zhou, Y., Wang, B., Huang, L., Cui, S., Shao, L.: A Benchmark for studying diabetic retinopathy: segmentation, grading, and transferability. IEEE Trans. Med. Imaging 40(3), 818–828 (2021). https://doi.org/10.1109/tmi.2020.3037771 3. Zang, P., Gao, L., Hormel, T.T., Wang, J., You, Q., Hwang, T.S., and Jia, Y.: DcardNet: diabetic retinopathy classification at multiple levels based on structural and angiographic optical coherence tomography. IEEE Trans. Biomed. Eng. 1–1 (2020) https://doi.org/10.1109/tbme.2020. 3027231 4. Rodrigues, E., Conci, A., Liatsis, P.: ELEMENT: multi-modalretinal vessel segmentation based on a coupled region growing and machine learning approach. IEEE J. Biomed. Health Inform. 1–1 (2020) https://doi.org/10.1109/jbhi.2020.2999257 5. Wang, J., Bai, Y.,Xia, B.: Simultaneous diagnosis of severity and features of diabetic retinopathy under fundus photography using deep learning. IEEE J. Biomed. Health Inform. 1–1. https:// doi.org/10.1109/jbhi.2020.3012547 6. Dai, L., Wu, L., Li, H., et al.: A deep learning system for detecting diabetic retinopathy across the disease spectrum. Nat. Commun. 12, 3242 (2021). https://doi.org/10.1038/s41467-021-234 58-5 7. Liu, H., Yue, K., Cheng, S., Pan, C., Sun, J., Li, W.: Hybrid model structure for diabetic retinopathy classification. J. Healthc. Eng. 2020, 1–9 (2020). https://doi.org/10.1155/2020/884 0174 8. Gayathri, S., Gopi, V.P., Palanisamy, P.: Automated classificationof diabetic retinopathy through reliable feature selection. Phys. Eng. Sci. Med. (2020) https://doi.org/10.1007/s13246-020-008 90-3
764
V. Bhutnal and N. R. Moparthi
9. Saman,G., Gohar, N., Noor, S., Shahnaz, A., Idress, S., Jehan, N., Rashid, R., Khattak, S.S.: Automatic detection and severity classification of diabetic retinopathy. Multimedia Tools Appl. (2020) https://doi.org/10.1007/s11042-020-09118-8 10. Kalyani, G., Janakiramaiah, B., Karuna, A., Prasad, L.V.N.: Diabeticretinopathy detection and classification using capsule networks. Complex Intell. Syst. (2021). https://doi.org/10.1007/ s40747-021-00318-9
Chapter 73
Open Research Issues of Battery Usage for Electric Vehicles Hema Gaikwad , Harshvardhan Gaikwad , and Jatinderkumar R. Saini
Abstract Electric Vehicles (EVs) that are the technology of the future face a major hindrance, which is the large battery size. Battery pack sums up to almost half of the cost of the EV. Lithium-ion batteries of EVs are made up of cells arranged in a particular pattern and even if a single cell is damaged, the entire battery pack has to be replaced. Instead of using a single large battery pack, it can be divided into several parts for more efficient use. This will prevent the entire battery from getting damaged even though a few cells are damaged. This is a more economically viable idea as replacement of the entire battery pack is very costly. Wastage is reduced and thus pollution due to chemical waste will come down exponentially keeping in mind the fact that EVs are being adopted at a phenomenal rate. Modular battery provides the customer with the choice of using the number of batteries as per their choice. They can use some modules of batteries in the EV while keeping others for charging. Battery swapping in a modular system is easier because of the lesser weight which makes it easier for them to be carried. Keywords Electric vehicles · Modular battery · Less waste · Battery swapping
73.1 Introduction As the world moves toward clean energy, the transportation sector would be witnessing radical changes. This sector is a major contributor to air pollution. With oil fields drying up, humankind is shifting toward non-conventional sources of energy. In this field, electric cars are a path breaking development. Electric cars, as the name suggests, are powered using electricity which comes out of a battery.
H. Gaikwad · J. R. Saini (B) Symbiosis Institute of Computer Studies and Research, Symbiosis International (Deemed) University, Pune, India e-mail: [email protected] H. Gaikwad Pandit Deendayal Energy University, Gandhinagar, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_73
765
766
H. Gaikwad et al.
Fig. 73.1 Graph of plug-in vehicle sales in the world (Source https://www.ev-volumes.com/wpcontent/uploads/2021/08/WW-K-6-2021.png)
For decades internal combustion (IC) engines have been the central machines used for running vehicles. Fuel is burnt in the IC engines and thus a smoke is produced. Induction motors have replaced IC engines in the electric vehicles. They have miniscule emission, are lightweight and have no moving parts. All these features make electric vehicles a promising technology. Figure 73.1 shows the penetration of various types of electric vehicles. It is miniscule as compared to IC engine vehicles. However, the basic problem with induction motors is that they need an external source of energy. This energy has to be supplied by batteries. Powering a car requires a tremendous amount of power and thus the batteries used are very large and heavy. Decreasing the weight of the vehicle is necessary in order to improve the mileage. Heavy weight is the primary reason why electric vehicles are costly and do not run at very high speeds. Fuel-based vehicles are refueled within a couple of minutes but the electric batteries take somewhere from 5 to 8 h for a single full charge. Figure 73.2 shows the country wise distribution of slow chargers. The fast charger infrastructure is much lesser than the slow ones. Being costly, their market penetration is very low in the middle- and low-income countries. All these challenges have to be overcome in order to make electrical vehicles competitive in the market. Figure 73.3 shows the statistics related to the sale of electric vehicle over the past few years in India. There is a steady increase in sales which is a positive sign for a developing economy like India. The research work is separated into various sections: Sect. 73.2 describes related work, Sect. 73.3 discusses the Methodology, Sect. 73.4 describes the Solution and finally, the Sect. 73.5 describes the Conclusion and Future work.
73 Open Research Issues of Battery Usage for Electric Vehicles
767
Fig. 73.2 Private EV slow chargers by country (Source https://www.ev-volumes.com/wp-content/ uploads/2021/08/WW-K-6-2021.png)
Fig. 73.3 Category-wise EVs sold in India (Source https://cef.ceew.in/solutions-factory/tool/ele ctric-mobility)
768
H. Gaikwad et al.
73.2 Related Work Karimi and Li [1] stated that thermal management in the EV battery can have a major impact on performance and life cycle of EV battery. Basic principles of transfer of heat are applied to judge the performance, temperature distribution and other characteristics of Li-ion batteries. Table 73.1 indicate issues related to Thermal Management System. Kim et al. [2] mentioned that BMSes must deal with the concerns that affect the efficiency and reliability of BMSes in order to meet rising demand and make EVs less expensive and safer. Because the impact of temperature on BMS electrical states, thermal management is critical to addressing these concerns. The author showed how to use thermal controls to create efficient and dependable BMSes in this research. Wang et al. [3] discussed the newly generated parameters from an electrochemical model. Their completely linked model calculates heat generation, whereas the decoupled model uses empirical equations based on experimental data. Qian et al. [4] stated that liquid cooling approach based on mini-channel cold-plate is utilized to explore the thermal performance of Li-ion battery packs, and a three-dimensional numerical model is constructed. The impacts of the number of channels, flow direction, inlet mass flow rate, and channel width on the battery pack’s thermal behavior were investigated. The cooling efficiency was calculated using the maximum temperature and temperature uniformity. Li et al. [5] presented an optimized air-based Battery Thermal Table 73.1 Battery thermal management issues Sr. No.
Year
Author
Description
1
2013
Karimi and Li [1]
Addressed concerns related to thermal management in the EVs
2
2014
Kim et al. [2]
Discussed how to use thermal controls to create efficient and dependable BMSes in this research
3
2016
Wang et al. [3]
Different problems and solutions of thermal management and thermal models were discussed
4
2019
Qian et al. [4]
Discussed a liquid cooling approach for battery thermal management system
5
2019
Li et al. [5]
Air based battery thermal management system were discussed
6
2019
Kim et al. [6]
Exhaustive analysis on all types of BTMS
7
2020
Akinlabi and Solyali [7]
Discussed that one of the most suitable power alternatives for an EV drivetrain is lithium-ion batteries (LiBs)
8
2020
Xiong et al. [8]
Discussed many types of faults like that of sensors, actuators, insulation, and faults in thermal management systems
9
2021
Yue et al. [9]
Discussed a hybrid battery temperature management
73 Open Research Issues of Battery Usage for Electric Vehicles
769
Management System (BTMS) for EV. The thermal performance was assessed using the criteria of temperature uniformity in the battery module and pressure drop. Kim et al. [6] stated the importance of maintaining an optimal range of temperature for the lithium-ion batteries as they are sensitive to temperature and fail to function properly under any deviation from the stipulated range. Akinlabi and Solyali [7] stated that a BTMS is perhaps the most important component of an EV, as it ensures that lithium-ion batteries work safely and consistently. It was also claimed that one of the most suitable power alternatives for an EV drivetrain is lithium-ion batteries (LiBs). Xiong et al. [8] stated that cells are the building blocks of the battery pack and even if a single cell misbehaves, the operation of the entire battery pack is disrupted which can lead to failure of the EV. Yue et al. [9] stated that because of the high energy density, extended lifespan, and great stability, Li-ion batteries have been widely employed to power EVs. However, EVs’ safety and performance are jeopardized by battery heat build-up which could be reduced by built in water spray mechanism. Table 73.1 shows the issues related to Battery Thermal Management system from 2013 to 2021 duration. Sturk et al. [10] have studied the consequences of a crash and other non-conducive condition on a Li-ion battery pack of an EV to come out with safer practices. Larsson et al. [11] have also discussed safety issues in Li-ion batteries and focused on issues like overcharging, propane fire test, and short circuit. Feng et al. [12] presented a thorough examination of the commercial lithium-ion battery’s thermal runaway process for electric vehicles. The abusive circumstances that may lead to thermal runaway have been outlined based on typical accidents. Kong et al. [13] also focused on the problem of thermal runaway, flame suppressors, safety valves, and separators. The cell chemistry related to generation, accumulation, and dissipation of heat have been elaborated by Zhang et al. [14]. Feng et al. [15] presented the results for establishment of different benchmarks for formulating the thermal runaway behavior of commercial Li-ion batteries. Wang et al. [16] proposed an approach to evaluate the safety of EV charging station even if they are powered using renewable sources of energy. Sun et al. [17] have focused on the latest fire safety issues in an EV caused due to thermal runaway and fire in Li-ion batteries. Duh et al. [18] stated that understanding the benefits and drawbacks of each cathode chemistry provides a good foundation for improving the 18,650 LIB safety measures. Table 73.2 indicates Thermal Analysis of Li-ion batteries and the issues. King et al. [19] stated that in this research, a car sharing concept is used to investigate a solution to a customer range anxiety problem. When compared to existing levels of subsidies to EV producers, the cost of this strategy was demonstrated to be modest. This cost could be further decreased by utilizing hourly car sharing and using low-priced vehicles. Jung et al. [20] stated that a study shows, highlighting rather than hiding the range anxiety uncertainty associated with a metric can be beneficial. While showing a single number is quick to read and understand, designers must consider the impact that hidden uncertainty may have on the user’s experience and behavior, particularly in key situations. Rauh et al. [21] suggested a conceptual model which appears to be useful for increasing comprehension of the range anxiety phenomena, although further extensive examination is required.
770
H. Gaikwad et al.
Table 73.2 Analytical report of Li-ion batteries and the issues Sr. No. Year
Author
Description
1
2015 Sturk et al. [10]
Studied the consequences of a crash and other non-conducive condition on a Li-ion battery pack of an EV
2
2016 Larsson et al. [11] Discussed safety issues in Li-ion batteries
3
2018 Feng et al. [12]
Thorough examination of the commercial. Lithium-ion battery’s thermal runaway process for electric vehicles
4
2018 Kong et al. [13]
Focuses on the problem of thermal runaway, its mechanism and the steps that could be taken to tackle it
5
2018 Zhang et al. [14]
Addressed safety issues related to Li-ion batteries
6
2019 Feng et al. [15]
Thermal analysis was used to study thermal runaway and it was found that it occurs at three distinctive temperatures
7
2019 Wang et al. [16]
Discussed development of safety standards related to charging of EV’s
8
2020 Sun et al. [17]
Discussed fire safety issues due to thermal runaway and fire in the Li-ion batteries
9
2021 Duh et al. [18]
Discussed improving the 18,650 LIB safety measures
Guo et al. [22] stated that establishing a large network charging infrastructure which is based on the psychology and behavior of the customer, is important for the adoption of EVs. Gelmanova et al. [23] has performed a study on the advantages and disadvantages of the adoption of electric vehicles. The average cost per month of owning an electric vehicle was estimated alongside the efficiency of the vehicle. Noel et al. [24] stated that ‘Range anxiety’, described as a consumer’s psychological worry in response to an electric vehicle’s limited range, has been labeled and presented as one of the most serious impediments to its mass adoption. As a result, academia, government, and even industry have concentrated their efforts on overcoming the range anxiety barrier in order to hasten adoption. Pevec et al. [25] stated that the trends in the electro mobility business, increased research activities connected to alternative powered cars, and growing environmental concerns all point to a required and unavoidable shift from internal combustion engine technology to electric vehicles (EV). Same set of researchers in another research work [26] stated that EVs have many advantages such as a very low operating cost and zero greenhouse gas emissions. This makes them a clean technology if they are charged by using renewable energy sources like biomass, solar power, and wind energy. The driver’s major concern is range anxiety which also demotivates the potential vehicle customers to pursue the EV option. Table 73.3 discussed Range Anxiety issues from 2015 to 2020. Gross and Clark [27] presented battery aging model and stated that the environmental impacts on the operation of the batteries and their life is critical for designing a thermal management system for a vehicle with a very high voltage battery. Carter et al. [28] presented a unique energy control technique for a battery/super capacitor vehicle that is designed to be customizable to achieve various goals. Super capacitors
73 Open Research Issues of Battery Usage for Electric Vehicles
771
Table 73.3 Range anxiety issues for EVs Sr. No Year Author
Description
1
2015 King et al. [19]
Discussed a car sharing solution to the problem of range anxiety
2
2015 Jung et al. [20]
Discussed the issue of hiding the range anxiety problems from the customers
3
2015 Rauh et al. [21]
Addressed range anxiety related issues in BEVs
4
2018 Guo et al. [22]
Discussed battery charging station location, range anxiety and distance deviations
5
2018 Gelmanova et al. [23] Discussed advantages and disadvantages of the adoption of EV’s
6
2019 Noel et al. [24]
Discussed the psychological impact of range anxiety on driver’s mind
7
2019 Pevec et al. [25]
Assess potential EV owners’ perceptions of range anxiety
8
2020 Pevec et al. [26]
Discussed range anxiety which causes problems in sales of EVs
were shown to be successful at lowering peak battery currents, but their benefits in terms of range extension were limited. Neubauer and Wood [29] stated that BEVs are the future because they yield the potential to cut down on both—oil imports and greenhouse gas emissions. But their usage is constrained by some factors like lack of charging infrastructure. Jeong et al. [30] stated that inductive charging electric vehicles use wireless power transfer technology for charging the batteries. Stationary and dynamic charging EVs are the two types of wireless charging EVs. Stationary EVs have to be charged while they are parked but dynamic chargers can charge even when the vehicle is in motion. Several studies have shown that one of the prime advantages of the dynamic chargers is that small and light batteries can also be used because this model allows frequent recharging and its chargers are easy to be installed. Alkarakchi et al. [31] stated that the most expensive component of an electric car is the lithium-ion battery. Efforts to improve the battery performance for extending the battery life and lowering the cost of ownership helps to promote EVs in the market. Their solution focuses on formulating an ideal charge profile for EV batteries by the time that it takes to get fully charged and the condition of the battery. Akar et al. [32] presented a fuzzy logic-based energy management strategy (EMS) for a battery/ultra-capacitor hybrid energy storage system (HESS). Andwari et al. [33] discussed that oil depletion and supply continue to grow, and as Europe grapples with the effects of climate change brought on by greenhouse gas emissions, it is increasingly exploring for alternatives to traditional road transportation technology. BEVs are seen as a promising technology that could lead to the decarburization of the Light Duty Vehicle fleet and oil independence. Schoch et al. [34] stated that the EV users favor charging their vehicles frequently even if the need does not be so (AFAP). The author has shown that battery deterioration minimal (optimal) charging method (OPT) helps to extend the life of the battery by two and even two and half
772
H. Gaikwad et al.
times. Beaudet et al. [35] stated that there has been observed an urgent requirement of solutions for enhancing the battery life and decreasing its cost. The authors have researched over a large and authentic database from projects that are still underway in countries like Canada. Table 73.4 indicates battery life issues. Botsford and Szczepanek [36] lamented the scarcity of EV chargers in the world. The level three chargers would be used by the comparatively richer population or the ones who are in a fix/hurry. The author has put focus on defining fast and slow chargers, the infrastructural requirements and the impact caused on the grid due to the increased demand of electricity. Fast chargers are the ones that provide enough energy to let the EV travel a distance of hundred miles in a single charge cycle. SanRoman et al. [37] have identified several regulatory issues that are faced by PEVs while they are integrated with the electricity sector. Schroeder and Traber [38] have advocated that only large-scale roll out of EV will bring in a boom in the fast charging technology business. Zhang et al. [39] stated that the charging infrastructure needs are investigated from the perspectives of PEV operating costs and BEV practicality in this study. An ideal pricing method based on 24-h travel patterns was presented to reduce operating costs. Mak et al. [40] noted the two important aspects for planning out the deployment of public chargers—the first one is the battery inventory for swapping stations and the second one is volatility in the adoption rate of EVs. Dong et al. [41] investigated the influence of deploying public charging infrastructure at different levels for lowering Table 73.4 Battery life issues Sr. No
Year
Author
Description
1
2011
Gross and Clark [27]
Discussed real-world environmental issues related to battery thermal management system
2
2012
Carter et al. [28]
Presented a unique energy control technique for a battery
3
2014
Neubauer and Wood [29]
Described Battery Lifetime Analysis and Simulation Tool for Vehicles (BLAST-V)
4
2015
Jeong et al. [30]
Wireless charging or inductive charging electric vehicle
5
2015
AlKarakchi et al. [31]
Discusses ways to improve battery performance by extending the battery’s life and increasing its holding capacity
6
2016
Akar et al. [32]
Presented a fuzzy logic-based energy management strategy for a battery
7
2017
Andwari et al. [33]
Assesses the technological preparedness of many aspects of BEV technology
8
2018
Schoch et al. [34]
Discussed the psychological effect of discharging vehicle on the mind of the driver
9
2020
Beaudet et al. [35]
Discussed projects on the solutions for enhancing the battery life and decreasing its cost
73 Open Research Issues of Battery Usage for Electric Vehicles
773
Table 73.5 Business model and infrastructural issues for EVs Sr. No.
Year
Author
Description
1
2009
Botsford and Szczepanek [36]
Defining fast and slow chargers and infrastructural requirements
2
2011
San et al. [37]
Discussed regulatory issues for plug-in electric vehicle integration in the electricity sector
3
2012
Schroeder and Traber [38]
Described fast charger technology business model used in the Germany
4
2013
Zhang et al. [39]
To cut operating expenses, an optimum pricing system based on 24-h travel patterns is proposed
5
2013
Mak et al. [40]
For the exchanging station location problem, two distributional robust optimization models were developed
6
2014
Dong et al. [41]
Discussed the influence of varying deployment levels of public charging infrastructure on lowering BEV range anxiety
7
2020
Greene et al. [42]
Discussed method for assessing the value of public charging infrastructure for plug-in electric vehicles
8
2020
Das et al. [43]
Covers every aspect of electric vehicle charging and grid interconnection architecture
the range anxiety of BEVs. They advocated that more strategically positioned public charging stations could aid BEV drivers in reducing range anxiety. Greene et al. [42] derived a model that was based on the benefits of the fact that BEVs are able to travel to longer distances and PHEVs use electricity in place of gasoline. According to Das et al. [43] as EV technology, charging infrastructure, and grid integration facilities improve, the popularity of EVs are expected to rise in the coming few years. This paper covers every aspect of electric vehicle charging and grid interconnection architecture. Table 73.5 describes the business model and infrastructural issues for EVs.
73.3 Methodology The internal structure of electric cars is a bit different from the diesel- or petrolpowered cars. The core components of both the vehicles have a stark contrast. The induction motor is the heart of an electric vehicle. A three-phase alternating current is supplied to the motor to rotate it. Induction motors have a simple structure as compared to internal combustion engines that have many moving parts which in turn
774
H. Gaikwad et al.
hampers their efficiency. Being lightweight, induction motors perfectly replace IC engines in an electric vehicle. Battery pack is the powerhouse of an electric vehicle and is thus a crucial component. This is also the heaviest and the costliest component. The battery pack is generally placed in the lower part of the car. This also helps to manage the center of gravity of the vehicle. But due to this, the ground clearance of the vehicle is reduced. So, the fear of the battery being harmed in a speed breaker always persists. Thus, in some vehicles, the battery is kept in the backside. In most electric vehicles, the battery constitutes almost half the total cost of the car. The battery pack used in electric cars is not a single battery. Instead, the manufacturers arrange hundreds of cells in a particular pattern to get the desired output. The Fig. 73.4 shows the Battery pack that consist of many cells. The battery pack structure also includes a cooling mechanism with pipes and ducts. The cooling mechanism can be either liquid-based or air-based. The most popular one is the liquid-based battery cooling mechanism. The large size of the battery causes some difficulties in the electric car. Two types of chargers are used—Fast chargers and slow chargers. Fast chargers are capable of charging the battery within half an hour. But they are costly to set up and degrade the battery’s life. The car manufacturing companies are promoting slow chargers because its infrastructure can be set up with ease. And if there is a steady increase in the charging station’s infrastructure, more and more people would be motivated to purchase electric vehicles and thus sales will rise. Slow chargers take more than three to four hours in order to charge a battery pack fully. Meanwhile the refueling of a traditional fuel powered car hardly takes five minutes. Thus, it is necessary to reduce the charging time if electric cars have to be promoted in the masses. Several researchers have developed different kinds of batteries but cost reduction still remains a major loophole. Another major challenge that batteries of electric cars face is that due to their large size, overheating of batteries is a major concern. There have been plenty of cases where car batteries have exploded or burnt abruptly. Li-ion batteries suffer thermal runaway if overcharged continuously for a long period of time. It has been noted that extinguishing electric vehicle fires because it keeps reigniting. Copious amounts of water or dry graphite is needed to extinguish Fig. 73.4 A battery pack (Multiple cells arranged in a calculated pattern)
73 Open Research Issues of Battery Usage for Electric Vehicles Table 73.6 EV fire accidents in Norway
775
Company Name
Vehicle fire incidents
EV fire incidents (Total %)
X (2006 to 2016)
567
27 (=5%)
Y (2014 to 2016)
499
13 (=2%)
Z (2016)
386
9 (=2%)
Source (https://www.researchgate.net/figure/Statistics-on-EVfire-incidents-received-from-Norwegian-insurance-companiesStatistics_tbl10_336640117)
this fire and not many fire stations in India have it. Table 73.6 shows the details for EV fire accidents in Norway from 2006 to 2016.
73.4 Solutions By dividing the battery pack into three parts, many of the problems can be solved. The Fig. 73.5 shows the three modules of Battery pack. The three different modules of the battery can be charged simultaneously and thus the charging time will be decreased by a factor of three. It is not necessary to use all the three modules every time. If the consumer wishes to go for a short distance, only one or two modules can be used. Thus, the weight of the battery pack will decrease and the efficiency will be enhanced. If the battery is segmented into three parts and an error is encountered in any one of the modules, the whole battery pack will not have to be changed. This is necessary because replacing the erroneous module is way cheaper than replacing the entire battery pack. When a battery pack catches fire, not all the cells are damaged but removing the usable cells becomes a tough job. So, these
Fig. 73.5 A three modules of battery pack
776
H. Gaikwad et al.
cells have to be dumped. But in a modular battery pack, the fire-resistant material of the module’s covering will protect them even if a particular module catches fire.
73.5 Conclusion and Future Work The awareness about climate change and global warming is increasingly rapidly which is why governments are making policies which are based on the principle of zero carbon emissions. Vehicles on the road are a major contributor to polluting the environment. Thus, there was an urgent need to introduce EV in the market and make sure they are adopted by the masses. EVs most important part is the battery pack used in it. But being a single large battery pack, it faces several issues such as Battery thermal management system, Thermal analysis of Li-ion batteries, Range anxiety, Battery life, and Business model and infrastructure. Segmenting a battery pack into three modules will solve many problems that the current EVs face. In case of fire in the battery pack, one or two modules will be damaged and not the entire battery. Currently, replacing the battery pack is costly but replacing modules will significantly be cheaper. The modular battery will help to give choices in the distance the driver needs to travel. If a smaller distance is to be traveled, a module can be removed so that the weight of the car is reduced and a better efficiency can be provided. Therefore, modular and segmented batteries solve most of the issues faced by the EVs. This gives a psychological relief to the driver and thus promotes the acceptance of EV. As a future work, the authors would like to consider any issues and provide optimum solution to the society.
References 1. Karimi, G., Li, X.: Thermal management of lithium-ion batteries for electric vehicles. Int. J. Energy Res. 37(1), 13–24 (2013) 2. Kim, E., Shin, K. G., Lee, J.: Real-time battery thermal management for electric vehicles. In: ACM/IEEE International Conference on Cyber-Physical Systems (ICCPS), pp 72–83 (2014) 3. Wang, Q., Jiang, B., Li, B., Yan, Y.: A critical review of thermal management models and solutions of lithium-ion batteries for the development of pure electric vehicles. Renew. Sustain. Energy Rev. 64, 106–128 (2016) 4. Qian, Z., Li, Y., Rao, Z.: Thermal performance of lithium-ion battery thermal management system by using mini-channel cooling. Energy Convers. Manage. 126, 622–631 (2016) 5. Li, M., Liu, Y., Wang, X., Zhang, J.: Modeling and optimization of an enhanced battery thermal management system in electric vehicles. Front. Mech. Eng. 14(1), 65–75 (2019) 6. Kim, J., Oh, J., Lee, H.: Review on battery thermal management system for electric vehicles. Appl. Therm. Eng. 149, 192–212 (2019) 7. Akinlabi, A.A.H., Solyali, D.: Configuration, design, and optimization of air-cooled battery thermal management system for electric vehicles: a review. Renew. Sustain. Energy Rev. 125, 109815 (2020) 8. Xiong, R., Sun, W., Yu, Q., Sun, F.: Research progress, challenges and prospects of fault diagnosis on battery system of electric vehicles. Appl. Energy 279, 115855 (2020)
73 Open Research Issues of Battery Usage for Electric Vehicles
777
9. Yue, Q.L., He, C.X., Jiang, H.R., Wu, M.C., Zhao, T.S.: A hybrid battery thermal management system for electric vehicles under dynamic working conditions. Int. J. Heat Mass Transf. 164, 120528 (2021) 10. Sturk, D., Hoffmann, L., Tidblad, A.A.: Fire tests on e-vehicle battery cells and packs. Traffic Inj. Prev. 16(sup1), S159–S164 (2015) 11. Larsson, F., Andersson, P., Mellander, B.E.: Lithium-ion battery aspects on fires in electrified vehicles on the basis of experimental abuse tests. Batteries 2(2), 9 (2016) 12. Feng, X., Ouyang, M., Liu, X., Lu, L., Xia, Y., He, X.: Thermal runaway mechanism of lithium ion battery for electric vehicles: a review. Energy Storage Mater. 10, 246–267 (2018) 13. Kong, L., Li, C., Jiang, J., Pecht, M.G.: Li-ion battery fire hazards and safety strategies. Energies 11(9), 2191 (2018) 14. Zhang, J., Zhang, L., Sun, F., Wang, Z.: An overview on thermal safety issues of lithium-ion batteries for electric vehicle application. IEEE Access 6, 23848–23863 (2018) 15. Feng, X., Zheng, S., Ren, D., He, X., Wang, L., Liu, X., Ouyang, M.: Key characteristics for thermal runaway of Li-ion batteries. Energy Procedia 158, 4684–4689 (2019) 16. Wang, B., Dehghanian, P., Wang, S., Mitolo, M.: Electrical safety considerations in large-scale electric vehicle charging stations. IEEE Trans. Ind. Appl. 55(6), 6603–6612 (2019) 17. Sun, P., Bisschop, R., Niu, H., Huang, X.: A review of battery fires in electric vehicles. Fire Technol. 1–50 (2020) 18. Duh, Y.S., Sun, Y., Lin, X., Zheng, J., Wang, M., Wang, Y., Yu, G.: Characterization on thermal runaway of commercial 18650 lithium-ion batteries used in electric vehicles: a review. J. Energy Storage 41, 102888 (2021) 19. King, C., Griggs, W., Wirth, F., Quinn, K., Shorten, R.: Alleviating a form of electric vehicle range anxiety through on-demand vehicle access. Int. J. Control 88(4), 717–728 (2015) 20. Jung, M.F., Sirkin, D., Gür, T.M., Steinert, M.: Displayed uncertainty improves driving experience and behavior: the case of range anxiety in an electric car. In: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, pp. 2201–2210 (2015) 21. Rauh, N., Franke, T., Krems, J.F.: Understanding the impact of electric vehicle driving experience on range anxiety. Hum. Factors 57(1), 177–187 (2015) 22. Guo, F., Yang, J., Lu, J.: The battery charging station location problem: Impact of users’ range anxiety and distance convenience. Trans. Res. Part E: Logistics Transp. Rev. 114, 1–18 (2018) 23. Gelmanova, Z.S., Zhabalova, G.G., Sivyakova, G.A., Lelikova, O.N., Onishchenko, O.N., Smailova, A.A., Kamarova.: Electric cars. Advantages and disadvantages. J. Phys. Conf. Ser. 1015(5), 052029 (2018) 24. Noel, L., de Rubens, G.Z., Sovacool, B.K., Kester, J.: Fear and loathing of electric vehicles: the reactionary rhetoric of range anxiety. Energy Res. Soc. Sci. 48, 96–107 (2019) 25. Pevec, D., Babic, J., Carvalho, A., Ghiassi-Farrokhfal, Y., Ketter, W., Podobnik, V.: Electric vehicle range anxiety: an obstacle for the personal transportation (r) evolution? In: Proceedings of 4th International Conference on Smart and Sustainable Technologies, SpliTech, pp. 1–8. IEEE (2019) 26. Pevec, D., Babic, J., Carvalho, A., Ghiassi-Farrokhfal, Y., Ketter, W., Podobnik, V.: A surveybased assessment of how existing and potential electric vehicle owners perceive range anxiety. J. Clean. Prod. 276, 122779 (2020) 27. Gross, O., Clark, S.: Optimizing electric vehicle battery life through battery thermal management. SAE Int. J. Engines 4(1), 1928–1943 (2011) 28. Carter, R., Cruden, A., Hall, P.J.: Optimizing for efficiency or battery life in a battery/super capacitor electric vehicle. IEEE Trans. Veh. Technol. 61(4), 1526–1533 (2012) 29. Neubauer, J., Wood, E.: The impact of range anxiety and home, workplace, and public charging infrastructure on simulated battery electric vehicle lifetime utility. J. Power Sources 257, 12–20 (2014) 30. Jeong, S., Jang, Y.J., Kum, D.: Economic analysis of the dynamic charging electric vehicle. IEEE Trans. Power Electron. 30(11), 6368–6377 (2015) 31. Al-karakchi, A.A.A., Lacey, G., Putrus, G.: A method of electric vehicle charging to improve battery life. In: Proceedings of 50th International Universities Power Engineering Conference (UPEC), pp. 1–3. IEEE (2015)
778
H. Gaikwad et al.
32. Akar, F., Tavlasoglu, Y., Vural, B.: An energy management strategy for a concept battery/ultracapacitor electric vehicle with improved battery life. IEEE Trans. Transp. Electrification 3(1), 191–200 (2016) 33. Andwari, A.M., Pesiridis, A., Rajoo, S., Martinez-Botas, R., Esfahanian, V.: A review of battery electric vehicle technology and readiness levels. Renew. Sustain. Energy Rev. 78, 414–430 (2017) 34. Schoch, J., Gaerttner, J., Schuller, A., Setzer, T.: Enhancing electric vehicle sustainability through battery life optimal charging. Transp Res Part B: Methodological 112, 1–18 (2018) 35. Beaudet, A., Larouche, F., Amouzegar, K., Bouchard, P., Zaghib, K.: Key challenges and opportunities for recycling electric vehicle battery materials. Sustainability 12(14) (2020) 36. Botsford, C., Szczepanek, A.: Fast charging vs. slow charging: pros and cons for the new age of electric vehicles. In: Proceedings of International Battery Hybrid Fuel Cell Electric Vehicle Symposium, pp. 1–9 (2009) 37. San Román, T.G., Momber, I., Abbad, M.R., Miralles, Á.S.: Regulatory framework and business models for charging plug-in electric vehicles: infrastructure, agents, and commercial relationships. Energy Policy 39(10), 6360–6375 (2011) 38. Schroeder, A., Traber, T.: The economics of fast charging infrastructure for electric vehicles. Energy Policy 43, 136–144 (2012) 39. Zhang, L., Brown, T., Samuelsen, S.: Evaluation of charging infrastructure requirements and operating costs for plug-in electric vehicles. J. Power Sources 240, 515–524 (2013) 40. Mak, H.Y., Rong, Y., Shen, Z.J.M.: Infrastructure planning for electric vehicles with battery Swapping. Manage. Sci. 59(7), 1557–1575 (2013) 41. Dong, J., Liu, C., Lin, Z.: Charging infrastructure planning for promoting battery electric vehicles: an activity-based approach using multiday travel data. Transp. Res. Part C: Emerg. Technol. 38, 44–55 (2014) 42. Greene, D.L., Kontou, E., Borlaug, B., Brooker, A., Muratori, M.: Public charging infrastructure for plug-in electric vehicles: what is it worth? Transp. Res. Part D: Transp. Environ. 78, 102182 (2020) 43. Das, H.S., Rahman, M.M., Li, S., Tan, C.W.: Electric vehicles standards, charging infrastructure, and impact on grid integration: a technological review. Renew. Sustain. Energy Rev. 120, 109618 (2020)
Chapter 74
Comparative Cost Analysis of On-Chain and Off-Chain Immutable Data Storage Using Blockchain for Healthcare Data Babita Yadav and Sachin Gupta
Abstract While the blockchain technology continues making giant strides in both crypto currencies and everyday use cases, the cost of gas expenses keeps on skyrocketing with increase in its demand. The problems get compounded with the technological requirements surfacing with increasing volume of the immutable data storage, which itself is very expensive in terms of gas costs. We analyze the costs associated with blockchain storage in this paper, and present a comparative analysis of how the healthcare data storage on immutable media varies with on-chain and off-chain blockchain solutions. Keywords Gas costs · Immutable data stores · Storage cost · Off-chain and on-chain solutions
74.1 Introduction Blockchain is the new buzz word being hailed as a disruption in the medical record storage field. In recent years we have witnessed a wide span of applications using this technology in various use cases. Blockchain as a disruptive technology is a way for generating more secure [1] and controlled applications. It denotes a shared immutable record, which is in the form of a chain of transactions, the blocks being held together by hashes or cryptographic keys. The concept explained theoretically is that no one can delete or change once data is written but in practice when records cannot be deleted, the associated data piles up. So in every issue surrounding discussions about blockchain, somewhere the issue of scalability occurs. Storing data permanently on blockchain is very costly and some applications like health care are data intensive by virtue of the volume and size of patient records to be stored. It is very expensive and time consuming to store these copious amounts of health-related data records B. Yadav · S. Gupta (B) SoET, MVN University, Palwal, Haryana, India e-mail: [email protected] B. Yadav e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_74
779
780
B. Yadav and S. Gupta
on public or consortium blockchains. Storing complete data on-chain is not a wise decision, as it has been established that on-chain storage is neither scalable nor efficient. It is also very expensive to store any other data in addition to core ledger and related hashes, as with each transaction costs are added and a punitive fee for each read is also applied each time you want to read data. With the technology spectrum changing rapidly, the working of databases are also changing from traditional pure centralized storage to storage used for analytical engines, i.e., distributed storage [2]. Data to be stored in databases are no more structured, it can be of any type including structured, semi structured and unstructured formats. The new generation data stores are no more limited to storing data only, they are preferred to be analytical to handle a variety of disparate data from sources including social media, device networks, Internet of Things (IoT) data, sensors data, data and immutability is desirable in synchronization with velocity and cloud-based nature of applications. Some applications require data to be stored for a particular life span for example health care [3] data of some individual should be ideally stored from his birth until his death with immutability property. The key to disrupting healthcare data storage lies in alternative solutions of storing data on blockchain. In a patient centric environment offered by a blockchain-based storage deployment, the data owner shall be the most important key in data decisions, which translates to better security, control over data, and privacy [4] for the stakeholders. The major question remains how ready is the health industry toward utilizing blockchain storage solutions all the while making the expenses manageable. The key contribution of this paper is a comparative analysis of how data storage problems can be dealt using off-chain and on-chain storage models with healthcare case study. The scalability requirements of a healthcare case study have been analyzed with the help of off-chain storage scalability [5] solutions. The remainder of the paper has been structured as follows: In the second section, a brief discussion of the related work in the blockchain storage domain has been discussed while in the third section on methodology, a comparative cost analysis method has been discussed. Section four presents some results for comparison conducted in the third section, followed by concluding remarks in section five.
74.2 Related Work In [1] a system was introduced blockchain-based distributed data storage with the option to search on the basis of keywords. Data is uploaded on cloud nodes in encrypted form, ensuring data availability using cryptographic techniques. The paper discussed how data access permission is given by the owner of data. A comprehensive use case of IPFS in blockchain storage has been discussed in [2]. In this paper it is discussed how content-based addresses are used in IPFS, for storage and retrieval on a low-cost basis. Due to this content-based address, indexing and search operations in blockchain become faster. The research in IPFS is ongoing
74 Comparative Cost Analysis of On-Chain and Off-Chain Immutable …
781
with dynamic content. In [6] a survey of methods for storing Healthcare data in decentralized media has been discussed. A recent discussion given in [7] is a comprehensive discussion of various offchain storage mechanisms with a comparative analysis with on-chain methods on both suitability and computation speed-up. The authors in [8] deliberated that open data sharing while keeping its privacy under privacy protection acts are the core for various applications being developed. When data is uploaded on cloud storage, users will lose ownership over their own data. In this paper blockchain, IPFS and cipher text-based encryption policy are combined. The paper proposes a solution to how fine-grained access control can be given to users. In healthcare one major problem was related to interoperability. In [9] the authors discussed patient driven interoperability, which means data exchange is patient driven and also patient centric. In this paper various mechanisms like access rules to digitally access, aggregation of data, preserving patient identity, data liquidity that can facilitate the transition using blockchain are discussed. In [10] the blockchain bloat problem has been discussed. It was discussed that while each blockchain node stores some of its own data, the transactions of other nodes are also required to be stored for verification. With each additional transaction storage, volumes keep growing and a distributed form of network coded storage has been suggested as a solution. In paper [11], IPFS-based storage for blockchain was proposed. Whenever the transaction get submitted, the transaction data gets stored on IPFS while the hash returned gets stored on-chain. By applying this scheme, the compression ratio reached 0.0817 when applied to bitcoin. The analysis shows the performance in terms of security and synchronization while speed of new nodes also increased. The authors in paper [12] presented a comparative analysis of the cloud-based storage vis-svis the blockchain storage. In this paper four blockchain-based storage systems were discussed. Storj, which uses proof-of-redundancy, is a frontrunner offchain solution. In storj, the split of all files called shards are created and sent on the network. The storj files are then replicated at three places. In this paper the bloating problem was also discussed and the solution given was to store only metadata with encryption on blockchain.
74.3 Methodology 74.3.1 Understanding Blockchain Distinction by Accessibility Permissionless or public versions of blockchains like bitcoin are completely decentralized, open and available to everyone for joining. Most applications in the use cases of permissionless blockchain include cryptocurrencies. This blockchain is slowest
782
B. Yadav and S. Gupta
among all types as it is very tedious to make data verification for so many participants. It is less vulnerable to be attacked and controlled by dominant actors, where the actors can be any participant. Permissioned and private blockchain run on a private network which is not open for all, but one can be a part of the network after seeking permission from stakeholder only. Trust is weaker as compared to public blockchain as validation is done by the owner or central node. Hybrid blockchain, by its definition, uses the best of both permissioned and permissionless mechanisms. The use cases in this domain require the organizations to partition data and transactions based on the sensitivity between the permissioned parts of the chain while keeping connections to be stored in a publicly accessible part of the hybrid chain. A Consortium blockchain is by and large the same as a fully controlled private version of blockchain technology [13].
74.3.2 Understanding Blockchain Distinction by Storage Blockchain data storage can use two approaches with the location of actual data being either off the blockchain, i.e., off-chain or on the blockchain itself. In the onchain approach, popular with completely public implementations, entire user data is stored on blockchain. The approach is beneficial in the event of any attack, where complete decentralization results in massive duplication and the data can possibly be recovered by synchronization from other peer nodes as it is distributed in nature [14]. This security we are getting here comes at a huge price to replicate full nodes across participants, which is a pretty expensive alternative. The researchers believe that scalability is the main issue with blockchain as on-chain storage was facing a blockchain bloating problem as well as high fees to store data on-chain was also very high. To solve this problem one more approach came into existence and that resulted in off-chain storage [15]. In this approach, the users’ data in the blockchain is not stored on-chain, instead it is stored somewhere else which may be on cloud as well [16]. The metadata of the user’s data is stored on the blockchain. The resulting data storage is considered the most cost efficient solution by far among all proposed.
74.3.2.1
Blockchain On-Chain Storage
We may visualize any blockchain as a list of connected blocks, joined together using cryptographic mechanisms. It is the responsibility of each block to keep a unique signature of the proceeding block, which includes timestamp information, and actual transaction data. It is designed to write only once and read only in the database. The write once, never delete policy makes sure that the immutability is preserved. Usually, transactional data is stored in a decentralized ledger which requires 1 KB or less for a single transaction. It is however, very expensive to store regular data such as patient prescription, x-rays, CT-Scans, patient sensor data in case of healthcare data. It is
74 Comparative Cost Analysis of On-Chain and Off-Chain Immutable …
783
Table 74.1 Parameters for comparative cost analysis Avg block size
Avg block time
Avg transaction fees
Transactions per day
Bitcoin
1.068 MB
10.75 min
156.18 USD/TXN
207,116.00
Ethereum
0.077 MB
0.22 min
1.601 USD/TXN
1.123 M
not like we cannot store this data on a blockchain, but prices of storing data on blockchain ledger is very high, which does not allow current applications transition to blockchain-based application. As an illustration, we are taking the example of an Ethereum blockchain to understand the economics of blockchain [17]. In Eq. (74.1), the general formula to understand cost when storing on-chain has been given, where Gp is gas price, Gl is gas limit and Tc is total cost T c = Gp ∗ Gl
(74.1)
To understand the equation, we need to understand that Ethereum blockchain functioning requires some fuel in terms of a kind of currency called Gas. All transactions including transferring ethers, creating and executing smart contracts and any other activity which requires any kind of computation requires a transaction execution fee. Every transaction shall have a variable gas fee associated with it depending on the computational complexity of the operations clubbed together in execution of that transaction. In simple words, the gas cost will increase with the complexity of the problem. As on March 7th 2022, the gas cost is 51 gwei or 0.00000005 ETH (1 ETH = 3,517.6794 USD) the storage of each data item of 256 bits will cost 20,000 gas equivalent to 0.0904 USD [18]. The cost of storage per word is constant for store operation. In Ethereum yellow paper [17] cost of every operation is given, which can be considered when required to calculate cost for various operations. The above cost of Ethereum has been considered for the calculations in the remaining part of the paper. For the comparative cost analysis, we are using the Bitcoin and Ethereum transaction statistics [18] shown below in Table 74.1. We are also assuming that the comparative data to be stored belongs to a single patient, with an average 200 Transactions per year and storing 2 kb per transaction on an average. We also assume the smart contract size for both use cases is 5 kb of one-time storage cost. Total Storage(Annualized) = 400 kb(Variable)
74.3.2.2
(74.2)
BlockChain Off-Chain Storage Options
We shall consider storing the data pertaining to a patient record as assumed in Eq. (74.2), with an assumption that this data needs to be changed, deleted when
784
B. Yadav and S. Gupta
required, and hence cannot be kept on blockchain. This demands for an off-chain storage solution. Like on-chain mechanisms, off-chain storage too offers various options like centralized, decentralized, and distributed, etc. So the choice of off-chain storage depends upon the application domain. Hash of data will be generated which is stored on the blockchain. The actual data item is never stored on the blockchain itself, it is rather placed on the cloud or in near-cloud storage. In summary we can say all data will be stored as it used to be in general applications. The only difference is that by keeping the hash of the transaction on the chain will confirm whether there is change in data kept on off-chain or not. Cost of storage on off - chain = off - chain storage cost + transaction cost on on - chain + accessing cost from on - chain + smart contract storage cost + cost of running smart contract
74.3.2.3
(74.3)
IPFS as Off-Chain Storage Option
IPFS is an abbreviation for the new generation decentralized storage called Inter Planetary File System which powers the web3.0. It works on a collection of nodes running IPFS software for storing the files, however to keep the storage manageable, the files are garbage collected on a regular basis. To avoid garbage collection for data needed to be kept permanent, IPFS has an option for pinning the data either using a local node or on a cloud-based service. The local node needs to be kept running if the pinning has been done on it, while cloud services like Pinata can take a premium for maintaining the information. IPFS simply stores the hashes to files on the blockchain. The generated hashes can then be used to find the actual location of the file. An illustration of the IPFS-based storage for healthcare data has been shown in Fig 74.1. Using Eq. (74.3), the cost of storage on an IPFS solution with a smart contract verification of data along with cost of pinning can be calculated as in Eq. (74.4) as shown below: Cost of storage on off - chain = off - chain storage cost + transaction cost on on - chain + smart contract storage cost + cost of running smart contract[18] + Pinning cost (74.4) where the Cost of pinning the storage on IPFS has been given at $0 upto 1 GB, the cost of pinning has been mentioned at $0.15/GB monthly (extra pinning cost), and its comparably negligible so we are not adding it to the calculations here.
74 Comparative Cost Analysis of On-Chain and Off-Chain Immutable …
785
Fig. 74.1 IPFS based off-chain storage in blockchain
74.4 Results A comparative cost analysis for the data storage requirements as per Eq. (74.2) above has been made for both use cases, including on-chain storage and off-chain storage using IPFS are shown in the Table 74.2 below (Fig. 74.2). Table 74.2 Comparative cost analysis of on-chain versus off-chain storage
Data to be stored
On-chain data storage cost
Off-chain data storage cost (IPFS)a
1 KB
103.76 USD
2.59 USD
2 KB
165.89 USD
2.59 USD
5 KB
414.74 USD
2.59 USD
10 KB
829.49 USD
2.59 USD
50 KB
4147.47 USD
2.59 USD
100 KB
8294.94 USD
2.59 USD
200 KB
16,589.88 USD
2.59 USD
400 KB
33,179.77 USD
2.59 USD
1000 KB
82,949.44 USD
2.59 USD
a Considering fixed size of hash (32 bytes) generated by IPFS using
SHA256 algorithm
786
B. Yadav and S. Gupta
Fig. 74.2 Comparative cost comparison chart
74.5 Conclusions and Future Work The results clearly indicate that the savings in terms of gas costs are huge when we work with off-chain storage mechanisms like IPFS. The paper has not considered the cost of running the IPFS verification through a smart contract which shall be necessary in the case of off-chain storage. This computation shall result in adding the cost of storing the smart contract on-chain, and the cost of executing the smart contract for verification each time data is accessed. This cost would need a breakeven analysis of the minimum amount of data to be stored off-chain to offset the overheads. The present paper has discussed only IPFS, and it would be interesting to see how the other storage services like storj fare in comparison. The new “zero gas fee” blockchain like Bitgert [19] which are compatible with Ethereum are also possible to explore for a comparative cost analysis.
References 1. Do, H.G., Ng, W.K.: Blockchain-based system for secure data storage with private keyword search. Processing—2017 IEEE 13th World Congress Services Services 2017, pp. 90–93. (2017) https://doi.org/10.1109/SERVICES.2017.23 2. Kumar, S., Bharti, A.K., Amin, R.: Decentralized secure storage of medical records using Blockchain and IPFS: a comparative analysis with future directions. Secur. Priv. 4(5) (2021). https://doi.org/10.1002/spy2.162 3. Dimitrov, D.V.: Blockchain applications for healthcare data management. Healthc. Inform. Res. 25(1), 51–56 (2019). https://doi.org/10.4258/hir.2019.25.1.51 4. Henry, R., Herzberg, A., Kate, A.: Blockchain access privacy: challenges and directions. IEEE Secur. Priv. 16(4), 38–45 (2018). https://doi.org/10.1109/MSP.2018.3111245 5. Khan, D., Jung, L.T., Hashmani, M.A.: Systematic literature review of challenges in blockchain scalability. Appl. Sci. (Switzerland) 11(20) (2021). https://doi.org/10.3390/app11209372
74 Comparative Cost Analysis of On-Chain and Off-Chain Immutable …
787
6. Cyran, M.A.: Blockchain as a foundation for sharing healthcare data. Blockchain Health Today (2018). https://doi.org/10.30953/bhty.v1.13 7. Eberhardt, J., Heiss, J.: Off-chaining Models and Approaches to Off-chain Computations, pp. 7–12. ISBN: 9781450361101 8. Gao, H., Ma, Z., Luo, S., Xu, Y., Wu, Z.: BSSPD: a blockchain-based security sharing scheme for personal data with fine-grained access control. Wirel. Commun. Mob. Comput. 2021(2021). https://doi.org/10.1155/2021/6658920 9. Gordon, W.J., Catalini, C.: Blockchain technology for healthcare: facilitating the transition to patient-driven interoperability. Comput. Struct. Biotechnol. J. 16, 224–230 (2018). https://doi. org/10.1016/j.csbj.2018.06.003 10. Dai, M., Zhang, S., Wang, H., Jin, S.: A low storage room requirement framework for distributed ledger in blockchain. IEEE Access 6(c), 22970–22975 (2018). https://doi.org/10.1109/ACC ESS.2018.2814624 11. Zheng, Q., Li, Y., Chen, P., Dong, X.: An innovative IPFS-based storage model for blockchain. Processing—2018 IEEE/WIC/ACM International Conference Web Intelligent WI 2018, pp. 704–708. (2019). https://doi.org/10.1109/WI.2018.000-8 12. Zahed Benisi, N., Aminian, M., Javadi, B.: Blockchain-based decentralized storage networks: a survey. J. Network Comput. Appl. 162 (2020). https://doi.org/10.1016/j.jnca.2020.102656 13. Siyal, A.A., Junejo, A.Z., Zawish, M., Ahmed, K., Khalil, A., Soursou, G.: Applications of blockchain technology in medicine and healthcare: challenges and future perspectives. Cryptography 3(1), 3 (2019). https://doi.org/10.3390/cryptography3010003 14. Ko, T., Lee, J., Ryu, D.: Blockchain technology and manufacturing industry: real-time transparency and cost savings. Sustainability (Switzerland) 10(11) (2018). https://doi.org/10.3390/ su10114274 15. Zhang, C., Xu, C., Wang, H., Xu, J., Choi, B.: Authenticated keyword search in scalable hybrid-storage blockchains. In: Proceedings of the 37th IEEE International Conference on Data Engineering, pp. 996–1007. (2021) 16. Shah, M., Shaikh, M., Mishra, V., Tuscano, G.: Decentralized cloud storage using blockchain. In: Proceedings of the 4th International Conference on Trends in Electronics and Informatics, ICOEI 2020, Icoei, pp. 384–389. (2020). https://doi.org/10.1109/ICOEI48184.2020.9143004 17. Wood, G.: Ethereum: a secure decentralized generalized transaction ledger. Ethereum.github.io, 2014. [Online]. Available: https://ethereum.github.io/yellowpaper/paper.pdf. [Accessed: 05 March 2022] 18. https://ycharts.com/indicators/reports 19. https://bitgert.com 20. https://medium.com/the-capital/how-much-does-it-cost-to-deploy-a-smart-contract-on-eth ereum-11bcd64da1 21. https://tradeblock.com/blog/analysis-of-bitcoin-transaction-size-trends
Chapter 75
Analysis of Various Toxic Gas Levels Using 5G ML-IoT for Air Quality Monitoring and Forecasting Sumeet Gupta, Paruchuri Chandra Babu Naidu, Vasudevan Kuppan, M. Alagumeenaakshi, R. Niruban, and J. N. Swaminathan Abstract Air pollution is one of the vital problems faced by the world today. Air pollution is an essential cause of global warming and causes various health issues to living organisms. The growing digital technology possibly helps to monitor air pollution and could find a possible solution to prevent air pollution. This paper presents Artificial Intelligence (AI)-based Machine Learning (ML) empowered Internet of Things (IoT) technology for air quality monitoring and forecasting techniques. The proposed technology measures Carbon Monoxide (CO), Sulfur dioxide (SO2 ), Nitrogen Dioxide (NO2 ), Ozone element (O3 ), and Particulate Matter (PM) levels in the air. The proposed technology uses intelligent ML techniques to estimate the air quality index and provides possible fore- casting on the air quality index. Through the forecasted data suitable policies can be framed to reduce air pollution. The air quality index obtained through the proposed technique is displayed in color bar graph, where the color indicates the level of air quality index. The obtained results have been directly fed to the cloud server through IoT and forecasting has been carried out through the ML technique. The results explore the air pollution level and the hazardous level of air pollution and results benefits the human kind to know the level of air pollution and adopt substantial development. Keywords Internet of Things · Artificial intelligence · Machine learning · Air quality S. Gupta (B) University of Petroleum and Energy Studies, Dehradun, India e-mail: [email protected] P. C. B. Naidu V R Siddhartha Engineering College, Kanuru, Vijayawada, Andhra Pradesh, India V. Kuppan · J. N. Swaminathan QIS College of Engineering and Technology, Ongole, Andhra Pradesh 523272, India M. Alagumeenaakshi Kumaraguru College of Technology, Coimbatore, India R. Niruban St. Joseph’s College of Engineering, OMR, Chennai, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6_75
789
790
S. Gupta et al.
75.1 Introduction The growing population has created the requirement of energy, and transport and without these the function of our day-to-day will be insignificant. The increased energy demand (energy sectors burning coal, thermal to generate energy), and road transport are the vital source of air pollution [1]. The air pollution such as, Carbon Monoxide (CO), Sulfur dioxide (SO2 ), Nitrogen Dioxide (NO2 ), Ozone element (O3 ), and Particulate Matter (PM) levels in the air are the rich source that causes global warming and also causes various health issues to living organism [2, 3]. From research it is also identifies that the person with lung diseases and respiratory problems exposing the air pollution for longer duration have a chance of 27% fatality [4–6]. The air pollution increases the risk of healthy man to lead to lung diseases and respiratory problems when he/she exposes to air pollution for longer duration [7]. These harmful gases are caused as an outcome of the vehicle, industrial smoke, natural and commonly caused in India by the combustion of coal in power plants [8]. The studies have also identified that the risk of inhaling air pollution gases such as, CO, SO2 , NO2 , and O3 have higher risk of initiating asthma and leads to worsening and could provoke chronic illnesses, Chronic Obstructive Pulmonary Disease (COPD), emphysema and that could worsen to lung cancer, cardiovascular diseases and finally fatal [9]. The recent study carried out by meta-analysis have identified that patients with COPD have a higher chance of COVID-19 infections [10]. Transportation and road traffic causes additional air pollution [11–15] in rural and urban areas. The emission estimation model has been developed to check the contribution of road traffic in the air pollution. Many research has been conducted in develop models to predict and evaluate the contribution of road traffic and transportationfor air pollution. This paper presents AI-based ML approach empowered IoT technology for air quality monitoring and forecasting techniques. The proposed technology measures Carbon Monoxide (CO), Sulfur dioxide (SO2 ), Nitrogen Dioxide (NO2 ), Ozone element (O3 ), and Particulate Matter (PM) levels in the air. The proposed technology uses intelligent ML techniques to estimate the air quality index and provides possible forecasting on the air quality index. Through the forecasted data suitable policies can be framed to reduce air pollution. The air quality index obtained through the proposed technique is displayed in color bar graph, where the color indicates the level of air quality index. The obtained results have been directly fed to the cloud server through IoT and forecasting has been carried out through the ML technique. The results explore the air pollution level and the hazardous level of air pollution and results benefits the human kind to know the level of air pollution and adopt substantial development.
75 Analysis of Various Toxic Gas Levels Using 5G ML-IoT for Air … Table 75.1 Sensors used to measure gas in the assessment
791
Name
Formula
Sensor
Nitrogen dioxide
NO2
110-508-ND
Carbon monoxide
CO
1684-1001-ND
Carbon dioxide
CO2
SCD-30
Ozone
O3
1684-1043-ND
75.2 Materials and Analysis Methods The materials and methods used in the investigation has been discussed in this session. The materials such as sensors has been used to measure the air pollution gases.
75.2.1 Sensors The sensors used in the analysis is listed in Table 75.1. The parameters measured in the study are Carbon Monoxide (CO), Sulfur dioxide (SO2 ), Nitrogen Dioxide (NO2 ), Ozone element (O3 ), and Particulate Matter (PM) analysis. The properties of the Nitrogen Dioxide (NO2 ) are that it emits brown or orange gas the sensor 110508-ND is used to measure NO2 gas. The properties of the Carbon Monoxide (CO), and Sulfur dioxide (SO2 ) are that it emits colorless and odorless gas the sensor 16841001-ND is used to measure CO, CO2 , and SO2 gas. Furthermore, the properties of the Ozone element (O3 ) are very reactive the sensor 1684-1043-ND is used to measure O3 gas.
75.2.2 Need for Assessing NO2 CO, SO2 , and O3 Gas Status in India The NO2 , CO, SO2 , and O3 gas concentration on the surface of the earth has been collected from the Copernicus Sentinel-5P satellite. Sentinel-5 Precursor is a space mission program developed by the European Space Agency (ESA) to monitor air level of the world. For the analysis, the average NO2 concentration over India is depicted in Fig. 75.1a. It is observed 30–42 µmol/m2 of NO2 concentration level in the major cities of India. Figure 75.1b depicts the CO concentration over India. It is observed that the CO concentration is about 57 to 93 ppb (parts per billion) in major cities of India. It is typically very high and high danger for living organisms. Figure 75.1c depicts the SO2 level in India. It has been observed that the SO2 level in India are above the recommended threshold at an average of 100 µg/m3. When a human exposes high levels of SO2 it can cause numerous health issues. Figure 75.1d depicts the O3 gas concentration globally. It is recommended only a limit of 0.1 ppm
792
S. Gupta et al.
Fig. 75.1 Assessing NO2 , CO, SO2 , and O3 gas status in India
O3 gas exposure. The human exposing O3 gas above 5 ppm or higher has high risk of health issues and sometimes lead to death.
75.2.3 Model Layers for the Proposed ML Empowered IoT-Based Air Quality Monitoring and Forecasting Techniques The proposed ML empowered IoT-based air quality monitoring and forecasting techniques has three model layers such as; Physical layer, Network layer, and Application layer as shown in Fig. 75.2. In IoT the layers act as the parts of the web actual design of IoT, working as a medium between the computerized system and real world. In the model layers, physical layer plays a vital role that acts as a medium in transferring analog signals into digital signals similarly digital signals into analog signals. The physical layer consists of devices such as sensors, actuators, modules, and devices. The function of the sensor is to measure the real world data and process that to module. The actuators are to control and do any physical action if required. Means while, the modules and devices control both sensors and actuators based on the signal or comments. The second layer is the network layer, where it acts as the gateways to the modules and devices to the application layers. The network layer consists of Local Area Network (LAN) or Wide Area Network (WAN) with Wi-Fi enabled, Bluetooth connection, Near Field Communications, ZigBee, and cellular network. Furthermore, in this layer, the data is additionally handled and estimated down to accumulate business knowledge. Here IoT frameworks get associated with middleware or programming that can comprehend information all the more definitively. The
75 Analysis of Various Toxic Gas Levels Using 5G ML-IoT for Air …
793
Fig. 75.2 Developed IoT model layers for the ML empowered IoT-based air quality monitoring and forecasting techniques
final layer is the application layer that includes cloud server with various analytical tool such as, Business decision-making software’s, Device control and monitoring tool, Machine learning and Artificial Intelligence tools, and Mobile Application for further interactions. In the developed model, ML and AL tool will be used for further analyzing the data. Every IoT framework is worked with its specific objectives and targets to coordinate with business particulars. As of now, the greater part of the IoT applications is working at a shifting intricacy and work a large number of innovation stacks performing explicit to tackle the tasks for the real-time solutions.
75.3 Proposed Methodology and Its Function The function of the proposed ML empowered IoT-based air quality monitoring and forecasting techniques is shown in Fig. 75.3. In the developed methodology, the experimental setup consists of the sensors discussed in the Sect. 75.2 and Arduino unoATmega328 with ESP Wi-Fi module. These, instruments acts as the physical layer which reads the gas data from the real world and convert them to digital data through the module. In the methodology, the sensors have single signal communication just to read the data. The actuators such as switch used in the study has two-way communication that transfers signal to the module as well as it receives signal from the module. The digital signal processed by the module is transferred to the cloud server of the application layer. Those signals are processed by the LAN or WAN in the network layer. The data transferred will be fully encrypted with high security and password protected. The cloud used is ThingSpeak powered by MATLAB. In the
794
S. Gupta et al.
Fig. 75.3 Function of the proposed methodology
ThingSpeak, a public channel must be created with required amount of field as per the data to be written. While functioning, Arduino Uno will send information to the cloud framework utilizing Wi-Fi module. As discussed earlier, we use ThingSpeak for cloud frameworks. ThingSpeak is an open-source IoT application stage and API for putting away and recovering information from things utilizing the HTTP convention over the WAN or LAN. The checking results should be visible on the page given by ThingSpeak in graphical structure. Furthermore, the data recorded in the ThingSpeak can be a further processed as per the workflow for ML analysis. In the workflow initially collect the historical data and analyze those data. Further using those data fitting data, create a NARX neural networks with collected available input data as shown in Fig. 75.4. The NARX network functions admirably for close term expectations and the fitting data network is viable for long term prediction. Data inputs to the neural networks include historical and read gas values. In this analysis, to predict the value we have used Neural Net Time Series using the NN tool box of the MATLAB/Thinkspeak data analytics tool. Finally, visualize the value obtained using the data analytics.
75.4 Results and Discussion An extensive analysis has been carried out using the developed model. In the analysis, the digital signal processed by the module is transferred to the cloud server of the application layer. Those signals are processed by the LAN or WAN in the network layer and is read in the cloud ThingSpeak powered by MATLAB as shown in Fig. 75.5. In the channel totally 8 fields have been created to read the values of sensors. Furthermore, the Fig. 75.6 depicts the SO2 value obtained in the investigation. The graph shows the SO2 value present in the atmosphere. The value is measured in ppb. Figure 75.7 depicts the CO2 value obtained in the investigation. The graph shows the CO2 value present in the atmosphere and it can be observed that the value crosses the average value, the values are measured in ppb. Figure 75.8 depicts the O3 value
75 Analysis of Various Toxic Gas Levels Using 5G ML-IoT for Air …
795
Fig. 75.4 NARX neural network prediction
obtained in the investigation. The graph shows the O3 value present in the atmosphere and the ozone value measures is in the limit, i.e., < 70 ppb and the values are measured in ppb. Figure 75.9 depicts the NO2 value obtained in the investigation. The graph shows the NO2 value present in the atmosphere and it can be observed that the value higher than the usual days to high traffic and industries. Figure 75.10 depicts the CO value obtained in the investigation. The graph shows the CO value present in the atmosphere and it can be observed that the value crosses the average value, the values are measured in ppb. Figure 75.11 depicts the PM2.5 value obtained in the investigation. The graph shows the PM 2.5 value present in the atmosphere and the value of PM2.5 should be below 12 µg/m3 as per the table 75.2 the PM2.5 average value is rounded off for achieving air quality index. When the PM2.5 falls between 0 and 150 the air quality is good. When the PM2.5 falls between 150 and 500 the air quality is moderate. In case, when the PM2.5 falls between 501 and 550 the air quality is unhealthy. Further, when the PM2.5 falls between 551 and 1000 the air quality is totally unhealthy. In, the assessment, its find the air quality is moderate and has crossed unhealthy situation once due to high traffic and industrial smoke. The ML-based NARX training is carried out using MATLAB through the data read in the Thinkspeak channel and shown in Fig. 75.12. The correspond predicted data of Carbon Monoxide (CO), Sulfur dioxide (SO2 ), Nitrogen Dioxide (NO2 ), Ozone element (O3 ) for upcoming years is shown in Fig. 75.13. From the analysis
796
S. Gupta et al.
Fig. 75.5 ThinkSpeak channel data
it has been determined that the air quality will be ruined up to 8% to 10% in the upcoming decade. Though the analysis of the predicted values, necessary steps can be taken to reduce the negative impact of air quality caused by Carbon Monoxide (CO), Sulfur dioxide (SO2 ), Nitrogen Dioxide (NO2 ), Ozone element (O3 ) gases.
75.5 Conclusion An Artificial Intelligence (AI)-based Machine Learning (ML) empowered Internet of Things (IoT) technology for air quality monitoring and forecasting technique have
75 Analysis of Various Toxic Gas Levels Using 5G ML-IoT for Air …
797
Fig. 75.6 Measured SO2 value
been developed and studied in this paper. The developed technology measures Carbon Monoxide (CO), Sulfur dioxide (SO2 ), Nitrogen Dioxide (NO2 ), Ozone element (O3 ), and Particulate Matter (PM) levels in the air. Further, the proposed technology uses intelligent ML techniques to estimate the air quality index and provides possible forecasting on the air quality index. From the obtained result it is identified that PM2.5 is moderate and few gases are in hazardous situation. Further, the forecasting analysis carryout by ML technique depicts that the air quality will be ruined up to 8% to 10% in the upcoming decade. The results explore the air pollution level and the hazardous level of air pollution and results benefits the human kind to know the level of air pollution and adopt substantial development to reduce the air pollution.
798 Fig. 75.7 Measured CO2 value
S. Gupta et al.
75 Analysis of Various Toxic Gas Levels Using 5G ML-IoT for Air … Fig. 75.8 Measured O3 value
Fig. 75.9 Measured NO2 value
799
800
S. Gupta et al.
Fig. 75.10 Measured CO value
Fig. 75.11 Estimated PM 2.5 value using ML analysis
Table 75.2 Air quality index
PM2.5
Air quality value
0–12.0
0–150 excellent
12.1–35.4
Acceptable 150–500
35.5–55.4
Unhealthy for breathing disorder peoples 501–550
55.5–150.4 Unhealthy for all living beings
75 Analysis of Various Toxic Gas Levels Using 5G ML-IoT for Air …
801
Fig. 75.12 Average error in NARX iteration
Fig. 75.13 Predicted air pollution value using ML analytics
References 1. Guttikunda, S.K., Goel, R., Pant, P.: Nature of air pollution, emission sources, and management in the Indian cities. Atmos. Environ. 95, 501–510 (2014) 2. Kobayashi, Y.: Holistic Environmental Health Impact Assessment: Hybridisation of Life Cycle Assessment and Quantitative Risk Assessment using Disability Adjusted Life Years (2015) 3. Lim, J.S., Manan, Z.A., Alwi, S.R.W., Hashim, H.: A review on utilisation of biomass from rice industry as a source of renewable energy. Renew. Sustain. Energy Rev. 16(5), 3084–3094 (2012) 4. Gomathy, V., Janarthanan, K., Al-Turjman, F., Sitharthan, R., Rajesh, M., Vengatesan, K., Reshma, T.P.: Investigating the spread of coronavirus disease via edge-AI and air pollution correlation. ACM Trans. Internet. Technol. 21(4), 1–10 (2021) 5. Sitharthan, R., Rajesh, M., Madurakavi, K., Raglend, J., Kumar, R., Assessingnitrogen dioxide (NO2 ) impact on health pre-and post-COVID-19 pandemic using IoT in India. Int. J. Pervasive Comput. Commun. (2020). https://doi.org/10.1108/IJPCC-08-2020-0115
802
S. Gupta et al.
6. Rajesh, M., Sitharthan, R.: Image fusion and enhancement based on energy of the pixel using deep convolutional neural network. Multimedia Tools Appl. 81(1), 873–885 (2022) 7. Mahato, S., Pal, S.: Revisiting air quality during lockdown persuaded by second surge of COVID-19 of mega city Delhi, p. 101082. Urban climate, India (2022) 8. Wang, Y., Duan, X., Liang, T., Wang, L., Wang, L.: Analysis of spatio-temporal distribution characteristics and socioeconomic drivers of urban air quality in China. Chemosphere 291, 132799 (2022) 9. Choe, Y., Shin, J.S., Park, J., Kim, E., Oh, N., Min, K., Kim, D., Sung, K., Cho, M., Yang, W.: Inadequacy of air purifier for indoor air quality improvement in classrooms without external ventilation. Build. Environ. 207, 108450 (2022) 10. Feng, M., Ren, J., He, J., Chan, F.K.S., Wu, C.: Potency of the pandemic on air quality: an urban resilience perspective. Sci. Total Environ. 805, 150248 (2022) 11. Shafabakhsh, G., Taghizadeh, S.A., Kooshki, S.M.: Investigation and sensitivity analysis of air pollution caused by road transportation at signalized intersections using IVE model in Iran. Eur. Transp. Res. Rev. 10(1), 1–13 (2018) 12. Houston, D., Wu, J., Ong, P., Winer, A.: Structural disparities of urban traffic in southern California: implications for vehicle-related air pollution exposure in minority and high-poverty neighborhoods. J. Urban Aff. 26(5), 565–592 (2004) 13. Armah, F.A., Yawson, D.O., Pappoe, A.A.: A systems dynamics approach to explore traffic congestion and air pollution link in the city of Accra, Ghana. Sustainability 2(1), 252–265 (2010) 14. Martín-Baos, J.Á., Rodriguez-Benitez, L., García-Ródenas, R., Liu, J.: IoT based monitoring of air quality and traffic using regression analysis. Appl. Soft Comput. 115, 108282 (2022) 15. Nasution, T.H., Muchtar, M.A., Simon, A.: Designing an IoT-based air quality monitoring system. IOP Conf. Ser.: Mater. Sci. Eng. 648(1), 012037 (Oct 2019) IOP Publishing
Author Index
A Aarthi, R., 689 Aditya Agarwal, 111 Aishwarya Krishnamurthy, 523 Akhil Reddy, M., 535 Alagumeenaakshi, M., 789 Anagha Aher, 697 Anantshesh Katti, 191 Anchuri Lokeshwar, 351 Anita, H. B., 199 Ankita Gandhi, 417 Anshita Malviya, 239 Archana Bhusri, 627 Archana Singh, 585 Archit Tiwari, 11 Arpan Desai, 251 Ashna Shah, 57 Ashwin Makwana, 381 Ashwini S. Kunte, 741 Asmita Nirmal, 101
B Babita Yadav, 779 Bharathi, M. A., 49 Bhargav Vyas, 559 Bhavesh Chauhan, 363 Boomika, G., 285
C Chandi Priya Reddy, G., 659 Chandni Upadhyaya, 251 Chilupuri Supriya, 351
D Debalina Barik, 653 Deepak Jayaswal, 101 Deepak Kumar, 397 Devender Kumar Beniwal, 397 Devesh Kumar Srivastava, 27 Dhirendra Kumar Shukla, 411 Diane Gan, 547 Dilbag Singh, 121 Dilip G. Khairnar, 19 Dip Patel, 559 Divyanshu Sinha, 363 Dixit, A., 503
F Fabricio Javier Santacruz-Sulca, 723 Febin Antony, 199 Fernando Molina-Granja, 723
G Gargi Bhakta, 627 Gargi Patel, 495 Georg Gutjahr, 209 Girubalini, S., 267 Gresha Bhatia, 523 Guruprasad Deshpande, 495
H Haftom Gebregziabher, 593 Hard Parikh, 603 Hari Mohan Pandey, 585 Harini, S., 689 Harsh Mody, 217
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 J. Choudrie et al. (eds.), IOT with Smart Systems, Smart Innovation, Systems and Technologies 312, https://doi.org/10.1007/978-981-19-3575-6
803
804 Harsh Parikh, 217 Harsh Shah, 697 Harshvardhan Gaikwad, 765 Hema Gaikwad, 765 Heta Dasondi, 437 Himabindu, K., 659 Himanshu Behra, 523
I Ikjot Khurana, 523 Ismail Mohideen, S., 411
J Jagruti N. Patel, 91 Jatinderkumar R. Saini, 329, 341, 765 Jayakrishnan, R., 457 Jayesh Kolhe, 495 Jaykumar Dave, 643 Jaynam Shah, 697 Jha, C. K., 155 Jincy A. George, 199 Juneja, P. K., 503 Jyoti Chaudhary, 155 Jyotsna, C., 181
K Kalpdrum Passi, 167 Kannan, M., 375 Kanti Singh Sangher, 585 Karthika, I., 285 Karthik, G., 721 Karuna Pandith, 613 Kaushiki Nagpure, 485 Kaustubh Sawant, 259 Ketaki Despande, 429 Killol Pandya, 251 Kode Sai Vikshit, 209 Kumar, P., 509 Kumkum Saxena, 217 Kuna Harsha Kumar, 447
L Lakshmi Kalyani, 585 Lakshmi Prabha, B. G., 267 Li-Wei Lin, 67
M Mahaadevan, K. R., 669 Mandala Nischitha, 351
Author Index Mandeep Kaur, 469 Mani, R. S., 627 Manish Kumar Abhishek, 317 Manojkumar Shahu, 57 Mansi Gupta, 81 Mantoo, A. K., 627 Meghana Kumar, K. J., 49 Meghna B. Patel, 91, 437 Menaka, R., 375 Milton Paul López Ramos, 723 Mitali Acharya, 417 Mittal Desai, 559 Mittal N. Desai, 669 Mohamed Ali Kazi, 547 Mohamed Ismail, A., 411 Mohammad Wazid, 37 Monga Suhasini, 121 Mridul Sharma, 469 Mullapudi Mani Shankar, 209 Muthukumaaran, S. K., 689
N Nada Rajguru, 697 Nageswara Rao Moparthi, 757 Nahid Shaikh, 697 Nangunuri Shiva Krishna, 535 Narasimha Reddy Soora, 709 Naseela Pervez, 111 Neela Madheswari, A., 375 Neeraj Patil, 217 Neha Rupesh Thakur, 741 Nidhi Patel, 57 Nikita Solanki, 57 Niranajan Chiplunkar, 613 Niranjan Polala, 709 Niruban, R., 789 Nisha, R., 285 Nudurupati Prudhvi Raju, 181
P Pakhala Rohit Reddy, 721 Pandey, J. P., 363 Paruchuri Chandra Babu Naidu, 789 Pavithra, G., 299 Pavithra, L., 299 Prachi Desai, 417 Pradeep, M., 447 Praful Gharpure, 73 Prafulla B. Bafna, 329, 341 Prajwal, K. S., 721 Pranitha, B., 267
Author Index Prasanna, A., 411 Prashant Chintal, 593 Pratik Phadke, 485 Preethi, B., 299 Premkumar, M., 411 Prince Hirapar, 559 Priya, L., 509 Priyank Thakkar, 133 Priya, P., 267 Priyanka Nair, 27
R Rajan, P. K., 485 Rajanala Malavika, 351 Rajanidi Ganesh Phanindra, 181 Rajani, P. K., 495 Rajat Pandey, 251 Raj Davande, 559, 669 Rajendra Kumar Dwivedi, 229, 239 Rajeswara Rao, D., 317 Rakesh Kumar, 229 Ramya Rajamanickam, 627 Ramya, R. S., 603 Raúl Lozada-Yánez, 723 Ravi Kumar, 627 Reshma, A. S., 689, 721 Roheet Bhatnagar, 27 Ronak B. Patel, 91 Ronakkumar Patel, 133
S Sachin Gupta, 779 Sai Kiran, P., 721 Saini, P., 503 Sameer Nanivadekar, 259 Sampada Nemade, 429 Sangita Patil, 485 Santosh Chebolu, 689 Sarath Kumar, 341 Saravanan, R., 689 Sarvesh Waghmare, 1 Satyen M. Parikh, 91, 437 Selvarathi, C., 447 Shaikh Shahin Sirajuddin, 19 Shalini Goel, 11 Shalini, M., 285 Shankara Nand Mishra, 627 Shanta Phani, 653 Sharfuddin Waseem Mohammed, 709 Sharia Saman, 709 Shashank Barthwal, 37
805 Shashi Mehrotra, 311 Shivanath Nikhil, B., 721 Shraddha V. Thakkar, 643 Shristi Rauniyar, 569 Shubham Khairnar, 259 Shubhangi Gupta, 81 Shwetali Jadhav, 429 Singh, D. P., 37 Sneha Thombre, 429 Sonal Jain, 259 Sridevi, S., 457 Srigayathri, M., 267 Srivarshini, S. P., 285 Steve Woodhead, 547 Subrahmanyam, K., 317 Sujasre, J. R., 299 Sujatha, C. N., 659 Sujay Kalakala, 167 Sumana, M., 191 Sumanth, M., 311 Sumathi, 613 Sumeet Gupta, 789 Sumit Pundir, 37 Sunil Jardosh, 381 Sunori, S. K., 503 Surendra Shetty, 613 Suresh Sankaranarayanan, 111 Sutirtha Kumar Guha, 653 Swaminathan, J. N., 723, 789 Swapnil Agrawal, 569 Swarad Hajarnis, Cdt., 259 Swathi, S., 659 Swetha Pesaru, 535 Syed Zishan Ali, 81
T Tandon, V. N., 627 Taranjeet Singh, 627 Thania Vivek, 181 Trishla Kumari, 229 Trushit Upadhyaya, 251 Tushar Champaneria, 381 Tushar Kotkar, 485
U Umamaheswari, A., 375 Upesh Patel, 251
V Vaibhav Vyas, 155 Vaidehi Bhagwat, 523
806 Vamsi Krishna Munjuluri, V. S., 209 Vanniyammai, R., 299 Varshini, P., 689 Varshney, N., 503 Vasudevan Kuppan, 789 Venugopal, K.R., 603 Vignesh, G. D., 723 Vignesh, V., 721 Vijayakumar, R., 375 Vikrant Shaga, 593 Vineet Kumar, 397 Vinodh Kumar, S., 509 Vinodkumar Bhutnal, 757 Vipasha Sharma, 81 Vishakha Chivate, 429 Vivek Gupta, 627
Author Index W Waiel Tinwala, 569
X Xuan-Gang, 67
Y Yashvi Soni, 57 Yerrolla Chanti, 351 Yogeshwar Kosta, 251 Yu Liu, 143