698 11 18MB
English Pages 671 [643] Year 2022
Smart Innovation, Systems and Technologies 282
Vikrant Bhateja Suresh Chandra Satapathy Carlos M. Travieso-Gonzalez T. Adilakshmi Editors
Smart Intelligent Computing and Applications, Volume 1 Proceedings of Fifth International Conference on Smart Computing and Informatics (SCI 2021)
Smart Innovation, Systems and Technologies Volume 282
Series Editors Robert J. Howlett, Bournemouth University and KES International, Shoreham-by-Sea, UK Lakhmi C. Jain, KES International, Shoreham-by-Sea, UK
The Smart Innovation, Systems and Technologies book series encompasses the topics of knowledge, intelligence, innovation and sustainability. The aim of the series is to make available a platform for the publication of books on all aspects of single and multi-disciplinary research on these themes in order to make the latest results available in a readily-accessible form. Volumes on interdisciplinary research combining two or more of these areas is particularly sought. The series covers systems and paradigms that employ knowledge and intelligence in a broad sense. Its scope is systems having embedded knowledge and intelligence, which may be applied to the solution of world problems in industry, the environment and the community. It also focusses on the knowledge-transfer methodologies and innovation strategies employed to make this happen effectively. The combination of intelligent systems tools and a broad range of applications introduces a need for a synergy of disciplines from science, technology, business and the humanities. The series will include conference proceedings, edited collections, monographs, handbooks, reference books, and other relevant types of book in areas of science and technology where smart systems and technologies can offer innovative solutions. High quality content is an essential feature for all book proposals accepted for the series. It is expected that editors of all accepted volumes will ensure that contributions are subjected to an appropriate level of reviewing process and adhere to KES quality principles. Indexed by SCOPUS, EI Compendex, INSPEC, WTI Frankfurt eG, zbMATH, Japanese Science and Technology Agency (JST), SCImago, DBLP. All books published in the series are submitted for consideration in Web of Science.
More information about this series at https://link.springer.com/bookseries/8767
Vikrant Bhateja · Suresh Chandra Satapathy · Carlos M. Travieso-Gonzalez · T. Adilakshmi Editors
Smart Intelligent Computing and Applications, Volume 1 Proceedings of Fifth International Conference on Smart Computing and Informatics (SCI 2021)
Editors Vikrant Bhateja Department of Electronics and Communication Engineering Shri Ramswaroop Memorial Group of Professional Colleges (SRMGPC) Lucknow, Uttar Pradesh, India Dr. A.P.J. Abdul Kalam Technical University Lucknow, Uttar Pradesh, India Carlos M. Travieso-Gonzalez Signals and Communications Department University of Las Palmas de Gran Canaria Las Palmas de Gran Canaria, Spain
Suresh Chandra Satapathy School of Computer Engineering Kalinga Institute of Industrial Technology (KIIT) Bhubaneswar, Odisha, India T. Adilakshmi Department of Computer Science and Engineering Vasvi College of Engineering Hyderabad, Telangana, India
ISSN 2190-3018 ISSN 2190-3026 (electronic) Smart Innovation, Systems and Technologies ISBN 978-981-16-9668-8 ISBN 978-981-16-9669-5 (eBook) https://doi.org/10.1007/978-981-16-9669-5 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022, corrected publication 2022 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
Organizing Team: SCI-2021
Chief Patrons Sri M. Krishna Murthy, Secretary, VAE Sri P. Balaji, CEO, VCE
Patron Dr. S. V. Ramana, Principal, VCE
Honorary Chair Dr. Lakhmi Jain, Australia
General Chairs Dr. Margarita N. Favorskaya, Reshetnev Siberian State University of Science and Technology, Krasnoyarsk, Russia Dr. Suresh Chandra Satapathy, KIIT DU, Bhubaneswar, Odisha, India
v
vi
Organizing Team: SCI-2021
Program Chair Dr. T. Adilakshmi, Professor and HOD, Department of CSE, Vasavi College of Engineering, Hyderabad, Telangana, India
Publication Chairs Dr. Carlos M. Travieso-Gonzalez, Professor and Head of Signals and Communications Department (DSC), IDeTIC, University of Las Palmas de Gran Canaria (ULPGC), Spain Dr. Nagaratna P. Hegde, Professor, Department of CSE, Vasavi College of Engineering, Hyderabad, Telangana, India Dr. Vikrant Bhateja, Department of ECE, Shri Ramswaroop Memorial Group of Professional Colleges (SRMGPC), Lucknow, Uttar Pradesh, India
Editorial Board Dr. Carlos M. Travieso-Gonzalez, Professor & Head of Signals and Communications Department (DSC), IDeTIC, University of Las Palmas de Gran Canaria (ULPGC), Spain Dr. Margarita N. Favorskaya, Reshetnev Siberian State University of Science and Technology, Krasnoyarsk, Russia Dr. Suresh Chandra Satapathy, School of Computer Engineering, Kalinga Institute of Industrial Technology (KIIT), Bhubaneswar, Odisha, India Dr. T. Adilakshmi, Department of CSE, Vasavi College of Engineering, Hyderabad, Telangana, India Dr. Vikrant Bhateja, Department of ECE, Shri Ramswaroop Memorial Group of Professional Colleges (SRMGPC), Lucknow, Uttar Pradesh, India
Publicity Chairs Mr. S. Vinay Kumar, Assistant Professor, Department of CSE, Vasavi College of Engineering, Hyderabad, Telangana, India Mr. M. Sashi Kumar, Assistant Professor, Department of CSE, Vasavi College of Engineering, Hyderabad, Telangana, India
Organizing Team: SCI-2021
vii
Technical Program Committee Dr. Badrinath G. Srinivas, Amazon Development Centre, Hyderabad, TS, India Dr. A. Bapi Raju, IIIT Hyderabad, Hyderabad, TS, India Dr. G. Vijaya Kumari, JNTUH, Hyderabad, TS, India Dr. K Shyamala, OU, Hyderabad, TS, India Dr. K. Sampath, TCS, Hyderabad, TS, India Dr. M. M. Gore, MNNIT, Allahabad, UP, India Dr. Naveen Sivadasan, TCS Innovation Labs, Hyderabad, TS, India Dr. P. Radha Krishna, NIT Warangal, Warangal, TS, India Dr. Rajendra Hegadi, IIIT Dharwad, Karnataka, India Dr. Ravindra S. Hegadi, Central University of Karnataka, Karnataka, India Dr. S. M. Hegde, NIT Surathkal, Karnataka, India Dr. S. Sameen Fatima, Anurag University, Hyderabad, TS, India Dr. Siddu. P. Algur, Vijayanagara Sri Krishnadevaraya University (VSKU), Bellary, Karnataka, India Dr. S. Ramachandram, Anurag University, Hyderabad, TS, India Dr. Sourav Mukhopadhyay, IIT Kharagpur, West Bengal, India
International Advisory Committee Dr. Amira Ashour, Tanta University, Egypt Dr. Aynur Unal, Stanford University, USA Dr. A. Govardhan, JNTU Hyderabad, Hyderabad, TS, India Dr. Akshay Sadananda Uppinakuu Pai, University of Copenhagen, Denmark Dr. Alok Aggarwal, UPES, Dehradun, India Dr. Anuja Arora, Jaypee Institute of Information Technology, Noida, India Dr. Ayush Goyal, Texas A&M University, Kingsville, Texas Dr. Banshidhar Majhi, IIITDM Kancheepuram, Tamil Nadu, India Dr. B. Sujatha, Osmania University, Hyderabad, TS, India Dr. Chintan Bhatt, Chandubhai Patel Institute of Technology, Gujarat, India Dr. D. Ravi, IDRBT, Hyderabad, TS, India Dr. Divakar Yadav, MMMUT, Gorakhpur, India Dr. D. V. L. N. Somayajulu, IIIT, Kurnool, AP, India Dr. J. V. R. Murthy, JNTU Kakinada, AP, India Dr. K. C. Santosh, The University of South Dakota, South Dakota Dr. Kailash C. Patidar, University of the Western Cape, Cape Town, South Africa Dr. Kuda Nageswar Rao, Andhra University, Visakhapatnam, AP, India Dr. Le Hoang Son, Vietnam National University, Hanoi, Vietnam Dr. M. A. Hameed, Osmania University, Hyderabad, TS, India Dr. M. Ramakrishna Murthy, ANITS, Visakhapatnam, AP, India Dr. Munesh Chana Trivedi, ABES Engineering College, Ghaziabad, UP, India Dr. Naeem Hanoon, Universiti Teknologi Mara, Shah Alam, Malaysia
viii
Organizing Team: SCI-2021
Dr. P. V. Sudha, Osmania University, Hyderabad, TS, India Dr. R. B. V. Subramaanyam, NITW, Warangal, TS, India Dr. Rammohan, Kyungpook National University, South Korea Dr. Roman Senkerik, Tomas Bata University in Zlin, Czech Republic Dr. S. G. Sanjeevi, Professor, NITW, Warangal, TS, India Dr. Sanjay Sengupta, CSIR, New Delhi, India Dr. Siba Udgata, HCU, Hyderabad, TS, India Dr. Sobhan Babu, IIT Hyderabad, TS, India Dr. Suberna Kumar, MVGR, Vizayanagaram, AP, India Dr. Vimal Kumar, The University of Waikato, New Zealand Dr. Yu-Dong Zhang, University of Leicester, UK
Organizing Committee Dr. D. Baswaraj, Professor, CSE, VCE Dr. K. Srinivas, Assoc. Professor, CSE, VCE Dr. V. Sireesha, Assistant Professor, CSE, VCE Mr. S. Vinay Kumar, Assistant Professor, CSE, VCE Mr. M. Sashi Kumar, Assistant Professor, CSE, VCE Ms. M. Sunitha Reddy, Assistant Professor, CSE, VCE Mr. R. Sateesh Kumar, Assistant Professor, CSE, VCE Ms. T. Nishitha, Assistant Professor, CSE, VCE
Publicity Committee Ms. B. Syamala, Assistant Professor, CSE, VCE Mr. C. Gireesh, Assistant Professor, CSE, VCE Ms. T. Jalaja, Assistant Professor, CSE, VCE Mr. I. Navakanth, Assistant Professor, CSE, VCE Ms. S. Komal Kaur, Assistant Professor, CSE, VCE Mr. T. Saikanth, Assistant Professor, CSE, VCE Ms. K. Mamatha, Assistant Professor, CSE, VCE Mr. P. Narasiah, Assistant Professor, CSE, VCE
Website Committee Mr. S. Vinay Kumar, Assistant Professor, CSE, VCE Mr. M. S. V. Sashi Kumar, Assistant Professor, CSE, VCE Mr. Krishnam Raju Relangi, Web Designer, CC, VCE
Organizing Team: SCI-2021
ix
Special Sessions Data Analysis of Expert System Based Models Using Machine Learning Dr. Anand Kumar Pandey, Associate Professor, Department of Computer Science and Applications, ITM University, Gwalior, MP, India Ms. Rashmi Pandey, Assistant Professor, Department of Computer Science, ITM Group of Institutions, Gwalior, MP, India
Blockchain 4.0: Artificial Intelligence and Industrial Internet of Things Paradigm Dr. Sandeep Kumar Panda, IcfaiTech (Faculty of Science and Technology), ICFAI Foundation for Higher Education, Hyderabad, Telangana, India Dr. Ajay Kumar Jena, School of Computer Engineering, KIIT deemed to be University, Bhubaneswar, Odisha, India Dr. D. Chandrasekhar Rao, Department of Information Technology, Veer Surendra Sai University of Technology, Burla, India
Interdisciplinary Data Issues: Opportunities and Challenges with Big Data analysis Dr. Rahul Deo Sah, Dr. Shyama Prasad Mukherjee University, Ranchi, India Dr. Mukesh Tiwari, Sri Satya Sai University of Technology and Medical Sciences, MP, India
Technical Session Chairs Dr. S. K. Panda, ICFAI Foundation of Higher Education, Hyderabad Dr. Chakravarthy VVSSS, Raghu Institute of Technology, Vizag, AP Dr. Chirag Arora, KIET, Ghaziabad, UP Dr. V. Sireesha, VCE, Hyderabad Dr. Srinivas Kaparthi, VCE, Hyderabad Dr. D. Baswaraj, VCE, Hyderabad Dr. E. Shailaja, VCE, Hyderabad
Preface
This volume contains the papers that were presented at the 5th International Conference on Smart Computing and Informatics (SCI-2021) organized by the Department of Computer Science and Engineering, Vasavi College of Engineering (Autonomous), Ibrahimbagh, Hyderabad, Telangana, during September 17–18, 2021. It provided a great platform for researchers from across the world to report, deliberate, and review the latest progress in the cutting-edge research pertaining to smart computing and its applications to various engineering fields. The response to SCI-2021 was overwhelming with a good number of submissions from different areas relating to artificial intelligence, machine learning, cognitive computing, computational intelligence, and its applications in main tracks. After a rigorous peer review with the help of technical program committee members and external reviewers, only quality papers were accepted for publication in this volume. Several special sessions were offered by eminent professors in cutting-edge technologies such as data analysis of expert system-based models using machine learning, Blockchain 4.0, artificial intelligence and industrial Internet of Things paradigm and interdisciplinary data’s issues opportunities and challenges with big data analysis. Eminent researchers and academicians delivered talks addressing the participants in their respective field of proficiency. Our thanks are due to Prof. Dr. Carlos M. Travieso-González, Professor and Head of Signals and Communications Department, Institute for Technological Development and Innovation in Communications (IDeTIC), University of Las Palmas de Gran Canaria (ULPGC), Spain, Shri. Balpreet Singh, Director of Engineering, Intel IOTG, Hyderabad, and Mr. Rahul Ghali, Senior Manager, Accenture, India, for delivering keynote addresses for the benefit of the participants. We would like to express our appreciation to the members of the technical program committee for their support and cooperation in this publication. We are also thankful to team from Springer for providing a meticulous service for the timely production of this volume. Our heartfelt thanks to Shri. M. Krishna Murthy, Secretary, VAE, Sri. P. Balaji, CEO, VCE, and Dr. S. V. Ramana, Principal, VCE, for extending support to conduct this conference in Vasavi College of Engineering. Profound thanks to Prof. Lakhmi
xi
xii
Preface
C. Jain, Australia, for his continuous guidance and support from the beginning of the conference. Without his support, we could never have executed such a mega event. Special thanks to all guests who have honored us in their presence in the inaugural day of the conference. Our thanks to all special session chairs track managers and reviewers for their excellent support. Our special thanks to all the authors who submitted papers and all the attendees for their contributions and fruitful discussions that made this conference a great success. Lucknow, India Bhubaneswar, India Las Palmas de Gran Canaria, Spain Hyderabad, India December 2021
Vikrant Bhateja Suresh Chandra Satapathy Carlos M. Travieso-Gonzalez T. Adilakshmi
Contents
1
2
3
An Analytical Study of COVID-19 Dataset Using Graph-Based Clustering Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mamata Das, P. J. A. Alphonse, and K. Selvakumar
1
Intrusion Detection Using Support Vector Machine and Artificial Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gandhe Srivani and Srinivasu Badugu
17
Development of Different Word Vectors and Testing Using Text Classification Algorithms for Telugu . . . . . . . . . . . . . . . . . . . . . . . . Guna Santhoshi and Srinivasu Badugu
33
4
Framework for Diabetic Retinopathy Classification . . . . . . . . . . . . . . Sravya Madala, Vani K. Suvarna, and Pranathi Jalapally
5
Online Malayalam Script Assortment and Preprocessing for Building Recommender Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V. K. Muneer, K. P. Mohamed Basheer, K. T. Rizwana, and Abdul Muhaimin
6
Improved Multi-modal Image Registration Using Geometric Edge-Oriented Histogram Feature Descriptor: G-EOH . . . . . . . . . . . B. Sirisha, B. Sandhya, and J. Prasanna Kumar
7
Reddit Sentiments Effects on Stock Market Prices . . . . . . . . . . . . . . . . Arnav Machavarapu
8
Speech-Based Human Emotion Recognition Using CNN and LSTM Model Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kotha Manohar and E. Logashanmugam
9
Recognizing the Faces from Variety of Poses and Illumination . . . . . T. Shreekumar, N. V. Sunitha, K. Suma, Sukhwinder Sharma, and Puneet Mittal
47
57
67 75
85 95
xiii
xiv
Contents
10 Experimental Analysis of Cold Chamber with Phase Change Materials for Agriculture Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 G. Bhaskara Rao and A. Parthiban 11 Comparison of H-Based Vertical-Axis Wind Turbine Blades of NACA Series with CFD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 Mohd Hasham Ali, Syed Nawazish Mehdi, and M. T. Naik 12 Design and Synthesis of Random Number Generator Using LFSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 K. Rajkumar, P. Anuradha, Rajeshwarrao Arabelli, and J. Vasavi 13 QCA-Based Error Detection Circuit for Nanocommunication Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 P. Anuradha, K. Rajkumar, Rajeshwar Rao Arabelli, and R. Shareena 14 Evaluating Performance on Covid-19 Tweet Sentiment Analysis Outbreak Using Support Vector Machine . . . . . . . . . . . . . . . 151 M. Shanmuga Sundari, Pusarla Samyuktha, Alluri Kranthi, and Suparna Das 15 Minimum Simplex Nonlinear Nonnegative Matrix Factorization for Hyperspectral Unmixing . . . . . . . . . . . . . . . . . . . . . . . 161 K. Priya and K. K. Rajkumar 16 Prediction and Analysis of Vitamin D Deficiency Using Machine Learning Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 Mohammad Ulfath and R. Pallavi Reddy 17 Student Performance Prediction Using Classification Models . . . . . . 187 Shrey Agarwal, Yashaswi Upmon, Riyan Pahuja, Ganesh Bhandarkar, and Suresh Chandra Satapathy 18 Estimating Driver Attentiveness Through Head Pose Using Hybrid Geometric-Based Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 H. D. Vankayalapati, K. R. Anne, and S. Sri Harsha 19 IIWSCOA-Based DCNN: Improved Invasive Weed Sine Cosine Optimization Algorithm for Early Detection of Myocardial Infarction Using Deep Convolution Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 Shridevi Soma and Shamal Bulbule 20 Student’s Academic Performance Prediction Using Ensemble Methods Through Educational Data Mining . . . . . . . . . . . . . . . . . . . . . 215 Sk. Vaheed, R. Pratap Singh, Padmalaya Nayak, and Ch. Mallikarjuna Rao
Contents
xv
21 A Flexible Accession on Brain Tumour Detection and Classification Using VGG16 Model . . . . . . . . . . . . . . . . . . . . . . . . . 225 V. Ramya Manaswi and B. Sankarababu 22 Check Certificate—A Certificate Verification Platform for Students and Organizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 M. D. N. Akash, Golla Bharadwaj Sai, Madhira Venkata Sai Yeshwanth Reddy, and M. A. Jabbar 23 Secure Cluster-Based Routing Using Modified Spider Monkey Optimization for Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . . 247 M. Supriya and T. Adilakshmi 24 ChefAI Text to Instructional Visualization Using Amazon Web Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 Sangeeta Gupta, Saif Ali Athyaab, and J. Harsh Raj 25 Identification of Predominant Genes that Causes Autism Using MLP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 Anitta Joseph and P. K. Nizar Banu 26 Detecting Impersonators in Examination Halls Using AI . . . . . . . . . . 281 A. Vishal, T. Nitish Reddy, P. Prahasit Reddy, and S. Shitharth 27 Telugu Text Classification Using Supervised Machine Learning Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293 G. V. Subba Raju, Srinivasu Badugu, and Varayogula Akhila 28 Application of Hybrid MLP-GWO for Monthly Rainfall Forecasting in Cachar, Assam: A Case Study . . . . . . . . . . . . . . . . . . . . 307 Abinash Sahoo and Dillip Kumar Ghose 29 Temperature Prediction Using Hybrid MLP-GOA Algorithm in Keonjhar, Odisha: A Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319 Sandeep Samantaray, Abinash Sahoo, and Deba Prakash Sathpathy 30 Addressing Longtail Problem using Adaptive Clustering for Music Recommendation System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 M. Sunitha, T. Adilakshmi, G. Ravi Teja, and Ayush Noel 31 A Proficient GK-KMA Based Segmentation and Lung Nodule Detection in CT Images Using PTRNN . . . . . . . . . . . . . . . . . . . . . . . . . . 339 Vijay Kumar Gugulothu and Savadam Balaji 32 Advertisement Click Fraud Detection Using Machine Learning Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353 Bhargavi Mikkili and Suhasini Sodagudi 33 Wind Power Prediction Using Time Series Analysis Models . . . . . . . 363 Bhavitha Katari and Sita Kumari Kotha
xvi
Contents
34 Classification of Astronomical Objects using KNN Algorithm . . . . . 377 Mariyam Ashai, Rhea Gautam Mukherjee, Sanjana P. Mundharikar, Vinayak Dev Kuanr, and R. Harikrishnan 35 Efficient Analogy-based Software Effort Estimation using ANOVA Convolutional Neural Network in Software Project Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389 K. Harish Kumar and K. Srinivas 36 Hand Written Devanagari Script Short Scale Character Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401 Kachapuram BasavaRaju and Y. RamaDevi 37 Artificial Intelligence for On-Site Detection of Invasive Rugose Spiralling Whitefly in Coconut Plantation . . . . . . . . . . . . . . . . . . . . . . . 413 M. Kalpana and K. Senguttuvan 38 Prediction of Heart Disease Using Optimized Convolution Neural Networks (CNNs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421 R. Sateesh Kumar, S. Sameen Fatima, and M. Navya 39 Sentiment Analysis using COVID-19 Twitter Data . . . . . . . . . . . . . . . . 431 Nagaratna P. Hegde, V. Sireesha, K. Gnyanee, and G. P. Hegde 40 Speech Mentor for Visually Impaired People . . . . . . . . . . . . . . . . . . . . . 441 P. G. L. Sahithi, V. Bhavana, K. ShushmaSri, K. Jhansi, and Ch. Raga Madhuri 41 An Extended Scheduling of Mobile Cloud using MBFD and SVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451 Rutika M. Modh, Meghna B. Patel, and Jagruti N. Patel 42 Performance Enhancement of DYMO Routing Protocol in MANETs Using Machine Learning Technique . . . . . . . . . . . . . . . . . 463 P. M. Manohar and D. V. Divakara Rao 43 Context Dependency with User Action in Recommendation System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471 Arati R. Deshpande and M. Emmanuel 44 Industrial Automation by Development of Novel Scheduling Algorithm for Industrial IoT: IIoT Re-birth Out of Covid-19 Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481 Sujit N. Deshpande and Rashmi M. Jogdand 45 Speech@SCIS: Annotated Indian Video Dataset for Speech-Face Cross Modal Research . . . . . . . . . . . . . . . . . . . . . . . . . . 493 Shankhanil Ghosh, Naagamani Molakathaala, Chhanda Saha, Rittam Das, and Souvik Ghosh
Contents
xvii
46 Task Scheduling in Cloud Using Improved ANT Colony Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505 Shyam Sunder Pabboju and T. Adilakshmi 47 Technological Breakthroughs in Dentistry: A Paradigm Shift Towards a Smart Future . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517 Anjana Raut, Swati Samantaray, and P. Arun Kumar 48 A Study on Smart Agriculture Using Various Sensors and Agrobot: A Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 531 Shraban Kumar Apat, Jyotirmaya Mishra, K. Srujan Raju, and Neelamadhab Padhy 49 Task Scheduling in Cloud Computing Using PSO Algorithm . . . . . . 541 Sriperambuduri Vinay Kumar, M. Nagaratna, and Lakshmi Harika Marrivada 50 A Peer-to-Peer Approach for Extending Wireless Network Base for Managing IoT Edge Devices Off-Gateway Range . . . . . . . . . 551 Ramadevi Yellasiri, Sujanavan Tiruvayipati, Sridevi Tumula, and Khooturu Koutilya Reddy 51 Analysis of Different Methodologies for Sentiment in Hindi Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561 Rohith Reddy Byreddy, Saketh Malladi, B. V. S. S. Srikanth, and Venkataramana Battula 52 Analysis of Efficient Handover CAC Schemes for Handoff and New Calls in 3GPP LTE and LTEA Systems . . . . . . . . . . . . . . . . . 569 Pallavi Biradar, Mohammed Bakhar, and Shweta Patil 53 Artificial Intelligence Framework for Skin Cancer Detection . . . . . . 579 K. Mohana Lakshmi and Suneetha Rikhari 54 Telugu Text Summarization Using HS and GA Particle Swarm Optimization Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589 M. Varaprasad Rao, A. V. Krishna Prasasd, A. Anusha, and K. Srujan Raju 55 A Dynamic Model and Algorithm for Real-Time Traffic Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 599 M. N. V. M. Sai Teja, N. Lasya Sree, L. Harshitha, P. Venkata Bhargav, Nuthanakanti Bhaskar, and V. Dinesh Reddy 56 A Selection-Based Framework for Building and Validating Regression Model for COVID-19 Information Management . . . . . . . 611 Pravinkumar B. Landge, Dhiraj V. Bhise, Kapil Kumar Nagwanshi, Raj Kumar Patra, and Santosh R. Durugkar
xviii
Contents
57 Fingerprint Liveliness Detection to Mitigate Spoofing Attacks Using Generative Networks in Biometric System . . . . . . . . . . . . . . . . . 623 Akanksha Gupta, Rajesh Mahule, Raj Kumar Patra, Krishan Gopal Saraswat, and Mozammil Akhtar 58 Pulmonary Nodule Detection Using Laplacian of Gaussian and Deep Convolutional Neural Network . . . . . . . . . . . . . . . . . . . . . . . . 633 Nuthanakanti Bhaskar and T. S. Ganashree Correction to: A Peer-to-Peer Approach for Extending Wireless Network Base for Managing IoT Edge Devices Off-Gateway Range . . . . Ramadevi Yellasiri, Sujanavan Tiruvayipati, Sridevi Tumula, and Khooturu Koutilya Reddy
C1
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 649
About the Editors
Vikrant Bhateja is associate professor in Department of Electronics & Communication Engineering (ECE), Shri Ramswaroop Memorial College of Engineering and Management (SRMCEM), Lucknow (Affiliated to AKTU) and also the Dean (Academics) in the same college. His areas of research include digital image and video processing, computer vision, medical imaging, machine learning, pattern analysis and recognition. He has around 160 quality publications in various international journals and conference proceedings. He is associate editor of IJSE and IJACI. He has edited more than 30 volumes of conference proceedings with Springer Nature and is presently EiC of IGI Global: IJNCR journal. Suresh Chandra Satapathy is a Ph.D in Computer Science, currently working as Professor and at KIIT (Deemed to be University), Bhubaneshwar, Odisha, India. He held the position of the National Chairman Div-V (Educational and Research) of Computer Society of India and is also a senior Member of IEEE. He has been instrumental in organizing more than 20 International Conferences in India as Organizing Chair and edited more than 30 Book Volumes from Springer LNCS, AISC, LNEE and SIST Series as Corresponding Editor. He is quite active in research in the areas of Swarm Intelligence, Machine Learning, Data Mining. He has developed a new optimization algorithm known as Social Group Optimization (SGO) published in Springer Journal. He has delivered number of Keynote address and Tutorials in his areas of expertise in various events in India. He has more than 100 publications in reputed journals and conference proceedings. Dr. Suresh is in Editorial board of IGI Global, Inderscience, Growing Science journals and also Guest Editor for Arabian Journal of Science and Engineering published by Springer. Carlos M. Travieso-Gonzalez received the M.Sc. degree in 1997 in Telecommunication Engineering at Polytechnic University of Catalonia (UPC), Spain; and Ph.D. degree in 2002 at University of Las Palmas de Gran Canaria (ULPGC-Spain). He is Full Professor on Signal Processing and Pattern Recognition and Head of Signals and Communications Department at ULPGC; teaching from 2001 on subjects on signal processing and learning theory. His research lines are biometrics, biomedical signals xix
xx
About the Editors
and images, data mining, classification system, signal and image processing, machine learning, and environmental intelligence. He has researched in 50 International and Spanish Research Projects, some of them as head researcher. He has 440 papers published in international journals and conferences. He has published 7 patents in Spanish Patent and Trademark Office. He is evaluator of project proposals for European Union (H2020), Medical Research Council (MRC—UK), Spanish Government (ANECA—Spain), Research National Agency (ANR—France), DAAD (Germany), and other Institutions. He has been General Chair in 16 international conferences, mainly sponsored by IEEE and ACM. He is Associate Editor in CIN and Entropy (Q2-WoS journals). He has been awarded in “Catedra Telefonica” Awards in editions 2017, 2018 and 2019 on Knowledge Transfer Modality. T. Adilakshmi is currently working as Professor and Head of the Department, Vasavi College of Engineering. She completed her Bachelor of Engineering from Vasavi College of Engineering, Osmania University, in the year 1986, and did her Master of Technology in CSE from Manipal Institute of Technology, Mangalore, in 1993. She received Ph.D. from Hyderabad Central University (HCU) in 2006 in the area of Artificial Intelligence. Her research interests include data mining, image processing, artificial intelligence, machine learning, computer networks and cloud computing. She has 23 journal publications to her credit and presented 28 papers at international and national conferences. She has been recognized as a research supervisor by Osmania University (OU) and Jawaharlal Nehru Technological University (JNTU). Two research scholars were awarded Ph.D. under her supervision, and she is currently supervising 11 Ph.D. students.
Chapter 1
An Analytical Study of COVID-19 Dataset Using Graph-Based Clustering Algorithms Mamata Das , P. J. A. Alphonse, and K. Selvakumar
Abstract COrona VIrus Disease abbreviated as COVID-19 is a novel virus which is initially identified in Wuhan of China in December of 2019, and now, this deadly disease has spread all over the world. According to World Health Organization (WHO), a total of 3,124,905 people died from 2019 to 2021, April. In this case, many methods, AI-based techniques, and machine learning algorithms, have been researched and are being used to save people from this pandemic. The SARS-CoV and the 2019-nCoV, SARS-CoV-2 virus invade our bodies, causing some differences in the structure of cell proteins. Protein–protein interaction (PPI) is an essential process in our cells and plays a very important role in the development of medicines and gives ideas about the disease. In this study, we performed clustering on PPI networks generated from 92 genes of the COVID-19 dataset. We have used three graph-based clustering algorithms to give intuition to the analysis of clusters.
1.1 Introduction Clustering is a task where similar objects are grouped into the same groups and others into different groups. In statistical data analysis, clustering is used as a common technique in different fields. Some uses are in machine learning, computer graphics, bioinformatics, information retrieval, image analysis, and pattern recognition. There are so many clustering algorithms where data points are coming from Gaussian distribution like k-means [7, 14]. Togate the high quality result we need to know some prior knowledge about clusterings like parameter or threshold value, the initial number of clusters, etc. To solve this problem, we can think of an algorithm that belongs to graph theory, i.e., graph-based clustering algorithms where problems are categorized as undirected graph. In graph-based clustering, data are transformed into a graph representation [12]. Data points are the vertices of the graph to be clustered. M. Das · P. J. A. Alphonse · K. Selvakumar (B) NIT Trichy, Trichy, Tamil Nadu 620015, India e-mail: [email protected] P. J. A. Alphonse e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_1
1
2
M. Das et al.
The edges between the data points or the node are weighted by their similarity. Usually, the model does not require any prior knowledge; however, users provide some parameters with value. Graph-based clustering methods are broadly used in biomedical and biological studies to determine the relationship among the objects [17]. We have used three interesting and efficient graph-based clustering algorithms in our experiment, namely Markov clustering algorithm (MCL) as algorithm one ( A1 ), regularized Markov clustering algorithm (RMCL) as algorithm two (A2 ), and MCL with the variable inflation rate Algorithm three ( A3 ). We have used both COVID-19 real data as well as synthetic datasets to evaluate and compare different graph-based algorithms and analyses. The result of the analysis of all the clusters is shown in Sect. 1.4. It shows that the performance of A1 is adequate and A2 performs better than A3 . The rest of this paper is indexed as follows. Section 1.2 presents an overview of the literature survey of previous works on graph-based clustering. Section 1.3 presents the materials and methods. Section 1.4 shows the performance of experimental results. Section 1.5 presents discussions and conclusions.
1.2 Related Work Two novel graph attacks are presented in the network-level graph-based detection system [2]. They have highlighted their work in the adversarial machine learning area where there are not many graph-based clustering techniques out there yet. Three clustering methods have been used, namely the community detection method, spectral methods, and node2vec method, and showed the selection criteria of hyperparameters for each of the three graph methods. The result of the study indicates that a real-world system that uses several popular graph-based modeling techniques can break by generic attacks and the cost of the attacks is minimum. A comparative study has been proposed on Markov chain correlation and shows that their method defeats the K-means clustering method [3]. They have taken gene expression data for experiments and used Dunn index for evaluation purposes. Authors have proposed a beautiful study of the PPI network on candidate genes of the Schizophrenia dataset [11]. They have been implemented and simulated RMCL graph-based clustering algorithm, and the result is compared with the MCL algorithm on the same parameters. An efficient method is proposed by Enright et al. [5] called TRIBE-MCL. It is based on Markov chain theory [4] and applied on protein sequence like a way, protein represented as a node and edge of the graph containing a weight is scientifically computed similarity values. Authors have used stepping-stone type RMCL on Japanese associative concept dictionary and got a satisfactory level of performance than the Markov clustering algorithms generated network [8]. They have summarized the problems of MCL
1 An Analytical Study of COVID-19 Dataset …
3
algorithms and proposed a stepping-stone type algorithm of RMCL algorithms as an extension of the MCL algorithm. The paper [10] concerns Vec2GC, a clustering algorithm to represent text. The approach is density-based which is works on a document or terms. They have used graphs with weight on edges and apply community detection on the objects. Applications of ant lion optimization (ALO) and cuckoo search (CS) have been discussed on protein–protein interaction for graph-based clustering. The author has used regularized Markov clustering method on SARS-CoV-2 and the humans dataset. The results indicate that CS-RMCL interactions are more stable than ALO-RMCL interactions [11]. The paper [1] proposed a way to prevent COVID-19 by predicting in which area COVID-19 can spread next based on geographical distance using graph-based clustering. Here, distance threshold is used to represent the connected graph, administrative as nodes, and geographic distance as a weight of the edges. Some analysis of graph-based clustering method and performance is found in [6, 16].
1.3 Materials and Methods 1.3.1 Data Collection We have used Covid-19 datasets in our experiment and the dataset downloaded from the Website Universal Protein Resource Knowledgebase (UniProtKB) which is freely available. This latest COVID-19 dataset can also be accessed from the link: ftp://ftp.uniprot.org/pub/databases/uniprot/pre_release/. These datasets have been collected from a database in uncompressed excel format and have 92 genes(Human (64), SARS-CoV (15), and 2019-nCoV, SARS-CoV-2 (13)). We perform clustering analysis on PPI networks using three graph-based clustering algorithms. We have used STRING to construct a PPI network, a well-known functional protein association network. The gene names of datasets are shown below. Homo Sapiens (Human) NRP1 NRP VEGF165R, TMPRSS2 PRSS10,TLR3, SGTA SGT SGT1, TOMM70 KIAA0719 TOM70 TOMM70A, SNAP29, HLAB HLAB, APOE, CD74 DHLAG, HLA-A HLAA, S100A8 CAGA CFAG MRP8, IL6 IFNB2, CTSL CTSL1, IL6R, HMGB1 HMG1, FURIN FUR PACE PCSK3, HLA-E HLA-6.2 HLAE, IFNAR1 IFNAR, SMPD1 ASM, ITGAL CD11A, KLRC1 NKG2A, CIITA MHC2TA, PHB PHB1, BSG UNQ6505/PRO21383, IL6ST, IFNAR2 IFNABR, IFNARB, VPS41, RAB7A RAB7, KPNA2 RCH1 SRP1, STX17, PPIA CYPA, EEF1A1 EEF1A EF1A LENG7, SMAD3 MADH3, BST2, KLRD1 CD94, IRF5,IRF3, IL17A CTLA8 IL17, LY6E 9804 RIGE SCA2 TSA1, HIF1A BHLHE78 MOP1 PASD8, TMEM41B KIAA0033, TICAM1 PRVTIRB TRIF, MPP5 PALS1, IL17RC UNQ6118/PRO20040/PRO38901, TPCN2 TPC2, DDX1, IRF7, DHX58 D11LGP2E LGP2, IL17RA IL17R, VPS39 KIAA0770 TLP VAM6, IL17F, PHB2 BAP REA, VAMP8, ACE2 UNQ868/PRO1885, IFIH1 MDA5 RH116, NLRP1
4
M. Das et al.
CARD7 DEFCAP KIAA0926 NAC NALP1, TMPRSS4 TMPRSS3 UNQ776/ PRO1570, ARL8B ARL10C GIE1, TBK1 NAK, PIKFYVE KIAA0981 PIP5K3. Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV) 1a, rep 1a-1b, S 2, N 9a, M 5, 3a, 7a, 9b, E sM 4, 3b, 6, 7b, ORF14, 8b, 8a. Severe Acute Respiratory Syndrome Coronavirus 2 (2019-NCoV, SARS-CoV-2) S 2, 3a, N, rep 1a-1b, E 4, M, 7a, 8, 6, 9b, 7b, ORF10 orf10, 9c.
1.3.2 Execution Environments We have implemented our experimental execution on a Lenovo ThinkPad E14 Ultrabook running the Windows 10 Professional 64-bit operating system and 10th Generation Intel Core i7-10510U Processor. The clock speed of the processor is 1.8 GHz with 16G bytes DDR4 memory size. The code has been executed in Python programming language (Version 3.6) in the Jupyter Notebook of Conda environment.
1.3.3 Graph-Based Clustering Method We have used three graph-based clustering algorithms in our experiment, namely Markov clustering algorithm, regularized Markov clustering algorithm, and MCL with the variable inflation rate. MCL algorithm proposed by Stijn Van Dongen in 2000 [15] is well known as an effectual algorithm in graph-based clustering. This algorithm is very famous in bioinformatics to cluster the protein sequence data as well as gene data. In our methodology, we give an undirected graph as an input which is constructed from the PPI network, with expansion parameter = 6 for a random walk and inflation parameter = 3 for probability modification. The algorithm got the sub-cluster as output after going to the convergence stage. One downside of the MCL algorithm is extraneous clusters in the final output which may give us an impracticable solution. To improve this situation, RMCL algorithm is used. RMCL is the modification of MCL and developed by Parthasarathy and Satuluri in 2009 [13]. The main three steps are pruning, inflation, and regularization. The value of the inflation parameter is given from the outside to get good results. Here, the inflation factor value is fixed. We have applied RMCL to PPI networks of COVID19 candidate genes data. In order to overcome the limation of the fixed inflation parameter value, MCL with the variable inflation rate is introduced. The idea behind the variable inflation value is to modify the similarity values in the columns of the similarity matrix [9] and to get clusters with high quality.
1 An Analytical Study of COVID-19 Dataset …
5
Fig. 1.1 Random network G3
Fig. 1.2 Random network G4
1.3.4 Protein–Protein Interaction Network The protein–protein interaction (PPI) network has been established to predict and analyze the function of protein in terms of physical interaction. The PPI network is a model that represents graphical connectivity between proteins. Though all the proteins connected with each other in the network, there may also be some isolated components. We can present it as a graph where proteins are represented as nodes and interactions are represented as an edge. We have used search tool for the retrieval of interacting genes/proteins (STRING) to construct a PPI network. Cytoscape platform has been used to visualize the networks. Figures 1.3 and 1.4 show the PPI graph G 1 (V1 , E 1 ) and G 2 (V2 , E 2 ) obtained from the COVID-19 dataset using the STRING tool.
6 Fig. 1.3 PPI network G 1
Fig. 1.4 PPI network G 2
M. Das et al.
1 An Analytical Study of COVID-19 Dataset …
7
1.4 Results and Analysis We have analyzed our results and organized them (only two PPI networks and two randomly generated graphs) by the figure from Figs. 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 1.10, 1.11, 1.12, 1.13, 1.14, 1.15, 1.16, 1.17, 1.18, 1.19, 1.20, 1.21, 1.22, 1.23, 1.24, 1.25, 1.26, 1.27 and 1.28. We have constructed PPI networks with vertex range 150 to 250 with non-uniform increment and five randomly generated graphs with vertex range 150 to 250 with a uniform increment of 25 of the dataset. Figures 1.3 and 1.4 show the PPI graph G 1 (V1 , E 1 ), G 2 (V2 , E 2 ), and Figs. 1.1 and 1.2 represent the random graph G 3 (V3 , E 3 ), G 4 (V4 , E 4 ) with V1 = 204, V2 = 254, V3 = 200 and V4 = 250. Output clusters of graphs are shown in Figs. 1.5, 1.6, 1.7, 1.8, 1.9, 1.10, 1.11, 1.12, 1.13, 1.14, 1.15 and 1.16. The iteration count versus execution time of four graphs (G 1 , G 2 , G 3 , and G 4 ) is shown in Figs. 1.18, 1.19, 1.20, 1.21, 1.22, 1.23 and 1.24. We have used the sparse matrix to do the experiment. The density of the matrix has been calculated in every iteration of the experiment. Figures 1.21, 1.22, 1.23 and 1.24 indicate the sparseness of the graph (G 1 , G 2 , G 3 , and G 4 ). We can see the histogram of G 3 and G 4 in Figs. 1.25 and 1.26. We have validated our clustering using the Dunn index (DI), and the quality of the clustering is very magnificent. DI obtained from the PPI network and the random graphs visualized in Figs. 1.27 and 1.28.
Fig. 1.5 G 1 on A1
Fig. 1.6 G 1 on A2
8 Fig. 1.7 G 1 on A3
Fig. 1.8 G 2 on A1
Fig. 1.9 G 2 on A2
Fig. 1.10 G 2 on A3
M. Das et al.
1 An Analytical Study of COVID-19 Dataset … Fig. 1.11 G 3 on A1
Fig. 1.12 G 3 on A2
Fig. 1.13 G 3 on A3
Fig. 1.14 G 4 on A1
9
10
M. Das et al.
Fig. 1.15 G 4 on A2
Fig. 1.16 G 4 on A3
·10−2
Fig. 1.17 E of G 1 5 4.5
t
→
4 3.5 3 2.5 0
1
2
3
4
5
6
→
i
A higher D I indicates a better cluster. Dunn index is defined in terms of intercluster distance and intra-cluster distance. Minimum intra-cluster distance and maximum inter-cluster distance are the criteria for DI. The analysis shows that the performance of A1 is good enough and A2 performs better than A3 . In a randomly generated geometric network, random points are generated and placed with two dimension unit cube. We have taken distance threshold as the radius of 0.3 and distance metric as 2.
1 An Analytical Study of COVID-19 Dataset …
11 ·10−2
Fig. 1.18 E of G 2
t
→
8
6
4
2 0
2
4
6
→
i
Fig. 1.19 E of G 3
0.15
A1 A2 A3
t
→
0.1
5 · 10−2
0
0
2
4
6
8
→
i
0.2
Fig. 1.20 E of G 4
t
→
0.15
0.1
5 · 10−2
0
2
4
6 →
i
8
12
M. Das et al.
Fig. 1.21 S of G 1
A1 A2 A3
20
→
Sp %
15 10 5 0 0
2
4
6
→
i
Fig. 1.22 S of G 2
10
→
Sp %
15
5
0 0
2
4
6
8
→
i
Fig. 1.23 S of G 3 20
→
Sp %
15 10 5 0 0
2
4
6 →
i
8
10
1 An Analytical Study of COVID-19 Dataset …
13 20
Fig. 1.24 S of G 4
A1 A2 A3
→
Sp %
15
10
5
0 0
2
4
6
8
10
→
i
Fig. 1.25 H of G 4 on A3
50
→
nc
40 30 20 10 0
2
4
6
8
10
→
f
1.5 Discussion and Conclusions In this paper, we propose graph-based clustering by the help of MCL, RMCL, and MCL with a variable inflation rate algorithm. We have validated our clustering using DI, and the quality of the clustering is very magnificent. To evaluate our clustering algorithm, the DI metric is interpreted as an intra-cluster and inter-cluster distance. DI in the results section shows that the performance of the A1 is up to the mark. Performance of A2 is superior to A3 . The study proposes that PPI on the COVID-19 candidate gene is extremely crucial for human disease.
14
M. Das et al.
Fig. 1.26 H of G 4 on A3
50
→
nc
40 30 20 10 0
2
4
6
8
10
→
f
Fig. 1.27 DI on random
4
ri
→
3 2 1 0 140
160
180
200
220
240
260
→
i
Fig. 1.28 DI on PPI 150 A1 A2 A3
ri
→
100
50
0
160
180
200 →
i
220
240
1 An Analytical Study of COVID-19 Dataset …
15
References 1. Behera, V.N.J., Ranjan, A., Reza, M.: Graph Based Clustering Algorithm for Social Community Transmission Prediction of COVID-19. arXiv preprint (2020) 2. Chen, Y., Nadji, Y., Monrose, F., Perdisci, R., Antonakakis, M., Vasiloglou, N.: Practical attacks against graph-based clustering. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security (2017) 3. Deng, Y., Chokalingam, V., Zhang, C.: Markov chain correlation based clustering of gene expression data. In: International Conference on Information Technology: Coding and Computing (ITCC’05), vol. 2, pp. 750–755 (2005) 4. Enright, A.J., Ouzounis, C.A.: BioLayout an automatic graph layout algorithm for similarity visualization. Bioinformatics 17(9), 853–854 (2001) 5. Enright, A.J., Dongen, S.V., Ouzounis, C.A.: An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 30(7), 1575–1584 (2002) 6. Foggia, P., Percannella, G., Sansone, C.: A graph-based clustering method and its applications. In: International Symposium on Brain. Springer, Berlin (2007) 7. Jain, A.K., Dubes, R.C.: Algorithms for clustering data. Prentice-Hall Inc. (1988) 8. Jung, J., Miyake, M., Akama, H.: Recurrent Markov cluster (RMCL) algorithm for the refinement of the semantic network. In: LREC. pp. 1428–1431 (2006) 9. MedvÃs, L., Szilágyi, L., Szilágyi, S.M.: A modified Markov clustering approach for protein sequence clustering. In: IAPR International Conference on Pattern Recognition in Bioinformatics, pp. 110–120. Springer, Berlin (2008) 10. Rao, R., Chakraborty, M.: Vec2GC—A Graph Based Clustering Method for Text Representations (2021) 11. Rizki, A., Bustamam, A., Sarwinda, D.: Applications of cuckoo search and ant lion optimization for analyzing protein-protein interaction through regularized Markov clustering on coronavirus. J. Phys.: Conf. Ser. (2021) 12. Roy, S.G., Chakrabarti, A.: Chapter 11—A novel graph clustering algorithm based on discretetime quantum random walk. In: Quantum Inspired Computational Intelligence, pp. 361–389. Morgan Kaufmann (2017) 13. Satuluriand, V., Parthasarathy, S.: Markov clustering of protein interaction networks with improved balance and scalability. In: First ACM International Conference on Bioinformatics and Computational Biology, pp. 247–256 (2010) 14. Selvakumar, K., Ramesh, L.S., Kanna, A.: Enhanced K-means clustering algorithm for evolving user groups. Indian J. Sci. Technol. 8(24) (2015) 15. van Dongen, S.: A cluster algorithm for graphs. Inf. Syst. (2000) 16. Wilschut, T., Etman, L.F.P., Rooda, J.E., Adan, I.J.B.F.: Multilevel flow-based Markov clustering for design structure matrices. ASME J. Mech. Des. 12(139) (2017) 17. Zhang, Y., Ouyang, Z., Zhao, H.: A statistical framework for integration through graphical models with application to cancer genomics. Ann. Appl. Stat. (2017)
Chapter 2
Intrusion Detection Using Support Vector Machine and Artificial Neural Network Gandhe Srivani and Srinivasu Badugu
Abstract With the rapid development of information technology in the past two decades, computer networks are widely used by industry, business and various fields of the human life. There are many types of attacks threatening the availability, integrity and confidentiality of computer networks. Therefore, intrusion detection system acts as a defensive tool to detect the network-based attacks on the Web, but still it is immature in monitoring and identifying attacks, and performance remains unchanged. Many number of techniques have been developed which are based on machine learning approaches. In this, we proposed intrusion detection system using three machine learning algorithms like support vector machine, logistic regression, artificial neural network and found their accuracy. The dataset here we used is NSLKDD dataset, which contains 25,193 instances and 42 attributes, in which 1 attribute is class which labeled as categorical type gives output “normal” or “anomaly.” Finally to my dataset, this experiment resultant shows that FFANN-IDS achieves high accuracy of 99% with and without PCA, and this is the best accuracy compared to other algorithms.
2.1 Introduction There has been a remarkable rise in the use of Web-based applications over the last decade. Hacking incidents are increasing day by day as technology rolls out. An IDS is a type of security tool that monitors network traffic and scans the system for suspicious activities and alerts the system or network administrator [1]. IDS can be classified into network-based intrusion detection system and host-based intrusion detection system [2]. Network intrusion detection system (NIDS) [3] is one common type of IDS to analyze network traffic analyzing for suspicious activity, e.g., denial of services attacks [4]. Host-based intrusion detection systems (HIDS) [5] analyzes network traffic and system-specific settings such as software calls, local security policy, local G. Srivani · S. Badugu (B) Stanley College of Engineering and Technology for Women, Hyderabad, Telangana, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_2
17
18
G. Srivani and S. Badugu
log audits. It is most commonly deployed on critical hosts as publicly accessible servers and servers containing sensitive information.
2.1.1 Intrusion Detection Approaches Intrusion detection methods generally classified in following categories: Misuse-based or signature-based IDs Signature-based detection can be used by IDS, relying on known traffic data to analyze potentially unwanted traffic. Anomalybased IDs that looks at network traffic patterns and detects whether data is valid, incorrect or any abnormality. It is mainly used for detecting unrelated traffic that is not specifically known and is called anomaly-based detection IDS [6], and it will detect that an Internet Protocol (IP) packet is malformed.
2.2 Related Work Based on Intrusion Detection System This section provides an account of previous studies on feature selection methods in general as well as intrusion detection system using ML and DL techniques. Buczak et al. [7] discussed about “A survey of data mining and Machine learning methods for cyber security intrusion.” This paper focuses on network flow (NetFlow) data and points out that the packet processing may not be possible at the streaming speeds due to the amount of traffic. Teodoro et al. [8] focus on anomaly-based network intrusion techniques. The authors present statistical, knowledge-based and machine learning approaches, but their study does not present a full set of state-of-the-art machine learning methods. This chronicle focused survey of machine learning (ML) and data mining (DM) methods for cyber analytics in support of intrusion detection. Ahsan et al. [9] discussed a paper “A Novel 2-stage deep learning model for efficient network intrusion detection” and developed a novel random neural network (RNN) for real-time cognitive system design requirements. They integrated the RNN with a genetic algorithm to achieve real-time decision-making. This paper discusses a novel two-stage deep learning (TSDL) model, first stage termed initial decision stage to allocate normal or abnormal status of network patterns and second detecting which type of attack or normal attack using KDD99 and UNSW-NB15 datasets. Jingping [10] introduced “Anomaly-Based network intrusion detection using SVM” and did an intensive research on the application of SVM in network intrusion detection and evaluated the importance of each attribute in Transmission Control Protocol (TCP) session to SVM classification based on KDD. This paper proposes an anomaly-based SVM detection scheme by extracting and optimizing the training features. It trains the SVM with Kullback–Leibler (KL) divergence and cross-correlation calculated by the control and data planes traffic. The best OSR is (99.47%). When taking both OSR and TPR into consideration, which results in a lower OSR (99.39%) than 99.47%. Hinton [11] proposed a paper “A Deep learning method with filter based feature engineering
2 Intrusion Detection Using Support Vector Machine and Artificial …
19
for wireless intrusion detection system”, and it is an advanced sub-field of ML that simplifies the modeling of various complex concepts and relationships using multiple levels of representation [12]. DL has achieved a great amount of success in fields such as language identification, image processing and pharmaceutical research [13, 14]. This paper deals with deep learning (DL) using feedforward deep neural network (FFDNN) using NSL-KDD dataset, and they are compared with different algorithms techniques and prove FFDNN-IDS has achieved greater accuracy.
2.3 Proposed System In this chapter, we will see the introduction of the system architecture that provides an overview of entire system architecture modules (Figs. 2.1 and 2.2). Fig. 2.1 Step-by-step approach of proposed system
20
G. Srivani and S. Badugu
Fig. 2.2 Flowchart of Intrusion detection system
2.3.1 Dataset Description Dataset I collected for the proposed system is NSL-KDD [15–17] dataset which is a benchmark of KDDcup99 dataset. Tavallaee et al. proposed NSL-KDD an improved version of the KDD99 [18]. I extracted this dataset collection from the Kaggle.com repository. When we deep look into my dataset which is NSL-KDD dataset which consists of 42 attributes which related to network patterns, out of which all 42 features values are not numerical. The attributes named (1) protocol_type, (2) service, (3) flag are non-numeric. For such that these attributes values can be converted into
2 Intrusion Detection Using Support Vector Machine and Artificial … Table 2.1 Represents about the NSL-KDD dataset
Table 2.2 Represents about the class attributes
21
Dataset name
Number of attributes
#Number of records
Number of classes
NSL-KDD
42
25,192
2(normal or anomaly)
S. no.
Class name
Total number of records
#Percentage
1
Normal
13,449
46.6
2
Anomaly
11,743
53.4
numericals because since we are using the machine learning models for our approach, so machine learning techniques cannot directly deal with non-numeric values, these can be converted to numericals.
2.4 Implementation In this section, it describes the overall structure of the implementation of our approach, what are the NSL-KDD training and testing datasets required for the creation of our proposed system and overview description of working process.
2.4.1 Dataset In my NSL-KDD dataset which contains 42 features, including class label which is categorical type, i.e., either “normal” or “anomaly,” and remaining attributes three are non-numeric and 38 are numeric. The three attributes are “protocol_type” which contains 3 values, “service” which contains 63 values, “flag” which contains 11 values. These three attributes undergo preprocessing as we are using machine learning models which cannot take categorical data directly. The given Tables 2.1 and 2.2 describe the values of the attributes which undergo preprocessing.
2.4.2 Label Encoding Attributes like (f2—“protocol_type,” f3—“service,” f4—“flag”). It refers to converting the labels into numeric form so as to convert it into the machine-readable form. For example, “protocol_type” is 1, “service” is 2 and “flag” is 3, which we are going to apply label encoding technique based on performing label encoding of the non-numeric column to numbers, but in the right way (Tables 2.3 and 2.4).
22
G. Srivani and S. Badugu
Table 2.3 Attribute Protocal type and values
Protocol_type TCP UDP ICMP
Table 2.4 Convertion Categorical attributes to numarical attributes using label encoding
Protocol_type (text)
Protocol_type (numerical)
ICMP
0
TCP
1
UDP
2
Thus, it columns by performing label encoding. The label encoding is which we assign non-numerical to numbers to values inside the attribute, in such a way that is given above: Same as the f3—service attribute string values, (1–63) can also be converted into numericals using label encoding technique.
2.4.3 Data Normalization using Min_Max Approach It is more important to transform and normalize the features because as we look deep into the values of every attributes, they are in uneven distribution. So, in order to make all the attributes values into uniformity, we apply normalization techniques. Here, we deal with Min–Max normalization technique by standard deviation; all values fall under the range (0–10) (Table 2.5). Table 2.5 Names of Categorical Attributes in Data set S. no.
Feature name
Range (both train and test) before
After
1
Duration
0–54,451
(0, 1)
2
Src bytes
0–1,379,963,888
(0, 1)
3
Dst bytes
0–309,937,401
(0, 1)
4
Hot
0–101
(0, 1)
5
Num compromised
0–7479
(0, 1)
6
Num root
0–7468
(0, 1)
7
Num file creations
0–100
(0, 1)
8
Count
0–511
(0, 1)
9
Srv count
0–511
(0, 1)
10
Dst host count
0–255
(0, 1)
11
Dst host srv count
0–255
(0, 1)
2 Intrusion Detection Using Support Vector Machine and Artificial …
v =
23
min A (newmax A − new − min A ) + new max A − min A
(2.1)
where v is the new value in the specified range. The benefit of this technique is that all values are concealed within certain ranges.
2.4.4 Dimensionality Reduction Using Principal Component Analysis The below flowchart describes the steps involved in principal component analysis for finding correlation between the attributes (Fig. 2.3). That means after dimensionality reduction, the number of features is reduced to 808,144 (Table 2.6). Fig. 2.3 Flowchart of the principal component analysis
Table 2.6 Dataset Description before and after PCA
#
Number of features
Number of records
Without PCA
42
25,192
With PCA
18
25,192
24
G. Srivani and S. Badugu
2.4.5 Splitting the Training Data and Testing Data Normally, the data split between test–train is 20–80%, 30–70%, 40–60%. We will start by installing packages needed. Next, we will start our program by importing the packages needed for the process. As explained above, pandas is for importing dataset and sklearn is for train_test_split() function.
2.4.6 Support Vector Machine In machine learning, support vector machines [19–21] (SVMs, also support vector networks) that analyze data used for binary classification.
2.5 Results In this chapter, we will evaluate our idea of proposed intrusion detection accuracy using support vector machine, logistic regression and artificial neural network. Along with this, we also test and train the accuracy of three machine learning models with PCA and without PCA. The final results are achieved by comparing all the three models and represent the visual presentation of results, including graphs, tables, figures that give detailed description.
2.5.1 Performance Metrics of Machine Learning A good intrusion detection system has high accuracy and detection rate. True positive = Precision = Recall =
True Positive(TP) True Positive(TP) + False Positive(FP)
True Positive(TP) True Positive(TP) + False Positive(FP)
True Positive(TP) True Positive(TP) + False Negative(FN)
(2.2) (2.3) (2.4)
2 Intrusion Detection Using Support Vector Machine and Artificial … Table 2.7 Data divided into training and test sets using different ratios
25
S. no.
Ratio
#Training data
#Testing data
1
80:20
20,153
5039
2
70:30
17,634
7558
3
60:40
15,115
10,077
2.5.2 Performance Evaluations See Table 2.7.
2.5.3 Tables and Graphical Representation of Results I build SVM using linear kernel with 80% training and 20% testing, 70% training and 30% testing, 60% training and 40% testing and with PCA and without PCA. The results are shown in Tables 2.8, 2.9, 2.10, 2.11, 2.12 and 2.13. Table 2.8 shows the accuracy of support vector machine with PCA and without PCA of different split size of records 20,153:5039, 17,634:755, 15,115:10,077. Observations This graph shows the comparison of performance accuracy achieved by support vector machine with PCA and without PCA. X-axis represents the test size of the SVM model. Y-axis represents the score of accuracy percentage. The given Graph 2.1 gives detailed observation and visual representation of respective proposed algorithms with and without including the dimensionality technique as PCA with any splitting ratio for respective (20,153:5039, 17,634:7558, 15,118:10,077). Table 2.9 shows the accuracy of logistic regression with PCA and without PCA of different split size of records 20,153:5039, 17,634:755, 15,115:10,077. Observations This graph shows the comparison of the performance accuracy achieved by logistic regression without and with PCA. X-axis represents the test Table 2.8 Test accuracy of support vector machine with and without PCA
Table 2.9 Test accuracy of logistic regression without and with PCA
Test size
SVM accuracy without PCA
SVM accuracy with PCA
0.2(5039) records
0.957
0.958
0.3(7558) records
0.953
0.956
0.4(10,077) records
0.953
0.956
Test size
LR without PCA
LR with PCA
0.2(5039) records
0.9526
0.9593
0.3(7558) records
0.9516
0.9561
0.4(10,077) records
0.9525
0.9559
26
G. Srivani and S. Badugu
Table 2.10 Accuracy of artificial neural network without PCA of any splitting ratio and with respective iterations Epochs (iterations)
Test_size of any splitting ratio of records
Training accuracy of neural network
Testing accuracy of neural network
100
0.2
0.9940
0.9897
0.3
0.9916
0.9873
150
500
0.4
0.9942
0.9912
0.2
0.9943
0.9938
0.3
0.9954
0.9930
0.4
0.9951
0.9907
0.2
0.9964
0.9911
0.3
0.9964
0.9911
0.4
0.9965
0.9920
Table 2.11 Accuracy of artificial neural network with PCA of any splitting ratio and with respective iterations Epochs (iterations)
Test_size of any splitting ratio of records
Training accuracy of neural network
Testing accuracy of neural network
100
0.2
0.9889
0.9879
0.3
0.9881
0.9877
0.4
0.9895
0.9859
0.2
0.9876
0.9863
0.3
0.9904
0.9901
0.4
0.9880
0.9849
0.2
0.9936
0.9897
0.3
0.9929
0.9889
0.4
0.9941
0.9892
150
500
Table 2.12 Accuracy of SVM, logistic regression, ANN with PCA of any splitting ratio with epochs as 100, 150, 500 SVM
LR
Training accuracy of neural network
Testing accuracy of neural network
0.9577
0.9526
0.9876
0.9863
0.9532
0.9516
0.9904
0.9901
0.953
0.9525
0.9880
0.9849
Table 2.13 Accuracy of SVM, logistic regression, ANN without PCA SVM
LR
Training accuracy of neural network
Testing accuracy of neural network
0.9585
0.9585
0.9959
0.9911
0.9566
0.9561
0.9964
0.9929
0.9564
0.9559
0.9965
0.9920
2 Intrusion Detection Using Support Vector Machine and Artificial …
27
ACCURACY OF SUPPORT VECTOR MACHINE WITH AND WITHOUT PCA with test size as 80:20,70:30,60:40 of records as 20153:5039,17634:7558,15115:10077 0.959 0.958 0.957 Support Vector Machine with PCA Support Vector Machine without PCA
ACCURACY
0.956 0.955 0.954 0.953 0.952 0.951 0.95 0.2
0.3
0.4
TEST SIZE
Graph 2.1 Accuracy of support vector machine with and without PCA
size. Y-axis represents the accuracy score of the logistic regression model. The given Graph 2.2 gives detailed observation and visual representation of respective proposed algorithms without and with including the dimensionality technique as PCA with any splitting ratio for respective (20,153:5039, 17,634:7558, 15,118:10,077). Table 2.10 given describes that it has been observed that the artificial neural network increased the accuracy of 99% with and without PCA of any splitting ratio with the (20,153:5039, 17,634:7558, 15,118:10,077) records and with epochs as 100, 500, 500. Furthermore, the change in the dataset has not affected much to the performance. ACCURACY OF LOGISTIC REGRESSION WITH AND WITHOUT PCA with test size as 80:20,70:30,60:40 of records as 20153:5039,17634:7558,15115:10077 0.96 0.958
ACCURACY
0.956
Logistic Regression accuracy with PCA Logistic Regression accuracy without PCA
0.954 0.952 0.95 0.948 0.946 0.2
0.3
0.4
TEST SIZE
Graph 2.2 Accuracy of logistic regression without and with PCA
28
G. Srivani and S. Badugu
ACCURACY OF ARTIFICIAL NEURAL NETWORK WITH AND WITHOUT PCA
ACCURACY
with test size as 80:20,70:30,60:40 and epohs as100,150.500 0.998 0.996 0.994 0.992 0.99 0.988 0.986 0.984 0.982 0.98 0.978
Training accuracy of neural network with PCA Testing accuracy of neural network with PCA Training accuracy of neural network without PCA Testing accuracy of neural network without PCA
100
150
500
EPOHS
Graph 2.3 Accuracy of neural network accuracy with and without PCA
Observations This graph shows the comparison of performance accuracy of achieved by the neural network with and without PCA. X-axis represents Number of epochs of the neural network model. Y-axis represents the accuracy score of the neural network. The given Graph 2.3 gives detailed observation and visual representation of respective artificial neural network with and without the dimensionality technique as PCA with any splitting ratio for respective (20,153:5039, 17,634:7558, 15,118:10,077), of respective epochs as 100, 150, 500. Observations This graph shows the comparison of performance accuracy of achieved by SVM, logistic regression and ANN with PCA. X-axis represents test size of the models. Y-axis represents the overall accuracy score of the models. The given Graph 2.4 gives detailed observation and visual representation of respective artificial neural network, SVM and logistic regression with the dimensionality technique as PCA with any splitting ratio for respective (20,153:5039, 17,634:7558, 15,118:10,077), of respective epochs as 100, 150, 500. Observations This graph shows the comparison of performance accuracy of achieved by SVM, logistic regression and ANN without PCA. X-axis represents data points of the models. Y-axis represents the overall accuracy score of the models. The given Graph 2.5 gives detailed observation and visual representation of respective artificial neural network, SVM and logistic regression without the dimensionality technique as PCA with any splitting ratio for respective (20,153:5039, 17,634:7558, 15,118:10,077), of respective epochs as 100, 150, 500.
2 Intrusion Detection Using Support Vector Machine and Artificial …
29
ACCURACY OF SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, ARTIFICIAL NEURAL NETWORK WITH PCA with test size as 80:20,70:30,60:40 of records as 20153:5039,17634:7558,15115:10077 1 Support vector machine Logistic regression Training accuracy of neural network Testing accuracy of neural network
0.99
ACCURACY
0.98 0.97 0.96 0.95 0.94 0.93 0.2
0.3
0.4
TEST SIZE
Graph 2.4 Accuracy of SVM, logistic regression, ANN with PCA of any splitting ratio with epochs as 100, 150, 500
ACCURACY OF SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, ARTIFICIAL NEURAL NETWORK WITHOUT PCA with test size as 80:20,70:30,60:40 of records as 20153:5039,17634:7558,15115:10077 1 Support vector machine Logistic regression Training accuracy of neural network Testing accuracy of neural network
ACCURACY
0.99 0.98 0.97 0.96 0.95 0.94 0.93 0.2
0.3
0.4
TEST SIZE
Graph 2.5 Accuracy of SVM, logistic regression, ANN without PCA
2.6 Conclusion and Future Work We have successfully implemented and tested the three machine learning models. In this research work, we proposed detection of intrusion using three machine learning algorithms such as support vector machine, artificial neural network, logistic regression with NSL-KDD dataset of records 25,193. The proposed method detects normal or abnormal attacks present in NSL-KDD dataset fast and efficiently. The accuracy of support vector machine with kernel type “linear” gives 95% accuracy with and without PCA and for logistic regression gives 95% accuracy with and without PCA
30
G. Srivani and S. Badugu
and for the feedforward artificial neural network without PCA for 42 attributes of any splitting ratio using “sigmoid” and “ReLU” as activation functions, at epochs 100,150, 500 gives accuracy as 99% and with PCA for 18 attributes gives accuracy as 99%. Finally, I would like to conclude that compared to three algorithms, artificial neural network gives the best accuracy of 99% with and without using PCA. Future Work In the future, we will probably see more and more forensic teams involved in cyber incidents performing in-depth analysis of events suspected to be an intrusion. In addition, AI algorithms will evolve to help security products continuously learn attacks and their behaviors, make connections between suspicious events and predict future evolutions of an attack, which would be helpful to reduce false alarm rate in intrusion detection system.
References 1. Sperotto, A., Schaffrath, G., Sadre, R., Morariu, C., Pras, A., Stiller, B.: An overview of IP flow-based intrusion detection. IEEE Commun. Surv. Tutorials 12(3), 343–356 (2010) 2. Garcia-Teodoro, P., Diaz-Verdejo, J., Maciá-Fernández, G., Vázquez, E.: Anomaly-based network intrusion detection: techniques, systems and challenges. Comput. Secur. 28(1), 18–28 (2009) 3. Ahsan, A., Larijani, H., Ahmadinia, A.: Random neural network based novel decision making framework for optimized and autonomous power control in LTE uplink system. Phys. Commun. 19, 106–117 (2016) 4. Jingping, J., Kehua, C., Jia, C., Dengwen, Z., Wei, M.: Detection and recognition of atomic evasions against network intrusion detection/prevention systems. IEEE Access 7, 87816–87826 (2019) 5. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015) 6. Shone, N., Ngoc, T.N., Phai, V.D., Shi, Q.: A deep learning approach to network intrusion detection. IEEE Trans. Emerg. Topics Comput. Intell. 2, 41–50 (2018) 7. Buczak, Anna L., and Erhan Guven. A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Communications surveys & tutorials 18, no. 2: 1153-1176 (2015) 8. Agatonovic-Kustrin, S., Beresford, R.: Basic concepts of artificial neural network (ANN) modeling and its application in pharmaceutical research. J. Pharm. Biomed. Anal. 22(5), 717–727 (2000) 9. Tavallaee, M., Bagheri, E., Lu, W., Ghorbani, A.A.: A detailed analysis of the KDD CUP 99 data set. In: IEEE Symposium on Computational Intelligence for Security and Defense Applications, CISDA, pp. 1–6. IEEE (2009) 10. Torkaman, A., Javadzadeh, G., Bahrololum, M.: A hybrid intelligent hids model using two-layer genetic algorithm and neural network. In: 2013 5th Conference on Information and Knowledge Technology (IKT), pp. 92–96. IEEE (2013) 11. Vasilomanolakis, E., Karuppayah, S., Muhlhauser, M., Fischer, M.: Taxonomy and survey of collaborative intrusion detection. ACM Comput. Surv. (CSUR) 47(4), 55 (2015) 12. Puzis, R., Klippel, M.D., Elovici, Y., Dolev, S.: Optimization of NIDS placement for protection of intercommunicating critical infrastructures. In: Intelligence and Security Informatics, pp. 191–203. Springer (2008)
2 Intrusion Detection Using Support Vector Machine and Artificial …
31
13. Tan, Z., Jamdagni, A., He, X., Nanda, P., Liu, R.P.: A system for denial-of-service attack detection based on multivariate correlation analysis. IEEE Trans. Parallel Distrib. Syst. 25(2), 447–456 (2014) 14. Mishra, P., Varadharajan, V., Tupakula, U., Pilli, E.S.: A detailed investigation and analysis of using machine learning techniques for intrusion detection. IEEE Commun. Surv. Tutorials (2018) 15. Yin, C., Zhu, Y., Fei, J., He, X.: A deep learning approach for intrusion detection using recurrent neural networks. IEEE Access 5, 21954–21961 (2017) 16. Javaid, A., Niyaz, Q., Sun, W., Alam, M.: A deep learning approach for network intrusion detection system. In: Proceedings of the 9th EAI International Conference on Bio-inspired Information and Communications Technologies (formerly BIONETICS), pp. 21–26. ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering) (2016) 17. Kim, J., Kim, J., Thu, H.L.T., Kim, H.: Long short term memory recurrent neural network classifier for intrusion detection. In: 2016 International Conference on Platform Technology and Service (PlatCon), pp. 1–5. IEEE (2016) 18. Kayaalp, M., Schmitt, T., Nomani, J., Ponomarev, D., Ghazaleh, N.A.: Signature-based protection from code reuse attacks. IEEE Trans. Comput. 64(2), 533–546 (2015) 19. Kuang, F., Xu, W., Zhang, S.: A novel hybrid KPCA and SVM with GA model for intrusion detection. Appl. Soft Comput. J. 18, 178–184 (2014) 20. Fatima, S., Srinivasu, B.: Text Document categorization using support vector machine. Int. Res. J. Eng. Technol. (IRJET) 4(2), 141–147 (2017) 21. Badugu, S., Kolikipogu, R.: Supervised machine learning approach for identification of malicious URLs. In: International Conference on Advances in Computational Intelligence and Informatics, Springer, Singapore (2019)
Chapter 3
Development of Different Word Vectors and Testing Using Text Classification Algorithms for Telugu Guna Santhoshi and Srinivasu Badugu
Abstract Word embedding methods are used to represent words in a numerical way. Text data cannot be directly processed by machine learning or deep learning algorithms. It is very efficient to process numerical data, so by using word embedding techniques, we need to transform the text data into numerical form. One hot encoding vectors of real-valued numbers are simple and easy to generate. The researchers are now managing Word2vec for semantic representation words. In the literature review, we found that there are fewer tools and resources available for Indian languages compared to European languages. We want to construct a word embedding (vectors) using one hot encoding and Word2vec strategy. In this paper, we evaluate these vectors using supervised machine learning algorithms for sentiment classification. We pursue a two-step approach in this article. The first step is to generate vocabulary using News Corpus and to create word vectors using various word embedding methods. Validating the vector quality is the second step, using machine learning algorithms. We did preprocessing on the corpus we received. We got 178,210 types and 929,594 tokens after preprocessing. The size of our vocabulary is 178,210 unique words. We used labeled corpus, i.e.; movie review sentences and vocabulary, for developing sentence vector. Using the one hot encoding vector and word2vec vector model, we translated sentences to vectors. Once label sentences were translated into vectors, three machine learning algorithms were trained and evaluated. We finally compared the outcome.
3.1 Introduction Word embedding is learned representation for text where words that have similar meaning are represented in similar form. It is one of the approach to represent words G. Santhoshi G.Narayanamma Institute of Technology and Science (for Women), Hyderabad, Telangana, India e-mail: [email protected] S. Badugu (B) Stanley College of Engineering and Technology for Women, Hyderabad, Telangana, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_3
33
34
G. Santhoshi and S. Badugu
and document that may be consider one of the key breakthrough of deep learning on challenging nlp problems [1]. To process natural language text on machine learning we must reorganize text to numerical form. The machine learning algorithms are incapable of processing raw text. They require numbers as input to perform any sort of job. So, we need to convert the text data to numerical form by using word embedding techniques. Word embeddings are considered to be one of the successful applications of unsupervised learning at present. They do not require any annotated corpora Embeddings use a lower-dimensional space while preserving semantic relationships [2]. The idea of the paper is we want to build and test the different word embedding vectors on different machine learning (ML) algorithm. We tested two embedding vectors on three machine learning algorithms for Telugu language. In word embedding we focused on how to design and develop the word vector model and translate this model in any one of the Indian language. We want to build a word embeddings (vectors) using one hot encoding and word2vec approach. Develop and test the machine learning model for labeled sentences using pre-defined word vectors models. Telugu is Dravidian language spoken in the Indian states of Andhra Pradesh, Telangana and the union territories of Puducherry (Yanam) by the Telugu people. It stands alongside, Hindi and Bengali as one of the few languages with primary official language status in more than one Indian state [3]. There are also significant linguistic minorities in states of Orissa, Karnataka, Tamil Nadu, Kerala and Maharashtra. It is one of six languages designated a classical language of India by the country’s government [4]. The Telugu script is an abugida consisting of 60 symbols—16 vowels, 3 vowel modifiers, and 41 consonants. Telugu is a morphological complex language [5]. The main problem is with the Telugu language, and it is a regional language hard to find the label training data.
3.2 Literature Survey We referred some of the papers that are listed as follows: Aditi et al. proposed a framework on new languages with morphological and phonological sub word representation. They taken a different task: focusing instead on the similarity of the surface forms, phonology, or morphology of the two transfer languages [6]. They explored two simple methods for cross-lingual transfer, both of which are task independent and use transfer learning for leveraging sub word information from resource-rich languages, especially through phonological and morphological representations. CT-Joint [7] and CT-FineTune [8] do not require morphological analyzers, but there have found that even a morphological analyzer built in 2–3 weeks can boost performance. “Word Embeddings for Morphologically Complex Languages”. Bengio et al. [9] “Feed forward Neural Net Language Model (NNLM) uses word vectors as its parameters”. The network itself models the language—that means when fed with N words (where N is a fixed, chosen number) it produces a probability distribution over all words from the language. For each word it should be the probability of appearing
3 Development of Different Word Vectors and Testing …
35
after the N given words. Part of the neural network is a shared matrix of word vectors. Neural Net Language Model consist of input, projection, hidden and output layers. Mikolov et al. [2, 10] Word2Vec implements two models Continuous Bag-ofWords Model (CBOW) and Continuous Skip-gram Model (Skip-gram). Both models were based on Neural Net Language Model. They consist of input, projection and output layers. “Tag Me a Label with Multi-arm: Active Learning for Telugu Sentiment Analysis”. Sricharan et al. [11] proposed a research work on Telugu sentiment analysis. Which is focused on: (i) Analysis of resource poor languages and their labeling techniques, (ii) Active learning and different query selection strategies, (iii) Sentiment analysis for regional languages. “Bengali word embeddings in solving document classification problem”. Word vector or so called distributed representations having a long history by now, starting perhaps from the work [12]. They collected the last 5 years newspapers data in Bengali. They used the word embeddings are generated using the neural networkbased language processing model Word2vec. There used the vector representation of the Bengali words. At last after the results are generated there observed that without using any preprocessing steps by taking the large amount of the data set their got the efficient results. Future it will be useful to solve other classification problems like POS tagging, NER and sentiment analysis in Bengali. “Sentiment analysis of citations using Word2vec” Mikolov et al. [2] introduced Word2vec technique that can obtain word vector by training text corpus. The idea of Word2vec (word embeddings) originated from the concept of distributed representation of words [13]. The common method to derive the vectors is using neural probabilistic language model [9]. Word embeddings proved to be effective representation in the tasks of sentiment analysis and text classification. Sadeghian and Sharafat [14] extended word embeddings to sentence embeddings by averaging the word vectors in a sentiment review statement. Their result showed that word embeddings outperformed the bag-of-words model in sentiment classification. They reported the citation sentiment classification results based on the word embeddings. The binary classification shows that the Word2vec is a promising tool for distinguishing positive and negative citations. However, hand crafted features performed better for the overall classification. “Word2Vec using Character n-grams”. Bojanowski et al. [15] used the approach of learning character n-gram representations to supplement word vector accuracy for five different languages to maintain the relation between words based on their structure. Alexandrescu and Kirchhoff [16] introduced factored neural language models, where words are represented as sets of features. Evaluation results indicate that the n-gram enhanced skip-gram models performed slightly better than regular skip-gram in the word similarity and word analogy tasks. However, it is worth noting that the skip-gram results are lower than the benchmark found in the paper by Bojanowski et al. [15].
36
G. Santhoshi and S. Badugu
Fig. 3.1 Proposed system architecture
3.3 Proposed System Architecture In the proposed system architecture it deals with the corpus, preprocessing, vocabulary selection, word vectors, Train the vectors and Test them by using machine learning algorithms. First we need to gather the corpus from the Online newspaper next we need to do some of the preprocessing methods like remove unnecessary punctuation mark in corpus by using the Noise character Unicode representation. After soaking corpus it will engender the word frequency and the Tokens. Select the lexicon based on the minimum and maximum frequency existence of the words. We are using the labeled sentences, i.e.; movie reviews for sentiment classification. By handling the labeled sentence and the vocabulary selection develop the one hot encoding vectors. By practicing the labeled sentence and token sequences develop theWord2vector. After vector are developed we need to split the vectors for training and testing part. We build the portrait using fit function, tested the model using predict function. In this paper, we are not using the stemming and lemmatization for morphology (Fig. 3.1).
3.4 Implementation For implementing the project I have used python 3.7 and have enforced Word2vector. The python 3.7 version can be downloaded online.
3 Development of Different Word Vectors and Testing …
3.4.1 Pseudo Code for Corpus Preprocessing
Pseudo code for the proposed word embeddingverification model INPUT: Corpus D, sentences S OUTPUT: Vocabulary V, Word_Vectors W, Accuracy A.
// Pre-Processing 1.
For each Di in D do: D`
Tokenize & clean DiV
i
D` 2.
For each Si in S do: `
Si
Tokenize & clean Si
//For one-hot-encoding vectors
For each S`i , V Take sentence vector vi ={s1 , s2 , ... , si },
// For word2vec model 3.
WV
Word2vec(V,size,type)
// For Sentence Vectors 4.
for each sentence in S
for each word in sentences 1. if word exits WV then
v
WV[word] 1. end if
end for end for
SV
SV +V
37
38
G. Santhoshi and S. Badugu
3.5 Result Our grant is we collected the Telugu corpus of 4517 text files. Which contains 179,765 tokens of size 6.46 MB, i.e.; 6,782,479 bytes from the online newspaper. After that in preprocessing without using any tool we cleaned the corpus by using noise character Unicode representation removing noisy data like comma, full stop, colon, exclamation mark, left parenthesis, right parenthesis, quotation marks, space and apostrophe. Next our donation is in vectorization part. For Telugu text no tools are shown for generation of vectors. It must be done manually. Like for creation of the sentence vector we need to add all word vectors then we will get vectors for sentences. For Text classification we are taken the labeled sentences. We assign the polarity for the sentences. Label 1—Positive sentences. Label 2—Negative sentences. Label 0—Neutral sentences.
3.5.1 Testing the Classification Accuracy Using One Hot Encoding Vectors Here, we are testing the classification accuracy using one hot encoding vectors. Out of 1081 sentences the trained data is 756, and the test data is 325. One hot encoding model with naive Bayes algorithm is giving the best accuracy rate compared to support vector machine and logistic regression. Table 3.1 describes about the confusion matrix, i.e.; it tells about the number of instances of actual class and the predicted class based on the instances. The purpose of taking Table 3.1 is to calculate the precision, recall, F1-score and accuracy for classification algorithms. Table 3.2 describes about the some of observations of one hot encoding model with machine learning algorithms. The purpose of this table is to generate the sentence vectors by using the vocabulary and testing the quality of vectors by using the machine learning algorithms. In the one hot encoding we mainly concentrate on the index values. Ignore the maximum and minimum frequency occurrence of the words select Table 3.1 Confusion matrix for support vector machine Actual class
Predicted class
Total
Class 0
Class 1
Class 2
Class 0
59
19
18
96
Class 1
29
94
23
146
Class 2
29
20
34
83
Total
117
133
75
325
3 Development of Different Word Vectors and Testing …
39
Table 3.2 Observations of one hot encoding vector with machine algorithms S. No.
Length of vocabulary
Naive Bayes accuracy (%)
Logistic regression accuracy (%)
SVM accuracy (%)
1
12,771
61
58
50
2
12,806
60
58
51
3
18,040
61
60
50
4
24,994
60
58
53
5
25,029
58
59
54
6
42,388
62
60
55
7
42,423
61
60
57
8
178,173
63
62
58
9
178,208
62
62
59
the average frequently occurrence of words by putting some range. After that we need to train the vectors and test the vectors with classification algorithms. Our observation in this table is among these classification algorithms the naive Bayes algorithm is giving best accuracy rate compared to logistic regression and support vector machine (Fig. 3.2).
Fig. 3.2 Graph of one hot encoding vector with LR, NB and SVM test-size as 0.3 (test data = 325, train data = 756 out of 1081)
40
G. Santhoshi and S. Badugu
3.5.2 Testing the Classification Accuracy Using Word2vec Vectors Testing the classification accuracy using Word2vec vectors, we assigned the vectorsize = 100, window-size = 3, 5, 7, 9 and sg = 0 or 1(sg = 0 it is represented as continuous bag-of-words, sg = 1 it is represented as Skip-gram). By using all these requirements it will test the similarity of words along with the machine learning algorithms accuracy rate. We are taken the 1081 Telugu sentences from movie review data set. Out of 1081 sentences, if the test-size is 0.4 test data = 433, train data = 648. If the test-size is 0.3 test data = 325, train data = 756. If the test-size is 0.2 test data = 217, train data = 864. Compared with two machine learning algorithms the logistic regression algorithm is giving the best accuracy rate Table 3.3 describes about the observation of Skip-gram with machine learning algorithms. The main purpose of this table is getting the context word vectors with the help of Skip-gram and by fixing the window-size and test-size we are testing the vectors quality by using classification algorithms. In this table, we have observed is if the window-size is more the accuracy rate is more. If the window-size is 9, and the test-size is 0.4, 0.3 it gives the best accuracy rate (Figs. 3.3, 3.4 and 3.5). Table 3.4 describes about the observation of CBOW with machine learning algorithms. We are mainly concentrated on predicting the target words based on the source context words. In this table what we have observed is if the window-size is more the accuracy rate is more. If the window-size is 5, 7, 9 and the test-size is 0.4, 0.3, 0.2 it gives the best accuracy rate (Figs. 3.6, 3.7 and 3.8).
Table 3.3 Observations of skip-gram with ML algorithms S. No.
Window-size
Test-size
Logistic regressionaccuracy (%)
SVM accuracy (%)
1
3
0.4
50.00
43.00
2
5
0.4
53.00
45.00
3
7
0.4
53.00
47.00
4
9
0.4
56.00
51.00
5
3
0.3
51.00
45.00
6
5
0.3
52.00
45.00
7
7
0.3
53.00
47.00
8
9
0.3
56.00
49.00
9
3
0.2
49.00
45.00
10
5
0.2
51.00
41.00
11
7
0.2
54.00
47.00
12
9
0.2
51.00
47.00
3 Development of Different Word Vectors and Testing …
41
Fig. 3.3 Graph of skip-gram with machine learning algorithms test-size as 0.4(test data = 433, train data = 648 out of 1081)
Fig. 3.4 Graph of skip-gram with machine learning algorithms with test-size as 0.3(test data = 325, train data = 756 out of 1081)
3.6 Conclusion In one hot encoding vectors, we are mainly concentrated on index values. Whereas in Word2vectors our focus is on context words. By comparing these two vectors with
42
G. Santhoshi and S. Badugu
Fig. 3.5 Graph of skip-gram with machine learning algorithms with test-size as 0.2(test data = 217, train data = 864 out of 1081)
Table 3.4 Observations of CBOW with ML algorithms S. No.
Window-size
Test-size
Logistic regressionaccuracy (%)
SVM accuracy (%)
1
3
0.4
46.00
41.00
2
5
0.4
47.00
44.00
3
7
0.4
45.00
44.00
4
9
0.4
46.00
45.00
5
3
0.3
43.00
42.00
6
5
0.3
45.00
43.00
7
7
0.3
47.00
44.00
8
9
0.3
45.00
44.00
9
3
0.2
41.00
43.00
10
5
0.2
44.00
45.00
11
7
0.2
44.00
45.00
12
9
0.2
45.00
43.00%
machine learning algorithms. Our observation is for small content and limited vocabulary one hot encoding vectors are good. But for huge data (100 M) and opaque vector, Word2vec vectors are good. It is working for word level also. The Word2vector gives the best results because Word2vector gives the semantic relationship. Word2vector is low spatial and impenetrable, whereas one hot encoding is high geographical and sparse. We are not using any stop words, stemming and lemming techniques for morphology and phonological words. In future scope, the perspective of technology and language by controlling Akshara based model with LSTM and CNN we can test achievement of word vector along with the machine learning algorithms.
3 Development of Different Word Vectors and Testing …
43
Fig. 3.6 Graph of CBOW with machine learning algorithms with test-size as 0.4(test data = 433, train data = 648 out of 1081)
Fig. 3.7 Graph of CBOW with machine learning algorithms with test-size as 0.3(test data = 325, train data = 756 out of 1081)
44
G. Santhoshi and S. Badugu
Fig. 3.8 Graph of CBOW with machine learning algorithms with test-size as 0.2(test data = 217, train data = 864 out of 1081)
References 1. Goldberg, Y.: Neural network methods for natural language processing. Synth. Lect. Human Lang. Technol. (2017) 2. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013a) 3. Murthy, K.N., Badugu, S.: Roman transliteration of Indic scripts. In: 10th International Conference on Computer Applications, University of Computer Studies, Yangon, Myanmar (28–29 February 2012) (2012) 4. Gordon, R.: Ethnologue: Languages of the World, 15th edn. SIL International, Dallas, TX (2005) 5. Srinivasu, B., Manivannan, R.: Computational morphology for Telugu. J. Comput. Theor. Nanosci. 15(6–7), 2373–2378 (2018) 6. Chaudhary., Aditi., Chunting, Z., Lori, L., Graham, N., David, R., Mortensen., Jaime, G., Carbonell.: "Adapting word embeddings to new languages with morphological and phonological subword representations." arXiv preprint arXiv:1808.09500 (2018) 7. Duong, L.: Learning cross lingual word embeddings without bilingual corpora (2016) 8. Zoph, B.: Transfer learning for low resource neural machine translation (2016) 9. Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003) 10. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems vol. 26, pp. 3111–3119 (2013) 11. Mukku., Sandeep, S., Radhika, M.: "Actsa: Annotated corpus for telugu sentiment analysis." In Proceedings of the First Workshop on Building Linguistically Generalizable NLP Systems. 54–58 (2017) 12. Collobert, R., et al.: Natural language processing. J. Mach. Learn. Res. (2011) 13. Weston, J., Ratle, F., Mobahi, H., Collobert, R.: Deep learning via semi-supervised embedding. In: Neural networks: Tricks of the Trade, pp. 639–655. Springer, Berlin, Heidelberg (2012) 14. Sadeghian., Amir., Ali, R.S.: "Bag of words meets bags of popcorn." (2015) 15. Bojanowski., Piotr., Edouard, G., Armand, J., Tomas, M.: "Enriching word vectors with subword information." Transactions of the association for computational linguistics. 5,135–146 (2017)
3 Development of Different Word Vectors and Testing …
45
16. Alexandrescu., Andrei., Katrin, K.: "Factored neural language models." In Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers. pp. 1–4 (2006)
Chapter 4
Framework for Diabetic Retinopathy Classification Sravya Madala, Vani K. Suvarna, and Pranathi Jalapally
Abstract Diabetic retinopathy (DR) is a severe complication of eyes found in the diabetic patients. It is the main cause of loss of vision among them, and it progresses with the duration of the disease. Commonly, the diagnosis will be done by an ophthalmologist. But, the huge number of patients especially in the rural areas has a limited number of ophthalmologists that need to screen and review the images to properly diagnose the disease. If the number of out-patients are high, the doctor will be inclined to spend less amount of time for each patient. Automatic detection of diabetic retinopathy from the astronomically immense scale of retinal images avails the ophthalmologist to treat the affected patient and reduce chances of vision loss, developing an automated diagnosis software based on latest deep learning techniques. We have developed a framework for classification of features in the fundus images for the data sets IDRiD, Messidor, Diaret_db0 based on disease criticality. The GUI takes the retina images data set as the input. It pre-processes the images and does the features extraction. The extracted retinal features are helpful for the accurate diagnosis of DR. Based on these features, we can classify the criticality level of the disease of each image.
4.1 Introduction Diabetic retinopathy is also termed as diabetic eye disease. It is the medical situation in which retina got effected due to diabetes mellitus. DR is one of the side effects of diabetes which may lead to cause blindness. The retina is the part of the eye which surrounds the rear of eye. This is extremely light sensitive. It converts every light ray that reaches the eye into signals that the neurons in brain can decipher. This technique creates visual images which is similar to how the eye sees. Diabetic retinopathy damages the blood vessels and veins with in the retinal tissue, causing them to leak liquid and also vision defects [1]. S. Madala (B) · V. K. Suvarna · P. Jalapally Department of Computer Science and Engineering, Velagapudi Ramakrishna Siddhartha Engineering College, Vijayawada, Andhra Pradesh, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_4
47
48
S. Madala et al.
Fig. 4.1 Retinal image
Diabetes is caused by a rise in blood sugar levels. A large number of people are affected by diabetes across the world. This disease has many side effects that affect every aspect of the body of the patient. One of the body parts that is affected mostly because of diabetes is retina which is present in the eye. Vision defect or blindness is mostly caused by diabetes [2]. According to the survey that was conducted by the World Diabetes Foundation, by 2025 over 43.9 billion persons across the globe will be attacked by this decease. The utilization of image processing techniques in the clinical field had gotten famous for different applications. It is always time taken and recourse demanding task by the doctor for manual diagnosis of diabetic retinopathy. For efficacious diagnosis and treatment of DR, detecting the lesions and abnormalities automatically in the fundus image can help the ophthalmologists [3]. This idea will lead to developing a GUI framework, for classifying different diabetic retinopathy features. This will help to detect the symptoms of diabetic retinopathy in the early stages of the disease. Figure 4.1 shows significant characteristics of the diabetic retinopathy in the fundus image. The general fundus image has optic disc, exudates, micro-aneurysms, and retina.
4.2 Literature Survey Several methods have been practised for the detection of DR in the early stage with the advancement of technology. Various algorithms were developed to detect optic disc, including morphological approaches [4]. El Abbadi and Al-saadi developed an automated algorithm which detects the optic disc with mathematical formulae [4]. Zhuo et al. proposed extraction of region of interest (ROI) of optic disc which will take less time to process than whole image [5]. Basit and Fraz proposed a method that shown consequential amendment over subsisting methods in terms of
4 Framework for Diabetic Retinopathy Classification
49
boundary extraction and detection of the optic disc [6]. Wavelet-based template matching algorithm was developed by M. Tamilarasi for detecting micro-aneurysms. Mangrulkar and Vishakhafursule proposed the scanning of the retina in the early stages of testing the eye [7]. In his experiments, he finds AVR ratio between image pre-processed and post-processed. Lokuarachchi et al. have proposed a systematic framework for the diagnosis of exudates automatically. In his methodology, location of the optical disc identified more accurately, and it is the key factor for the extraction of exudates [8].
4.3 Methodology Precise and programmed appraisal of retinal pictures has been considered as a useful asset for the finding of retinal issues, for example diabetic retinopathy and hypertension. In the retinal images, a vessel tree structure ought to contain data about exact thickness of veins. By tracking the vessel tree, optic disc can be identified. Microaneurysms in the retinal image will start from the optic disc and spread across the eye. We can able to detect the eye disease by identifying the noticeable changes in the micro-aneurysms with the progression of time. To detect the various features in the retinal image, the proposed methodology was shown in Fig. 4.2. Step 1: Pre-processing techniques are used to remove the noise from the data and modify the image to detect the features more accurately. Step 2: Image enhancement techniques are used to generate the better graphical quality of the medial image which is having low luminance. Step 3: In this extraction process, the morphological operations like opening, closing, erode, and dilate are used to detect the features like an optic disc, microaneurysms, and exudates. Fig. 4.2 Methodology
50
S. Madala et al.
Step 4: After extracting the features, we use classification techniques to identify the DR and to classify the images based on the criticality of the disease.
4.3.1 Pre-processing Image processing is divided into digital image processing and analogue image processing. Various computer algorithms are used for digital image processing to perform image pre-processing on digital images. It lets more and more algorithms to be applied to the input images. The goal of digital image processing is to improve the image features by reducing the unwanted distortions and/or enhancement of some important image features. In the green channel of the retinal image, the contrast between the haemorrhages, blood vessels, and exudates is best optically discerned. We can easily different the retinal features in green channel images when compared to red and blue channels. So, to extract the retinal features more clearly, we can use the green channel of retinal image for further processing.
4.3.2 Micro-aneurysms Identification Due to the built-in natural shape of retina, the retinal images have heterogeneous lighting patterns. To remove heterogeneity in illumination, the retinal images are processed first with filtering. Filtering of noise is an essential part of any blood vessels extraction methods for clean outputs. The noise removal techniques used in the process are CLAHE, mean-C filter, median filter, Gaussian filter. After the image is cleaned, morphology operations like erosion, dilation, black tophat, white tophat, opening, closing, skeletonize, convex hull are applied to process the image. Then, on that image, algorithms are applied for getting a clear and proper view of the blood vessels. The flow chart for detecting micro-aneurysms is shown in Fig. 4.3. Algorithms like Maxflow algorithm, Laplacian algorithm, flood filling algorithms are used in the process. After applying different filters, different morphological operations and different algorithms, and different combinations of filters, operations, and algorithms, the output image with blood vessel tree was extracted which is shown in Fig. 4.4.
4.3.3 Optic Disc Identification To detect the optic disc, I use the methodology which is shown in Fig. 4.5. First, ROI is captured from the image before processing because the fundus images of the
4 Framework for Diabetic Retinopathy Classification
51
Fig. 4.3 Micro-aneurysms detection
Fig. 4.4 Identified micro-aneurysms
retina are high resolution and have a more sizably voluminous size. The location of high-density pixels will get by applying minmaxloc, which is one of the functions in image processing [6]. It returns the pixel’s position and value, as well as the intensity’s low and high limit. A region of 120 pixels was derived from the position of the pixel with the highest intensity. The red channel is obtained from the ROI. On the resultant image, Gaussian Blur is implemented. It is then pre-processed with the image’s histogram and a several morphological operations.
52
S. Madala et al.
Fig. 4.5 Optic disc detection
To remove noise and detect edges, Canny edge detection algorithm is applied on the pre-processed ROI. CHT is used on this edged ROI. It identifies the optic disc, which is roughly circular in shape and masks the area as seen in. The ROI coordinates were mapped to the original image’s coordinates. The picture with the masked optic disc is shown in Fig. 4.6. Finally, the image is free of the optic disc which can be able to remove several retinal elements for diabetic retinopathy identification. Fig. 4.6 Identified optic disc
4 Framework for Diabetic Retinopathy Classification
53
Fig. 4.7 Identified exudates
4.3.4 Exudates Identification For detecting the exudates, first, we will take retinal image as input and remove the optic disc from the retinal image [3]. Later, we will apply the threshold function on the retinal images. Here, we will find the threshold values of green channel and red channel. After, add these threshold values and apply on the original values. Finally, we mask the output image on original image to show exudates on original image which is shown in Fig. 4.7.
4.3.5 Framework We developed a GUI framework for diabetic retinopathy feature extraction which is shown in Fig. 4.8. In that, first, we have to select either input image or input folder and output folder. Later, we have to select the optic disc detection or micro-aneurysms detection. After selecting the option, the images will process in background and output will be displayed on the selected output folder. To detect the exudates first, we have to remove the optic disc from the image, so after detecting the optic disc, we get the exudates in retinal image, and all this process is done internally without an extra click.
4.4 Results The algorithm is tested for the data sets IDRiD, Messidor, Diaret_db1. In IDRiD data set, out of 1113 samples, optic disc was correctly predicted in 1031 samples, microaneurysms were correctly predicted in 1086 samples, and exudates were correctly predicted in 610 samples. In Messidor data set, out of 1200 samples, optic disc was detected in 1116 of them, and in 1183 images, micro-aneurysms were detected
54
S. Madala et al.
Fig. 4.8 Framework
correctly. In Diaret_db1 data set, out of 89 samples, optic disc was detected in 77 of them, in 85 images, micro-aneurysms were detected correctly, and in 40 images, exudates are extracted properly as shown in Fig. 4.9. The GUI developed works more efficiently for the single image and the folder of images.
Fig. 4.9 Result analysis
4 Framework for Diabetic Retinopathy Classification
55
4.5 Conclusion The most frequent type of diabetic eye infection is diabetic retinopathy. It generally impacts individuals that have had diabetes for a long time. If diabetic retinopathy is detected earlier, and vision deficiency caused by it may be reduced significantly. Exudates, micro-aneurysms are the primary sign of diabetic retinopathy. Early detection of diabetic retinopathy features can, hence, inherently reduce the risk of blindness. The GUI proposed in this paper helps to identify the diabetic retinopathy easily and accurately.
References 1. Shirbahadurkar, S.D., Mane, V.M., Jadhav, D.V.: Early stage detection of diabetic retinopathy using an optimal feature set. In: Third International Symposium on Signal Processing and Intelligent Recognition Systems (SIRS’17), MIT Manipal (2017) 2. Fenner, B.J., Wong, R.L.M., Lam, W.-C.: Advances in retinal imaging and applications in diabetic retinopathy screening: a review. Ophthalmol. Ther. (2018) 3. Biran, A., Raahemfar, K.: Automatic method for exudates and hemorrhages detection from fundus retinal images. Int. J. Comput. Inf. Eng. 10(9) (2016) 4. El abbadi, N.K., Al-saadi, E.: Automated detection of exudates in retinal images. IJCSI Int. J. Comput. Sci. Issues 10(2) 1, 1–6 (2013) 5. Zhuo, Z., et al.: Optic disc region of interest localization in fundus image for glaucoma detection in ARGALI. In: Industrial Electronics and Applications, pp.1686–1689. IEEE (2010) 6. Basit, A., Farz, M.M.: Optic disc detection and boundary extraction in retinal images. Appl. Opt. 54(11) (2015) 7. Mangrulkar, R.S., Vishakhafursule: A review on artery vein classification of retinal images to identify diabetes using graph based approach. Int. J. Pure Appl. Res. Eng. Technol. 4(9), 288–293 (2016) 8. Lokuarachchi, D., Gunarathna, K., Muthumal, L., Gamage, T.: Automated detection of exudates in retinal images. In: 15th International Colloquium on Signal Processing and Its Applications (CSPA), pp. 43–47. IEEE (2019)
Chapter 5
Online Malayalam Script Assortment and Preprocessing for Building Recommender Systems V. K. Muneer, K. P. Mohamed Basheer, K. T. Rizwana, and Abdul Muhaimin Abstract An immense amount of research is happening in natural language processing across the globe. Availability of data is one of the major issues faced by researchers working in different languages worldwide. Huge volumes of data are being produced every second that is available on various servers and social networking sites. Though there are several methods to access data from these online repositories, the accessibility is limited and predefined, which may not be sufficient for the researchers and academia. This paper focuses on discussing methodologies to develop a customized data scraping algorithm in travel-related data from Facebook.com written in Malayalam Language, one of the prominent Dravidian languages in Indian state Kerala. The paper also focuses on preprocessing of data that is being scraped with the help of NLP techniques. Multiple requests have been raised to the server to fetch scattered and independent raw data from Facebook, and the algorithm succeeded in scraping sufficient information about posts details and user details.
5.1 Introduction With the rapid advancements in technology, people and devices are well connected to define social media data is one of the widest, deepest, and richest source of information. People use online sites and forums to upload their photographs, opinions, activities, reviews about anything. In a survey, about 73% of online grown-ups in the USA utilize social networks, and there are more than 2 billion accounts that generate [1] a huge load of online media information. Data from social media is synthesized and widely used for business development, marketing, advertisements, and recommendations. Information taken from social media will always be considered incredibly important for market study and research as they provide current trends, behavior, and attitudes of users. It also contains historical details, a variety of information sources, V. K. Muneer (B) · K. P. Mohamed Basheer · K. T. Rizwana · A. Muhaimin Research Department of Computer Science, Sullamussalam Science College, University of Calicut, Calicut, Kerala, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_5
57
58
V. K. Muneer et al.
domains, and locations. Among a couple of social media sites, Facebook shall be considered as large enough to claim that if Facebook were a country, it would be third largest, next to China and India [2]. Collecting data from this largest information repository is considered the most crucial thing in research, data processing, and decision-making. The clarity of data defines the precision of the result. There are mainly three methods to automate data collection. The first one is through Web services, where legal data access tools and technologies are provided by the native site. The second one is a customized tool to access API, and the third one is scraping [3] raw content from social media through coding. There are several libraries in languages like Python, R, and SAS to communicate with API. Tweepy and twitteR are two packages in Twitter for fetching data by using Python and R languages, respectively [4]. Web scraping is the process of collecting information from the Internet and store it to any storage media of the backend for further revival and assessment. Scraping can be executed either manually or automatically by a computer program or a Web crawler. Obtaining required resources from a targeted Web site with the help of HTTP requests and extricating sufficient details from the site which will undergo further parsing, reformatting, and organizing into a structural way is the two consecutive steps of Web scraping [5]. In the first step, the resources can be HTML, XML, images, audio, or JSON formats extracted with the help of tools and libraries like request, puppeteer, selenium. Parsing of raw data can be done with the help of a tool like beautiful soup [6]. An API provided by Facebook itself helps to fetch few preset data in prescribed fasion. But, this FB graph API [7] does not provide an extensive customization on collecting entire details of user posts along with the individual preference set of that user. Apart from this, few of the best free automatic Web scraping tools in the market are Octoparse [8], Dexi.io, Outwit Hub, Scrapinghub, and Parsehub [9]. The tools like Rvest, RSelenium, rtweet, streamR, and Rfacebook packages allow researchers to collect tweets and posts filtered by keywords, location, and language in real time and to scrape public Facebook pages, including likes and comments [10]. Beautiful Soup is a well-known package to extract Facebook posts and its comments using the parameter Post_URL. For confidential information like personal messages, emails, or private social media posts, it needs the consent of the users involved to make observations of their information [11].
5.2 Literature Review Abu Kauser et al. are discussing that scraping needs a perfect Web crawler. An automated computer program that navigates through Web pages to one another to search and download online documents is termed a Web crawler. General purpose crawling, focused crawling, and distributed crawling are the three crawling techniques based on knowledge [12]. The study conducted by Luciana in 2019 was about the advantages and disadvantages of the rate of information searching and spreading in social
5 Online Malayalam Script Assortment and Preprocessing …
59
media. They explained the standard methodology to create an app for fetching details officially from Facebook and Instagram using API. https://developers.facebook.com/ to register the application and then acquire the Application ID (AppID) and Application Secret (AppSecret). The confidentiality of the app is ensured by an encryption mechanism using AppSecret [13]. Ananthan et al. conducted an extensive analysis of the recommender system in social media is conducted with help of 61 high-quality selected papers published in SCOPUS and Web Of Science in the year between 2011 and 2015 to examine and understand the trend of the recommender system. The work focused on well-known online datasets like MovieLense, IMDB, Wikipedia, Stack overflow, and Facebook [14]. In the paper [15], Ilham Safeek proposes an appropriate career path to the user based on analyzing the posts, interactions, and sentiments they shared on Facebook and other social media. He et al. investigate the significance of explicit social relations in recommender systems (RS) and design a better recommender algorithm by examining the way how preferences and reviews of people correlate with those of friends [16]. Vashisht and Thakur conducted a study to execute an emotional analysis based on emoticons in the comments and status updates shared on social media [17]. Tiwari et al. discussed the possibility of improving the accuracy of recommendation by utilizing the data generated in social media, specifical tweets from Twitter [18]. Stefan and team concluded that the social media analytics process involves four distinct steps, data discovery, collection, preparation, and analysis [19]. A detailed study by Salloum et al., about techniques in text mining from social media sites like Facebook and Twitter [20]. A study on harvesting useful information Remmiya Devi et al. in 2016 from the text written in Malayalam social media Web sites and extract entities and processed the task using the structured skip-gram model [21]. They used FIRE2015 as a dataset to process natural language entity extraction and claimed 89.83% of overall accuracy. Named entity recognition [22], morphology analyzer, and parts of speech tagging [23] are few growing research areas in Malayalam Language processing which can turn out significant outcomes in grammar management and linguistics. Hovy and Lin classified it into extractive summarization and abstractive summarization [24]. The process of text summarization in NLP has developed extremely faster with the introduction of BERT algorithms. Liu et al. produced a different document-level encoder based on BERT which could articulate the overall meaning of the document and figure out interpretations for its sentences, and they could produce an excellent performance in both extractive summarization and abstractive summarization [25, 26]. To deal with summarization in the Malayalam text, Kanitha et al. proposed a graph theoretic style to produce an exact outline for Malayalam texts [27] Kabeer and Idicula in 2014 proposed a statistical sentence scoring technique and a semantic graph-based technique for text summarization [28]. Krishnaprasad et al. proposed an algorithm for generic summarization of Malayalam text in a single document with the help of generating ranks for each word in the document and then extracting the top N ranked sentences to arrange them in chronological order for an extractive summarization [29]. Pandian discussed the concept of natural language understanding of
60
V. K. Muneer et al.
Malayalam Language [30] to process texts taken from various sources to perform preprocessing and additional functionalities.
5.3 Dataset Facebook is one of the biggest data providers in social media. There are umpteen groups and pages for different categories where people of similar interest taking part in them. In the proposed work, we have chosen the largest travel group in Malayalam, named “Sanchari” means, “Traveler” in English. The URL is www.facebook.com/ groups/teamsanchari. The strength of members in the Facebook group is 6.9 lakh as of 11-11-2020. People who know the Malayalam Language usually publish their travel experience and reviews about different locations of the globe, which are written in the Malayalam language. The group contains more than 50,000 detailed reviews, and each of them gets hundreds of comments. We have taken 12,500 posts from different users, personal preferences, public check-ins, likes, education, and all.
5.4 Methodology Facebook provides its graph API explorer to fetch details from pages, groups, and public spaces. This is a general method to collect information in a faster way as native API is more compatible with the FB engine. It is essential to log in to developers.facebook.com to create an app that is the foundation of graph API. Once logged in Facebook will provide a temporary token and secret code, which are the keys to perform query processing and data retrieval. To carry out our research work, we need to collect travel posts and associated data in a customized manner, more deeply than the graph API can help. In that scenario, a traditional Facebook API may not be much useful. Hence, we developed a custom tool intending to scrape the essential details from Facebook. We used JSON, Node JS, and other scripting tools to accomplish the task (Fig. 5.1). Facebook also provides an analytical tool for group admins, named insight [31] which will give comprehensive detail about a particular group or page as a dashboard [32]. Statistical analysis can be utilized for checking the user engagements and identify other measures of group activities. The problem of this facility is the lack of extensive customization of data retrieval. Only predefined mode of operations can be performed (Fig. 5.2). This functionality of this scraping algorithm relies on three sequential steps. Targeting the posts returns the messages written in Malayalam, posting time, count of reactions, count of comments, number of shares, and profile_url of author for each post. We did not focus on taking the comments on the post for the time being. With the help of this profile_url, the second step focuses more on the public personal details like education, hometown, marital and family details, age, and their check-in
5 Online Malayalam Script Assortment and Preprocessing …
61
Fig. 5.1 Facebook API for data retrieval
Fig. 5.2 Facebook insight of FB group
details. All these raw details are fetched in a JSON format which is then flattened to an Excel sheet which is the third stage. The stages involved in scraping Facebook groups are given below, 1. 2. 3. 4. 5.
Login to Facebook Wall/Group/Page with credentials Identifying DOM and targeting DOM element from “raw” data of Facebook Collecting individual posts, message, count of likes, comments, and reactions Collecting personal preferences and check-in history. Converting JSON to Excel sheet.
62
V. K. Muneer et al.
5.4.1 Challenges in Facebook Scraping Being customized scraping, targeting an HTML element in real time, and fetching data within the tag were a complicated task. Facebook has written an algorithm to generate dynamic naming of these tags which makes the DOM structure complex. A few of the challenges faced during the customized FB scraping were, (a) (b) (c) (d)
Client-side rendering. Memory leaks Facebook bot detection and data usage Performance optimization tasks
Client-Side Rendering Facebook uses client-side rendering technology, ReactJS. The workaround for this is to load the react webpage on a browser and wait for it to complete the render. Selenium, puppeteer are the two libraries used for this. These libraries manipulate the browser in headless mode; thus, we can wait for the react page to complete render and execute basic JavaScript scraping using document QuerySelector API on the headless browser context. Memory leaks Facebook scraping is done by scrolling the FB page infinitely until the desired number of posts is fetched. As we scroll down, new DIV elements are added to the DOM. This causes memory leaks. Scraped DIVs are marked using class names and removed from the DOM to fix these leaks. Even though memory piles up as we fetch data due to other images and files being loaded in the background. Facebook Bot Detection Facebook uses bot detection techniques to detect and ban scrapers. Our main challenge was to bypass this bot detection. Randomizing each page visit and requests, randomizing time intervals between these requests and page visits, setting random user-agent strings are the fixes for this. Reduction of Bandwidth and Data Usage Each time a post has loaded the images, and videos in that post are also loaded. This uses a lot of bandwidth. So, to reduce data usage requests for certain media types, images, media, videos are filtered and blocked. Performance Optimization tasks The scraping is done in three steps. In the first step, basic data about group posts is collected using the mobile version of FB. Personal data and preferences are fetched in react version of FB in the second stage.
5.5 Experimental Results The algorithm is built on the top two phases. In the first phase, it is equipped with fetching all assigned details from any Facebook group whose URL is given. The fetched HTML raw data/JSON scripts will be then converted to Excel spreadsheets
5 Online Malayalam Script Assortment and Preprocessing …
63
or CSV files. All these tasks are done through the three submodules in the first phase. The second phase is responsible for performing natural language processing of Malayalam text taken from the Facebook group to perform a text preprocessing. The algorithm succeeded in extracting data from facebook.com in first phase and which could successfully preprocess with the help of natural language processing techniques to develop a model for recommender system.
5.5.1 Facebook Group Extraction Step 1: Basic posts scraping Here, every article element which contains the post with given properties is selected to an array. By looping this array, we can scrape each post in detail. The natural graph API is provided by native sites, like Facebook or Instagram, which can fetch details much faster than the customized Web scraping. However, adopting advanced tools and optimized code the algorithm could fetch 2500 post details per hour. In total running, the code in 5 h, 12,500 posts with details was scraped. More posts can be fetched by adding random delays between each fetch to avoid rate limiting and by fixing some memory leaks. Parent–child relations in DOM are used since exact CSS selectors are not available on the page. The above data is filtered and saved in JSON format for the next step of processing (Fig. 5.3). Step 2: Personal data scraping This step takes a lot of time compared to step one because of the random time delays between each page visit. Eight pages are visited for each post author with a 2 to 8 s delay in between each page visit. These time intervals are tweakable, and we found 2 to 8 s are optimal. For a single post, author personal data: minimum time: 2 * 8 = 16 s maximum time: 8 * 8 = 64 s. Including delays while scrolling for scraping check-ins, on average, one post author details are fetched in one minute. The number of posts processed in one day is 1500 approx. These results can be improved by tweaking time delays and providing good Internet speeds (Table 5.1).
Fig. 5.3 Post details fetched in Step 1
64
V. K. Muneer et al.
Table 5.1 Summarized details of iterative scraping of fields from FB group Parameters
Description
Message
Full length of travelogue in Malayalam First iteration
Iteration phase
Time
Uploaded time
First iteration
post_url
Web URL of the post
First iteration
profile_url
Unique ID of user profile
First iteration
total_reactions
Total reactions on post
First iteration
Comments
Number of comments
First iteration
Shares
Number of shares
First iteration Second iteration
about_work_and_education
Details of profession and academics
about_places
Home town and placed lived in
Second iteration
about_contact_and_basic_info
Telephone and email details
Second iteration
about_family_and_relationship
Marital status and family details
Second iteration
about_details
Personal additional details
Second iteration
about_life_events
Important milestones in life, notable achievements
Second iteration
Check-ins
Visited locations and check-ins
Third iteration
5.5.2 Preprocessing of Malayalam Text The messages, travelogues are taken from Facebook group, which are a lengthy description about the travel experience written in the Malayalam Language. The main challenge is to extract the most important content from the text. The essence of each post should be less than ten words. NLP algorithm for text summarization for Malayalam Languages has been adopted for this task. A customized Python package, Root_Pack, has been used [33]. Given a sample Malayalam inflated word import root_pack root_pack.root(“
്”)
The extractor will find out the root word as “അവർ”
Utilizing the root extractor along with conventional text preprocessing tools will help to produce the extract of long passages into three or four most significant words. Long passages are then summarized into short ones by considering the most repeated and relevant information. The long passage should be summarized and treated with further operations of POS tagging, named entity recognition, syntax checker and morphological analyzer. The extracted essence is in the form of quantifiable tokens of the passage and passed to the training algorithm for recommendation.
5 Online Malayalam Script Assortment and Preprocessing …
65
5.6 Conclusion Social media is one of the biggest data sources where people post their photographs, check-ins, polls, opinions, activities, and their updates. In this work, a data mining algorithm has been developed to scrape travel posts from a prominent Malayalam Facebook travel group named “Sanchari.” The algorithm is designed to overcome a few of the inabilities of existing traditional data scraping methods. The conventional method provided by developers.facebook.com graph API is not supported to fetch a few of the details from feeds and walls of users. On the execution of the algorithm, there were a couple of exceptions thrown by the code, browser, and server requests. Corrective measures are taken to handle these errors, and optimized methods are adopted to improve execution speed, the accuracy of scraping, performance optimization, and avoid memory leakages. The raw data generated by the scraping tool in JSON is then converted to Excel sheets which are then submitted for cleaning and parsing. In this study, we took complete details of 12,500 Facebook posts and took public personal information of 3781 travelers and details about 84,463 checkin spots around the globe. Another task discussed in this paper is preprocessing of Malayalam text, the “message” entry of each post. Collaborative filtering and cosine similarities mechanism are used to further preprocess the text to supply to a machine learning-based recommender model.
References 1. Lenhart, A., Purcell, K., Smith, A., Zickuhr, K.: Social media and mobile internet use among teens and young adults (2010) 2. Saravanakumar, M., Sugantha Lakshmi, T.: Social media marketing. Life Sci. J. 9(4), 4444– 4451 (2012). ISSN: 1097-8135 3. Batrinca, B., Treleaven, P.C.: Social media analytics: a survey of techniques, tools and platforms, 2014. AI Soc. 30, 89–116 (2015). https://doi.org/10.1007/s00146-014-0549-4 4. Singh, K., Shakya, H.K., Biswas, B.: Clustering of people in social network based on textual similarity. Perspect. Sci. 8, 570–573 (2016). https://doi.org/10.1016/j.pisc.2016.06.023 5. Zhao, B.: Web scraping. In: Schintler, L.A., McNeely C.L. (eds.) Encyclopedia of Big Data, Springer International Publishing AG (outside the USA) (2017). https://doi.org/10.1007/9783-319-32001-4_483-1 6. Zheng, C., et al.: A study of web information extraction technology based on beautiful soup. J. Comput. 10 (2015). https://doi.org/10.17706/jcp.10.6.381-387 7. van Dam, J.-W., van de Velden, M.: Online profiling and clustering of Facebook users. Decis. Support Syst. 70(2015), 60–72 (2014). 0167-9236/© https://doi.org/10.1016/j.dss.2014.12.001 8. Ahamad, D., Mahmoud, A.: strategy and implementation of web mining tools. Int. J. Innov. Res. Adv. Eng. (IJIRAE) 4(12) (2014). ISSN: 2349-2163 9. Milev, P.: Conceptual approach for development of web scraping application for tracking information. Econ. Altern. 3, 475–485 (2017) 10. Rieder, B.: Studying Facebook via data extraction, WebSci ’13. In: Proceedings of the 5th Annual ACM Web Science Conference, pp. 346–355 (2013). https://doi.org/10.1145/2464464. 2464475
66
V. K. Muneer et al.
11. Rifea, S.C., et al.: Participant recruitment and data collection through Facebook: the role of personality factors. Int. J. Soc. Res. Methodol. (2014). https://doi.org/10.1080/13645579.2014. 957069 12. Abu Kausar, Md., Dhaka, V.S., Singh, S.K.: Web crawler: a review. Int. J. Comput. Appl. 63(2) (2013). https://doi.org/10.5120/10440-5125 13. Dewi, L.C., Meiliana, Chandra, A.: Social media web scraping using social media developers API and regex. Proc. Comput. Sci. 157, 444–449 (2019) 14. Anandhan, A., et al.: Social media recommender systems: review and open research issues. IEEE Access (2018). https://doi.org/10.1109/ACCESS.2018.2810062 15. Safeek, I., Kalideen, M.R.: Preprocessing on Facebook data for sentiment analysis. In: Proceedings of 7th International Symposium, SEUSL, 7th and 8th December 2017 16. He, J., Chu, W.W.: A social network-based recommender system (SNRS) (2010). https://doi. org/10.1007/978-1-4419-6287-4_4 17. Vashisht, G., Thakur, S.: Facebook as a corpus for emoticons-based sentiment analysis. Int. J. Emerg. Technol. Adv. Eng. 4(5) (2014). ISSN 2250-2459 18. Tiwari, S., et al.: Implicit preference discovery for biography recommender system using twitter. Int. Conf. Comput. Intell. Data Sci. (2019). https://doi.org/10.1016/j.procs.2020.03.352 19. Stieglitz, S., et al.: Social media analytics—challenges in topic discovery, data collection, and data preparation. Int. J. Inf. Manage. 39, 156–168 (2018). https://doi.org/10.1016/j.ijinfomgt. 2017.12.002 20. Salloum, S.A., Al-Emran, M., Monem, A.A., Shaalan, K.: A survey of text mining in social media: Facebook and twitter perspectives. Adv. Sci. Technol. Eng. Syst. J. 2(1), 127–133 (2017) 21. Remmiya Devi, G., et al.: Entity extraction for Malayalam social media text using structured skip-gram based embedding features from unlabeled data. Proc. Comput. Sci. 93, 547–553 (2016). https://doi.org/10.1016/j.procs.2016.07.276 22. Ajees, A.P., Idicula, S.M.: A named entity recognition system for Malayalam using neural networks. Proc. Comput. Sci. 143, 962–969 (2018) 23. Anisha Aziz, T., Sunitha, C.: A hybrid parts of speech tagger for Malayalam language. In: 2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 1502–1507. Kochi (2015). https://doi.org/10.1109/ICACCI.2015.7275825 24. Hovy, E., Lin, C.-Y.: Automated text summarization in SUMMARIST. In Advances in Automatic Text Summarization (1999) 25. Liu, Y., Lapata, M.: Text summarization with pretrained encoders. arXiv:1908.08345v2 [cs.CL] 5 Sep 2019 26. Miller, D.: Leveraging BERT for extractive text summarization on lectures (2019). arXiv:1906. 04165 27. Kanitha, D.K., et al.: Malayalam text summarization using graph based method. Int. J. Comput. Sci. Inf. Technol. 9(2), 40–44 (2018). ISSN: 0975-9646 28. Kabeer, R., Idicula S.M.: Text summarization for Malayalam documents—an experience. In: International Conference on Data Science and Engineering (ICDSE) (2014) 978-1-4799-54612114/$31.00 @2014 IEEE 29. Krishnaprasad, P., Sooryanarayanan, A., Ramanujan, R.: Malayalam text summarization: an extractive approach. In: International Conference on Next Generation Intelligent Systems (ICNGIS) (2016). 978-1-5090-0870-4/16/$31.00 ©2016 IEEE 30. Pandian, S.: Natural language understanding of Malayalam language. Int. J. Comput. Sci. Eng. 7, 133–138 (2019). https://doi.org/10.26438/ijcse/v7si8.133138 31. Houk, K.M., Thornhill, K.: Using Facebook page insights data to determine posting best practices in an academic health sciences library. J. Web Librarianship 7, 372–388 (2013). https:// doi.org/10.1080/19322909.2013.837346 32. Houk, K.M., Thornhill, K.: Using Facebook page insights data to determine posting best practices in an academic health sciences library. J. Web Librarianship (2013). https://doi.org/10. 1080/19322909.2013.837346 33. https://pypi.org/project/root-pack/
Chapter 6
Improved Multi-modal Image Registration Using Geometric Edge-Oriented Histogram Feature Descriptor: G-EOH B. Sirisha , B. Sandhya , and J. Prasanna Kumar Abstract Feature descriptors intended for multi-modal images in particular logGabor histogram, edge-oriented histogram and phase congruency edge-oriented histogram descriptor can handle the nonlinear intensity deformation between the infrared and visual images but fail to handle geometric deformations like rotation and scale. This paper exhibits a robust feature descriptor algorithm to the task of registering corresponding features points between multi-modal images with nonlinear geometric and intensity deformations. In this approach, an M ∗ N image region around the detected feature point is re-sized using the scale σi and rotated by an angle θi . The scale and orientation information are extracted from the feature detector corresponding to that feature point. From this preprocessed image region, histogram of local and global contour orientations is acquired to describe spatial distribution of edges. The proposed geometric EOH algorithm adapt several steps of the EOH to multi-modal images. We study the performance enhancement brought by this new algorithm, with state-of-the-art algorithms. We present an application of G-EOH for the registration of LWIR-VS images in different configurations, especially with geometric and photometric deformations.
6.1 Introduction Image registration aims to precisely register two or several images of the corresponding scene, acquired at varied times, with different or same imaging devices at similar or dissimilar views [1]. Several image processing and computer vision applications like change detection, image fusion, object detection, image reconstruction and image tracking and navigation rely on image registration [2]. The accuracy of the above-mentioned applications heavily relies on precise registration. The images used for registration are termed as the reference and sensed images in remote sensing, the target and source image in computer vision and the fixed and moving image in medical imaging. Registering images acquired with a wavelength between 0.3m and 0.7m B. Sirisha (B) · B. Sandhya · J. P. Kumar Maturi Venkata Subba Rao (MVSR) Engineering College, Hyderabad, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_6
67
68
B. Sirisha et al.
in the electromagnetic spectrum (visible light spectrum) is a straightforward problem and has been addressed by researchers over the past decades [3]. It is observed that LWIR provides extraordinary visibility of terrestrial objects [4]. Images acquired using optical sensors in the visible spectrum provide radiation and reflection information, whereas infrared sensors provide temperature information [5]. IR sensor and visible sensor provide complementary information in varying situations such as day and night. Therefore, the image data acquired from infrared and visible cameras is effective in varied medical, remote sensing and computer vision applications such as video surveillance, change detection, object fusion, object/ image tracking and navigation. Registering images acquired using different modalities (infrared and visible) is called multi-modal image registration [6].
6.2 Related Work Conventional feature descriptors and its adaptations based on image’s gradient and intensity have received strong focus from the researchers for its promising performance for describing features of images acquired in the visible spectrum [7, 8]. Contemporary studies have exposed that performance of conventional feature descriptors declines as spectral bands, depart from the visible light spectrum [9, 10]. Due to nonlinear relationship between the pixel intensities and minimalistic image details/texture, corner and region/blob-based feature point descriptors for multi-modal images are hardly preferred. Thus, edge points distribution and intensitybased feature descriptors should be adapted. Using the histograms to represent the distribution of edges in an infrared image is able to describe the directionality and frequency of their illumination changes. Aguilera et al. [11] proposed edge-oriented histogram (EOH) to enhance the accuracy of key-point matching for multi-modal images that exploits only edge points in local windows rather than all pixels. The edge histogram descriptor uses 80 histogram bins to describe the distribution of local edges and is not enough to represent an image global features [11]. Won, Chee Sun et al. tried to address the setback of EHD, by combining the global, semi global and local edge histograms to describe the distribution of edges in an image [12]. Aguilera et al. proposed LGHD (log-Gabor histogram ) feature descriptor for multi-modal and multi-spectral image matching [13]. Mouats et al. proposed (PCEHD) phase congruency and edge-oriented histogram feature descriptor [14] which fuses spatial and frequency info from EHD and the magnitude of LG—log-Gabor coefficients. Aouf and Mouats proposed PCLGM feature descriptor which utilizes PC-Phase congruency to determine corner, edge features. The detected features are described using log-Gabor histograms [15]. In modified PCLGM feature descriptor, original image is combined with preprocessed (smoothed) image. Phase congruency is used to detect the corners, and log-Gabor histograms are used for feature descriptions. In Distinct MPCLGM algorithm, distinct wavelengths and log-Gabor filters are used for extracting features in the visible and infrared images [16]. Descriptors based on edge-oriented histogram got a enhanced performance of feature matching on multi-
6 Improved Multi-modal Image Registration Using Geometric Edge …
69
modal images than scale invariant feature transform, but hardly assign an orientation to detected feature point. This feature of classical descriptors limits their applications to mono-modal scale, viewpoint and translation invariant feature matching problems. To adapt EOH to dealing with rotation, Yong Li, Xiang Shi, assigns an orientation to feature points [17] for descriptor computation. Though this algorithm and adaptations of this algorithm works well on VS-IR images, it does not perform well in the presence of rotational and scale changes between the images. Hence, to address the geometric deformation between the images, we proposed G-EOH feature descriptor algorithm, which uses feature detectors scale and orientation parameters to construct the feature descriptor values. The proposed feature descriptor is robust and invariant to scale and rotation deformation between LWIRVS images. The rest of this paper is organized as follows, and Sect. 6.3 presents a detailed introduction of the proposed G-EOH feature descriptor and investigates the possibilities offered by this new algorithm for LWIR-VS and LWIR-LWIR image registration. Experimental validations, performances and conclusions are presented in Sects. 6.4 and 6.5 .
6.3 Multi-modal Image Registration Using G-EOH Multi-modal feature-based image registration pipeline has three main stages. In Stage:1 feature points of an input images (VS and LWIR) are identified through a scale space built on the basis of multi-scaled Gaussian filtering. Difference of Gaussians (DoG) is computed to identify potential points. These feature points are determined as local maxima and maxima across the DoG. The feature points are obtained by setting the SIFT like detector with the following parameters: sigma = 1.2 and threshold = 40, number of octaves: 4, number of octave layers: 3, contrast threshold: 0.04 , edge threshold: 10 and gaussian sigma: 1.6. In Stage:2 , each feature point detected in both VS and LWIR images is represented by a vector xi , yi , σi , θi , where xi , yi correspond to the feature point location and σi and θi are the scale and orientation of the detected feature point. Steps for computing proposed geometric EOH descriptor. Step 1. Image Region Pre-processing: To incorporate geometric in-variance, from each feature point, a patch of size M ∗ N is extracted. The extracted M ∗ N patch is re-sized using the scale σi and rotated by an angle θi obtained from the feature detector, corresponding to the feature point. From the rotated and re-sized output patch, an image region of size M ∗ N is cropped from the center. Step 2. Image Region Partitioning: The input preprocessed image M ∗ N of Step-1 is partitioned into sixteen 4 * 4 non overlapping subregions. The size of each subM N M N ∗ . The dimensionality of the image is re-sized so that and region is 4 4 4 4 becomes an integer. Each subregion is further partitioned into small 3 ∗ 3 pixel blocks for extracting local edge orientation.
70
B. Sirisha et al.
M N ∗ , a five Step 3. Extracting Local Edge Orientation: For each subregion 4 4 point histogram bin is initialized as H Bin = [H, V, 45D, 135D, N E O]. Here, H — Horizontal Edge Orientation, V —Vertical Edge Orientation, 45D—Diagonal Edge 45◦ Orientation, 135D—Diagonal Edge 135◦ Orientation and N E O—(Isotropic) No Edge Orientation. Initially, all the bin counts are initialized to zero, H Bin = [0, 0, 0, 0, 0]. To capture local edge orientation from the 3 ∗ 3 pixel block, five 3 ∗ 3 Scharr operators are employed. This operator generates an image highlighting the vertical, horizontal, diagonal (45◦ and1355◦ ) and no edge orientation gradient edges using first-order derivatives. It is observed that Scharr edge operators attempt to obtain ideal rotational in-variance. Each local edge operator (LEO) is applied on 3 ∗ 3 block represented as LEOtype = | 3n=0 bk .ek | , where bk represents 3 ∗ 3 image block region, and ek represents 3 ∗ 3 Scharr edge operator. The five values EO H , EOV , EO45D , EO135D , EONEO are obtained. The maximum of these five values is compared with a threshold (T ) to get the dominant edge orientation. EODominant = Max(EO H , EOV , EO45D , EO135D , EO N E O ) > T . The EODominant will be equal to any one of these five orientations (Max). Then, the count of the corresponding bin value is increased by one, and it is repeated for all 3 ∗ 3 blocks in M N M N ∗ subregion. For one ∗ subregion, we get the complete bin values one 4 4 4 4 expressed as H Bin1 = [B H 1 , BV 1 , B45D1 , B135D1 , BNEO1 ]. The above operations are M N ∗ image subregions to get all corresponding 16 HBins. repeated for all sixteen 4 4 Sixteen histogram bins (HBins) are for 16 subregions, and each bin has a total five members H Bin = [H, V, 45D, 135D, N E O] are computed. All these histogram bins are arranged in a matrix. Global feature is extracted by taking the mean function of this column matrix of total HBins. Global feature bin is represented by GBin. GBin = Mean(Total) H Bins. This global feature is combined with the computed local feature. The computed local feature bins are arranged side by side along with the global feature bin, to make a geometric edge-oriented histogram (G-EOH) vector of size eighty five (85). Figure 6.1 shows the flow diagram of geometric edge-oriented histogram(G-EOH) feature descriptor. G − EOH = H B[1] , H B[2] , H B[3] , H B[1] . . . , H B[16] , GBin
(6.1)
The choice of the appropriate image patch size (IP) for extracting features is a prime factor. In this paper, image patch size—40, 60 and 80 is considered. Figure 6.2 shows the importance of preprocessing of the feature point region. The number of true positive matches increases, after preprocessing the image region. In Stage: 3 NNR feature matching is used to find the correspondences between the pair of images (nearest neighbor ratio threshold: 1.1). Descriptor vectors are matched using nearest neighbor ratio (NNR) matching technique using Manhattan distance. If the computed NNR of the adjacent neighbor distance is higher than the predetermined threshold (1.1), these feature pairs are considered to be matched (corresponding feature pair). To improve matching robustness, feature matching is computed twice from reference to sensed and sensed to reference images. A feature
6 Improved Multi-modal Image Registration Using Geometric Edge …
71
Fig. 6.1 G-EOH (geometric edge-oriented histogram) descriptor for visual and LWIR images
Fig. 6.2 Figure at the right shows 299 true positive matches obtained, if the image region is re-sized using the scale σi and rotated by an angle θi . Figure at the left shows two true positive matches obtained, if the image region is not preprocessed
correspondence is judged only if it exists in one and the other cases. Finally, in Stage: 4, true correspondences from all the obtained correspondences of stage-3 are obtained using random sample consensus algorithm.
6.4 Experimental Results Visual and LWIR image registration using the proposed approach has been objectively and subjectively evaluated. Various comparisons are made using standard feature-based image registration for dataset (I and II ) using feature descriptors like EHD, LGHD and PCHD. Evaluation is done by varying image patch size in G-EOH descriptor. Visible image serves as reference image and LWIR image as sensed image. Evaluation parameters: The following four measures are used for evaluation (1) Ground truth error (GTE), (2) True positives(TP), (3) Inlier ratio(IR) and (4)Registered images (RI).
72
B. Sirisha et al.
Table 6.1 Comparison of feature descriptors EHD, PCEHD, LGHD and G-EOH for varied Window size Evaluation Image patch EHD PCEHD LGHD G-EOH size IR
RI
AGTE
IP-40 IP-60 IP-80 IP-40 IP-60 IP-80 IP-40 IP-60 IP-80
0.159 0.105 0.137 14 21 23 12.072 13.01 14.348
0.137 0.133 0.123 20 15 24 9.063 9.767 10.89
0.216 0.224 0.227 16 15 29 7.471 9.767 10.89
0.323 0.318 0.312 28 24 25 6.851 9.001 9.009
It is noted from Table 6.1 that the average inlier ratio attained using G-EOH descriptor of image patch or window size (IP-80) is 0.312, that is greater than the average inlier ratio attained using EHD (0.1159), PCEHD (0.137) and LGHD (0.227). It is noted that the inlier ratio for G-EOH descriptor of image patch or window size -IP:40 is 0.323 is high and stable across photometric and geometric deformation within LWIR and VS image pairs. It is noted from Table 6.1 that the overall number of LWIR and VS image pairs registered with AGTE < 10, using G-EOH descriptor of image patch (IP-40) is 28, that is greater than the overall number of LWIR and VS image pairs registered/aligned using EHD (24), PCEHD (31) and LGHD (48). Average ground truth error directly reflects the quality of feature point transformation between LWIR and VS image pairs. It is perceived from Table 6.1 that the average ground truth error(AGTE) attained for G-EOH descriptor of image patch (IP-40) is 6.851, that is lesser AGTE attained using EHD (12.072), PCEHD (9.063) and LGHD (7.471). The choice of the appropriate image patch size for every feature point is a significant element of the G-EOH descriptor computation. Table 6.1 shows that the number of registered images and the error is low for IP-40, compared to IP-60 and IP-80. It is observed from the Table 6.2 that extremely small or a big image patch size would enhance the false matching of feature points. Hence, IP-40 is fixed for extracting the features. Using SIFT as feature detector, proposed G-EOH as feature descriptor with image patch size −40, 53 visual and infrared image pairs with viewpoint, rotation, blur, down sampling, noise and scale deformations are registered. The average feature point error is computed with the ground truth provided by the dataset. Table 6.2 shows the evaluation of multi-modal image registration using G-EOH as feature descriptor for 53 images of various deformations, A∗ —Blur(10), B ∗ —Down sampling(3), C ∗ —Noise(10), D ∗ —Rotation(18), E ∗ —Scale(6), F ∗ —Viewpoint(6), AGT E ∗ — Avg ground truth error, R I ∗ —Registered images, T P ∗ —True positive matches. It is
6 Improved Multi-modal Image Registration Using Geometric Edge …
73
Table 6.2 Evaluation of proposed feature descriptor G-EOH for 53 images of various deformations from dataset-II Feature Evaluation A(10) B(3) C(10) D(18) E(6) F(6) descriptor EHD
PCEHD
LGHD
G-EOH
AGTE RI TP AGTE RI TP AGTE RI TP AGTE RI TP
13.45 3 62 11.5 5 19 9.49 7 21 7.48 8 26.99
15.32 3 14 14.33 3 13 13.32 3 16 12.31 3 25.9
12.44 6 361 10.49 7 73.06 8.54 8 89.26 6.59 8 98.26
15.52 3 10 14.285 4 15 12.375 4 18 10.465 12 23
16.81 2 13 14.86 3 16 12.85 3 21 10.84 5 24
13.2 4 12 11.99 5 13.56 10.14 5 17 8.29 5 22.65
noted from Table 6.2 that the number of true positive matches obtained using G-EOH is higher, for all possible deformations, as a result more number of LWIR and VS image pairs are registered with less ground truth error compared to EHD , PCEHD and LGHD.
6.5 Conclusion Feature-based multi-modal image registration is a challenging task, owing to the characteristic discrepancies in the image characteristics. Popularly used conventional feature descriptors based on gradient and intensity do not address nonlinear intensity deformations. Contemporary descriptors based on edge orientations, log-Gabor filters and phase congruency though can handle the nonlinear intensity changes but fail to handle geometric deformations like scale and rotation. To incorporate geometric in-variance, the extracted M ∗ N image patch is re-sized and rotated by a scale σi and orientation θi obtained from the feature detector, corresponding to the feature point. Evaluation results indicate that the assigned scale and orientation can considerably enhance the registration performance of geometrically deformed visual and long wave infrared images. Furthermore, the proposed edge-oriented histogram variant employs Scharr filter response to extract local and global gradient orientation. Registration performance of proposed descriptor is compared with the log-Gabor and phase congruency-based descriptors, through objective measures estimated from generated ground truth. Results shows that G-EOH feature descriptor considerably increases the registration performance.
74
B. Sirisha et al.
References 1. Zitov, B., Flusser, J.: Image registration methods: a survey. Image Vis. Comput. 21(11):977– 1000 (2003) 2. Gade, R., Moeslund, T.B.: Thermal cameras and applications: a survey. Machine vision and applications 25(1), 245–262 (2014). Jan 3. Li, Y., Zou, J., Jing, J., Jin, H., Hang, Y.: Establish keypoint matches on multispectral images utilizing descriptor and global information over entire image. Infrared Phys. Technol. 76, 01 (2016) 4. Cai, G.-R., Jodoin, P.-M., Li, S.-Z., Wu, Y.-D., Su, S.-Z., Huang, Z.K.: Perspective-sift: an efficient tool for 339 low-altitude remote sensing image registration. Sig. Process. 93 (2013) 5. Shengfeng, H., Rynson, L.: Saliency detection with flash and no-flash image pairs. In: Proceedings of European Conference on Computer Vision, pp. 110–124 (2014) 6. Shen, X., Xu, L., Zhang, Q., Jia, J.: Multi-modal and multi-spectral registration for natural images. In: ECCV, Zurich, pp. 309–324 (2014) 7. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004) 8. Mikolajczyk, C.S.: Scale and affine invariant interest point detectors. Int. J. Comput. Vis. 60(1), 63–86 (2004) 9. Leng, C., Zhang, H., Li, B., Cai, G., Pei, Z., He, L.: Local feature descriptor for image matching: a survey. IEEE Access 7, 6424–6434 (2019). https://doi.org/10.1109/ACCESS. 2018.2888856 10. Kumar, R.M.: A survey on image feature descriptors. Int. J. Comput. Sci. Inf. Technol. 5(6), 7668–7673 (2014) 11. Aguilera, C., Barrera, F., Lumbreras, F., Sappa, A.D., Toledo, R.: Multispectral image feature points. Sensors 12, 12661–12672 (2012) 12. Won, C.S., Aizawa, K., Nakamura, Y., Satoh, S.: Feature extraction and evaluation using edge histogram descriptor in MPEG-7. In: Advances in Multimedia Information Processing— PCM, pp: 583–590 (2005) 13. Aguilera, C.A., Sappa, A.D., Toledo, R.: LGHD: A feature descriptor for matching across non-linear intensity variations. In: IEEE International Conference on Image Processing (ICIP) 2015, 178–181 (2015). https://doi.org/10.1109/ICIP.2015.7350783 14. Mouats, T., Aouf, N.: Multimodal stereo correspondence based on phase congruency and edge histogram descriptor. In: Proceedings: International Conference on Information Fusion, Istanbul, Turkey, vol. 912, pp. 1981–1987, July 2013 15. Mouats, T., Aouf, N., Sappa, A.D., Aguilera, C., Toledo, R.: Multispectral stereo odometry. IEEE Trans. Intell. Transp. Syst. 16, 1210–1224 (2015) 16. Liu, X., Li, J.B., Pan, J.S.: Feature point matching based on distinct wavelength phase congruency and Log-Gabor filters in infrared and visible images. Sensors (Basel) 19(19), 4244 (2019). https://doi.org/10.3390/s19194244. PMID: 31569596; PMCID: PMC6806253 17. Yong, L., Shi, X., Wei, L., Zou, J., Chen, F.: Assigning main orientation to an EOH descriptor on multispectral images. Sensors 15, 15595–15610 (2015)
Chapter 7
Reddit Sentiments Effects on Stock Market Prices Arnav Machavarapu
Abstract The market capitalization of GameStop (GME) was listed to be 918 million dollars at the beginning of 2020, increasing 25 fold to 23 billion dollars by the end of January 2021. Similar shifts in market capitalization were seen in other publicly traded companies as well. In this study, sentiment data from r/WallStreetBets discussion board and stock price data over time are garnered. This information is utilized to train a longitudinal long short-term memory model (LSTM) to predict the stock prices of GME and AMC Entertainment Holdings (AMC). Using LSTM architecture, three models are developed with separate input features: previous day’s close price, sentiment data only, and a model with both sets of data. It is observed that sentiment data alone can be predictive of stock prices. However, the models containing solely close price or close price and sentiment data perform significantly better in terms of validation loss, and an average difference in stock price prediction during the validation set of $X and $Y for GME and AMC, respectively. These results show that if institutional investors had included sentiment data in their longitudinal models for predicting AMC and GME stock prices, they might have been able to avoid the short squeeze event.
7.1 Introduction In early 2021, a short squeeze, or an increase in price due to an increase in shorts, took place. This happened with numerous stocks, most notably Gamestop (GME) and AMC Entertainment Holdings (AMC). Many institutional investors including hedge funds shorted GME due to its lackluster financials and sales in 2020. Retail investors organizing via social media in particular the r/WallStreetBets subreddit, a discussion board where members can discuss stock market, studied publicly available information on a high proportion of short positions held by institutional investors, and decided to buy GME stock, forcing the hedge funds to cover their short positions at high prices. Institutional investors failed to properly predict changing market A. Machavarapu (B) Westwood High School, Austin, TX, USA © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_7
75
76
A. Machavarapu
conditions arising from retail investors and trading sentiments being exchanged on a public forum. Institutional investors taking these short positions did not properly identify, study, and integrate social data into their trading models. Their failure was due to their lack of understanding of the importance of social media information in influencing market conditions and their inability to react quickly to social media sentiments which change over time. My study addresses both points by using long short-term memory model (LSTM) models trained on longitudinal social media information pre-processed via sentiment analysis. As a result of the inability to integrate social media information into trading strategies, Melvin Capital (investment management company) ended Jan 2021 with a 53% loss on investments as GME stock price increased over 108% in five consecutive days of trading [1]. Subsequently, many claimed that WallStreetBets was a primary factor in driving GME prices up. This was a key reason for my research, to explore this claim with a model utilizing sentiment analysis and an LSTM model. The importance of this short squeeze comes from the fact that chatters coming together on Reddit led to an increase in price of 750% for GME from Jan 27, 2020 to Jan 27, 2021 [2]. One trader turned a $50,000 dollar investment on GME into $22,000,000 through the power of WallStreetBets Redditors. This shows how simply online gatherings and chat boards can sway the price of stock prices and affect the stock market long-term, leading to the possible development of new trading strategies and approaches. My project involves taking post data from WallStreetBets and utilizing the stock price data of AMC and GME. Using the Reddit data, I have applied sentiment analysis to the titles and body text of the post, and used the previous day’s closing price and percentage change in price combined with the sentiment analysis to predict next day’s percentage change in price using a LSTM model. Broadly, my project aims to use these two sets of data and the accuracy of the model’s predictions to determine whether GME and AMC prices truly did correlate to the sentiments of Reddit posts.
7.2 Related Work LSTM models are extremely effective for modeling the trajectories of things over time, or sequence data [3]. The success of the LSTM model in this literature in comparison to the autoregressive integrated moving average (ARIMA) and generalized autoregressive conditional heteroskedasticity (GARCH) models inspired me to use a similar method in my project as it can be used to accurately capture the time-sequence data that is being input (sentiment data), and predict the volatility of GME and AMC. Bollen et al. [4] claimed that posts on Twitter could predict the Dow Jones Industrial Average (measured the performance of 30 notable stocks) with nearly 88% accuracy. This indicates that social media has had success in being a predictor of stock prices through the Dow Jones Industrial Average, further showing that Reddit post and title sentiment data could have an impact on GME and AMC prices. Nann [5] illustrated how company StockPulse worked with NASDAQ to incorporate social media sentiments into their market surveillance. This means that
7 Reddit Sentiments Effects on Stock Market Prices
77
a well-known stock exchange currently uses/used sentiments in its tracking to see whether it can be used to predict prices that my report seeks to solve.
7.3 Methods To determine if Reddit sentiment data over time is a predictive factor of AMC and GME stock prices, I utilized two methods—sentiment analysis and LSTM models. I generated sentiment information from WallStreetBets by applying sentiment analysis, then separated the posts by day and calculated the mean of positive, neutral, and negative sentiments per day. This created a quantified metric of the aggregate sentiment of authors on WallStreetBets over time. I theorized that this metric, and trajectory of this metric over time as modeled by an LSTM, would be an important factor influencing GME and AMC stock prices. I used a supervised machine learning model to predict their stock prices. Specifically, I labeled training data, in this case positive, neutral, and negative sentiment values as well as closing price of stocks from previous day.
7.3.1 Dataset I downloaded a dataset consisting of 44,597 Reddit posts from WallStreetBets from Jan 28, 2021 to Apr 8, 2021 [6]. Each of these posts had four key features: title, body text, number of up votes, and timestamp. I applied sentiment analysis to this dataset to generate average sentiment for each day. I integrated daily sentiment information with 49 days of stock price data of GME Historical Data [7] and 36 days of AMC Historical Data [8]. I then calculated percentage change in daily stock price and appended it to existing dataset. The purpose of my model was to predict percentage changes in day to day stock price using information from the previous three days. The final datasets used in training of the three models included each day’s average positive, neutral, and negative sentiment values for Reddit posts’ body and titles, close price, and percentage change in close price (Table 7.1). Table 7.1 Sample representation of sentiment data by day Average sentiment data Date
Posts#
Positive sentiment
Negative sentiment
Title
Body
Title
Body
Neutral sentiment Title
Body
Jan-28
1197
0.082546
0.055070
0.0878162
0.0522297
0.824632
0.461604
Jan-29
15,694
0.079729
0.054809
0.0883269
0.0523046
0.82895
0.463294
Jan-30
1424
0.097926
0.053763
0.0791819
0.0369593
0.820786
0.363622
78
A. Machavarapu
I separated the data into training and validation sets as I needed an independent source of data to test my model performance, as it would not be feasible to test model performance on data used to train the model. I used an 80/20 split on total amount of days of data. For GME, my dataset of sentiment and close price had 49 days of data. Therefore, 80/20 split resulted in 39 days of training data and 10 days of validation data. For AMC, my dataset of sentiment and close price had 36 days of data, and 80/20 split resulted in 29 days of training data and 7 days of validation data.
7.3.2 Sentiment Analysis I applied sentiment analysis to the data I took from WallStreetBets from Jan 28, 2021 to Apr 8, 2021 [6]. After applying sentiment analysis, I attained a numerical value for the positive, neutral, and negative sentiments of the title and body of each post. I also noted the volume of posts on WallStreetBets for each day to attain a better understanding of how many posts were made and how this could have affected prices of GME and AMC [9]. The sentiment analysis model that I used in my project was developed by researchers based on a list of lexical features which were each assigned a score depicting sentiment intensity. Using these features, they had considered five rules that embody traditional grammatical rules and conventions to adjust the sentiment intensity scores accordingly.
7.3.3 Long Short-Term Memory Model (LSTM) LSTM is a type of supervised machine learning and a recurrent neural network (RNN) which can learn the dependence or order in a prediction problem over time [10]. Olah [11] explained how LSTMs were especially capable of learning long-term dependencies, or patterns over time-series data, and were specifically made to avoid problems regarding data over time. LSTMs have been used in stock price prediction on numerous occasions. Stated by Phi [12], LSTMs use a forget gate and a sigmoid function, to obtain a numerical value to determine whether or not to forget or keep the data (value closer to zero means to forget and closer to 1 means to keep). The data then moves to the input gate, determining what data must be updated through importance. This is done through a sigmoid function, and assigns a numerical value between zero and one, where zero means not important and one means important. The hidden state and input get put into the tangent function to regulate the network. Lastly, there is the cell state and output gate. The cell state simply drops the values in the cell state if the values get multiplied by a number close to zero and updates the values which the LSTM finds important. The output gate determines the next hidden layer, which is mathematically determined by putting the previous hidden layer and current input into a sigmoid function and putting the new cell state into a tangent function. We then multiply the outputs, thus giving the new hidden layer (Fig. 7.1).
7 Reddit Sentiments Effects on Stock Market Prices
79
Fig. 7.1 Diagram depicting how LSTM works [11] this diagram represents the input, hidden layer, sigmoid and tangent functions, as well as the cell state and output gate
I used LSTM as it is well-suited to form predictions over time-series data, can detect patterns and was designed to model long-range dependencies more precisely. Gate recurrent units (GRU) do not have output gate, and ARIMA can yield better results in short-term forecasting but LSTM provides more options for fine-tuning. Zou and Qu [3] applied LSTMs in stock prediction and quantitative trading, determining that attention-LSTM better captured long-term factors in time-series data. Attention-LSTM used levels of importance to train a model to predict prices, which outperformed other LSTMs such as a time-series ARIMA. Lu et al. [13] depicted an LSTM in comparison to ARIMA, GRU, and dual-stage attention-based RNN, and showed that LSTM can consistently outperform these other models. My LSTM model used input features of solely close price, solely sentiment data, or a combination of both to predict close prices for the following day. The sequence length of my LSTM was three days, meaning that the predicted close price would be influenced by the data of the previous three days in the dataset. I used a mean squared error (MSE) loss function to track the accuracy of the model and an Adam optimizer for stochastic gradient descent to achieve more accurate predictions.
7.4 Results I made three models to determine if WallStreetBets posts’ sentiments were predictive of the stock prices of GME and AMC, and to prove whether there was a correlation between these two factors. I trained three separate models—one on sentiment data only, one on close price only, and a final model containing both sentiment and close price data. The sentiment data only model was trained to determine if the post sentiments correlated to price data, and to what extent this conclusion was true. The close
80
A. Machavarapu
Fig. 7.2 Prices predicted based on LSTM model for AMC (left) and GME (right) prices with just sentiments. These two graphs depict the predicted and actual close price of AMC and GME where the vertical lines depict the start of the validation training set
price model and the model combining close price and sentiment were trained to establish whether incorporating WallStreetBets sentiment data improved the predictions of my model in comparison to solely close price model.
7.4.1 A Model Based on Solely Sentiment Data Can Be Predictive of Close Price This model was trained on solely sentiment data of titles and body text of posts from WallStreetBets from Jan 28, 2021 to Apr 8, 2021. As shown by Figs. 7.3 and 7.4 depicting training and validation loss of the models for GME and AMC prices, both losses decreased over the amount of iterations of training, indicating that the error was decreasing. The MSE value for training loss started at 0.2 and ended at 0.1, while validation loss started at 0.16 and 0.04, meaning that as the model was being trained, the loss (penalty for bad prediction), decreased and predictions became more accurate. Figure 7.2 shows the predicted prices made by the LSTM model using only sentiment data, proving that sentiment data was indicative of close price as the predictions after the validation are off by $0.5 for AMC and around $10 for GME.
7.4.2 Training on Only Close Price Yields Accurate Predictions Using the same LSTM model as used in prior model with only previous day’s closing price as the sole feature, the model yielded an accurate prediction of next day’s prices for both GME and AMC as shown by Figs. 7.3 and 7.4. In comparison to the model using only sentiment data, this model with close price had severely less error by
7 Reddit Sentiments Effects on Stock Market Prices
81
Fig. 7.3 Training and validation of LSTM for predicting GME prices. These two graphs depict the training (left) and validation (right) loss of the three models for prediction of GME prices
Fig. 7.4 Training and validation of LSTM for predicting AMC prices. These two graphs depict the training (left) and validation (right) loss the three models for prediction of AMC prices
0.03–0.06. This shows that only close price is a better predictive factor compared to only sentiment data of WallStreetBets. MSE values are ~0.06 for training loss and ~0.03 for validation loss, depicting how error in these models for close price was low. In Figs. 7.5 and 7.6, GME prices are similar to actual close price with a range of $5 above and below at the start of validation data. For AMC prediction, after the start of validation data, predictions for only close price were consistently below actual closing price by $1.50. This shows that close price predictions were similar to actual close price, meaning my LSTM model is moderately accurate irrespective of sentiment data.
7.4.3 Using Sentiment and Close Price Data Results in Marginal Improvement or Worse Performance Using the same LSTM model as in prior two models, however this time with both sentiment and close price data, yielded a semi-accurate prediction for both GME and AMC prices in terms of the plots depicting the close price relative to day. In
82
A. Machavarapu
Fig. 7.5 This plot depicts the three models’ predictions for GME prices over time based on LSTM. Vertical line at x = 39 represents the split from training to validation data
Fig. 7.6 This plot depicts the three models’ predictions for AMC prices over time based on LSTM. Vertical line at x = 29 represents the split from training to validation data
terms of GME prices, this model had nearly identical performance in comparison to solely close price in both training and validation loss. It had significantly less loss than the model with solely sentiments by 0.03 MSE according to Fig. 7.2, meaning predictions were more accurate over time. In Fig. 7.4, sentiment and close price model was nearly identical to actual close price, with predictions being between $5
7 Reddit Sentiments Effects on Stock Market Prices
83
and $10 above or below actual close price. However, when it came to predicting AMC prices using sentiments and close price, training and validation loss was significantly higher than the model with just close price by 0.03, and was similar to the loss of the model which used solely sentiment data as input feature. With predicted price plots in Fig. 7.5, model using both input features performed on par with the model with close price (above or below by $1). Although training and validation loss was higher for this model in comparison to the model with solely close price, plots containing predicted price appeared as if the models are relatively the same in terms of accuracy, hinting at potential overtraining of the AMC data.
7.5 Discussion Figures 7.3 through 7.6 depict training and validation loss of GME and AMC and predicted prices of the three models in comparison to actual close price. For GME, sentiment data loss was low, however the MSE value was not comparable in terms of accuracy to the models containing solely price data or a combination of the two, as MSE loss of sentiment data model was ~0.04 higher than the other two models. However, model containing a combination of the two was on par in terms of accuracy regarding the model with solely close price, as MSE loss was nearly same, and plot depicted a nearly equal visual of accuracy between the two models. This shows that sentiments from WallStreetBets were able to help predict the price of GME, especially with volume of posts being high. However, AMC models depicted other conclusions. MSE loss for models that included sentiment data were higher than that of one with only close price by around 0.02–0.05. Although plots look accurate, this can be attributed to overtraining or that MSE loss was not high enough to where predictions were completely inaccurate. This hints at WallStreetBets not being as influential on AMC stock prices as it was on GME stock prices from Jan 28, 2021 to Apr 8, 2021. These findings can be corroborated by other research being done in the field, specifically Ranco et al. [14], which found that Twitter sentiment correlated to uncommon returns during peak Twitter usage. This means as the volume of posts increased on Twitter, returns increased as well. Also, Ranco et al. [14] found that sentiment peaks of Twitter implied the direction in which stock price would trend. Future expansions include testing this same theory regarding social media sentiments and their correlation on stock prices through other social media platforms such as Instagram, Twitter, or Facebook. The findings from future work could be used to corroborate results in this study regarding WallStreetBets and GME. These LSTM models can also be used to model and predict prices for other stocks affected by short squeezes in the same period, such as Nokia and Blackberry. Acknowledgements I would like to thank Dr Parsa Akbari, University of Cambridge, for the guidance and encouragement during this research.
84
A. Machavarapu
References 1. Murphy, M.: GameStop stock surges to highest point since January, market cap tops $17 billion. MarketWatch https://marketwatch.com/story/gamestop-stock-surges-to-highestpoint-since-january-market-cap-tops-17-billion-11615337376 (2021). Last accessed 22 May 2021 2. Wolff-Mann, E.: ‘Fighting 100 mini Mike Tysons’: the powerful influence of Reddit Trade. Yahoo Finance. https://finance.yahoo.com/news/fighting-100-mini-mike-tysons-the-powerfulinfluence-of-reddit-trade-141009102.html (2021). Last accessed 22 May 2021 3. Zou, Z., Zihao, Q.: Using LSTM in Stock prediction and Quantitative Trading (CS230). Stanford University, Stanford, United States (2020) 4. Bollen, J., Huina, M., Xiaojun, Z.: Twitter mood predicts the stock market. J. Comput. Sci. 2(1), 1–8 (2011) 5. Nann, S.: How does social media influence financial markets? Nasdaq. https://nasdaq.com/art icles/how-does-social-media-influence-financial-markets-2019-10-14 (2019). Last accessed 22 May 2021 6. Preda, G.: Reddit WallStreetBets Posts. Retrieved February 2021 from kaggle.com/gpreda/red dit-wallstreetsbets-posts (2021). Last accessed 22 May 2021 7. GameStop Corp. (GME) Historical Data: Yahoo Finance https://finance.yahoo.com/quote/ GME/history/. Last accessed 22 May 2021 8. AMC Entertainment Holdings, Inc. (AMC) Historical Data: Yahoo Finance https://finance. yahoo.com/quote/amc/history/. Last accessed 22 May 2021 9. Hutto, C.J., Gilbert, E.: VADER: a parsimonious rule-based model for sentiment analysis of social media text. In: Eight International AAAI Conference on Weblogs and Social Media, vol. 8(1) (2014) 10. Brownlee, J.: A gentle introduction to long short-term memory networks by the experts. Machine Learning Mastery. https://machinelearningmastery.com/gentle-introduction-longshort-term-memory-networks-experts/ (2017). Last accessed 22 May 2021 11. Olah, C.: Understanding LSTM networks. Colah’s Blog. https://colah.github.io/posts/2015-08Understanding-LSTMs/ (2015). Last accessed 22 May 2021 12. Phi, M.: Illustrated guide to LSTM’s and GRU’s: a step by step explanation. Towards Data Science. https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-bystep-explanation-44e9eb85bf21 (2018). Last accessed 22 May 2021 13. Lu, A., Zeyu, W., Huanzhong, X.: Stock Price Prediction with Deep Learning Framework (CS230). Stanford University, Stanford, United States (2018) 14. Ranco, G., Aleksovski, D., Caldarelli, G., Grˇcar, M., Mozetiˇc, I.: The effects of Twitter sentiment on stock price returns. PloS one 10(9), e0138441 (2015)
Chapter 8
Speech-Based Human Emotion Recognition Using CNN and LSTM Model Approach Kotha Manohar and E. Logashanmugam
Abstract Emotion identification is an interdisciplinary research topic that has gotten much attention from scholars in recent years. Automatic emotion recognition expects to accomplish an interface among the machines and people. It is consistently a troublesome issue, especially if emotion recognition is done by utilizing speech. Numerous massive exploration works have been done on emotion sensing by utilizing speech signals. The essential difficulties of emotion recognition are picking the emotion recognition corpora, finding various features related to speech, and selecting appropriate classification models. In this work, we utilize Mel Frequency Cepstral Coefficient (MFCC) as extracted features from speech and a neural network (CNN) with long short-term memory-based (LSTM) methodology for classification purposes. We picked Berlin Emotional Speech dataset (EmoDB) for classification. From the outcomes, it is clear that the proposed CNN-LSTM-based SER system has outperformed other existing works with an accuracy of 85% for all types of emotions.
8.1 Introduction Speech signals assume an essential part in communicating the feelings of the speaker. Perceiving feelings from speech signals have been a focus of ongoing study since it paves the way for the AI framework to be built. Emotion recognition has been used in a few domains, such as artificial intelligence and pattern recognition, to construct machine and human connections. The emotional feelings in the speech signal are dependent on the speaker style; additionally, the presence of longer length speech comprising of single feelings reduces different feeling emotions. As a result, it is critical to look at the emotions in the audio signals for various frame lengths. Speech emotion recognition (SER) is a technique for detecting emotions in audio signals. SER recognizes the emotional condition of the speaker by perceiving the feelings in the speech. Fusing SER in the artificial intelligence fields makes human–machine communication to be more practical. Recognizing appropriate emotions present in K. Manohar (B) · E. Logashanmugam Department of ECE, Sathyabama Institute of Science and Technology, Chennai, Tamil Nadu, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_8
85
86
K. Manohar and E. Logashanmugam
the speech by the emotion acknowledgment framework helps the speech investigation framework, and consequently, applications, like intelligence and surveillance, advantage through the SER. The speech signal used for the emotion acknowledgment contains an assortment of utterances, and every utterance in the speech signal has a feeling. So, the emotion recognizer has to identify the emotions present in every expression of the speech. Additionally, the speech’s emotional matter differs depending on lifestyle and the surrounding environment [1]. Extracting the required elements from the speech that indicate the speaker’s emotion is one of the essential phases in emotion recognition. Utilization of the hand-planned features has ended up being inadequate in determining the feelings in speech signals. As a result, enhanced characteristics must be used for extracting features. The researchers have essentially added to the advancement of the SER by presenting the different feature selection methods. The features have more central importance in emotion recognition since the features for emotions contrast with one another. Different features, like linear predictor coefficients (LPC), linear predictor cepstral coefficients (LPCC), and Mel Frequency Cepstral Coefficients (MFCC), fundamentally contribute to emotion recognition, and many deep learning methods are available for classification purpose. The emotion recognition framework can be sorted as a categorical system and a dimensional system. The first type identifies the feelings in the speech as sad, happy, stress, neutral, and angry. Whereas in the second type, the feelings are perceived as valence and arousal. Also, it is hard to recognize the emotion-based characteristics from many different available features. Utilizing the features identified with explicit emotions or identified with the language may improve the classifier’s presentation [2–4]. Choice of features and the database size assumes a significant part of the emotion recognition model. The fundamental steps in an emotion recognition framework are an emotional speech corpora are chosen or carried out; then, emotion explicit features are obtained from those speeches. Lastly, a grouping model is utilized to identify the speaker’s emotion. The actual test of identifying emotions from a speech is that each speech is pretty long. Hence, the MFCC method of feature extraction runs in a sliding window technique that implies it set a 25 ms outline on the speech flag and find out 13 cepstral coefficients from each edge, and those coefficients are utilized as features [5]. Based on the length of the speech signal, MFCC returns a diverse number of frames. Thus, from every speech signal, we have a diverse number of features which is not satisfactory. To make all the speech signals of equal length, we have applied some preprocessing techniques. We have utilized the CNN-LSTM classification model for the classification of emotions. Generally, CNN is utilized for two-dimensional space. A spectrogram picture formed from an audio signal has been used to contribute to CNN in several studies. As part of our contribution to CNN, we used a one-dimensional information space with 39 features per edge. We also utilize CNN’s output to contribute to memory element architecture [6, 7]. As the dataset, we used the Berlin Database of Emotional Speech (EmoDB) [8], which contains 535 expressions said by ten unique entertainers. There are seven emotional states included in this speech database: angry, pleased, anxious, bored, afraid, neutral, and disgust [8–10].
8 Speech-Based Human Emotion Recognition Using CNN and LSTM …
87
Deng et al. proposed a method of SER in which the extracted feature was arranged with the DTPM strategy. It used an SVM classifier for emotions classification, which has a better emotion classification rate, but it is lagging while continuously varying emotion recognition [11]. Badshah et al. presented the DCCN and kernel-based SER model, which uses polling operators and square kernels to train the convolution network. This model uses spectrograms of an input speech signal to identify the emotions. Because this model was trained using only minimally labeled data, training using this rectangular kernel will not ensure correct emotion identification [12]. The following is how the paper is structured: Sect. 8.1 introduces the emotion detection system and contemporary research that has made significant contributions to speech signal emotion recognition. The MFCC model of speech signal feature extraction is explained in Sect. 8.2.1. The suggested CNN-LSTM-based architecture for SER is explained in Sect. 8.2.2. The proposed method’s outcomes are described in Sect. 8.3, and the paper’s conclusion is summarized in Sect. 8.4.
8.2 Materials and Methods 8.2.1 MFCC Model MFCC is depicted in block diagram form in Fig. 8.1, which is used to extract the different features required to identify emotions for the SER system. MFCCs are calculated based on the hearing capacity of the human. In the MFCC technique, two kinds of filters are utilized. Certain filters are linearly separated under 1 kHz, and others are positioned logarithmically over 1 kHz. The feature extraction approach used by MFCC is broken down into a few parts, which are detailed here. A.
Preprocessing
We use specific preprocessing procedures on the informative dataset before using MFCC. All of the speech documents are in the .wav file type. With a sampling frequency of 16 k samples, we first determine the intensity of each file. Later, we take a weighted mean as indicated by the duration of speech documents and equate
Fig. 8.1 Sequential steps in MFCC
88
K. Manohar and E. Logashanmugam
these by zero padding to the more minor records to convert these equivalents to mean length records and yield the entire more enormous document with a similar reason. With this procedure, comprehensive speech documents are of the same length now. B.
Pre-emphasis
To boost the signal’s intensity, pre-emphasis is needed. During this phase, the speech signal is passed in a filter that increments the speech signal’s energy. The addition of the higher energy provides further data. C.
Framing
The speech signal is divided into 20–40 ms outlines in this step. Because the duration of a human voice might vary, this step is necessary to ensure that the size of speech is consistent. Although the speech signal is not stationary (i.e., the frequency might shift over time), it acts as a stationary speech signal for a short period. D.
Windowing
The windowing step comes after the framing process, and it reduces the voice signal inconsistencies at the beginning and conclusion of each frame. A frame is moved with a 10 ms range in this measurement. This means that each frame contains some of the previous frame’s content. E.
FFT
FFT is utilized to produce the frequency spectrum of all the frames. All the frames of the speech signal are changed from time-variable domain to frequency spectrum by FFT. It is utilized to discover all of the frequencies that are present in a specific frame [13]. F.
Mel scale channel bank
Each frame of the speech signal is passed through a set of 20–30 triangle filters. This set of filters determines the quantity of energy in such a given frame. The following Eq. (8.1) describes the numerical relationship between the ordinary frequency X and the Mel scale Y. Y = 2595 ∗ log(1 + X/700) G.
(8.1)
Log function
Upon determining the energy in each frame’s filter bank, a logarithmic function is applied. This is based on human listening capability, which is limited to a linear proportionate scale when listening to high volumes. At large volumes of sound, the human ear is unable to distinguish vast differences in energy. This log energy calculation gives the capabilities for people to hear these sounds. H.
DCT
8 Speech-Based Human Emotion Recognition Using CNN and LSTM …
89
DCT is determined for these log energies in the last stage. We used 25 ms frames with a sliding time of 10 ms. In addition, 26 filters were used. We estimated 13 MFCC features from each frame and also calculated the amount of energy in each frame. We estimated thirteen velocity and thirteen acceleration components by determining the derivatives of MFCC and energy after obtaining 13 MFCC features [14].
8.2.2 Emotion Classifier A classification framework is a way to categorize every speech signal frame to an appropriate emotion class as per the features extracted from the speech frames. There are various classification techniques accessible for emotion identification. Here, we utilized one-dimensional CNN with the LSTM model for the purpose of categorization.
8.2.2.1
CNN
CNN comprises different layers of convolution. Rectified linear unit (ReLU) or sigmoid function is utilized as an activation function. Here, the nodes of the input, hidden layer, and output layer are connected serially. Convolution operations on input and hidden layers produce an output of CNN. Every input speech frame is convolved with filters of different layers and combined to get the final output. One of the essential layers present in the neural network is the pooling layer, and it is used to sample the input speech signal from a specific filter. Max pooling is a typical pooling method. Here, the most extreme point is chosen from every filter. The pooling layer decreases the dimension of input information. Max pooling layer could also act on a window similar to its work on the entire matrix. After obtaining the final output of CNN, a system like RNN or LSTM multi-layer erceptron (MLP) can be utilized for training purposes. Figure 8.2 depicts the classifier model employed in our model.
8.2.2.2
LSTM
The final output of the network is applied as per the LSTM. While classifying the emotions of the speech signals, recurrent neural network-based (RNN) model designs have given better. RNNs use the temporal relationships present in a dataset, so it is helpful in the emotion recognition of speech. The performance of RNN is outstanding, but this has a problem of diminishing gradient because RNN might expand with time based on input sequence length. RNN does not work appropriately for long-duration input speech signals. This problem can be avoided by always keeping the error term as unity. It is called a constant error carousel (CEC), which is implemented by changing the RNN with LSTM. The point of utilizing this newly added node is to utilize CEC with the assistance of various gate units demonstrated in Fig. 8.3.
90
K. Manohar and E. Logashanmugam
Fig. 8.2 CNN classifier model
Fig. 8.3 LSTM memory cell
Memory cells are the nodes of the LSTM architecture. With the help of enabling and disabling the gates, we can add or delete data to the cell state. This makes the system capacity to retain or forgot the data about states that showed up long ago. The cell states and their outputs at the Kth time instant are addressed with C K and H K . The forget gate layer resets cell states. The input gate layer chooses the amount of information that will influence the present cell state. The output gate layer chooses what further output will influence the remaining network. Like this, LSTM kills the issue of vanishing error gradient. This empowers us to utilize RNN-LSTM for training the longer-duration signals.
8 Speech-Based Human Emotion Recognition Using CNN and LSTM …
91
8.3 Results and Discussion Initially, we have separated the entire dataset into two groups with data of 80 and 20%. 80% of the dataset was used to train the system, and 20% of the dataset was used to exam the system. The MFCC features with acceleration and velocity coefficients for each record of preparing and validating the dataset are calculated. These MFCC features are fed into CNN as input. We have used CNN with three layers to have 32, 16, and 8 filters and set 500 epochs in the neural network and have utilized the optimizer “adadelta” and activation function “ReLU” in our network. The LSTM network has given two hidden layers with 50 and 20 nodes in the first and second layers and utilized the “softmax” activation function. We have also utilized entropy for the calculation of data lost. After 500 epochs, we have achieved 90% training accuracy and 85% test accuracy. The graphical portrayals of error and accuracy with epochs are shown in Figs. 8.4 and 8.5, respectively.
Fig. 8.4 Error versus epochs
Fig. 8.5 Accuracy versus epochs
92
K. Manohar and E. Logashanmugam
Table 8.1 Confusion matrix of test data Emotion
Anger
Anger
20
Sadness 0
Neutral 0
Boredom 0
Happiness 1
Disgust 0
Fear 0
Sadness
0
10
0
1
0
0
0
Neutral
0
0
8
4
0
0
0
Boredom
0
0
1
12
0
0
1
Happiness
0
0
0
0
10
1
3
Disgust
0
0
0
0
0
7
2
Fear
0
0
2
0
1
0
10
The confusion matrix for training data is presented in Table 8.1. The diagonal values of this matrix are actual emotion recognition values. For a more significant part of the emotions, our network can perceive the appropriate emotion with a significant degree of precision.
8.4 Conclusion Utilization of MFCC features extracted from the input speech signal and CNNLSTM-based classification model for identifying emotion from the speech is one of the best approaches for a conventional emotion recognition framework. Even though the size of the informational dataset is not so big, our proposed model’s efficiency is sufficiently optimistic. Some standardized information or utilization of two-way LSTM rather than LSTM can prompt better results. Likewise, the ERS will deliver better results if we provide a more extensive dataset to the system.
References 1. Mannepalli, K., Sastry, P.N., Suman, M.: Emotion recognition in speech signals using an optimization-based multi-SVNN classifier. J. King Saud Univ. Comput. Inf. Sci. (2018). ISSN 1319-1578. https://doi.org/10.1016/j.jksuci.2018.11.012 2. Issa, D., Fatih Demirci, M., Yazici, A.: Speech emotion recognition with deep convolutional neural networks. Biomed. Signal Process. Control 59, 101894 (2020). ISSN 1746-8094 3. Lim, W., Jang, D., Lee, T.: Speech emotion recognition using convolutional and recurrent neural networks. In: 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), pp. 1–4 (2016). https://doi.org/10.1109/APSIPA.2016.782 0699 4. Manohar, K., Sravani, K., Ponnapalli, V.A.S.: An investigation on Scilab software for the design of transform techniques and digital filters. In: 2021 International Conference on Computer Communication and Informatics (ICCCI), pp. 1–5 (2021). https://doi.org/10.1109/ICCCI5 0826.2021.9402694
8 Speech-Based Human Emotion Recognition Using CNN and LSTM …
93
5. Manohar, K., Irfan, S., Sravani, K.: Object recognition with improved features extracted from deep convolution networks. Int. J. Eng. Technol. 7 (2018) 6. Elbarougy, R., Akagi, M.: Cross-lingual speech emotion recognition system based on a threelayer model for human perception. In: 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, pp. 1–10 (2013). https://doi.org/10.1109/APS IPA.2013.6694137 7. Anagnostopoulos, C.N., Iliou, T., Giannoukos, I.: Features and classifiers for emotion recognition from speech: a survey from 2000 to 2011. Artif. Intell. Rev. 43, 155–177 (2015). https:// doi.org/10.1007/s10462-012-9368-5 8. Basu, S., Chakraborty, J., Aftabuddin, M.: Emotion recognition from speech using convolutional neural network with recurrent neural network architecture. In: 2017 2nd International Conference on Communication and Electronics Systems (ICCES), pp. 333–336 (2017). https:// doi.org/10.1109/CESYS.2017.8321292 9. http://emodb.bilderbar.info/docu/#docu 10. Likitha, M.S., Gupta, S.R.R., Hasitha, K., Raju, A.U.: Speech based human emotion recognition using MFCC. In: 2017 International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET), pp. 2257–2260 (2017). https://doi.org/10.1109/WiS PNET.2017.8300161 11. Deng, J., Xu, X., Zhang, Z., Fruhholz, S., Schuller, B.: Semi-supervised auto encoders for speech emotion recognition. IEEE/ACM Trans. Audio Speech Lang. Process. (2017) 12. Badshah, A.M., Rahim, N., Ullah, N., Ahmad, J., Muhammad, K., Lee, M.Y., Kwon, S., Baik, S.W.: Deep features-based speech emotion recognition for smart affective services. Multimed. Tools Appl. 1–19 (2017) 13. Prasad, K.M.V.V., Suresh, H.N.: Integrated framework to study efficient spectral estimation techniques for assessing spectral efficiency analysis. Int. J. Adv. Comput. Sci. Appl. (IJACSA) 8(10) (2017) 14. Prasad, K.M.V.V., Suresh, H.N.: An efficient papametric model-based framework for recursive frequency/spectrum estimation of nonstationary signal. Int. J. Eng. Technol. 7(4.6) (2018)
Chapter 9
Recognizing the Faces from Variety of Poses and Illumination T. Shreekumar, N. V. Sunitha, K. Suma, Sukhwinder Sharma, and Puneet Mittal
Abstract The advancement of face recognition systems is still restricted by the conditions achieved by several real applications even though current face recognition systems have attained a moderate level of maturity. A facial recognition system is a technology capable of matching a human face from a digital image or a video frame against face databases. The proposed work introduces a novel approach for eliminating an illumination problem from the facial images using 4-patch Local Binary Pattern (4-patch-LBP) and then a Convolutional Neural Network (CNN) is used to recognize the faces. The method used showed an average recognition performance of 92.3% and a maximum performance of 96% with YouTube Face (YTF) Database.
9.1 Introduction Recognizing the human face is one of the most active topics in image pressing research because of its application in the field of security, Human-machine interactions, Bio-metric identification, etc. [1–3]. It is very difficult to recognize the faces that are degraded due to noise and blur. One of the major problems in FR is image blur and it is very difficult to identify the Faces from blurred images. The images will lose a lot of information that it contains due to the effect of the blur factor. One solution to this problem is restoring the image by developing an efficient de-blurring method and Removal of the blur across the face images may improve the performance of the recognition. Noise in the image is undesirable data in terms of variety of brightness levels and color. It conceals the required information in the image. Digital images are usually affected by noise [4]. Accuracy of most of the face recognition techniques drops down for the noisy images. Before applying recognition technique, various methods need to be used to remove the noise from image [5]. One of the approaches is to transfer the image to separate domain where noise can be separated easily. Alternative approach T. Shreekumar (B) · N. V. Sunitha · K. Suma · S. Sharma · P. Mittal Department of CSE, Mangalore Institute of Technology and Engineering, Moodabidri, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_9
95
96
T. Shreekumar et al.
is to take the image statistics directly from the image domain. These two methods can give better images but edge information may be lost. Hence there are some methods like fuzzy local binary patterns to directly recognize the face from the noisy images [6]. The common problem faced by the recognition system is image blur. The facial appearance of two or more persons may look the same due to the presence of blur [7]. The reason for blur images can be environmental disturbance, object movement, camera shaking, or out-of-focus camera. When the image is captured by the sensor, blur image may be the result of fast movement of the sensor or the object which is captured [8]. The sharpness of the body in a photo taken by the camera is estimated by its scene separation to the central plane of the camera. When body is exactly on the central plane the best sharpness is obtained. Image looks blurred when it is away from the central plane. Image looks more blur as the distance between object and central plane increases [9]. When a photo of the scene with multiple objects is captured, camera may have focused only on one object, and remaining objects may look blurring as they are out of focus. Noise and blur degrade the performance of the face recognition system by hiding the required features of the face needs to be recognized. Therefore, some techniques are required to deal with these two factors and restore the quality of the face. In this work, to remove noise and blur from the test image Convolutional Neural Network is used.
9.2 Literature Survey The work [8] have proposed different techniques for illumination normalization in face recognition. In this work, comparison and analysis of five different preprocessing techniques are carried out using Euclidean and cosine distance. DCT, GIC, and different histogram remapping techniques are used as normalization techniques. Among these techniques, GIC gave the best result when used with Euclidean distance for the considered databases. But performance is not up to the mark for CAS PEAL database. In [9] analyzed the face recognition rate using LDA, PCA, NCC, and SVM on ORL database. Here NCC and SVM are his classifiers that are trained by both negative and positive samples of each image. Experiments are carried out on different combinations of these methods like PCA and LDA, PCA combined with LDA and NCC, PCA combined with LDA and SVM. Redundant information is removed by applying geometric normalization to the image. Features are extracted from this normalized image using PCA and LDA. From the results of experiments, it is clear that the combination of PCA, LDA, and SVM gives the best recognition rate. The work [3] introduced automated face recognition system. It is able to recognize the person from the image that was not used in the training. PCA is used for dimensionality reduction and LDA is for feature extraction. The analysis and evaluation of the system are done through several classification algorithms like NN, SVM,
9 Recognizing the Faces from Variety of Poses and Illumination
97
ANFIS, and K-NN. Results show that the combination of LDA and PCA for feature extraction with classifier SVM gives a good recognition rate of 96%. The paper [10] proposed a model called SeqFace to use the Deep Neural Network to learn the discriminatory feature. The performance of FR has increased considerably during recent years, using Deep Learning Methods. There are also many datasets available to train these Deep Learning Networks. To achieve a state-of-the-art efficiency, high-quality data sets are necessary but expensive. SeqFace also provides a Face dataset which includes a large number of face sequences from videos collected. To improve the power of discrimination of deep-face features, a new loss function called, discriminative sequence agent (DSA) loss and a label smoothing regularization (LSR) are proposed using the video sequence. This method is also tested on the Labeled Faces in the Wild (LFW) and YouTube Faces (YTF), with a single ResNet and achieved an accuracy of 98.12% and 99.03% respectively. On synthetic noise deep learning based de-noising method achieves the best results but it is not suitable for realistic noise with photographs corrupted. These real-world face images with noise are challenging as complicated as synthetic noise. This issue is addressed in paper [11] by proposing NERNet, a novel network that includes estimation and removal module of noise. Noise level map is automatically estimated by noise estimation module which corresponds to pyramid feature fusion block and symmetric dilated block. With the help of estimated level map of noise, focus on removing noise from input noisy is done by removal module. This method performs better on both realistic and synthetic noise. Performance of this model is shown by conducting experiments using three realistic and two synthetic noise datasets. These experiments show that competitive results are achieved using NERNet model compared to state of art methods. In [12] relation between the video frames is used for video de-blurring. They set up a convolutional Recurrent Neural Networks architecture which uses information of inner frame. Information from the past frame to current frame is passed in the hidden state format. Iterative hidden state update scheme within a single inter-frame time-step is employed to fit the hidden state to fix into the target frame. Intra-frame iteration is also done in the same form. Intra-frame recurrence is inspected by varying the RNN compositions. Experiments show restoration accuracy is improved in intraframe recurrence scheme. Each model is trained with predefined intra-frame iteration number. Number of internal-iterations chosen randomly. Iteration number is decided by gating number. RNN cells are regularized to improve performance. This method removes blur in the video frames effectively and is fast and accurate compared to other methods.
9.3 Face Recognition Using Convolutional Neural Network The video-based recognition performance is normally affected by quality of image, Pose variation problem, brightness variation, partial occlusions, noise, and blur in the image/video. In video-based recognition system, two critical issues to be addressed
98
T. Shreekumar et al.
Video Data
Frame-1
Frame-2
Frame-3
Feature Extraction 4-Patch-LBP Feature
CNN Model CNN Training
CNN Testing
Recognition Fig. 9.1 Overall block diagram of proposed face recognition model
are: (i) Feature extraction and representation, (ii) Developing the robust model to identify the Face from the represented feature. A CNN model is devised to identify the Faces. Normally Deep CNN based system is used to perform FR task. Few results state that the CNN performance can be increased by training the model with extracted features [3]. Hence, in the proposed method, illumination robust 4-Patch LBP features are used to train the model. From the experiment, it is clarified that the
9 Recognizing the Faces from Variety of Poses and Illumination
99
computation speed and the statistical performance are better compared to CCN based FR where the inputs to the CNN are images instead of image features. Figure 9.1 shows the overall structure of the system.
9.4 Experimental Results In this part, the experimental results are presented in detail. Here we used 3.2 GHz Pentium Core i5 processor and MATLAB version 7.2 for experiment. The result analysis is carried out with YouTube Video Face Database (YTF). Table 9.1 shows the result of our experiment. For the experiment, four-fold datasets are used. Each dataset contains 10 images. All the images are obtained from YTF Face set. A result of 96.00% is obtained during the experiment and the same is tabulated in Table 9.2. Table 9.2 also shows an Average Accuracy of 92.3%. The experiment is conducted to record the recognition speed. Five separate experiments are conducted and the speed of the recognition is recorded in Table 9.3. Table 9.1 The databases and the data samples Datasets
Identities
Images
Images identities
Image size
Image type
YouTube Face (YTF)
50
700
14
320 × 240
JPEG
Table 9.2 Recognition result with YTF Video Face Statistical measures
Dataset1
Dataset2
Dataset3
Dataset4
Average
Recall
96.00
94.00
92.00
88.00
91.5
Precision
96.00
94.00
92.16
88.00
91.6
Accuracy
96.00
94.00
93.00
88.00
92.75
Table 9.3 Computation speed in seconds Experiment No.
LBP-CNN
1
0.0450
2
0.0664
3
0.0645
4
0.0676
5
0.0697
100
T. Shreekumar et al.
Accuracy
98 96 %
94 92 90 88 86 Accuracy
Proposed
LBP-CNN
AAM
CNN
96
95
90
92
Fig. 9.2 Accuracy analysis
9.4.1 Comparison with State-of-Art Methods The proposed method is compared with LBP-CNN [13], Active Appearance Model (AAM) based Face Recognition [14] and image-based CNN methods. Image-based CNN is implemented for comparison purpose by selecting the images from YTF. The image based CNN FR takes the high-resolution images as input to the training model and the model extracts the features in hidden layers with three hidden layers. The proposed method takes four-patch LBP feature to train the proposed model. Figure 9.2 shows the result of the experiment. From the result, it is clear that the proposed system obtained a maximum accuracy of 96%, AAM [14] obtained 90%, LBP-CNN [13] obtained 95% and image-based CNN model obtained 92%. The proposed method obtained higher accuracy of 96% which is higher than all the selected models.
9.5 Conclusion For Illumination normalization, the technique based on local binary pattern is very effective, hence a new version of LBP, i.e., 4-Patch-LBP is adopted in our experiment for face feature extraction. A Deep learning method, i.e., CNN based recognition technique is effective in accurate prediction. The CNN-based prediction is more effective when the inputs are features. Throughout our experiment, we used a combination of 4-Patch-LBP and CNN for obtaining accuracy of 96% on YTF Face set.
References 1. Shreekumar, T., Karunakara, K.: Face pose and illumination normalization for unconstraint face recognition from direct interview videos. Int. J. Recent Technol. Eng. (TM) 7(6S4), 59–68 (2019). ISSN: 2277-3878 (Online)
9 Recognizing the Faces from Variety of Poses and Illumination
101
2. Shreekumar, T., Karunakara, K.: Face pose and blur normalization for unconstraint face recognition from video/still images. Int. J. Innov. Comput. Appl. (in press). ISSN:1751-648X 3. Shreekumar, T., Karunakara, K.: Identifying the faces from poor quality image/video. Int. J. Innov. Technol. Explor. Eng. (IJITEE) 8(12), 1346–1353 (2019). ISSN: 2278-3075 4. Shreekumar, T., Karunakara, K.: A video face recognition system with aid of support vector machine and particle swarm optimization (PSO-SVM). J. Adv. Res. Dyn. Control Syst. (JARDCS) 10, 496–507 (2018) 5. Chai, X., Shan, S., Chen, X., Gao, W.: Locally linear regression for pose-invariant face recognition. IEEE Trans. Image Process. 16(7), 1716–1725 (2007). https://doi.org/10.1109/TIP.2007. 899195 6. Hosgurmath, S., Mallappa, V.V.: Grey wolf optimizer with linear collaborative discriminant regression classification based face recognition. Int. J. Intell. Eng. Syst. 12(2), 202–210 (2019) 7. Shermina, J.: Impact of locally linear regression and Fisher linear discriminant analysis in pose invariant face recognition. Int. J. Comput. Sci. Netw. Secur. 10(10), 106–110 (2010) 8. Dhamija, J., Choudhury, T., Kumar, P., Rathore, Y.S.: An advancement towards efficient face recognition using live video. In: 2017 3rd International Conference on Computational Intelligence and Networks (CINE), Odisha, pp. 53–56 (2017). https://doi.org/10.1109/CINE.201 7.21 9. Mantoro, T., Ayu, M.A., Suhendi: Multi-faces recognition process using haar cascades and eigenface methods. In: 2018 6th International Conference on Multimedia Computing and Systems (ICMCS), Rabat, pp. 1–5 (2018). https://doi.org/10.1109/ICMCS.2018.8525935 10. Siddiqui, M.Y.F., Sukesha: Face recognition using original and symmetrical face images. In: 2015 1st International Conference on Next Generation Computing Technologies (NGCT), Dehradun, pp. 898–902 (2015). https://doi.org/10.1109/NGCT.2015.7375249 11. Petpairote, C., Madarasmi, S., Chamnongthai, K.: A pose and expression face recognition method using transformation based on single face neutral reference. In: 2017 Global Wireless Summit (GWS), Cape Town, pp. 123–126 (2017). https://doi.org/10.1109/GWS.2017.8300485 12. Chen, Z., Xu, T., Han, Z.: Occluded face recognition based on the improved SVM and block weighted LBP. In: 2011 International Conference on Image Analysis and Signal Processing, Hubei, pp. 118–122 (2011). https://doi.org/10.1109/IASP.2011.6109010 13. Zhang, H., Qu, Z., Yuan, L., Li, G.: A face recognition method based on LBP feature for CNN. IEEE Xplore (2017) 14. Prasanna, K.M., Rai, C.S.: A new approach for face recognition from video sequence. In: 2018 2nd International Conference on Inventive Systems and Control (ICISC), Coimbatore, pp. 89–95 (2018). https://doi.org/10.1109/ICISC.2018.8398969
Chapter 10
Experimental Analysis of Cold Chamber with Phase Change Materials for Agriculture Products G. Bhaskara Rao and A. Parthiban
Abstract Fruits and vegetables are very healthy products with considerable value to people’s health. They are also very perishable and are therefore readily spoiled, resulting in a decrease in quality and food waste. Over the years, cool chain solutions have been used to decrease the loss of quality of crops and fruits from the food supply chain. Nevertheless, high losses (50%) continue to occur when these fresh agricultural products are packaged, pre-cooled and distributed and maintained. Storage of these products needs to maintain chambers with stabled temperature. The first research describes the application, in comparison with the standard boundary, of the PCM layer on the outside of a refrigerated cold chamber to lower and change the cooling energy. To this objective, a numerical and experimental study methodology was used to measure the suggested technology. To validate the mathematical model, calculation results have been compared with experimental values. The implementation of a PCM air heat exchanger in the cold room evaporator has been experimentally investigated for refrigerated storage. Nano-based PCMs like 40% tetra n-butyl ammonium bromide + 2% borax, 45% tetra n-butyl ammonium bromide + 1% borax and RT21-paraffin used in the investigation to check the variation in cold chamber.
10.1 Introduction The primary energy issue is directly or indirectly caused by refrigeration and airconditioning systems, as their use in household, commercial and transportation industries is growing rapidly. Currently, power reduction has become the most commonly caused by accidents or by the use of demand-side management schemes to change energy consumption to ensure that the power supplier does not have high electrical cost during off-peak (electric load shift). Most frozen, refrigerated foods are sensitive to variations in temperature. The heat penetrating the walls contributes significantly to heat loadings for a cold factory. The cooling system eliminates the heat, but when power fails, the stored product is not cooled. Thermal energy storage (TES) can use G. Bhaskara Rao (B) · A. Parthiban Vel’s Institute of Science, Technology and Advanced Studies, Vijayawada, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_10
103
104
G. Bhaskara Rao and A. Parthiban
materials of phase change for heat and cold storage in switched time. Phase change material (PCM) melts within a small temperature range and consumes significant quantities of energy during the transition state, reducing environmental temperature increases to a maximum. PCM may be used for heat capacity, to preserve the appropriate internal temperature during power loss with appropriate melting temperatures. In charge shedding applications, PCM can also be used for shifting energy consumption to an optimum time. In recent years, cold thermal energy storage systems (CTES) have gained interest. Cold storage is a specific sort of room which maintains its very low temperature with machines and precision equipment. The location of India and a vast array of soils produces a range of vegetables, like apples, grapes, oranges, potatoes, chilies and ginger. Marine goods, due to large coastal areas, are also made in large quantities. Present production of fruit and vegetables is over 100 million tons, and population growth and demand keeps in mind that risky commodities are developing every year. The main infrastructural portion for such commodities is the cold storage. Besides regulating the price of the market and producing on demand and on schedule, cold storage provides other benefits both to farmers and consumers. During shipments, the temperature and relative humidity at various places on the cold chain system are often extensive. The diverse nature of refrigeration systems, food characteristics and containers can reveal significant variations in the approach airspeed of different fruit and vegetables.
10.2 Literature Review Etheridge et al. [1] designed a cooling module for the heat pipe/PCM, to reduce air conditioning for construction applications. The heat pipes, as cooler, transmit the thermal energy to and frozen during the night to a PCM that may be distilled during the day. Rifat et al. [2] a novel thermoelectric cooling module with heat pipes and PCM was designed. Two examples have been investigated. In the first instance, the cold part of the thermoelectrical device was fitted with a temperature unit, whereas the second was employed with a PCM instead of a heat sink. A hot tubing fine as the heating sink has been put into the hot part of the thermoelectrical unit for both scenarios. Al-Maghalseh and Mahkamov [3] focused on improved heat transfer techniques in the heat-saving modules of PCM. Fins and porous materials are research described as instruments for improving heat transfer in PCMs. In these procedures, the capacity of the module is important to provide the same quantity of saved heat, and this is the main disadvantage of these systems. Jaguemont et al. [4] the potential for automotive applications of PCMs has been investigated. Their analysis suggested that, in the case of a significant improvement in the conductivity of these materials, PCMs are a cost-effective simple thermal solution in automotive systems.
10 Experimental Analysis of Cold Chamber with Phase Change …
105
Huang et al. [5] focused on the potentials in heat-saving system of microencapsulated PCM. They showed that the micro-encapsulate PCM’s spherical structure increases the surface-to-volume ratio which can increase PCM heat dissipation values. Kamel et al. [6] reviewed the integrated solar modules of heat pumps. They said that the design of the control unit is a major problem when combining an air heat pump with a photovoltaic/thermal module. This facilitates optimum operation of this module under various conditions. Pesaran et al. [7] many models for simulating heat adsorption pumps have been studied. Please note that the heat absorption pumps use environmentally friendly fluids and they can use low-grade heat loss energy as their first drive. Byrne and Ghoubali [8] evaluated air heat pump’s potential to concurrently deliver cooling and heating. Simultaneous method produces warm water using thermal energy from cold water and is colder. In this instance, there are two simultaneous thermal energy amounts that can be utilized for heating and cooling. Rashidi et al. [9] the objective of the PCM linked with the thermal pipe is to incorporate the good thermal conductivity of the thermal channel with the huge heat capacity of the PCM latent. PCMs can balance the difference between supply and demand for heat pumps and reduce the storage tank size of the heat pump module. Han et al. [10] the possibility of pulsating heat pipes has been evaluated. Their review demonstrated that the heat efficiency of pulsing heat pipes is significantly affected by internal diameter, weight, number of circulations, load ratio and angle of tilt. Wu et al. [11] Evaluating the potential of heat pipe systems utilized in buildings in the heat pump module. Their use of heat pipes in ground heat pump modules had the potential to increase the difference in temperature between geothermal and evaporation elements of the ground heat pump and to reduce the energy spent in the circulation pump. Shafeian et al. [12] focused on heat pipe solar receptor potential. They determined that it is very important to improve the efficiency of solar thermal piping tubes and, thus, to achieve this objective, novel fluids, such as different nanofluids have been used. Valipour et al. [13] forced convection simulated around a tubular porosity. Chen et al. [14] examined the efficiency of using PCM as grout for a vertical U-pipe heat exchanger utilized in the ground heat pump module. They utilized the least-square methodology to tell the average Nusselt number. They observed that the PCM grooves with modest thermal conductivity values significantly lower the module performances, whereas PCM grooves with high thermal conductivity equivalent to that of a conventional groove route can improve the module’s performance and operational stability. Bottarelli et al. [15] use PCM in the heat pump of the ground source. The PCM was directly mixed with the backfill material near the heat exchanger on the heat pump or in an enclosing shell that is directly linked to the heat exchanger. Nahor et al. used the k–e standard model in the empty filled space to determine the air velocity magnitude. But Delele et al. used a number of turbulence models for prediction of internal air distribution (k–e, RNG k–e, feasible k–e, k–x Shear Stress Transport— SST). Liu et al., by its capacity to better describe the whirling flux side, concluded that k–x SST model is optimal for air velocity prediction as compared with other models (RKE and RSM). Mathematical simulation was accepted as a successful way of technology with improved processing power and efficiency and the availability of affordable computer packages. Mathematical modelling has been utilized
106
G. Bhaskara Rao and A. Parthiban
in the storage life industry for the optimization and development of equipment and operational strategy and has expanded considerably in the recent decade. Erriguible et al. studied the water evaporation in porous media by means of the CFD instrument, using a convective mechanism and generating the equations of Navier–Stokes, and achieved comprehensive literature of pressure, temperature, mass flow and heater in use of convection drying models in permeable material.
10.2.1 Governing Equations The presence of complicated wall geometries and high air speeds of the axial flow generated by the axial fan reveals turbulent conditions around the containers. Based on these flux conditions, Eqs. (10.1), (10.2) and (10.3) shall be applied to conserve mass, impulse and scalar amounts in the system. ∂ p/∂t + ∇ · (∂ρU ) = 0
(10.1)
∂(ρU )/∂t + ∇ · {ρU ⊗ U } = ∇ · {σ − ρu ⊗ u} + SM
(10.2)
∂(ρ 1)/∂t + ∇ · (ρU 1) = ∇ · (Dα∇ 1 − ρu 1) + S1
(10.3)
10.2.2 Scope of Work India is the world’s largest fruit and vegetable production, but its availability is substantially low due to post-harvest losses, which represent approximately 25–30% of production. Even if they risk growing valuable fruit and vegetables year after year, they stay poor. Launching the cold storage/cold room facility helps them to remove the danger of trouble sales. Fruit and vegetables produce 18–20% of our agricultural produce every year throughout the country.
10.2.3 Objectives • To check the isotropic convection heat transfer in cold room. • To check the thermal contour variation with different phase changing materials. • To validate experimental comparison of heat transfer maintained in the cold room.
10 Experimental Analysis of Cold Chamber with Phase Change …
107
10.3 Cold Room Modelling Most fruits and vegetables are kept at environmental harvesting temperatures after harvesting. Refreshing after harvest removes heat from the land rapidly, thus allowing more storage periods. Ripening Agent and Treatments for Normal Fruits and Ripening Chamber Data Ethylene @100 PPM Temperature range for PCM—20–24 °C Relative humidity—80, 85, 90 and 95% Capacity of chamber—4 tons or 4000 kg Chamber volume—36.6 m3 Chamber size—3.6 × 3.0 × 3.6 m or 12 × 10 × 12 ft. Cooling load—2 TR. Dimensions of Fruit Crate Length = 53 cm = 0.53 m, Width = 30.5 cm = 0.305 m, Height = 28.5 cm = 0.285 m Weight of crate with no load = 1.68 kg, weight of crate with 50 mangoes = 20.35 kg Distance from wall to chilling tank = 120 cm, distance between crates = 30 cm = 30 × 5 = 150 cm Distance from wall to side gap of chilling tank = 67.5 cm.
10.4 Methodology In particular, throughout time we analyse temperature variations as cold charge and discharge of phase change materials. We also attempt to analyse the physical and chemical properties of the phase change materials listed previously. In the design of the water freezing cold storage tank calculation, the characteristics of heat storage and the application of phase change chemicals will be applied for energy saving. It will be as follows the precise procedure for this experiment.
10.4.1 Preparation of PCM with NANO PCM encapsulation is a suitable solution for heat transmission enhancement and prevents PCM from being mixed with the fluid. The PCM containment utilized for embedding should have the strength, flexibility, resistance to corrosion and thermal stability. The surface area for thermal transmission and structural stability should also be included. Macro-encapsulation, micro-encapsulation and nano-encapsulation
108
G. Bhaskara Rao and A. Parthiban
Fig. 10.1 Scheme for preparation of HY + borax and HyN PCM
are many types of encapsulation technologies. In macro-scales of metallic or polymerized film, PCM filled in blocks, pouches, spherical capsules, etc., is called macro-encapsulation (Fig. 10.1). Experimental results show that low temperatures of the refrigerant and high fluid velocity increase chamber efficiency. Micro-encapsulation and nanoencapsulation refer, respectively, to the filling of PCM in micro and nanotechnology polymer capsules. Allouche et al. performed a performance investigation of micro-encapsulated air-conditioning PCM (Fig. 10.2).
Fig. 10.2 a The microstructure of PCM after addition of 2% borax, b SEM images of nanoparticles after dispersion, c PCM placement design
10 Experimental Analysis of Cold Chamber with Phase Change …
109
Borax hybrid addition concentration with ammonium boride at low temperature was also explored for improved input quality in PCM substitution. The influence on supercooling degree of Na2 SO4 ·10H2 O-Na2 HPO4 ·12H2 O eutectic salt hydrated by means of multifactor orthogonal tests has been investigated in the compound nucleating agents (nano-a-borax) [B4 O5 (OH)4 ]. Nano-a-Na2 ·8H2 O can be a good thermally conductive filler and a high-efficiency nucleatory agent. The Na2 SO4 ·10H2 O-Na2 HPO4 ·12H2 O supercooling grade is lowered from 7.8 to 1.6 °C. 61.3% more thermal conductivity and 4.5% nano-a-Na2 [B4 O5 (OH)4 ] more thermally. Borax at ·8H2 O and 1.0 wt. Li et al. has added g-Na2 [B4 O5 (OH)4 ]·8H2 O in CaCl2 ·6H2 O to maintain or enhance the thermal conductivity and minimize CaCl2 ·6H2 O supercooling.
10.4.2 Phase Change Materials
S. No. PCM
Melting point (°C) Heat of fusion (k/kg)
1
40% tetra n-butyl ammonium bromide + 2% 9.3 borax (H1 + borax)
114
2
45% tetra n-butyl ammonium bromide + 1% 12.5 borax (H1)
195.5
3
RT-21-paraffin (R21)
21
134
4
Salt hydrate-S21
21
171
Experimental work on the cold chamber phases with input and variations of fruit and maturing treatments. To study and utilize the nanoparticles production to various applications, the characterization of nanomaterials is necessary. It also shows the average nanoparticle particulate size and geometric aspect. The effect of adding nanoparticles on the underlying material is controlled by differential scanned analysis (DSC). However, all balls were melted almost at the same time in this experiment in the end. For pure PCM investigations, maximum time is required; however, its melting time in these experiments is reduced by adding
110
G. Bhaskara Rao and A. Parthiban
Fig. 10.3 Monitoring system developed a reference cold room, b cold room with PCM-added layer (in green). Refrigeration unit evaporator (1) and condenser (2)
nanoparticles Na2 [B4 O5 (OH)4 ]·8H2 O. This observation reveals, of consequence, that Na2 [B4 O5 (OH)4 ]·8H2 O nanoparticle are increased in pure PCM (Fig. 10.3). Sensor name Reference cold room location sensor
PCM cold room location of sensors
R1
Compartment air temperature at 1.8 m
Compartment air temperature at 1.8 m
R2
Compartment air temperature at 1.5 m
Compartment air temperature at 1.5 m
R3
Compartment air temperature at 1 m
Compartment air temperature at 1 m
R4
South external surface temperature
South external surface temperature
R5
South internal surface temperature
South internal surface temperature between PCM layer and PE foam
R6
East external surface temperature
South internal surface temperature
R7
East internal surface temperature
East external surface temperature
R8
West external surface temperature
East internal surface temperature between PCM layer and PE foam
R9
West internal surface temperature
East internal surface temperature
R11
Horizontal external surface temperature West external surface temperature
R12
Horizontal internal surface temperature
West internal surface temperature between PCM layer and PE foam
R13
N/A
West internal surface temperature
R14
N/A
Horizontal external surface temperature
R15
N/A
Horizontal internal surface temperature between PCM layer and PE foam
R16
N/A
Horizontal internal surface temperature
Q1
Heat flux south region
Heat flux south region
10 Experimental Analysis of Cold Chamber with Phase Change …
111
Fig. 10.4 Cold chamber PCM input method for practical work
10.4.3 Practical Data Approach with PCM in Chamber Most significantly, since the difference of temperature between the external and inner side of the envelope is to reduce the incoming heat flow maximum. In addition, the PCM absorbs heat when the outside cover surface is affected by the maximum heat flow that occurs over the hours of the highest radiation from the solar system. When the temperature declines below the melting point, the latent heat trapped by PCM can be discharged. This storage release cycle usually causes a phase heat flux displacement compared with a typical structure, with a decreased cooling peak load from daytime tonight time. Experimental work carried out for the two cases of different fruits in the chamber which can ripen at 20–24° practically. The temperature should be maintaining in a periodic way of response by changing PCM as a primary objective. Total 4 types of PCM’s selected for the experimental work in that two PCM’s added with Nano particles to it to improve efficiency. The chamber observed for 4–6 days of two different fruit ripening conditions without damage of its properties. The comparison of PCMs made with water to check the optimal feasibility of the experimental work (Fig. 10.4).
10.5 Results and Discussions In the present study, sensors were fitted to digital observations to estimate a phase transformation at a certain point in time to identify the solid–liquid interface of the PCM inside the chamber. The PCM phase front coordinates must be known in order to design a solid model from the solid–liquid interface. Different testing of systems were performed to observe the heat transfer improvements by adding nanoparticles. The melting behaviour of pure PCM and nano-mixed PCM was also studied. The trials are categorized as pure PCM, forward and backward, with nanoparticles in the basic PCM (Graph 10.1).
112
G. Bhaskara Rao and A. Parthiban
Graph 10.1 Temperature changing of phase change materials in input process
Results at input cold chamber show that by the use of nanoparticles thermophysical properties of the nanocomposites can be improved. Although decreasing heat capacity of nanocomposites was observed with nanoparticle concentration. These composite samples depend upon any conditions like purpose of use, weight fraction of nanoparticles, the variation of heat capacity, points and thermal conductivity, etc., and final input observations show that the addition of nanoparticle concentration shows a convenient regular interval with time to stabilize the temperature at inlet to maintain temperatures in cold chamber (Graph 10.2). From the results obtained at discharge from the chamber, the sustainability of discharge temperatures decreased with the increase of NANO concertation. Thermal stabilization in the chamber observed with the 5 different PCM’s the originated data observed at outmost deviation between hybrid PCM and the paraffin, approximate difference of 13 °C observed at outlet. Temperature changes in the cold chamber with mean values has been observed and compared in the (Graph 10.3). Graph 10.2 Temperature changing of phase change materials in discharging process
10 Experimental Analysis of Cold Chamber with Phase Change …
113
Graph 10.3 Temperature changing of cold chamber in discharging process
According to the results in cold chamber for the purpose of maintaining low temperature than the ripening temperatures such that the procurement time can be increased by observing one on each, paraffin-r-21 maintained 28° in the chamber even it replicates lower value at the inlet and good amount of heat carrying at discharge. Salt hydrate and cold water both are in ripening zone at a temperature between 20 and 25 °C, and the factor of safety in ripening is not appreciable. The addition of 1% nano borax to the ammonium bromide has given better result, but it is also in the ripening temperature zone of case study. Finally lower temperatures than ripening temperature can extended the ripening time, useful for storage factor under the ripening temperature. Borax with 2% addition has given ultimate low temperature in the cold chamber at maintaining 16 °C. By observing the materials of PCM, the temperatures are noted as below. Table 10.1 shows minimum and maximum conditions of all five PCMs and chamber stabilization mean at every 2 h interval time. Table 10.1 Chamber stabilization at every 2 h interval of time S. No.
PCM
Inlet T (°C)
Discharge T (°C)
Chamber average T (°C)
1
(H1 + 2% borax)
31 to −5
12
15.8
2
(HV + 1% borax)
31 to 2
20
21.5
3
RT-21-paraffin
31 to −12
28
28
4
Salt hydrate-S21
31 to −13
26
26
5
Water
31 to 3
23
25
114
G. Bhaskara Rao and A. Parthiban
10.6 Conclusions • Synthesis of nanoparticle addition to the regular PCM reported and the properties are noted, and an experimental case study on ripening delay observed with the PCM’s investigated as primary objective. • To evaluate the average temperature in the chamber with mean temperatures with fully loaded crates observed using digital temperature sensors, these are also placed at the inlet and discharges to check the thermal stabilization of cold chamber. • By replacing nanoparticles to ammonium bromide, the cold chamber temperature observed for 4–6 days, experiment carried out for 20 days with five phase change materials to get comparative analysis in the chamber. • Normal ethylene input as ripening agent observed in all case studies to observe the delay factor and procurement after ripening of fruits. More simulative observations needed for different products to validate the nano combination of agricultural products.
References 1. Etheridge, D., Murphy, K., Reay, D.: A PCM/heat pipe cooling system for reducing air conditioning in buildings: review of options and report on field tests. Build. Serv. Eng. Res. Technol. 27, 27–39 (2006) 2. Rifat, S.B., Omer, S.A., Ma, X.: A novel thermoelectric refrigeration system employing heat pipes and a phase change material: an experimental investigation. Renew. Energy 23, 313–323 (2001) 3. Al-Maghalseh, M., Mahkamov, K.: Methods of heat transfer intensification in PCM thermal storage systems: review paper. Renew. Sustain. Energy Rev. 92, 62–94 (2018) 4. Jaguemont, J., Omar, N., Van den Bossche, P., Mierlo, J.: Phase change materials (PCM) for automotive applications: a review. Appl. Therm. Eng. 132, 308–320 (2018) 5. Huang, X., Zhu, C., Lin, Y., Fang, G.: Thermal properties and applications of microencapsulated PCM for thermal energy storage: a review. Appl. Therm. Eng. 147, 841–855 (2019) 6. Kamel, R.S., Fung, A.S., Dash, P.R.H.: Solar systems and their integration with heat pumps: a review. Energy Build. 87, 395–412 (2015) 7. Pesaran, A., Lee, H., Hwang, Y., Radermacher, R., Chun, H.H.: Review article: numerical simulation of adsorption heat pumps. Energy 100, 310–320 (2016) 8. Byrne, P., Ghoubali, R.: Exergy analysis of heat pumps for simultaneous heating and cooling. Appl. Therm. Eng. 149, 414–424 (2019) 9. Rashidi S, Shamsabadi H, Esfahani JA, Harmand S (2019) A review on potentials of coupling PCM storage modules to heat pipes and heat pumps. J. Therm. Anal. Calorim. https://doi.org/ 10.1007/s10973-019-08930-1 10. Han, X., Wang, X., Zheng, H., Xu, X., Chen, G.: Review of the development of pulsating heat pipe for heat dissipation. Renew. Sustain. Energy Rev. 59, 692–709 (2016) 11. Wu, S., Dai, Y., Li, X., Oppong, F., Xu, C.: A review of ground-source heat pump systems with heat pipes for energy efficiency in buildings. Energy Procedia 152, 413–418 (2018) 12. Shafeian, A., Khiadani, M., Nosrati, A.: A review of latest developments, progress, and applications of heat pipe solar collectors. Renew. Sustain. Energy Rev. 95, 273–304 (2018)
10 Experimental Analysis of Cold Chamber with Phase Change …
115
13. Valipour, M.S., Rashidi, S., Masoodi, R.: Magnetohydrodynamics flow and heat transfer around a solid cylinder wrapped with a porous ring. ASME J. Heat Transf. 136, 062601 (2014) 14. Chen, F., Mao, J., Chen, S., Li, C., Hou, P., Liao, L.: Efficiency analysis of utilizing phase change materials as grout for a vertical U-tube heat exchanger coupled ground source heat pump system. Appl. Therm. Eng. 130, 698–709 (2018) 15. Bottarelli, M., Georgiev, A., Aydin, A.A., Su, Y., Yousif, C.: Ground source heat pumps using phase change materials ground-source heat pumps using phase change materials. In: Conference: European Geothermal Congress, Pisa (2013)
Chapter 11
Comparison of H-Based Vertical-Axis Wind Turbine Blades of NACA Series with CFD Mohd Hasham Ali, Syed Nawazish Mehdi, and M. T. Naik
Abstract The use of small wind turbines to generate electricity is becoming increasingly common around the world. This energy source is still understudied when compared to studies on medium and large wind turbines. The model presented here has three fixed straight blades; in the future, this model will have motorized blades. In addition, lift and drag coefficients for the applied airfoils were calculated using the same numerical model as for the vertical-axis wind turbine (VAWT) using the same numerical model. When using a Reynolds number of 2.9 million, the obtained values of lift and drag force coefficients match with XFOIL’s and the experiment’s predictions throughout a wide range of angles of attack. As a result, this impeller is attractive for further investigation due to its maximum rotor power coefficient of 0.5. According to research, if this rotor is to be used with fixed blades, the NACA 4418 airfoil should be used instead of the original NACA 0015 and NACA 0018. Angle of increment 00, 50, 100, 150 and velocity variants input from 5, 5.5, 6, 6.5, 7, 7.5, 8 m/s. As a result, the pressure distributions in all input parameters have been observed. ANSYS Fluent was used to analyse all airfoils’ aerodynamic performance.
11.1 Introduction In many European nations, the percentage of renewable energy in electricity generation has grown dramatically [1–6]. It has long been believed that wind energy systems are among the most cost-effective of all the already exploited renewable energy sources, which has led to an increase in wind energy investment over the past decade. There are three primary forms of contemporary VAWT: Savonius [7, 8], Darrieus [9, 10] and H-rotor [11]. This is a drag-type wind turbine, and it is made by Savonius. It can self-start and has a strong torque, but it runs at a low tip speed ratio M. H. Ali (B) · M. T. Naik Department of Mechanical Engineering, JNTUH, Hyderabad, India S. N. Mehdi Department of Mechanical Engineering, Lords Institute of Engineering and Technology, Hyderabad, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_11
117
118
M. H. Ali et al.
(TSR). VAWT stands for vertical-axis wind turbine. One type of Darrieus VAWT is curved bladed (or egg-shaped), while the other is straight bladed. Among the Darrieus VAWT straight blades, the H-rotor is the most commonly used design. According to the “H” rotor’s straight blades and straight arms, it was given that name. When it comes to lift-type wind turbines, they can run at high TSRs, but they generally have a problem: the ability to self-start [12].
11.2 Literature Beri and Yao [5] show that 135 domain widths divided by 6 are sufficient for accurately estimating the rotational power coefficient. According to Mohamad et al. (2011), the ratio of the measured surface width to the Savonius rotor diameter of 10 yields adequately accurate rotor torque results. Rogowski (2018) a similar conclusion was reached while analysing the influence of a Darrieus rotor’s square-length computational domain. Because of this, the ratio of domain width to rotor diameter in this research was decided to be 10 as a starting point for our analysis. For example, we estimated that the rotor’s centre of rotation would be 5D from the intake [3]. Ferreira et al. (2009) and Castelli et al. [7], according to this work, the distance between the midpoint of the revolving rotor and its outlet is equal to 25 D, significantly more expansive than in the case of Trivella and Castelli (2014) or Castelli et al. [8], but smaller than in the case of it or just. Changing meshes are required to solve this problem. This is not a simple task and requires a specific technique. A 145-domain area surrounding the revolving rotor must be separated using this technology, often known as the “slide mesh” approach. When it comes to the airfoils’ edges, the standard edge of both areas must have a long enough distance to account for probable numerical errors [4].
11.3 Methodology Blade geometrical series of the different NACA profiles have been observed in Figs. 11.1 and 11.2. The low velocity profiles on two geometries should be observed.
Fig. 11.1 0 series NACA blade profile geometry
11 Comparison of H-Based Vertical-Axis Wind Turbine Blades …
119
Fig. 11.2 18 series blade profile geometry
Essentially, three-blade profiles were observed at different pressure angles and at different velocities to get the optimum blade profile for further investigations using CFD. Blade profiles—NACA series—0015, 0018, 4418, velocity variants (m/s)—5, 5.5, 6, 6.5, 7, 7.5, 8. Angle of increment—00, 50, 100, 150. As the vertical-axis wind turbine rotates, it creates aerodynamic loads that induce localized retardation of the main flow in the area. As the rotor’s azimuth or changes, so do the forces that act on it concerning the wind. “Azimuth zero” in this context denotes that the blade chord is parallel to the primary flow direction, and therefore, blade moves against the wind. To reflect aerodynamic loads, most blades have coefficients that are defined as follows in Figs. 11.3 and 11.4.
Fig. 11.3 Domain modelling
Fig. 11.4 Cavity domain modelling of NACA 0015, NACA 0018, NACA 4418
120
M. H. Ali et al.
Fig. 11.5 Pressure distribution of NACA 0015 at 0° at minimum and maximum velocity conditions of 5.8 m/s
Rectangular components were employed along the boundaries of the airfoils, while triangular elements were used for unstructured mesh. The global grid settings are the same for all of the numerical models examined in this study. There were 200 equal-length segments on the blades. For example, if TSR = 2 and TSR = 6, the structural mesh has 60 layers, with a growth rate of 1.12, and the first layer at the edges of the airfoil is 5.3 × 106 m thick, which results in an average wall y+ of 0.50. For mesh elements that are not organized, the growth rate is 1.04. Figure 11.5 shows the finished grid, which has 234,840 elements and 136,059 nodes in overall. The domains are compared in two stages: attacking angle versus lift/drag and velocity versus lift/drag.
11.4 Results and Discussion The analysis run in ANSYS workbench to check the optimum condition of air attacking angle on the blade surface has been studied. Pressure contours analysed in all conditions of different velocities with respect to attacking angles. Four attacking angles with seven different velocities with an increment of 0.5 m/s analysed with angle of increment 5°. Distributed pressure contours are plotted at minimum angle and maximum angle, minimum velocity and maximum velocity as a part of representing the work. Other CL/CD (coefficient of lift/coefficient of drag) values noted for better comparison of profiles. By observing the graphs, there is not much variation found in the minimum angle of attack even though velocity increased, and a bit of variation observed at the middle and end of the blade with increase of velocity (Fig. 11.6). Increasing of attacking angle found more variation at all levels bottom middle and top, the top area deflected much when compared with the other two. Pressure applicability gradually increases from bottom to top in the series 0015. Pressure deflection found more at the fixed end and normal at the tailed end of the profile (Fig. 11.7). In profile pressure uniformity contours observed in above 0018 series compared with 15 profile maximum contour pressures at the tail part of the blade (Fig. 11.8).
11 Comparison of H-Based Vertical-Axis Wind Turbine Blades …
121
Fig. 11.6 Pressure distribution of NACA 0015 at 15° at minimum and maximum velocity conditions of 5.8 m/s
Fig. 11.7 Pressure distribution of NACA 0018 at 0° at minimum and maximum velocity conditions of 5.8 m/s
Fig. 11.8 Pressure distribution of NACA 0018 at 15° at minimum and maximum velocity conditions of 5.8 m/s
A slight variations found in the pressure contours of twisted NACA 0018 when compared with 0015, much twisted pressure zones not found in the overall profile, pressure coefficient found be a little more than a value between 0 and 0.2 in the tail area (Fig. 11.9). Pressure contours on the profile found on entire profile of twisted 4418 contours, and pressure increased with the increase of velocity in the middle area profile. Drastic differences found in the case of twisted blade with the increase of length (Fig. 11.10).
122
M. H. Ali et al.
Fig. 11.9 Pressure distribution of NACA 4418 at 0° at minimum and maximum velocity conditions of 5.8 m/s
Fig. 11.10 Pressure distribution of NACA 4418 at 15° at minimum and maximum velocity conditions of 5.8 m/s
By observing the above contours, it found that similar pressure contours found with the increment of velocity but a change on attacking the profile, with the noted values of all contours, comparative analysis done for the all profiles with respect to angle and velocity.
11.4.1 Comparative Analysis of CL/CD CL/CD is an important parameter known for the airfoil efficiency; lift drag ratio needs to be compared in all cases with all parameters (Fig. 11.11). The profiles of twisted blades compared for the responses at low velocities and all attacking angles by virtue of the significant variation found between twisted and flat air foils, the values of cl/cd initiated from 0.46 with continuous drop to −7.63 negative whereas 4418 initiated from 5.12 to −3.48, the ratio almost same in the same 18 series profiles a little variation when compared with 0015 (Fig. 11.12). An equal ratio in 4418 found after increasing the velocity a step as like 0018, uneven loop found in 0015 when compared with 18 series. The values observed on 00 series start from 0.46 to −8.70, and for 4418, it is from 6.32 to −3.3. Ratio increased in clear intervals in 4418 when compared with other two (Fig. 11.13).
11 Comparison of H-Based Vertical-Axis Wind Turbine Blades …
Fig. 11.11 Comparison of airfoils—angle of attack versus CL/CD at 5.0 m/s
Fig. 11.12 Comparison of airfoils—angle of attack versus CL/CD at 5.5 m/s
123
124
M. H. Ali et al.
Fig. 11.13 Comparison of airfoils—angle of attack versus CL/CD at 6.0 m/s
Positive ratio increased drastically in 4418 with the increase of velocity but found equal intervals at all attacking angles. No such variation in ratio found for 00 series profiles between 10 and 15° angles of attack. Observed values for this velocity are from 0.47 to −8.82 for 0015 and 0018, 6.8 to −3.23 between 0 and 15° (Fig. 11.14).
Fig. 11.14 Comparison of air foils—angle of attack versus CL/CD at 6.5 m/s
11 Comparison of H-Based Vertical-Axis Wind Turbine Blades …
125
Fig. 11.15 Comparison of airfoils—angle of attack versus CL/CD at 7.0 m/s
At a velocity of 6.5 m/s, we can observe the ratios similarly for all profiles up to 10°, after attacking angle more than it, there is no such difference in non-twisted profiles, but at the same, there is a deviation in twisted profile. Values observed 0.49 to −8.8 for 0015 and 0018, whereas 6.79 to −3.3 for twisted 4418 (Fig. 11.15). A little increment of 0018 found compared with 0015 airfoil at 7 m/s velocity, as usual no difference of ratio found in 4418 when compared with above velocities. Values ranged from 0.51 to −8.9 for 00 series and 6.88 to −3.41 for twisted profile (Fig. 11.16). Much differences found between 0015 and 0018 with the attacking angle increased at a velocity of 7.5 m/s. Regular intervals found in 4418 twisted airfoil, an interval ratio of 2.1 found in it, whereas 3.1, 2 and 1.1 found unevenly for 00 series. Values lie between 0.51 to −9.09 and 6.99 to 3.42 (Fig. 11.17). In the present work responses observed at maximum velocity input of 8m/s considered as like above responses the values deviation not much differed. The values range between 0.53 to −9.09 for 0015 and 18 and 7.15 t0 −3.42 for 4418, respectively. Not much differences between 10 and 15° attacking angle, but deviation observed in 4418 with the increment of attacking angle. A gradual increment of expansion ratios between 0-15 observed for the profiles NACA 0015, 0018, 4418 at maximum values (Fig. 11.18). As per the graphical interpretation at the primary angle, there is no such difference found in the flat airfoils of 0015 and 0018. The values increased with the increase of velocity in twisted profile. Ultimate difference found in the NACA 4418 airfoil (Fig. 11.19).
126
Fig. 11.16 Comparison of airfoils—angle of attack versus CL/CD at 7.5 m/s
Fig. 11.17 Comparison of airfoils—angle of attack versus CL/CD at 8 m/s
M. H. Ali et al.
11 Comparison of H-Based Vertical-Axis Wind Turbine Blades …
Fig. 11.18 Comparison of airfoil velocity versus CL/CD at 0°
Fig. 11.19 Comparison of airfoil velocity versus CL/CD at 5°
127
128
M. H. Ali et al.
Fig. 11.20 Comparison of airfoil velocity versus CL/CD at 10°
Angle inclination with respect to velocity increment given lower ratios, but stability found in 4418 airfoil. Two different observations found that the ratios decreased with the increase of velocity in flat airfoil and increased with the twisted airfoil. Compare to NACA 0015, 0018, an efficiency increment more in the twisted profile NACA4418 (Fig. 11.20). A stabilized load at attacking angle of 10° found on airfoil profile of NACA 4418, a part of that dramatic decrement ratio found in flat blades of 0015 and 0018. Similarities found in both flat profiles at the same; the twisted 4418 given good result of equal ratios at all velocities (Graph 11.1). A promising ratio of all stages of velocity variance found at the attacking angle of 15° for airfoil profile NACA 4418, exceptional change not found in the CL/CD ratio. A drastic decrement with consisted variance with velocity found for the airfoil profiles NACA 0015 and 0018.
11.5 Conclusions A 4-digit NACA series blades compared as a part of research with twisted and flat profiled airfoils in CFD environment. Important observations noted with many permutations as a part of work. As per the literature observed, efficiency of the airfoil increased with stabilized Cd/Cl ratios which observed good at twisted airfoil NACA 4418. The comparison between flat blade and twisted blade there is a notable difference found, twisted aerofoil given good result in maximum angle of attack, i.e.
11 Comparison of H-Based Vertical-Axis Wind Turbine Blades …
129
Graph 11.1 Comparison of airfoil velocity versus CL/CD at 15°
15°. Air velocity increment has shown an effective ratio variation of Cl/Cd in flat blade which is inversely proportionate to attacking angle. Twisted blade results in the increase of air velocity, and attacking angle is directly proportionate to Cl/Cd. As a conclusion, it is recommended that twisted profiles with increment attacking angle between 10 and 15° have given good results at the end. More observations needed with practical approach to validate twisted angle with better enhancement.
References 1. Anderson, J.W., Brulle, R.V., Bircheld, E.B., Duwe, W.D.: McDonnell “40-kW Giromill wind system”: phase I—design and analysis. Volume II. Technical report. McDonnell Aircraft Company (1979) 2. Apelfröjd, S., Eriksson, S., Bernhoff, H.: A review of research on large scale modern vertical axis wind turbines at Uppsala University. Energies 9 (2016). https://doi.org/10.3390/en9070570 3. Balduzzi, F., Bianchini, A., Maleci, R., Ferrara, G., Ferrari, L.: Critical issues in the CFD simulation of Darrieus wind turbines. Renew. Energy 85, 419–435 (2016). https://doi.org/10. 1016/j.renene.2015.06.048 4. Bangga, G., Hutomo, G., Wiranegara, R., Sasongko, H.: Numerical study on a single bladed vertical axis wind turbine under dynamic stall. J. Mech. Sci. Technol. 31, 261–267 (2017). https://doi.org/10.1007/s12206-016-1228-9 5. Beri, H., Yao, Y.: Effect of camber airfoil on self starting of vertical axis wind turbine. J. Environ. Sci. Technol. 4, 302–312 (2011). https://doi.org/10.3923/jest.2011.302.312 6. Bianchini, A., Balduzzi, F., Rainbird, J.M., Peiro, J., Graham, J.M.R., Ferrara, G., Ferrari, L.: An experimental and numerical assessment of airfoil polars for use in Darrieus wind turbines— part I: flow curvature effects. J. Eng. Gas Turb. Power 138, 032602-1–10 (2015). https://doi. org/10.1115/GT2015-42284
130
M. H. Ali et al.
7. Castelli, M.R., Ardizzon, G., Battisti, L., Benini, E., Pavesi, G.: Modeling strategy and numerical validation for Darrieus vertical axis micro-wind turbine. In: Proceedings of the ASME 2010 International Mechanical Engineering Congress & Exposition, IMECE2010-39548, Canada (2010) 8. Castelli, M.R., Englaro, A., Benini, E.: The Darrieus wind turbine: proposal for a new performance prediction model based on CFD. Energy 36, 4919–4934 (2011). https://doi.org/10.1016/ j.energy.2011.05.036 9. Elsakka, M.M., Ingham, D.D., Ma, L., Pourkashanian, M.: CFD analysis of the angle of attack for a vertical axis wind turbine blade. Energy Convers. Manag. 182, 154–165 (2019). https:// doi.org/10.1016/j.enconman.2018.12.054 10. Hoerner, S., Abbaszadeh, S., Maître, T., Cleynen, O., Thévenin, D.: Characteristics of the fluid–structure interaction within Darrieus water turbines with highly flexible blades. J. Fluids Struct. 88, 13–30 (2019). https://doi.org/10.1016/j.jfluidstructs.2019.04.011 11. Ivanov, T.D., Simonovi´c, A.M., Svorcan, J.S., Pekovi´c, O.M.: VAWT optimization using genetic algorithm and CST airfoil parameterization. FME Trans. 45, 26–32 (2017). https://doi.org/10. 5937/fmet1701026I 12. Lichota, P., Agudelo, D.: A priori model inclusion in the multisine maneuver design. In: 17th IEE Carpathian Control Conference, Tatranska Lomnica, Slovakia (2016). https://doi.org/10. 1109/CarpathianCC.2016.7501138
Chapter 12
Design and Synthesis of Random Number Generator Using LFSR K. Rajkumar, P. Anuradha, Rajeshwarrao Arabelli, and J. Vasavi
Abstract Currently days, each microcircuit desires high security level throughout knowledge transmission between numerous blocks and storing knowledge into recollections. For this purpose, the in-built self-check primarily based random range systems area unit plays the key role within the typical way. The random numbers area unit is generated by victimization of pseudo-patterns. But, they are consuming higher energies with less random patterns. Thus, to beat this downside, it is higher to come up with the random ranges by victimization actuality random number generator (TRNG) frameworks. Within the existing, many works area unit is centered on implementation of TRNG by victimization CMOS technology. But, they consume a lot of range of transistors, so space and power consumption was increasing. The LFSR-TRNG is functioned supported the switchable ring oscillators (SCRO) with beat frequency detection. The simulations area unit enforced by victimization the Xilinx ISE package, the qualitative analysis of space, power shows the LFSR-TRNG technique that outperforms compared to the state-of-the-art approaches.
12.1 Introduction A random range generator is meant to come up with a sequence of numbers while any specific pattern. Within the current era, the VLSI primarily based RNG applications [1] area unit is enhanced day-by-day quickly within the human lives thanks to their large applications within the differing kinds of domains like energy metering system, K. Rajkumar · P. Anuradha Department of Electronics and Communication Engineering, SR University, Warangal, Telangana, India R. Arabelli Center for Embedded Systems and IoT, SR University, Warangal, Telangana, India J. Vasavi (B) Department of Electronics and Communication Engineering, SR Engineering College, Warangal, Telangana, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_12
131
132
K. Rajkumar et al.
medical applications, commercial applications, industrial machinery observation, SCADA applications. A secure system needs random numbers at numerous stages [2], like random key generation, data format vectors, random nowadays. Randomness within the range generated by a random range generator is crucial to make sure its privacy, namelessness, and unpredictability. The safety of most cryptanalytic algorithms victimization random number generator is predicated on the belief that it is not possible to predict the random sequence by associate unauthorized user. True random range generator and pseudo-random range generator area units are the two basic sorts of random range generators. A TRNG [2] uses some natural phenomenon, whereas a pseudo-random range generator uses a function, initial state and a seed to come up with long random sequence of numbers. Therefore, the cryptanalytic strength of a pseudo-random range generator depends on the safety of the initial state and, therefore, the seed. Identical initial state and seed cause pseudo-random generator to come up with identical sequence. So as to boost security, the random range generator should be designed in such how that it is not possible to predict the initial state and, therefore, the seed of the generated sequence. Data communication is the foremost necessary facet in our way of life. Cryptography, primarily based cryptography, is technique used for providing secure communication of information between the one finish and another finish. These cryptography algorithms are area unit essentially used for military and business purpose. The secret knowledge messages or knowledge calls of the govt. higher officers, politicians are also cornered by the intermediate hackers to involve malpractices against government movements, so these secret knowledge messages ought to be created safer. Knowledge is that the continuous streaming knowledge. It consists of big range of bits. Knowledge signal consists of each negative and positive bit values. Knowledge signal is outlined as a special quite message signal; however, it will method completely different properties than typical message signal. Knowledge signal could be a slender band, and message could be a broader band signal. Knowledge signal is pictured in two types like analog and digital forms so as to produce the secure communication between one finishes to a different end, coding of information messages is that the solely thanks to secure communication from trappings. To beat these drawbacks, the LFSR primarily based technique is enforced as follows: • All elementary gates and fundamental building blocks area unit are enforced like SCRO, D-FF and CB4 by victimization LFSR technology. • Then, by utilizing the thought of SCRO with beat frequency detection the LFSR (mostly primarily based)-TRNG generated with high chance-based random numbers. Rest of the paper is organized as follows: Chapter two deals with the elaborated analysis of the assorted state-of-the-art approaches with their drawbacks. Chapter three deals the implementation details of LFSR technique with elaborated operation. Chapter four deals with the simulations of LFSR primarily based TRNG technique and comparison with the state-of-the-art approaches. Finally, Chapter five deals with the conclusion and potential future studies.
12 Design and Synthesis of Random Number …
133
12.2 Literature Survey There are several journals discussed regarding Viterbi decoder [3] realization in VLSI environment. In [4], authors discussed about a VLSI method for area examination for a hard-soft decision-based decoder. They portray any calculations that be measured using new ACS unit. In [5], authors have projected a low-area consumed decoder with the transpose algorithm for trellis approach and trellis modulation. Then, a low-power method is GDI since a decoder has soaring power dissipation in trellis methods [6]. In addition to power dissipation, authors have studied on speed and delay evaluation in [7] and hardware efficiency and area utilization in [8]. All these literatures are focused to develop TRNG communication system by using conventional CMOS technologybased basic gates, as the trellis method is the major approach for decoding, for reducing the number of paths and path delays, the reversible logic is preferable. As it consumes low quantum cost, low area, less power consumption, and fewer delays compared to other literatures. In [9], authors give a new design method for TRNG decoder with no input carry with one ancillary input bit. Authors have examined new QR carry adder designs with no ancillary input bit that gives improved delay. The reversible decoder in the existing literature is evaluation by garbage outputs, total RL used, QC, and delay. In [10], authors have described concepts of convolutional decoder. Also, designed and implemented high cost, efficient, fault tolerant, reversible decoder. In this, more garbage outputs were compensated with fewer operations. The author concluded that the decoder performs all the logical operations better than existing methods not arithmetic operations. In [11], they addressed a concept that a TRNG function can be reversible if every vector produces equal number of outputs. In this, the high-speed turbo decoder design is presented by making use of control signals and RNG multiplier units. With this design, author has found that the GDI design is more effective as per garbage outputs and constant inputs are considered.
12.3 Proposed Method 12.3.1 Linear Feedback Shift Register (LFSR) Linear feedback shift register is a flip flop connection organized in sequence where the output of the preceding flip flop is the input of the current flip flop. It is established through the combination of XOR gate in the flip flop series feedback. LFSR’s preliminary value is known as a seed value, a mixture of 1s and 0s. Selecting the seed value which induces low power dissipation as the seed value decides additional random values. Because of its finite possible states, LFSR enters a repetitive process. LFSR is a kind of shift register that tilts from left to right the stored values. The generation of random numbers [12] is due to the LFSR circuit XOR feedback inputs (Fig. 12.1).
134
K. Rajkumar et al.
Fig. 12.1 XOR-based LFSR architecture
The LFSR will pass through as many states as possible, and only the taps are selected properly. The seed value should initialize the LFSR. For n stage, 2n − 1 states required. Depending on the current state, the sequence of values is produced by the register. The LFSR produces a random sequence of bits. Whenever it is clocked, the signal is forwarded toward the next MSB bit through the register bit. There is an input from predefined registers to the left through the XOR gate. In XOR-based LFSR, a count of all zero is not feasible. LFSR counters are very efficient because they do not produce carry signals.
12.3.2 LFSR-TRNG Frameworks The detailed architectures of the LFSR-TRNG are presented in Fig. 12.2, respectively. Each and every block in Fig. 12.2 was implemented by using the priorities of LFSR properties, and especially, the CB4 is designed by using LFSR approach. It means
Fig. 12.2 Proposed TRNG block diagram
12 Design and Synthesis of Random Number …
135
SCRO, 2 to 1 multiplexer, phase detector, reset-set (RS) latch, and 4-bit binary counter (CB4) were developed by using the LFSR gates. The detailed procedure of the LFSR-TRNG method is as follows: Step 1: By initializing the channel enable (CH ENA) to active low, the circuit starts functioning. The CH ENA input and phase detector output is applied as inputs to the control enable block. Here, if any one of the signal is triggered to zero level, then the output becomes logic high based on the functionality of the LFSR-NAND gate. Step 2: Control enable output will be applied as the selection input to the two individual multiplexers MUX1 and MUX2, respectively. Here, the multiplexer is used to selection of optimal path of SCRO. • If selection = 0, then the MUX2 selects from Path I1 (delayed output from SCRO2), else it will directly selects the SCRO2 output. • If selection = 1, then the MUX1 selects from Path I1 (delayed output from SCRO1), else it will directly selects the SCRO1 output. The control enable is also applied as one of the inputs to the LFSR-SCRO to its XOR gate, respectively. Step 3: The major functionality of the TRNG will be depended on the beat frequency of SCRO. Here, SCRO is a digital oscillator, which is used to generate the various frequencies. • Mainly, the input clock and control enable (fg1, fg2) signal will be applied as major inputs to the SCRO1 and SCRO2. • The output of the SCRO will be applied as input to the SCRO again through the MUX in the positive feedback manner; thus, there will be less errors in the oscillations. The output frequencies will be generated perfectly. • The frequency of SCRO1 is not same as the frequency of SCRO2. If the both frequencies are same, then there is less probability of generation of random sequences. For this purpose, the 3-input XOR gate is used in the SCRO1 block, whereas 2-input XOR gate is used in the SCRO2 block, respectively. Step 4: In order to calculate the phase differences between two frequencies, the outputs of both SCRO will be applied as inputs to the D-flip flop. Here, D-FF is used as the phase detector and calculates the phase difference between two signals, respectively. • Clock signal will be applied as the clear (CLR) input to the D-FF; thus, for every positive edge triggering of clock, the data stored in the D-FF will be cleared. • SCRO1 output will be applied as data input, and SCRO2 output will be applied as the clock input to the D-FF. Thus, with respect to the SCRO2 clock triggering, SCRO1 data will be monitored and results the output as phase detected output.
136
K. Rajkumar et al.
• This phase detected output will be applied as input to the control enable block as mentioned in Step 1. • By this feedback mechanism, the SCRO 1 and SCRO 2 clock output frequencies will be adjusted without any phase and frequency mismatches. Step 5: The controlled SCRO 1 and SCRO 2 clock output frequencies will be applied as inputs to the set-reset (RS) latch. It will trigger the output and generates the final enable signal for the CB4. Step 6: The RS latch enable signal will be applied as input to the CB4, as CB4 is a counter, it will generates the output sequences randomly. The randomization majorly dependent on the triggering of SR latch output set and reset conditions, respectively.
12.4 Simulation Results All the LFSR-TRNG designs have been designed using Hspice software; this software tool provides the two categories of outputs named as simulation and synthesis. The simulation results give the detailed analysis of LFSR-TRNG design with respect to inputs, output byte level combinations. Through simulation analysis of accuracy of the encoding, decoding process is estimated easily by applying the different combination inputs and by monitoring various outputs. Through the synthesis results, the utilization of area with respect to the transistor count will be achieved. And also, time summary with respect to various path delays will be obtained and power summary generated using the static and dynamic power consumed. Figure 12.3 indicates the simulation outcome of proposed method. Here, clock (clk), enable (en), reset, drp, and address (add) are the input signal. For each address, a new random number is generated in non-deterministic manner and resulted in the output signal out. So, the proposed method gives the effective outputs as it utilizes the LFSR. Figures 12.4 and 12.5 indicate the total power utilized by the proposed method. The total power consumed by the proposed method is 14.14 mw.
Fig. 12.3 Simulation waveform
12 Design and Synthesis of Random Number …
137
Fig. 12.4 Power report
Fig. 12.5 On-chip power summary
Figure 12.6 indicates the total time consumed by the proposed method. The total path delays presented in the proposed method is 4.55 ns. Figure 12.7 indicates the total area consumed by the proposed method. The total area consumed in terms of slice registers is 40, and look up tables utilized by the proposed method are only 36. The combination of slice register and LUTs is LUT-FFs, and only 30 number of LUT-FF are used. From Table 12.1, it is clearly discovered that the planned methodology provides the higher performance in terms of space, power, and delay as compared to the prevailing methodology.
Fig. 12.6 On-chip delay summary
138
K. Rajkumar et al.
Fig. 12.7 Device utilization summary
Table 12.1 Comparison table
Parameter
Existing
Proposed
Slice registers
125
40
LUTs
149
36
LUT-FFs
173
30
Time consumed
23.72 ns
4.55 ns
Power consumed
36.47 mw
14.14 mw
Conclusion This work was majorly focuses on implementation of TRNG by victimization the LFSR primarily based technology. For implementing the TRNG, the SCRO {based|based mostly|primarily primarily based} mechanism@@ has been custommade with the beat frequency-based detection ideas. Thus, the chances of occurrences of random numbers are augmented and oscillations within the frequency conjointly improved with the reduced error rates. Finally, the output of TRNG was applied as input to the playing encoder-decoder primarily based mechanism for firmly transmission of random numbers within the varied communication channels. The simulation results by victimization Xilinx ISE software system shows that the LFSR primarily based TRNG technology provides the higher outcomes compared to the state-ofthe-art approaches. The work is extended to implement the important time secured protocols, like public key cryptography, RSA, computer code, and HECC cryptography mechanisms with the LFSR-TRNG TRNG outputs are going to be applied as each public key and personal key to them.
References 1. Jiang, H., et al.: A novel true random variety generator supported a random disseminative memristor. Nat. Commun. 8(1), 1–9 (2017) 2. Tuna, M., et al.: Hyperjerk multiscroll oscillators with megastability: analysis, FPGA implementation and a completely unique ANN-ring-based true random variety generator. AEU—Int. J. Phys. Commun. 112, 152941 (2019) 3. Karakaya, B., Çelik, V., Gülten, A.: Chaotic cellular neural network-based true random variety generator. Int. J. Circ. Theor. Appl. 45(11), 1885–1897 (2017)
12 Design and Synthesis of Random Number …
139
4. Brown, J., et al.: A low-power and high-speed true random variety generator victimization generated RTN. In: 2018 IEEE Conference on VLSI Technology. IEEE (2018) 5. Satpathy, S.K., et al.: An all-digital unified physically unclonable operate and true random variety generator that includes self-calibrating stratified John von Neumann extraction in 14-nm tri-gate CMOS. IEEE J. Solid-State Circ. 54(4), 1074–1085 (2019) 6. Lee, H., et al.: Design of high-throughput and low-power true random variety generator utilizing sheer magnetised voltage-controlled magnetic tunnel junction. AIP Adv. 7(5), 055934 (2017) 7. Jerry, M., et al.: Stochastic insulator-to-metal part transition-based true random variety generator. IEEE Lepton Device Lett. 39(1), 39–142 (2017) 8. Kaya, T.: A true random variety generator supported a Chua and RO-PUF: style, implementation and applied mathematics analysis. Analog Integr. Circ. Sig. Process 102(2), 415–426 (2020) 9. Kim, E., Lee, M., Kim, J.-J.: 8.28 Mb/s a pair of 8 Mb/mJ strong true- random-number generator in 65nm CMOS supported differential ring generator with feedback resistors. In: 2017 IEEE International Solid-State Circuits Conference (ISSCC). IEEE (2017) 10. Wieczorek, P.Z., Gołofit, K.: True random variety generator supported flip-flop resolve time instability boosted by random chaotic supply. IEEE Trans. Circ. Syst. I: Regul. Pap. 65(4), 1279–1292 (2017) 11. Arslan Tuncer, S., Kaya, T.: True random variety generation from bioelectrical and physical signals. Mach. Math. Ways Medicat. 2018, 1–11 (2018) 12. Koyuncu, I., Özcerit, A.T.: The style and realization of a replacement high speed FPGA-based chaotic true random variety generator. Comput. Eng. 58, 203–214 (2017)
Chapter 13
QCA-Based Error Detection Circuit for Nanocommunication Network P. Anuradha, K. Rajkumar, Rajeshwar Rao Arabelli, and R. Shareena
Abstract Data transmission and reception without errors are essential in real-time communication systems. But, data is effecting by lot of errors due to the noise presented in channel. Thus, various conventional approaches are developed to remove this noise and error. But, they failed to provide maximum error correction capability with less hardware resources. They are consuming the higher area, power, and delay properties. Thus, quantum dot cellular automata (QCA) technology has been considered for implementation of nanocommunication network. Initially, this article deals with the implementation of QCA-based three-bit even parity generator and four-bit even parity checker. Further, a new QCA-based convolutional encoder and decoder is developed by using the proposed parity generator and checker. Thus, the more number of errors is detected and corrected by using the proposed nanocommunication system. The system is implemented using verilog-HDL code in Xilinx ISE 14.2 environment, and the simulation, synthesis results show that the proposed method is consuming the less area, power, and delay as compared to the conventional approaches.
13.1 Introduction The research starts with the concept of QCA [1]-based reversibility where at the ending of computation and first inverse of the transition work, the system can perform reverse computation. The majority circuit can create output from every input, i.e., there is a coordinated corresponding to input and output vectors [2]. Thus, in a MAJ, P. Anuradha · K. Rajkumar Department of Electronics and Communication Engineering, SR University, Warangal, Telangana, India R. R. Arabelli Center for Embedded Systems and IoT, SR University, Warangal, Telangana, India R. Shareena (B) Department of Electronics and Communication Engineering, S. R. Engineering College, Warangal, Telangana, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_13
141
142
P. Anuradha et al.
outputs will be equal to inputs. Basically, for an RC, conventional NOT gates are used. For designing the RC with the assistance of RG, some points are to be considered [3]: In QCA, the fan-out cannot permit. In QCA, loops cannot permit. In QCA, one more element is considered, which is more imperative than the total gates utilized with specific garbage outputs. The un-utilized outputs from a RC/RG are called as “garbage” [4]. The RL imposes many design constraints that should be either ensured or optimized for actualizing a specific Boolean function. In QCA circuit, the inputs and outputs must be equal. Every input design [5] must have unique output design. Each output will be used just once, i.e., fan-out is not permit. The QCA must be non-cyclic computation that too in majority manner for a system can be performed only if system comprises of less quantum cells. A circuit can be “majority,” when the input vector [6] is uniquely recoupled by output vector corresponding to input and output unit. The QCA a promising design paradigm offers efficient to design the computers with power dissipation. The QCA can improve the standard of computing system. Thus, the major contributions of this paper are as follows: • This article deals with the implementation of QCA-based parity generator and parity checker using majority logic gates. • Further, a new QCA-based convolutional encoder and decoder is developed by using the proposed parity generator and checker. • Thus, the much more efficient nanocommunication system will be implemented with high error correction rates with reduced hardware resource utilization. Rest of the paper is organized as follows: Sect. 13.2 deals with the properties of QCA and majority logic gates. Section 13.3 deals with the implementation details of proposed method. Section 13.4 deals with the simulation and synthesis results and comparison with state-of-the-art approaches. Finally, Sect. 13.5 concludes the paper with possible future enhancements.
13.2 Majority Logic Gates Majority logic gates are developed with the pass transistor technology using quantum dot cellular automata. The majority logic gates are preferable to design the chips because they exhibit the following properties.
13.2.1 Properties of Majority Logic Gates Property 1: bidirectional property Using this property in chip-level implementation, so both input and output pins are interchangeable. Thus, by the use of bidirectional property, the path delay will be
13 QCA-Based Error Detection Circuit for Nanocommunication Network
143
Fig. 13.1 Majority logic gate, a QCA layout, b Symbol, c Truth table
reduced as both inputs and outputs can be interchangeable and logic optimization also achieved. Property 2: fan-in fan-out capacity Every majority logic gate supports fan-in and fan-out property because the number of input pins and number of output pins will remain not equal. So, the load on the chip will be reduced effectively if even number of inputs and outputs are mismatched. Property 3: number of operations The majority logic gates support N-number of logical operations based on the number of input–output pins. For example, Feynman gate acts as both buffer and ex-or operation. In case of basic gates, they dedicated to only one operation. Property 4: quantum cost The quantum cost required for the majority logic gate is very less compared to the basic gates. Figure 13.1 presents the QCA layout, symbol, and truth table of majority logic gate, respectively. If anyone of the inputs in majority logic gate is zero, then it acts as AND operation. If anyone of the input in majority logic gate is one, then it acts as OR operation.
13.3 Proposed Method This section deals with the implementation of QCA-based parity generator and parity checker. Further, a new QCA-based convolutional encoder and decoder is developed by using the proposed parity generator and checker. Thus, the much more efficient nanocommunication system will be implemented with high error correction rates with reduced hardware resource utilization.
144
P. Anuradha et al.
13.3.1 Parity Generator and Checker Figure 13.2 represents the Nanocommunication network developed by using threebit even parity generator and four-bit even parity checker. The detailed operation of this approach is as follows: Step-1: Consider the input bits A, B, and C and apply it to the parity generator. Here, the majority logic gates are formed as the XOR gate. As mentioned in Sect. 13.2, here AND, OR gates are formed by using majority logic gates and develop the XOR operation as highlighted in green color boxes. Initially, bit-wise QCA-XOR performed between the A, B, then outcome will be again QCA-XORed with input and results the parity bit Pb as its final output. Pb = A ⊕ B ⊕ C
(13.1)
Step 2: The parity bit is transmitted into channel. In the channel, various types of noises will be added to the encoded data. Step 3: The parity bit is received and applied as input to the parity checker. Here, the receiver consists of three XOR gates. Initially, individual bit-wise XOR operation performed between A, B, and C, Pb input combinations and results the outcomes as X1 and X2, respectively. Finally, the bit-wise XOR operation will be performed between X1 and X2 and results the outcome as parity check bit PC. Step 4: If the received parity check bit is zero, it means no error is occurred in the communication system. If the received parity bit is one, then error is occurred.
Fig. 13.2 Nanocommunication network using parity generator and checker
13 QCA-Based Error Detection Circuit for Nanocommunication Network
145
Fig. 13.3 Nanocommunication system using convolution codes
13.3.2 Nanocommunication System Using Convolution Codes The above parity generation and parity checking operations are capable of calculating only error presented or not. Thus, this section deals with the implementation of nanocommunication system using convolution codes implemented by using above parity generator and checker. Figure 13.3 presents the block diagram of convolutionbased nanocommunication system. Convolution encoder: Generally, real-time communication systems are used to decode the convolutional encodes operands. Here, D0, D1, D2, and D3 are the input data sources, and corresponding outputs are Out0 to Out6. For generating the outputs, convolution encoder utilizes the generator matrix G, and it consists of identity matrix and parity symbols. The parity symbols P1, P2, P3 are user-defined, and accordingly, connections between majority gates have done. Output encoded frame format is out = [P1, P2, P3, D0, D1, D2, D3];
(13.3)
P1 = QCA_XOR(D0, D1, D2);
(13.4)
P2 = QCA_XOR(D0, D1, D3);
(13.5)
P3 = QCA_XOR(D0, D2, D3);
(13.6)
Channel: After successful completion of encoding operation, the encoded codewords are transmitted into channel. Generally, the channel consists of lots of Gaussian noise, random noise, and AWGN noise. The encoded codewords will be affected by this noise; thus, error will be added. Thus, the major task of the decoder is to remove the error from encoded data instead of decoding it. Convolution decoder: Convolution decoders are prepared out of three important segments: syndrome computation (SC), error analysis block (EAB), and error correction unit (ECU). SC creates the path to data levels matching to paired encoded trellis operands; thus, syndrome of those particular inputs is calculated and applied to EAB
146
P. Anuradha et al.
unit. By comparing the each and every bit position with parity check matrix decoders and storing the compared results in the Register unit R again, the stored decoders will be further processed for location detection by using feedback mechanisms. The error locations are identified by using the branch units generated in SC. After finding the error location for correction of those decoders, ECU unit will be useful. By using error identified path, its path metric will be recalculated and error corrected; thus, final decoded data will be generated. Syndrome Check: For this purpose, the SC block in the convolution decoder is useful to check the error status. It will monitor the every codeword, and if error is present, it will identify the type of noise added, then it will alert the EAB block, and if there is no error, SC block simply decodes the codewords by using branch metrics. These branch metrics are formed by the multiple combination parity symbols with their low to high probabilities. Here, syndrome is calculated by using parity checking operation as shown in Fig. 13.2. If syndrome is zero, it means no error is presented in the received data. If syndrome is not zero, then there are lot of errors presented in the received data. Error analysis block: After identifying the error and noise type status in SC unit, it is necessity to locate the error in codeword. Thus, the error location identification will be done by the EAB unit. Generally, the EAB contains the trellis methodology to identify the error location. But here, as we are using the majority methodology, here approximation of SC syndromes methodology has been implemented. By using the minimum of two between each SC metric, the comparison of coefficients has done. These compared coefficients will then apply to revisable QCA-XOR gates for approximate addition. Error correction unit: The final error location identification and error correction are done in the ECU block; here, each and every path will be monitored with respect to the approximation coefficients generated in EAB block. This ECU unit is reconfigurable because for the different types of noises, the error will be altered. According to that noise, ECU also reconfigures itself.
13.4 Simulation Results All the proposed designs have been programmed and designed using Xilinx ISE software; this software tool provides the two categories of outputs named as simulation and synthesis. The simulation results give the detailed analysis of proposed design with respect to inputs, output byte-level combinations. Through simulation analysis of accuracy of the addition, multiplication process is estimated easily by applying the different combination inputs and by monitoring various outputs. Through the synthesis results, the utilization of area with respect to the programmable logic blocks (PLBs), look-up tables (LUTs) will be achieved. And also, time summary
13 QCA-Based Error Detection Circuit for Nanocommunication Network
147
with respect to various path delays will be obtained and power summary generated using the static and dynamic power consumed. Figure 13.4 represents the operation of parity generator; here, it is consisting of A, B, C as its inputs and output as out. The even parity is calculated among A, B, C, and resultant is stored in out. Figure 13.5 represents the operation of parity checker; here, it is consisting of A, B, C pins as its inputs and output as out. The even parity checking is calculated among A, B, C pins, and resultant is stored in out (Fig. 13.6). Here, data is the original input data, ES is the manual error syndrome input, and enc is the final encoded operand. YC is the decoded error-free output data, so it is same as input data (Fig. 13.7). Here, data is the original input data, and out is the final encoded operand (Fig. 13.8). Here, In is the original encoded operand input data, and S is the final SC error coefficients (Fig. 13.9). Here, S is the SC error coefficients input data, and E0, E1, and E2 are the final EAB prioritized error coefficients (Fig. 13.10). Here, In is the original encoded operand input data, and E0, E1, and E2 are the EAB prioritized error coefficients inputs. YC is the decoded error-free output data, so it is same as input data. From Table 13.1, it is observed that the proposed enhanced performance and consumed less resource blocks with low area, power, and delay.
Fig. 13.4 Parity generator output
Fig. 13.5 Parity checker output
148
Fig. 13.6 Convolution communication system
Fig. 13.7 Encoder output
Fig. 13.8 SC output
Fig. 13.9 EAB output
Fig. 13.10 ECU output
P. Anuradha et al.
13 QCA-Based Error Detection Circuit for Nanocommunication Network
149
Table 13.1 Comparison of various decoders Parameter
Convolutional decoder [1]
Turbo decoder [2]
Linear block decoder [3]
Proposed convolution decoder
Time delay (ns)
7.281
10.3
4.29
2.298
Power utilized (uw)
0.322
1.32
1.84
0.165
Look-up tables
107
193
134
23
Slice registers
112
83
78
13
13.5 Conclusion In this manuscript, initially, a three-bit parity generation and four-bit parity checking operations were developed by using the QCA-based majority logic gates. But, this architecture is used to calculate only for error presented or not. Further, a new method for convolution communication system has been developed by using majority logic for multibit error detection and correction, so number of elements quantum levels will be optimized. By this approach, the original functionality will not be affected. To advance the reconfigure ability of ECU block well-organized path metrics with high standards to error detection and correction, the projected convolution decoder design is altered with singular pattern inputs. The convolution decoder design was utilized with insignificant PLBs transparency compared with CMOS design of convolution decoder.
References 1. Naveen, K.B., Puneeth, G.S., Sree Rangaraju, M.N.: Low power Convolution decoder design based on Majority logic gates. In: 2017 4th International Conference on Electronics and Communication Systems (ICECS). IEEE (2017) 2. Habib, I., Paker, Ö., & Sawitzki, S.: Design space exploration of harddecision convolution decoding: algorithm and VLSI implementation. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 18(5), 794–807 (2010) 3. He, J., Liu, H., Wang, Z., Huang, X., Zhang, K.: High-speed lowpower convolution decoder design for TCM decoders. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 20(4), 755–759 (2012) 4. Nargis, J., Vaithiyanathan, D., Seshasayanan, R.: Design of high speed low power convolution decoder for TCM system. In: Proceedings of International Conference on Information Communication and Embedded Systems, Chennai, India, pp. 185–190 (2013) 5. Azhar, M.W., Själander, M., Ali, H., Vijayashekar, A., Hoang, T.T., Ansari, K.K., LarssonEdefors, P.: Convolution accelerator for embedded processor datapaths. In: Proceedings of IEEE 23rd International Conference on Application-Specific Systems, Architectures, and Processors, Delft, Netherlands, pp. 133–140 (2012) 6. Karim, M.U., Khan, M.U.K., Khawaja, Y.M.: An area reduced, speed optimized implementation of convolution decoder. In: Proceedings of International Conference on Computer Networks and Information Technology, Abbottabad, Pakistan, pp. 93–98 (2011)
Chapter 14
Evaluating Performance on Covid-19 Tweet Sentiment Analysis Outbreak Using Support Vector Machine M. Shanmuga Sundari , Pusarla Samyuktha , Alluri Kranthi , and Suparna Das Abstract Sentiment analysis is a perfect machine learning process to analyze text and returns the text whether in positive or negative. The machine is trained with the emotions in text, then the machine can automatically understand text and predict the sentiment analysis. Sentiment analysis is an information extraction task that gives the result based on users writing emotions such as positive and negative thoughts, feelings. The emotions can be categorized as positive or negative words. Now, natural language processing (NLP) is an upcoming field in machine learning which gives hybrid applications in daily life. For example, the keyword which is taken from the text will undergo for intelligent learning. The output of the NLP algorithm enables sentiment analysis report daily activities. In this paper, we exposed the Covid-19 tweets from social media and did sentiment analysis using support vector machine (SVM). We trained the system using sentiment model and found the emotions from the Covid-19 tweets. Based on the trained system, we found the emotions in terms of negative, positive, and neutral emotions from Covid-19 tweet messages.
M. Shanmuga Sundari (B) · A. Kranthi · S. Das BVRIT Hyderabad College of Engineering for Women, Hyderabad, India e-mail: [email protected] A. Kranthi e-mail: [email protected] S. Das e-mail: [email protected] P. Samyuktha Vasavi College of Engineering, Hyderabad, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_14
151
152
M. Shanmuga Sundari et al.
14.1 Introduction In difficult times of pandemic situation, the social media is helping people in many ways and plays a vital role. Covid-19 [1] information provides details about the total number of affected people based on the information gathered from social media. The information not only makes people take steps to safeguard themselves but also provides the path for helping hands to reach the needy person. Covid-19 tweets [2] in social media support business growth in all times. Unlike the usual time, more people rely on social media message for their need. The information about who is supplying the very basic necessary things like food, medicine reaches people fast through social media. With dozens of information in our hand, we are now able decide properly. These information saves people from falling into the wrong hands. Positive Covid-19 tweets in social media make people mind strong by sharing positive messages. The negative messages shared also can make people think positive. Example, when we see lot of death news, we start to feel that we are blessed to be alive. That positive thinking may sow a seed for an attitude change in people life. In another example, pandemic situation like this would be very difficult to know the information about virtual classes conducted in various places without social media. Social media is a powerful vehicle to carry information to people. This will showcase their emotions, words, anger, and desperation. People use social media regularly to post incidents every day. It is anticipated that social media can guide people in correct or wrong way. Based on the number of messages, the spread will give more impact. The impact may be circulating positive or negative thoughts. Many times, it is misleading people in the wrong way. Nowadays, Covid-19-related information is circulating in social media. The cause of these messages is the people are in panic. In this paper, we focus on the twitter social media data and analyze the sentiment in the text. The messages are categorized as positive or negative impression words. This research starts about collecting the dataset for Covid-19. Our dataset contains all the tweets from January 2020 to April 2020, and we also considered retweet user messages. The SVM algorithm [3] is used to find the sentiment hidden in the dataset. According to Covid-19 tweets, that words such as “vaccine,” “vaccinated,” and “shot” themselves have a connotation toward positive or negative polarity [4].
14.2 Proposed System We collected n number of tweets that are posted in social media. The dataset has many emotions, symbols, and other unwanted sentences. The model has been created that will understand the word from tweet what are emotions [5] or not. Sentiment analysis [6] is the best way to decide the text emotions from tweets. Figure 14.1 shows the model for sentiment analysis using supervised learning classification. In this, the dataset has different classifiers for training and test datasets. The analysis is done using these classifiers and finds the accuracy [7] of the model.
14 Evaluating Performance on Covid-19 Tweet Sentiment …
153
Fig. 14.1 Proposed system for Covid-19 tweet
Fig. 14.2 Implementation steps in Covid-19 tweet sentiment analysis
The collection of data is reduced after preprocessing and undergoes with sentiment analysis using support vector machine algorithm. So, the text analysis is effectively identified with the algorithm and finds the accuracy of sentiment analysis.
14.3 Implementation Sentiment analysis is a most important element in NLP. Sentiment analysis has the following implementation steps shown in Fig. 14.2; the following steps are covered to implement the sentiment analysis concept. In this paper, we use Covid-19 tweets to understand the sentiments hidden in the text messages and predict the emotions in the text in terms of positive and negative words. Here are the steps to complete this analysis: • • • • • •
Dataset collection and preprocessing of the data in dataset Split the dataset into multiple tokens from the Recognize the problem statement for sentiment analysis Identify the algorithm for sentiment analysis Word extraction and apply the algorithm Evaluate the accuracy of sentiment analysis.
14.4 Support Vector Machine (SVM) Support vector machine is a supervised learning algorithm. It is highly preferred as it produces significant accuracy with less computation power. SVM can be used for both regression [8] and classification tasks.
154
M. Shanmuga Sundari et al.
Fig. 14.3 Dataset sample
SVM is used to find a hyperplane in between multiple features that basically used to classify data points. The SVM kernel is a useful function that transforms the lower input dimensional value to high-dimensional space [9]. The complex data will be transformed based on the label or output. SVM is used to convert a non-separable problem to a separable problem. SVM produces the margin of features that gives an effective high-dimensional space. This model is useful when the samples are more than the dimensions. SVM has few disadvantages such as more time to train a large dataset, and it does not provide probability estimates. Basic steps involved in SVM: • • • • • • •
Import the dataset Identify the attributes Clean the data Split data and label Find training and testing data Train the SVM algorithm Make some predictions.
14.4.1 Collection of Dataset Fig. 14.3 shows the collection of dataset details. There contains 49,605 tuples and 16 attributes in the Covid-19 dataset. The collection of attributes must be checked for importance before using for sentiment analysis. Then, the dataset can be used for implementation and further proceed with tokenization [10] of text.
14.4.2 Dataset Preprocessing Data preparation [11] can be done by using sklearn packages in Python. In Covid-19 dataset, the unwanted null columns and other attribute without text messages are removed from the dataset. The symbols are removed from the dataset to reduce the
14 Evaluating Performance on Covid-19 Tweet Sentiment …
155
Fig. 14.4 Preprocessed dataset
noise and for better model training. The sample output after preprocessing of the data is given in Fig. 14.4; in this, the text is extracted and the sentiment column is added with the emotions for the next level implementation.
14.4.3 Performance Analysis Covid-19 Dataset The dataset sentiment analysis is based in Covid-19 tweets in terms of emotions for example positive, negative, and neutral. Figure 14.5 shows the bar chart for the Covid-19 tweet dataset values. The Covid-19 dataset has divided into training and testing dataset. The accuracy is calculated for the testing dataset using model training process. The model is evaluated, and test accuracy values are given below: Train accuracy is 0.9961648 Test accuracy is 0.8403960. Fig. 14.5 Dataset emotions chart
156
M. Shanmuga Sundari et al.
Fig. 14.6 Classification matrix
14.4.4 Performance Matrix The performance of the accuracy has determined using the confusion matrix [12]. A confusion matrix is the way of explaining the positive and negative prediction [13] values from the collection of text. There are four different values determined using the below formula [14] using true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN). • • • •
Accuracy = (TP + TN)/(TP + FP + TN + FN) Sensitivity = TP/(TP + FN) Specificity = TN/(FP + TN) Sensitivity = TP/(TP + FP).
Figure 14.6 shows the classification report [15] on negative, positive, and neutral values. The emotion values are captured using the confusion matrix formula and derive the values for precision, recall, F1-score, and support. The confusion matrix values are finally calculated based on the formulas. The matrix is derived for Covid-19 tweets as shown below. Based on the confusion matrix values, the accuracy of the Covid-19 tweet was calculated. Confusion matrix:
1568 377 41
167 2473 108 115 288 1730
14.4.5 Model Evaluation In Fig. 14.7, axes subplot [16] shows the values derived for Covid-19 tweets. The x-axis and y-axis values are based on tweets and user favorite tweets. The values are plotted using the axes subplot class, and the related values are plotted from the Covid-19 dataset.
14 Evaluating Performance on Covid-19 Tweet Sentiment …
157
Fig. 14.7 Axes chart for Covid-19 tweet dataset
Fig. 14.8 Tweets sample from India
The sentiment model displays the Covid-19 tweet emotions and text coming from different countries [17]. Figure 14.8 shows the Covid-19 tweet emotions and text mostly originated in India.
14.4.6 Covid-19 Sentiment Analysis The sentiment tweets analysis segregates the text in the Covid-19 tweets. The following values are derived in the sentiment analysis values for the emotions like negative, positive, and neutral values. Sentiment analysis model [18] derives 8399 negative emotion tweets, 7004 positive emotion tweets, and 29,992 neural emotion tweets. Sentiment 0 negative 1 neutral
tweets 8399 29,992
158
M. Shanmuga Sundari et al.
Fig. 14.9 Covid-19 sentiment analysis
2 positive
7004
Figure 14.9 shows the bar chart [19] based on the final tweets collected from the sentiment analysis on Covid-19 tweets. The bar shows that the neutral tweets are more than the negative and positive emotions. The people are tweeting neutral messages than other emotions. The graph shows the result, and this paper concludes that the neutral tweets are spreading more in the social media than other emotions.
14.5 Conclusion Sentiment analysis is a grateful tool in text analysis. We have analyzed the sentiment analysis with Covid-19 tweets from different countries. We have studied the tweet words and did sentiment analysis with current tweet dataset. Support vector machine is used to find tweet sentiment analysis. In this paper, we evaluated the Covid-19 tweets using support vector machine and found the accuracy of sentiment analysis. We evaluated the emotions from the tweet dataset and determined the value of emotions such as negative, positive, and neural messages. In the future, we can identify other emotions in the dataset and use different algorithm for the sentiment analysis.
References 1. Yin, H., Yang, S., Li, J.: Detecting topic and sentiment dynamics due to Covid-19 pandemic using social media. In: International Conference on Advanced Data Mining and Applications, pp. 610–623. Springer, Cham (2020) 2. Manguri, K.H., Ramadhan, R.N., Amin, P.R.M.: Twitter sentiment analysis on worldwide COVID-19 outbreaks. Kurdistan J. Appl. Res. 54–65 (2020)
14 Evaluating Performance on Covid-19 Tweet Sentiment …
159
3. Satu, M.S., Khan, M.I., Mahmud, M., Uddin, S., Summers, M.A., Quinn, J.M., Moni, M.A.: TClustVID: a novel machine learning classification model to investigate topics and sentiment in COVID-19 Tweets. Knowl.-Based Syst. 226 107126 (2021) 4. Gruzd, A., Mai, P.: Going viral: how a single tweet spawned a COVID-19 conspiracy theory on Twitter. Big Data Soc. 7(2), 2053951720938405 (2020) 5. Alkhalifa, R., Yoong, T., Kochkina, E., Zubiaga, A., Liakata, M.: QMUL-SDS at CheckThat! 2020: Determining COVID-19 Tweet Check-Worthiness Using an Enhanced CT-BERT with Numeric Expressions (2020). arXiv preprint arXiv:2008.13160 6. Alam, F., Dalvi, F., Shaar, S., Durrani, N., Mubarak, H., Nikolov, A., Nakov, P.: Fighting the COVID-19 infodemic in social media: a holistic perspective and a call to arms (2020). arXiv preprint arXiv:2007.07996 7. Long, Z., Alharthi, R., El Saddik, A.: NeedFull—a tweet analysis platform to study human needs during the COVID-19 Pandemic in New York State. IEEE Access 8, 136046–136055 (2020) 8. Sharma, K., Seo, S., Meng, C., Rambhatla, S., Liu, Y.: Covid-19 on Social Media: Analyzing Misinformation in Twitter Conversations (2020). arXiv e-prints, arXiv-2003. 9. Ghosh, P., Schwartz, G., Narouze, S.: Twitter as a powerful tool for communication between pain physicians during COVID-19 pandemic. Reg. Anesth. Pain Med. 46(2), 187–188 (2021) 10. Sundari, M.S., Nayak, R.K.: Process mining in healthcare systems: a critical review and its future. Int. J. Emerg. Trends Eng. Res. 8(9). ISSN 2347-3983 11. Singh, L., Bansal, S., Bode, L., Budak, C., Chi, G., Kawintiranon, K., Wang, Y.: A First Look at COVID-19 Information and Misinformation Sharing on Twitter (2020). arXiv preprint arXiv: 2003.13907 12. Banda, J.M., Tekumalla, R., Wang, G., Yu, J., Liu, T., Ding, Y., Chowell, G.: A LargeScale COVID-19 Twitter Chatter Dataset for Open Scientific Research—An International Collaboration (2020). arXiv preprint arXiv:2004.03688 13. Nannapaneni, L., Rao, M.V.G.: Control of indirect matrix converter by using improved SVM method. Bull. Electr. Eng. Inform. 4(1), 26–31 (2015) 14. Reddy, R.R., Ramadevi, Y., Sunitha, K.V.N.: Enhanced anomaly detection using ensemble support vector machine. In: 2017 International Conference on Big Data Analytics and Computational Intelligence (ICBDAC), pp. 107–111. IEEE (2017, March) 15. Chaganti, S.Y., Nanda, I., Pandi, K.R., Prudhvith, T.G., Kumar, N.: Image classification using SVM and CNN. In: 2020 International Conference on Computer Science, Engineering and Applications (ICCSEA), pp. 1–5. IEEE (2020, March) 16. Sundari, M.S., Nayak, R.K.: Master card anomaly detection using random forest and support vector machine algorithms. J. Critic. Rev. 7(9), (2020). ISSN-2394-5125 17. Shoemaker, L., Hall, L.O.: Anomaly detection using ensembles. In: International Workshop on Multiple Classifier Systems, pp. 6–15. Springer, Berlin, Heidelberg (2011, June) 18. Laskari, N.K., Sanampudi, S.K.: TWINA at SemEval-2017 task 4: Twitter sentiment analysis with ensemble gradient boost tree classifier. In: Proceedings of the 11th international workshop on semantic evaluation (SemEval-2017), pp. 659–663 (2017, August) 19. Nayak, R.K., Tripathy, R., Mishra, D., Burugari, V.K., Selvaraj, P., Sethy, A., Jena, B.: Indian stock market prediction based on rough set and support vector machine approach. In: Intelligent and Cloud Computing, pp. 345–355. Springer, Singapore (2021)
Chapter 15
Minimum Simplex Nonlinear Nonnegative Matrix Factorization for Hyperspectral Unmixing K. Priya and K. K. Rajkumar
Abstract Hyperspectral unmixing is one of the most emerging technologies that estimates the endmember and abundance from the hyperspectral image. This technology improves the spatial and spectral quality of the image. In the past decades, many literature has introduced a number of spectral unmixing based on Linear Mixing Model (LMM). At present, Nonnegative matrix factorization (NMF) is widely used in LMM, because it estimates the endmembers and abundances simultaneously but considers only the linearity features of the images. To overcome these limitations, we proposed a Minimum Simplex Nonlinear Nonnegative Matrix Factorization for hyperspectral unmixing. This model help us to acquire spectral and special data from the hyperspectral image. This proposed method considers the nonlinearity in the image thus obtain a high-quality spatial data. Therefore to improve the spectral quality, a simplex minimum volume constraints are added with this method. We also measured the strength and superiority of the proposed model on two different public datasets and our algorithm shows outstanding performance than all other baseline methods. Further, an experiment is also conducted by varying the number of endmembers and concluded that as the number of endmembers less than five attain high quality spectral and spatial data simultaneously.
15.1 Introduction Spectral imaging is one of the fastest-growing technology due to its wide range of applications such as object classification, target identification, change detection, and object recognition so on. In hyperspectral imaging, each pixel in the images is a mixture of several kinds of materials of the scene and it gives a complex structure to the hyperspectral images. Due to this complex structure, it is practically difficult for the in-depth or detailed analysis of HSI using conventional imaging systems. To overcome this limitation, advanced imaging technology such as Hyperspectral Unmixing (HU) can be effectively utilized for analyzing minute details of the pixel K. Priya (B) · K. K. Rajkumar Department of Information Technology, Kannur University, Kannur, Kerala, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_15
161
162
K. Priya and K. K. Rajkumar
information in the image. The HU mainly consists of three main steps for extracting the information in the images such as selecting desired number of endmembers from the scene, extraction of the spectral information for the selected endmembers, and finally estimating the fractional abundances for endmembers [1]. The hyperspectral unmixing technique can be classified into two fundamental mixing models: linear and nonlinear models. Linear mixing models (LMM) consider only the macroscopic scale of mixing, which means that only a single level interaction of the incident light on the material is taken into account. In nonlinear mixing model (NLMM) mixing is more complex because it occurs at the microscopic or multiple scattering of material in the image pixel. Therefore due to its simplicity and ease of use LMM becomes more commonly used in unmixing methods [2]. An LMMbased NMF method, which is an attractive and widely used model for hyperspectral unmixing process [3]. Many literature proposed constrained NMF by incorporating some additive terms to the original NMF [4]. But most of these existing NMF based unmixing algorithms impose constraints on any one of the two matrices namely in the endmember or the abundance matrix but does not consider the nonlinearity in the image [5]. In this paper, we are going to introduce a new blind unmixing method called Minimum Simplex Nonlinear Nonnegative Matrix Factorization. The minimum simplex information is based on the spectral data of HSI image. This information is acquired from the vertices of a simplex. A simplex is a hypercube which is formed from the data points of the image called endmembers. This endmember determines the volume of the simplex which is one of the most important prior considerations in the analysis of spectral image. In order to control the structural information and to gain better performance on unmixing process, it is necessary to choose the simplex with possible minimum volume. Here we use centroid-based minimum volume regularizer that makes the selected endmember comes closer to the center point of the simplex denoted by μ. Thus minimizing the volume of the simplex obtained through the endmembers [6]. In this work, other than these constraints, we also address the nonlinearity effects in the spatial data. The nonlinear effects account for the residual noise, spectral and spatial variability occurring in the specific locations of the image. These nonlinear effects are also named as outlier or residual data [7]. The remaining parts of this paper are ordered as follows. In Sect. 15.2 a review is done about various hyperspectral unmixing work that is related to our paper and identified the proper research gap in the literature. Section 15.3 present the details of selected dataset and quality measures used in this paper for evaluating the performance of the proposed algorithm. Then in Sects. 15.4 and 15.5 formulate and implement the proposed unmixing algorithm. Finally, results and discussion are done in Sect. 15.6 and in Sect. 15.7 conclusion and future enhancement of the work are explained.
15 Minimum Simplex Nonlinear Nonnegative Matrix Factorization …
163
15.2 Review of Literature In recent literature, many LMM-based NMF unmixing algorithm have been proposed. In the article [6], Zhuang et al., proposed a nonnegative matrix factorization-quadratic minimum volume-based (NMF-QMV), which imposed a minimum volume (MV) constraints to the unmixing algorithm. This MV constraints minimize the volume of simplex thus reducing the non-convexity problem as well as computational complexity during the optimization process. In the article [8], Yuan et al. proposed a hyperspectral image unmixing method named an improved collaborative non-negative matrix factorization and total variation algorithm (ICoNMF-TV) is proposed to enhance the hyper-spectral image. In this method, authors introduced total variation (TV) regularizer to the abundance matrix to provide piecewise smoothness between adjacent pixels. Therefore, this unmixing method enhances the spatial quality and also promote the performance and efficiency of hyperspectral unmixing algorithm. In the article [9], Zhou et al., proposed NMF unmixing method based on spatial information by considering the original image into subspace structure. This subspace structure helps to identify the materials that are globally distributed on different parts of the image. Finally, the sparse-NMF framework was incorporated with this method for considering the amount of sparsity in the abundance matrix. Therefore, this method improved the spatial quality as well as the performance of unmixing algorithm. In the article [10], He et al., introduced a total variation regularized reweighted sparse NMF (TV-RSNMF) unmixing method. This method incorporates a reweighted sparse unmixing method which encourages more sparsity of the abundance map than the existing L1 norm method. The total variation (TV) regularizer is also embedded into this method which provides piecewise smoothness as well as denoising the abundance map. Thus, TV-RSNMF provides good visual quality to the image. In the article [11], Qu et al., proposed unmixing algorithm namely multiplepriors ensemble constrained NMF (MPEC-NMF). In this work, NMF unmixing method combines both the geometrical and statistical prior of HSI. In geometric prior concepts, the endmember matrix is imposed with minimum volume constraints to strengthen the quality of spectral data. The statistical prior impose total variation constraints to provide a spatial smoothness as well as the sparsity constraints to accounts for the number of zero or null values in the abundance map. These two statistical priors are most important constraints applied in the abundance matrix. That means, this method incorporated all the important constraints which enhance the spatial as well as spectral data of the HSI. But all these methods do not consider the nonlinearity effects in the image. As a result of simplicity in LMM method, most of the existing unmixing algorithms use LMM based NMF for hyperspectral unmixing process. Moreover, LMM based NMF unmixing method provides a good approximation in many fundamental observations of the image. This behavior makes the LMM a useful technique for many applications. But all these existing LMM-based unmixing algorithms have
164
K. Priya and K. K. Rajkumar
been developed did not consider the basic nonlinearity in the image. In some situations, such as, sand-like scenes, incident light scattered and absorbed through multiple materials present in each pixel which may highly result in nonlinear effects. In such a special situation, LMM may be inaccurate to handle the image unmixing process due to the presence of nonlinearity or outlier effects in the image. Therefore, it is necessary to consider nonlinear data during the spectral unmixing process [7]. In this work, we introduce a nonnegative matrix factorization (NMF) based LMM method by incorporating the nonlinear effects described above. Our method is built on the standard LMM, with some regularization terms that enhance the accuracy of both spectral and spatial dimensions along with the nonlinear effects of the image.
15.3 Dataset and Quality Measures The efficiency and effectiveness of our proposed algorithm are evaluated by using two hyper-spectral data sets. The Washington DC Mall is a well-known dataset contains 1278 × 307 pixels size and 191 bands with 0.4–2.5 μm spectral range [12]. The AVIRIS Indian Pines image consists of 512 × 614 pixels and 192 bands with wavelength range from 0.4 to 2.5 μm [13]. The most common five such quality measures namely Spectral Angle Mapper (SAM), Signal-to-Reconstruction Error (SRE), Root-Mean-Square Error (RMSE), Peak Signal to Noise Ratio (PSNR), and Universal Image Quality Index (UIQI) are used to determine the performance quality of our algorithm [14]. The mathematical modeling and the significance of all these methods are explained below. A.
SAM measure the spectral difference between the estimated and referenced spectra E and E as follows:
n E Tj · E j 1 SAM E, E = arccos n j=1 E j 2 · E j 2
(15.1)
B.
SRE measure the quality of estimated and referenced abundance Ai and Ai as follows, ⎞ ⎛ 2 1 n A i 2 i=1 n ⎠ (15.2) SRE = 10log10 ⎝
2 n 1 A − A i i 2 i=1 n
C.
RMSE measures the spatial quality between the referenced and estimates abundance A and A is defined as:
RMSE(A, A) =
2 1 A − A F λh n m
(15.3)
15 Minimum Simplex Nonlinear Nonnegative Matrix Factorization …
D.
165
PSNR measures the reconstruction quality of estimated and referenced abundance data in band wise as follows. PSNR =
λh 1 PSNRl λh l=1
(15.4)
where PSNRl is defined as:
PSNRl = 10 · log10 E.
max(Al )
2
(15.5)
Al − Al /P
UIQI determines the similarity between the original and the estimated images A(l) and A(l) by calculating the average correlation between both images as,
σ l 2μ Al μ l 2σ Al σ l l l A A Q Al , A = A A σ Al σ l μ2Al + μ2 σ A2l + σ 2 A Al Al
(15.6)
λh l l 1 UIQI Al , A = Q Al , A λh 1
(15.7)
where, μ Al and μ
Al A
l
l denote the mean vectors, σ Al and σ l denote the variances A A is the covariance of both images respectively.
and σ
15.4 Proposed Method In the proposed model, we are going to explore both spatial and spectral quality of the image by considering the nonlinearity and minimum volume constraint to the simplex formed from the endmembers in spectral domain. These constraints help us to attain a narrow solution space for the NMF problem. The robust nonnegative matrix factorization algorithm (r-NMF) [15] is applied to describe the outlier residual term that can capture the nonlinear effects in the image. In this paper NMF based unmixing algorithm uses the multiplicative update algorithm for updating the basic terms and use β-divergence which minimize the objective function of NMF at each iteration until it reaches the pre-determined ε (epsilon) value [7]. A.
Linear Mixing Model (LMM)
In paper [1] proposed, LMM, which is one of the conventional and effectively used methods in hyperspectral unmixing. This linear mixture model only assumes the macroscopic level of information about the pixel. That means LMM considers only the reflecting endmembers present within a pixel. LMM first estimate the endmember based on the scene of interest, then decompose the input matrix into pure spectral
166
K. Priya and K. K. Rajkumar
signature and their fractional abundance corresponding to the estimated endmember [16]. Let the observed hyperspectral image be Y = [y1 , . . . , yk , . . . , y N ] ∈ RL×N with L bands and N pixels, then assume that p is the number of endmembers to estimated. The endmember matrix are represented as E = [e1 , . . . , e P ] ∈ R L× p and its corresponding abundance matrix are represented as A = [a1 , . . . , ak , . . . , a N ] ∈ R p×N . The matrix R = [r1 , . . . , rk , . . . , r N ] ∈ R L×N is the corresponding residual matrix that accounts for some noise, variability, and errors, etc. In general, R may be considered as zero or close to zero. With these notations, the LMM can be modeled, based on pixel-wise yk ∈ RL×1 , as: yk = Eak + rk ,
(15.8)
The matrices representation of Fig. 15.1 can be presented as: Y = EA+ R
(15.9)
The LMM-based unmixing approaches are generally divided into geometrical and statistical ways. The geometrical based unmixing approach includes two steps, first extract the endmembers (spectral signature) and then estimate the abundances for the extracted endmembers [17]. Whereas, in statistical-based unmixing algorithm, both endmembers and its abundances are estimated at the same time without any purest pixel assumption. NMF are the typical statistical-based unmixing algorithm which factorize the high dimensional data into two nonnegative matrices simultaneously without pure pixel assumption by satisfying both sum-to-one (ASC) and nonnegativity (ANC) constraints. Therefore, hyperspectral unmixing (HU) can be formulated well in NMF based statistical approach [18].
Fig. 15.1 An iterative representation of minimum volume simplex
15 Minimum Simplex Nonlinear Nonnegative Matrix Factorization …
B.
167
NMF
NMF factorize the input matrix Y into two nonnegative matrices named endmember matrix E and abundance matrix A. In the initialization step, set the number of endmember p for the input matrix Y ∈ RL×N and then calculate the initial endmember matrix E ∈ RL×p by Automatic Target Generation Process (ATGP), which provides a purest pixel assumption. After the endmember estimation step, fractional abundance A ∈ Rp×N corresponding to these endmembers are calculated. These initial abundances are estimated using Fully Constrained Least Square method (FCLS) that satisfy both sum-to-one and non-negativity constraints [19]. In general, NMF method can be formulated as follows, Y ≈ EA
(15.10)
Equation (15.10) implies that the input matrix Y is decomposed into two nonnegative matrix endmember E and abundance A simultaneously. NMF based unmixing minimize the difference between Y and EA by performing the matrix decomposition iteratively until it meets the convergence condition. Then, minimum distance of Eq. (15.10) can be represented as follows: min f (E, A) =
1 ||Y − E A||2F subjected to, E, A ≥ 0 2
(15.11)
where, E and A are endmember and abundance with nonnegative values and · 2F is the Frobenius norm [20]. C.
NMF with Outlier Term
In general, the LMM method does not consider specific heterogeneous regions. This limitation of LMM unmixing algorithms found difficulty for the estimation of endmembers and abundances fractions accurately. From this premise, in paper [7], proposed a new NMF based LMM model by considering the above scenarios, so-called robust nonnegative matrix factorization (r-NMF). This r-NMF model decomposes the input matrix Y as follows, Y ≈ EA+ R
(15.12)
where Y ∈ RL×N is an input matrix, E and A represents the endmember and the abundance matrix. R is an outlier term that account for nonlinearity effects such as residual noise, spectral and spatial variability, etc. The approximation symbol (≈) in Eq. (15.12) indicates the minimum dissimilarity measure between the input and factorized matrix. So, Eq. (15.12) can be reformulated as, D(Y |M A + R) Which is equals to,
(15.13)
168
K. Priya and K. K. Rajkumar
D(Y |E A + R) = ||Y − E A − R||2F
(15.14)
then Eq. (15.14) can be rewrite as, min f (E, A, R) = D(Y |E A + R) st, E, A, R ≥ 0, D.
(15.15)
NMF with Minimum Volume (MV) Constraint
A significant geometrical constraint considered during hyperspectral unmixing process is convex or simplex minimum volume (MV) prior. This MV regularizer measures the volume of the simplex or convex hull whose vertices are represented by the endmembers selected for unmixing. This work proposes a centroid-based minimum volume simplex. This method shrinks the volume of simplex or convex hull by pulling the endmember (vertex of simplex) towards the centroid μ (center of simplex) [21]. The NMF method with minimum volume based constraints for Eq. (15.15) can be represented as, min f (E, A, R) = D(Y |E A + R) + αMV (E) MV (E) =
p
ei − μ22
(15.16)
(15.17)
i=1
The centroid μ (center of mass) of simplex is estimated as follows: μ=
p 1 ei p i=1
(15.18)
where, p denotes the number of endmembers and the ei represent the mean value of each row in the endmember matrix E. The MV-based NMF algorithm provides high-fidelity to the spectral signatures and thus helps to reduce the computational complexity in practical applications. Figure 15.1 shows the iterative process to represent the simplex with minimum volume (MV). In the k + 1th iterative step, it select a minimum volume simplex for the hyperspectral image Y h .
15.5 Algorithm Implementation In this paper, we implemented an algorithm that accounts both the nonlinearity effects of the image along with spatial and spectral quality. Then our proposed model can be represented as following objective function:
15 Minimum Simplex Nonlinear Nonnegative Matrix Factorization …
min f (E, A, R) = D(Y |E A + R) + αMV (E) subjected to E ≥ 0
169
(15.19)
The matrix R is an outlier function that accounts for the non-linearity effects such as residual noise and other anomalies due to spatial and spectral variability. The parameter α controls the simplex with minimum volume. By solving the Eq. (15.19), we will obtain an optimum minimization solution to unmixing problems with high robustness to the signal-to-noise ratio. In the optimization step, both E, A and R are alternately updated until the cost function reaches the convergence condition. Various literature have been already proposed and many algorithms for updating and solving the NMF. Among them multiplicative update (MU) algorithms [22] are commonly used for the solution of nonnegative matrix factorization as well as many variants of NMF method. This is because of its ease of implementation and good approximation results. Following the multiplicative update rule for NMF proposed by Juan in [22], they derive update rules for E, A and R as follows: E ← E◦
Y AT E A AT
(15.20)
A ← A◦
ET Y ET E A
(15.21)
Y EA
(15.22)
R ← R◦
The endmember matrix E, abundance matrix A and outlier matrix R are updated iteratively with this multiplicative update rule until it reaches the convergence condition. Here, the T indicates the transpose of the matrix. The convergence condition is measured as the change in ratio of cost function f must be below the given threshold ε, t f − f t+1 ],” “VidToken1,” “VidToken2,” “VidToken3,” “SEP”}. The model is able to accommodate our changes as our video tokens have been vector quantized from the I3D and k-means models. As such, the BERT model is a multi-layer bidirectional transformer encoder. It is one of the most sophisticated NLP algorithms due to its large pre-training on two main tasks—mask language modeling (Fill in the blanks) and next-sentence prediction (Question–answer) (Fig. 24.3).
24.3.2 Implementation 24.3.2.1
Training Process
The video must be transformed to spatiotemporal features during the feature extraction process. The S3D model is what we use. TensorFlow is used to build the models, and a vanilla synchronous SGD algorithm with a momentum of 0.9 is used to optimize them. In the next phase, we pass the features through to a hierarchial minibatch k-means clustering algorithm. This allows to form centroids which represent the essence of the cluster. We have set the hyperparameters as d = 1 (hierarchy) and the number of clusters as 5. These numbers can be fixed by visually inspecting the coherence of the clusters. The output as centroids goes to the punctuator model. The problem of punctuation restoration is defined as a sequence labeling task with four target classes: EMPTY, COMMA, PERIOD, and QUESTION. We do not include other punctuation marks as their frequency is very low in both datasets. For this reason, we apply a conversion in cases where it is semantically reasonable: We convert exclamation marks and
264
S. Gupta et al.
semicolons to periods and colons and quotation marks to commas. We remove double and intra-word hyphens, however, if they are encapsulated between white spaces, we convert them to commas. Other punctuation marks are disregarded during the experiments. As tokenizers occasionally split words to multiple tokens, we apply masking on tokens, which do not mark a word ending. Finally, these tokens are now interpolated among other text tokens from the filtered captions for next-sentence prediction or mask-prediction task. We employ a global set of 28 steps in the training process.
24.3.2.2
AWS and Google Colab Deployment
While Google Colab is one of the best free resources online, it lacks persistent storage. We started working on a notebook using GPU in Google Colab and left it running for an hour for observation purposes. When we examined the activity of the notebook on which we were working, we discovered that it had disconnected from the VM, and when we attempted to reconnect, we received an error message. Yes, you can only use one GPU with 12 GB of memory, but the TPU has 110 GB of high bandwidth memory. So, for large datasets, it is not feasible. A few of our earlier experiments have been carried out by mounting the Google Drive into the Colab notebook, but in the long term, it is not economical. It was finally incurred that an AWS instance would be the best option. The Amazon deep learning AMI (Ubuntu 18.04) was selected. It was also observed that the costs varied with the region selected. Following Amazon’s recent expansion in India, Mumbai is also a datacenter site. This lead to a steady decrease in cost for the instance selected. The AMI driver enables cloud computing on NVIDIA T4 GPUs. The new G4 instance type features the NVIDIA T4 GPU and supports this driver on Ubuntu 18.04. Although we concede that AWS Sagemaker would have been the easiest of options, it was not the most economical. So, the final cost of our EC2 instance was around $0.2 per hour. After choosing the appropriate instance, the next step was to generate a PEM key. PEM is a container file format often used to store cryptographic keys; in our case, the credentials encode by any state-of-the-art encryption algorithm for a SSH login. It was initially agreed to access the EC2 instance via GUI. But, there was a surprising large latency of about 2–3 s through various RDP clients like TigerVNC, RealVNC, Remmina, TeamViewer, and so on. Hence, it was deduced that a CLI would be the most time friendly.
24.3.3 Results Analysis Using the standard metrics, it was inferred to use text-based scores. These scores help us in getting the loss metric. Logarithmic loss functions were used to confine the results within a range. The loss curve is as plotted in Fig. 24.4. It is observed
24 ChefAI Text to Instructional Visualization …
265
Fig. 24.4 Loss curve of the model over 28 global training steps
Fig. 24.5 Learning rate evolution over the training process
that the loss is steadily decreasing over the training cycle. It drops from 9.5 to about 6.5. The learning rate is automatically adjusted by the SGD algorithm; it drops from 1e−5 to 2e−6 by the last training steps. Some of the outputs of the model are also displayed. The temporal actions can be seen in Fig. 24.5. The pictures on the left constitute the “Making of a smoothie” recipe process. Note how the mixing action is correctly shown. Similarly, the right side of the image constitutes “Mexican Rice” which correctly shows the action of “adding oil to a hot pan” followed by “adding chopped onions to pan” (Fig. 24.6).
24.4 Conclusion There exist generative as well as transformational methods for instructional visualization. In our approach, we have worked with three types of models. In the preliminary stage, a video-captioning model was employed to understand the domain and
266
S. Gupta et al.
Fig. 24.6 Outputs obtained for training the model over Smoothie and Mexican Rice recipes
work around with the data available. We built upon this by steering away with some generative models like LDVDGAN, MoCoGAN, and CGAN. The training being complex on our machines yielded blurry images as outputs even after 200 epochs as the model was not learning. This was the time we decided to pivot our project from being largely a video-based to video-text approach. This yielded appreciable results on a small scale on Google Colab and AWS; we aim to build upon by training the model on videos in the magnitudes of thousands. The challenges faced shall be the scalability of the model and proper evaluation of the model.
References 1. Aldausari, N., et al.: Video generative adversarial networks: a review (2020). arXiv [cs.CV]. Available at: http://arxiv.org/abs/2011.02250 2. Arasanipalai, A.U.: Generative adversarial networks—the story so far, Floydhub.com. FloydHub Blog (2019). Available at: https://blog.floydhub.com/gans-story-so-far/ 3. Richard, A., Gall, J.: Temporal action detection using a statistical language model. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2016) 4. Gupta, S., Godavarti, R.: IoT data management using cloud computing and big data technologies. Int. J. Softw. Innov. 8(4), 50–58 (2020) 5. Mansimov, E., et al.: Generating images from captions with attention (2015). arXiv [cs.LG]. Available at: http://arxiv.org/abs/1511.02793 6. Liu, P., et al.: KTAN: knowledge transfer adversarial network. In: 2020 International Joint Conference on Neural Networks (IJCNN). IEEE (2020) 7. Zhu, Y., et al.: A comprehensive study of deep video action recognition (2020). arXiv [cs.CV]. Available at: http://arxiv.org/abs/2012.06567 8. Pan, Y. et al.: To create what you tell: generating videos from captions. In: Proceedings of the 2017 ACM on Multimedia Conference—MM’17. ACM Press, New York, USA (2017) 9. Sener, F., Yao, A.: Zero-shot anticipation for instructional activities. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE (2019)
24 ChefAI Text to Instructional Visualization …
267
10. Kahembwe, E., Ramamoorthy, S.: Lower dimensional kernels for video discriminators. Neural Netw. Official J. Int. Neural Netw. Soc. 132, 506–520 (2020) 11. Dong, H., et al.: I2T2I: Learning text to image synthesis with textual data augmentation. In: 2017 IEEE International Conference on Image Processing (ICIP). IEEE (2017) 12. Mittal, G., Marwah, T., Balasubramanian, V.N.: Sync-DRAW: automatic video generation using deep recurrent attentive architectures. In: Proceedings of the 25th ACM International Conference on Multimedia. ACM, New York, USA (2017) 13. Nagy, A., Bial, B., Ács, J.: Automatic punctuation restoration with BERT models (2021). arXiv [cs.CL]. Available at: http://arxiv.org/abs/2101.07343 14. Ryoo, M.S.: Human activity prediction: early recognition of ongoing activities from streaming videos. In: 2011 International Conference on Computer Vision. IEEE (2011) 15. Gupta, S., Gugulothu, N.: Secure NoSQL for the social networking and E-commerce based bigdata applications deployed in cloud. Int. J. Cloud Appl. Comput. 8(2), 113–129 (2018) 16. Devlin, J., et al.: BERT: pre-training of deep bidirectional transformers for language understanding (2018). arXiv [cs.CL]. Available at: http://arxiv.org/abs/1810.04805 17. Ramesh, A., et al.: Zero-shot text-to-image generation (2021). arXiv [cs.CV]. Available at: http://arxiv.org/abs/2102.12092 18. Salvador, A., et al.: Inverse cooking: recipe generation from food images. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2019) 19. Wang, Y.: A mathematical introduction to generative Adversarial Nets (GAN) (2020). arXiv [cs.LG]. Available at: http://arxiv.org/abs/2009.00169 20. Xie, S., et al.: Rethinking spatiotemporal feature learning: speed-accuracy trade-offs in video classification (2017). arXiv [cs.CV]. Available at: http://arxiv.org/abs/1712.04851
Chapter 25
Identification of Predominant Genes that Causes Autism Using MLP Anitta Joseph and P. K. Nizar Banu
Abstract Autism or autism spectrum disorder (ASD) is a developmental disorder comprising a group of psychiatric conditions originating in childhood that involve serious impairment in different areas. This paper aims to detect the principal genes which cause autism. Those genes are identified using a multi-layer perceptron network with sigmoid as an activation function. The multi-layer perceptron model selected sixteen genes through different feature selection techniques and also identified a combination of genes that caused the disease. From the background study, it is observed that CAPS2 and ANKUB1 are the major disease-causing genes but the accuracy of the model is less. The selected 16 genes along with CAPS2 and ANKUB1 produce more accuracy than the existing model which proved 95% prediction rate. The analysis of the proposed model shows that the combination of the predicted genes along with CAPS2 and ANKUB1 will help to identify autism at an early stage.
25.1 Introduction Autism or autism spectrum disorder (ASD) is a disorder associated with a broad range of conditions characterized by challenges with repetitive behaviors, social skills, speech, and nonverbal communication. Even though the cause of the disease is still mysterious and unknown, studies made throughout these years suggest that the origin of the disease can be concluded to be a complex mode of inheritance of genetic basis. An autism-affected person can be diagnosed with the disorder if they possess some ailment in the communication process and also correlative social communication. Such people exhibit restricted repetitive and stereotyped patterns of behaviors or interests prior to the age of 3 years. A higher functioning form of Autism was found in a certain group of children known as Asperger Syndrome. A. Joseph · P. K. Nizar Banu (B) Department of Computer Science, CHRIST (Deemed to be University), Bangalore, India e-mail: [email protected] A. Joseph e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_25
269
270
A. Joseph and P. K. Nizar Banu
ASD is more likely to be found in males rather than females, with a ratio of 4:1 respectively. There had been an increasing trend worldwide in the rates of Autism and its related disorders, which can be approximated from 4 per 10,000 to 6 per 1000 children. The rate of the occurrence of the disease is tremendously increasing over these years. At present, there is no cure for the core symptoms of autism. However, several groups of medications, including a typical narcoleptic, have been used to treat associated behaviors. The motive behind this study is that is to detect the actual cause of autism is still unknown. This paper starts with a brief introduction about ASD, Sect. 25.2 discusses background study, Sect. 25.3 presents methodology and model information, Sect. 25.3 presents the data analysis and experimental results, Sect. 25.4 concludes the paper.
25.2 Background Study The degree of autism can range from mild to severe. Logistic regression analysis was used to examine the associations between risk of ASD and parental age (Reichenberg et al. 2006). A similar study was conducted on all singleton children born at KP from January 1, 1995, to December 31, 1999 [1]. Which concluded that there is no association between maternal age and offspring being autistic where as there is an association between paternal age and offspring being autistic. Self-efficacy is defined as an individual’s perceived ability to do things for themselves and be successful in their life. Shattuck, et al. [2] talks about the personal experiences of college students on the autism spectrum by examining disability identification and self-efficacy. Memari et al. [3] is basically emphasizing the importance of physical education among ASD children. Feng et al. [4] deals with the concept of using robots as a therapy tool for autistic children. This paper focuses on predicting autism during the prenatal stage itself. A similar study was conducted on all singleton children born at KP from January 1, 1995, to December 31, 1999. These children were compared with all 132,251-remaining singleton KP births. Relative risks (RRs) are estimated using proportional hazards regression models (Croen et al. 2007).
25.3 Methodology The paper mainly aims to develop a method to predict the chances of autism in the prenatal stage based on hereditary factors. The primary objective of this study is to detect the most predominant gene which causes autism and the secondary objective is to determine the combination of a subset of genes causing the disease using a multi perceptron neural network model.
25 Identification of Predominant Genes that Causes Autism Using MLP
271
25.3.1 Model Building A neural network model is used to find out the relevant gene. The model is a multilayer perceptron network shown in with Relu as an activation function [5]. A perceptron is a very simple learning machine. It can take in a few inputs, each of which has a weight to signify how important it is, and generate an output decision of “0” or “1”. However, when combined with many other perceptron’s, it forms an artificial neural network [5]. A multilayer perceptron (MLP) is a perceptron that teams up with additional perceptron, stacked in several layers, to solve complex problems shown in Fig. 25.1. An activation function can be used to get the output of a node. It is also known as Transfer Function [5]. The Relu (rectified linear unit) is a nonlinear activation function and the value of the function lies between zero and positive values. For any negative value, the function returns zero and for any positive value the function returns the value itself [6]. Adam is used as an optimization solver for the Neural Network algorithm (Oracle 2021). Adam is well suited for problems that are large in terms of data, parameters, or both (Oracle 2021). It’s computationally efficient and only little memory is required to process. The network is trained for 500 epochs (Fig. 25.2).
25.4 Data Analysis and Experimental Results Figure 25.2 Represents the entire workflow the study starting from data collection to result analysis, which is explained in detail in the coming sections.
Fig. 25.1 Multi perceptron network
272
A. Joseph and P. K. Nizar Banu
Fig. 25.2 Workflow of the proposed model
25.4.1 Phase I—Dataset Collection/Description Pre-processing. The dataset is collected from NCBI repository https://www.ncbi. nlm.nih.gov/sites/GDSbrowser?acc=GDS4431 which contains 146 observations (samples) and 54,613 genes (features). The observations are divided into two classes, a control class containing 69 observations and an autism class containing 77 observations [7]. After collecting the data its dimension is reduced to 9454 genes by removing the gene which has variance more than 15% using median ratio criterion and variance less than 15% through mean ratio criterion. The reason for adopting this method is, for higher variance data median is the better criteria than mean according to many researchers. After removing the irrelevant gene, the data is normalized. The data doesn’t have a class label hence it is added using label encoder technique as 0 and 1 class, where zero class is nonautistic class and one class is autistic class respectively.
25.4.2 Phase—II Model Building A multi-layer perceptron Neural Network model is used to train the dataset. As a first step model is trained using the entire 9454 genes. Later applied different feature selection techniques such as Forward Selection, Feature importance, k-best Feature selection, Random Forest, and Lasso which identified the best 16 genes. After selecting the best 16 genes the model is trained and tested with the same. The Multi-layer perceptron model consists of 16 input and 3 hidden layers (shown in Fig. 25.1). Each hidden layer has 16 neutrons inside it. The initial weight specified is 0.1 for all neurons. Since it’s a binary class problem the activation function used here is called RELU—(Rectified linear unit) this function will output zero if the input is negative or else it will output directly if it is positive. After training the model with different genes selected through various feature selection techniques, the gene selected through forward selection method has given the very less error rate which is 0.0018 and better accuracy hence those 16 gene is considered as predominant gene.
25 Identification of Predominant Genes that Causes Autism Using MLP
273
Steps: Entire data is passed through various feature selection method; selected gene is passed through MLP model; model training and testing; selecting the model which has less error rate and more accuracy. Table 25.1 show the genes which are selected through various feature selection methods. The steps of forward feature selection are graphically shown in Fig. 25.3. Figure 25.4 Shows the first decision tree for autistic dataset which explains, if the target values are less than or equal to a threshold it will classify into autistic class if it is greater it will classify into nonautistic class. Lasso: Least Absolute Shrinkage and Selection Operator—is a powerful method that performs two main tasks: regularization and feature selection [9]. K-best [10]: Select K—Best is a univariate feature selection technique. SelectKBest–Select features according to the k highest scores, it will keep the feature that has a higher score and remove the remaining feature [10]. Feature Importance: The feature importance of each attribute of your dataset is calculated using the feature importance property of the model [11]. Table 25.2 gives the selected genes through various methods. Using different feature selection techniques (mentioned above in Table 25.3) it’s evident that 16 features selected through the Forward selection model gave the best accuracy. The error rate is more when we use the entire gene, i.e., 0.7863 because the number of genes is more than the number of samples. From Tables 25.4 and 25.5 it’s evident that precision, recall, and F1-score are high for the model whose genes are selected through forward selection technique. After feature selection, we get rid of this problem the error rate is reduced to 0.0018 which is very less. But most of the feature selection techniques had overfitting problems due to the high variability and variance. From Fig. 25.5 it’s visible that after a certain epoch the trading and validation accuracy is not improving, hence the model came to the maximum possible result. However, the features selected through forward selection gave better output with the error rate (0.0018) shown in Fig. 25.5. Hameed et al. [7] Uses a combination of statistical filters and a GBPSOSVM algorithm’ selected the best 10 genes related to autism. The table shows the comparison of the below result. The result indicates that CAPS and ANKUB1 together increased the accuracy of the model to 92, 95 respectively for the non-autistic and autistic classes. But using any of the features individually decreases the accuracy of the model. Table 25.6 presents the comparative study.
25.5 Conclusion From the above study conducted we can say the first selected Sixteen genes through forward selection method provided good accuracy in identifying the ‘Autistic” and ‘Normal’ class respectively. The 16 gene along with the combination of CAPS2 and ANKUB1 which is suggested by the previous research together increased the accuracy of the model. The forward selection method selected LOC283075 which is selected as the 3rd important gene by Hameed et al. [7]. so we can conclude that
Method
Random forest
Lasso regression
Forward selection
K-best
Feature importance
S. No
1
2
3
4
5
‘FAHD2CP’,‘PRDM10’,‘UBAP1.1’,‘GIGYF2’,‘EVPLL’,‘LOC101926921’,‘SHISA7’,‘MIR6800’,‘ZMYM3’,‘RSPRY1’,‘GNMT’,‘RPL38.1’,‘THAP1’,‘SLC25A5AS1’,‘AGAP1’,‘VEGFC’
‘PPBP’,’MALAT1’,‘MMP9’,‘IGK.2’,‘IGLC1’,‘CXCL8’,‘NCF1’,‘NCF1C’,‘HLA-DQB1’,‘HBB.2’,‘HBB.1’‘LOC100130872.1’‘CMTM2’,‘TALDO1’,‘MSRB1’, ‘HBA2.4’]
‘H1FOO’, ‘NFE4’, LOC283075, ‘ZBTB43’, ‘HIST2H4B’, ‘NR6A1.1’,‘CLEC1B’, ‘RNA45S5’, ‘ROBO2’, ‘BF511763’, ‘FBXO4’, ‘OR3A3’, ‘C4orf46’, ‘GOPC’, ‘KIN’, ‘BF056251’
‘LOC100130872.1’,‘IL32’,‘PPBP’,‘OSBPL8’,‘MALAT1’,‘HLADQB1’,‘TLR2’,‘IFITM3’,’HBA2.4’,‘CXCL8’,‘CMTM2’,‘PLAC8’,‘RYBP’,‘RPS6’,‘TYROBP’,‘SNORD14D.2’
‘REEP3’,‘PIGM’,‘RSF1.2’,‘TMTC2.1’,‘FAM224A’, ‘MIR646HG’,‘EVPLL’,‘BE467566’,‘ZNF341’,‘LOC374443’,‘AW296081’,‘242390_at’,‘N79601’,‘NFAM1.1’, ‘AI467945’,‘243701_at’
Selected features
Table 25.1 Name of gene selected through various feature selection methods
274 A. Joseph and P. K. Nizar Banu
25 Identification of Predominant Genes that Causes Autism Using MLP
275
Fig. 25.3 Forward selection technique
Fig. 25.4 First decision tree for autism dataset [8] Table 25.2 Number of features identified through various feature selection methods
S. No
Feature selection method
No. of Selected Genes
1
Random forest
16
2
Lasso regression
18
3
Forward selection
16
4
K-best
10
5
Feature importance
16
276
A. Joseph and P. K. Nizar Banu
Table 25.3 Accuracy table of MLP model for various feature selection methods
S. No
Feature selection method
Accuracy autistic non-autistic
1
Random forest
55
2
Lasso regression
66
66
3
Forward selection
90
89
4
K-best
62
50
5
Feature importance
60
61
54
Table 25.4 Precision and recall table of MLP model for various feature selection methods S. No
Feature selection method
Precision autistic non-autistic
1
Random forest
59
50
54
2
Lasso regression
68
63
72
60
3
Forward selection
91
82
83
90
4
K-best
42
70
50
58
5
Feature importance
67
54
50
70
Table 25.5 F 1 -score table of MLP model for various feature selection method
Recall autistic non-autistic 55
S. No
Feature selection method
1
Random forest
57
52
2
Lasso regression
69
62
3
Forward selection
87
86
4
K-best
62
50
5
Feature importance
57
61
Fig. 25.5 Accuracy versus error plot
F 1 -score autistic non-autistic
Accuracy AN 89 86 87 93
Method
The proposed system selected gene + Existing system selected gene
The proposed system selected gene + Existing system selected feature called ANKUB1
The proposed system selected gene + Existing system selected feature called CAPS2
The proposed system selected gene + Existing system selected feature called ANKUB1 + Existing system selected feature called ANKUB1
S. No
1
2
3
4
Table 25.6 Comparative study result
93
85
87
90
92
90
91
85
Precision AN
95
89
92
94
96
84
83
96
Recall AN
90
89
90
80
94
88
87
90
F 1 -score AN
92
84
86
86
25 Identification of Predominant Genes that Causes Autism Using MLP 277
278
A. Joseph and P. K. Nizar Banu
this gene is also highly related to Autism disorder. All the feature selection methods conducted above selected at least one gene from the ‘LOC’ family which leads to the conclusion that these family genes are also correlated with the disorder. The selected 16 genes, CAPS2 and ANKUB1 gave the best model with more accuracy than the existing model which is 95%. But the limitation of the study is that as we focused on the accuracy we consider 18 genes only from 9454 genes. We can apply soft computing or optimizing model for further reduction in the number of genes. Though we applied MLP the accuracy was 95% it can be further improved by finetuning the parameters with respect to the gene.
References 1. Saxena, A., Chahrour, M.: Autism spectrum disorder. In: Genomic and Precision Medicine, pp. 301–316. Academic Press (2017) 2. Shattuck, P.T., Steinberg, J., Yu, J., Wei, X., Cooper, B.P., Newman, L., Roux, A.M.: Disability identification and self-efficacy among college students on the autism spectrum. Autism Res. Treat. 2014 (2017) 3. Memari, A.H., Panahi, N., Ranjbar, E., Moshayedi, P., Shafiei, M., Kordi, R., Ziaee, V.: Children with autism spectrum disorder and patterns of participation in daily physical and play activities. Neurol. Res. Int. 2015 (2015) 4. Feng, Y., Jia, Q., Chu, M., Wei, W.: Engagement evaluation for autism intervention by robots based on dynamic Bayesian network and expert elicitation. IEEE Access 5, 19494–19504 (2017) 5. Brownlee, J.: How to choose an activation function for deep learning. Machine Learning Mastery (2021). Retrieved from http://machinelearningmastery.com/choose-an-activation-fun ction-for-deep-learning/ 6. Dansbecker.: Rectified linear units (ReLU) in deep learning (2018). Kaggle, Kaggle. Retrieved from http://www.kaggle.com/dansbecker/rectified-linear-units-relu-in-deep-lea rning. Learning database new features. Moved (2021). Retrieved from http://docs.oracle. com/en/database/oracle/oracle-database/21/nfcon/adam-optimization-solver-for-the-neuralnetwork-algorithm-274058466.html 7. Hameed, S.S., Hassan, R., Muhammad, F.F.: Selection and classification of gene expression in autism disorder: use of a combination of statistical filters and a GBPSO-SVM algorithm. PloS one, 12(11), e0187371 (2017) 8. Ververidis, D., Kotropoulos, C.: Fast and accurate sequential floating forward feature selection with the Bayes classifier applied to speech emotion recognition. Signal Process. 88(12), 2956– 2970 (2008) 9. Fonti, V.: Feature Selection Using Lasso, pp. 1–26. Victoria University, Amsterdam (2017) 10. Liang, R.: Feature selection using python for classification problem. Medium, toward data science (2019). Retrieved from http://towardsdatascience.com/feature-selection-using-pythonfor-classification-problem-b5f00a1c7028 11. Shaikh, R.: Medium (2018). Retrieved from http://towardsdatascience.com/feature-selectiontechniques-in-machine-learning-with-python-f24e7da3f36e 12. Reichenberg, A., Gross, R. Weiser, M., Bresnahan, M., Silverman, J., Harlap, S., Rabinowitz, J., Shulman, C., Malaspina, D., Lubin, G., Knobler, H.Y., Davidson, M., Susser, E.: “Advancing Paternal Age and Autism,” Archives of General Psychiatry, vol. 63, no. 9, p. 1026, (2006) 13. Croen, L. A., Najjar, D.V, Fireman, B., Grether, J.K.: “Maternal and paternal age and risk of autism spectrum disorders,” Arch. Pediatr. Adolesc. Med., vol. 161(4), pp. 334–340, (2007)
25 Identification of Predominant Genes that Causes Autism Using MLP
279
14. Adam Optimization Solver for the Neural Network Algorithm Retrieved from https://docs.ora cle.com/en/database/oracle/oracledatabase/21/nfcon/adam-optimization-solver-for-the-neu ral-network-algorithm-274058466.html
Chapter 26
Detecting Impersonators in Examination Halls Using AI A. Vishal, T. Nitish Reddy, P. Prahasit Reddy, and S. Shitharth
Abstract Detecting impersonators in examination halls is very significant to conduct the examination fairly. Recently there were occasions of potential impersonators taking tests instead of the intended person. To overcome this issue, we require an efficient method with less manpower. With the advancement of machine learning and AI technology, we can overcome this issue. In this project, we are developing an AI system where images of students are saved and the model is developed using transfer learning process to get accurate results. If the student is an allowed one, it shows the hall ticket number and name of the student otherwise it appears unknown tag.
26.1 Introduction As we all know, India is a developing country and the youth of the nation plays a crucial role in the country’s development, so it is very important to conduct exams fairly and honestly. Examinations are conducted to measure the knowledge of a student and the result of an examination plays an important role in a student’s professional career. So, the examinations should be conducted fairly. Conducting an examination is a hectic task for an educational institute from distributing the hall tickets to verifying them at the time of examination. In real-time, manual checks of hall tickets happens to verify the student is certified or an impersonator. Manual checks need more human power and we cannot get maximum accuracy as there have been instances of students who cheated exams and are in a very crucial stage where a minor mistake can be costly. To avoid this we have proposed an AI system that detects impersonators in examination halls.
A. Vishal (B) · T. Nitish Reddy · P. Prahasit Reddy · S. Shitharth Department of Computer science and Engineering, Vardhaman College of Engineering and Technology, Hyderabad, Telangana, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_26
281
282
A. Vishal et al.
26.2 Related Work In early 1994, Vaillant et al. [1], have implemented neural networks for detecting faces. He considered forming CNN to detect whether there is a face in the image window or not and explore the entire image at all sites. In the year 2002, Garcia et al. have developed a system to detect semi-frontal faces of the human being [2]. Their system can also detect the face of the person in a fluctuating image in any environment. In this study, the creators [3] utilize Raspberry Pi, Haar Filter, and OpenCV for continuous face identification. The classifier utilized here is a classifier for Haar. By checking the head positions, SimpleCV and OpenCV are utilized for face identification. The AdaBoost calculation is utilized as a calculation for face discovery. The progression of this technique for face location goes this way: Starting with the webcam, Capturing and putting away pictures in the database, recognizing the face and confirming the face personality from the current data set, distinguishing the face, preparing the distinguished face picture that is an extraction of the element, discovering the match of the face in the current data set, showing the comparing result if the display ID and confirmation have not been recognized. A video outline that runs in a circle that gathers the pictures from the camera and those pictures are put away in transitory capacity, at that point utilized for grouping and identification, is utilized in the execution of this strategy (Kumbhar 2017).
26.3 Existing System In the existing system, students are identified by the details present in their hall ticket, whether the student is the intended person or not. Sometimes because of being late to the examination hall, students rush into the test centers or test room so, identifying each of them becomes difficult. Hence, many of them enter the test room without being authenticated so, there might be more chances of impersonators entering the examination hall without being verified. There were instances of hall ticket morphing where manual checks could not play a vital role, which leads to compromise the purpose of conducting an examination honestly. Recently, there were cases of students who did not get caught during the manual authentication of the hall tickets. So there is a need for a system to be implemented to untangle this issue.
26.4 Proposed System To overcome the problem faced in the existing model, we propose an AI system for detecting impersonators in the examination hall. In the proposed system first,
26 Detecting Impersonators in Examination Halls Using AI
283
images of each student are collected by using the haarcascade_frontalface classifier [4, 5] which is used to crop the face of the students which makes training our model easier. Now, each student is given a unique value at the time of the creation of hall tickets. Each dataset comprises 100 images of every student, and then we use image augmentation [6] to increase the count of images so that the accuracy of the model increases. We use the transfer learning concept in which we take a pre-trained model which is already trained with large datasets (imagenet dataset). In our system, we use the mobilenet_v2 model [7] (which is a pre-trained model which has been trained for millions of datasets) and train it with our dataset using CNN [8], and the model is saved in the system. We also use hyperparameter tuning [9, 10] where we get the best-optimized model which can be used as the final AI system.
26.4.1 Convolutional Neural Network CNN [8] is one of the deep learning techniques where input images are moved through a sequence of layers known as covnets which is known as Fig. 26.1. Every layer transforms one volume to another volume through a differentiable function. Covnet consists of the following layers. Input layer: It contains raw images in the form of an image array. Here we take an image array of size 224*224*3. Convolutional layers with filters: In this layer, we take image arrays as input and apply dot product with filters to generate output volume. In this layer, we will find the outlines of the objects. Activation function layer: The output of the convolutional layer is taken as input and we will apply elementwise activation function and it is used to activate or deactivate a neuron i.e. pixel. We took Rectified Linear Unit (ReLU) as an activation function because it does not activate all the neurons at a time. This layer will activate the
Fig. 26.1 Working of a covnet
284
A. Vishal et al.
Fig. 26.2 Implementation of MobileNet_V2 using CNN
neurons detected in the convolutional layer and deactivate the neurons which are not detected in the convolutional layer. Pooling layer: This layer reduces the size of volume hence, making the computation fast by reducing memory and prevents overfitting. Fully connected layer: It takes input from the previous layer and computes class score and it flattens it into a one-dimensional array whose size is equal to the number of classes taken. Later, we apply a Softmax classifier to classify the object with values 0 or 1. In real-time, there will be a camera placed at the entrance of every examination hall. Whenever students enter the classroom, our model will detect the face of the student through the camera, and it will compare with the dataset. Then the output value is generated if it is a known value, it displays the name and hall ticket number of the student, or else an unknown tag is displayed (Fig. 26.2).
26.4.2 Transfer Learning Transfer learning is a method where a model is trained for one particular task and is used for other similar tasks as a starting point. It is a process that uses a pre-trained model that was trained for a dataset and is used in predicting another dataset. The advantage of using Transfer learning is that it reduces the training time of a model as well as the errors associated with the prediction (Fig. 26.3).
26.4.3 Hyperparameter Tuning Hyperparameter tuning is performed using Keras-tuner [11] which is an assignment for meta-optimization. Each trial involves training a model and an internal optimization method for a specific hyperparameter environment.
26 Detecting Impersonators in Examination Halls Using AI
285
Fig. 26.3 Working of transfer learning
Fig. 26.4 Implementation of hyperparameter tuning
The best hyperparameter setting is the outcome of hyperparameter tuning as shown in Fig. 26.4, and the best model parameter setting is the outcome of model training. The hyperparameter tuner outputs the setting which yields the best performing model after evaluating a number of hyperparameter settings.
26.5 Results and Discussion Validation The proposed approach has been evaluated by measuring the precision, recall, and accuracy metrics of the face detection model and our classifier for MobileNetV2.
286
A. Vishal et al.
Fig. 26.5 Training loss versus validation loss
26.5.1 Training Data Versus Validation Loss Loss is the result of a poor forecast. Its value reflects a model’s performance. Our model suffered a loss of ‘3.2509e−08’ which is a tiny value. The blue line in Fig. 26.5 indicates training loss, whereas the yellow line depicts validation loss. As shown in Fig. 26.5, the difference between validating loss and training loss is minimal, indicating that our model has been properly trained. The training loss was initially 0.001, but as the training progressed, it was reduced to 0. Later on, the training loss rose slightly, which could be due to image quality or an image change. Finally, because it has already been trained for most of the images in dataset, the training loss has been.
26.5.2 Training Accuracy Versus Value Accuracy Accuracy is one of the measures to calculate the algorithm’s performance. In the below figure, the blue line shows the training accuracy whereas, the yellow line indicates validation accuracy. As shown in Fig. 26.6, as the number of epochs increases, there is an increase in the accuracy of the model from 0.998 to 1. Training accuracy is very less hence, we can say that our model is not overfitted or underfitted.
26 Detecting Impersonators in Examination Halls Using AI
287
Fig. 26.6 Training accuracy versus value accuracy
26.5.3 Verification Verification can be done in two ways: directly with photos or through a video stream. For verification, we sent the student’s photographs to the model, which verifies them.
26.5.3.1
Verification of Authorized Image
As shown in Fig. 26.7, we gave the student’s photographs that are present in the data set to the model, and we were correct in our prediction.
26.5.3.2
Verification of Unauthorized Image
We gave the student’s photograph to the model which is not present in the dataset, and we were correct in our prediction as shown in Fig. 26.8.
26.5.3.3
Verification of Authorized Image Through Video Stream
In verification, we have taken input from the video stream and for the authorized students, our model predicted correctly as shown in Fig. 26.9.
288
A. Vishal et al.
Fig. 26.7 Authorized student after verification procedure
Fig. 26.8 Unauthorized student after verification procedure
26.5.3.4
Verification of Unauthorized Image Through Video Stream
In verification, as shown in Fig. 26.10 we have taken input from the video stream and for the unauthorized students, i.e., they are not present in the dataset our model predicted correctly.
26 Detecting Impersonators in Examination Halls Using AI
289
Fig. 26.9 Verification of known person in video stream
26.5.4 Confusion Matrix As shown in Fig. 26.11, we can see in the below confusion matrix, the confusion matrix is between the actual and predicted values. Our model is predicted 1478 "Nitish" images and 1479’Prahasit’ images and 1479’Vishal’ images and there are no wrong predictions.
26.6 Conclusion and Future Scope 26.6.1 Conclusion In recent times, because of an upsurge in cases of impersonators in examinations, there is a need to develop a system that can detect impersonators and them from taking examinations. Hence, we have developed this model to reduce malpractices and improve the quality of examinations. In this project, we adopted the mobilenet_v2 model which gives greater efficiency than the other models.
290
A. Vishal et al.
Fig. 26.10 Verification of unknown person in video stream
26.6.2 Future Scope In near future, by this model, we will try to reduce the manpower of cross-checking the hall tickets. In our present system, our model identifies impersonators and a person is required to alert the higher authorities. In the future, our model is going the alert the chief examiner directly when an impersonator is found.
26 Detecting Impersonators in Examination Halls Using AI
291
Fig. 26.11 Confusion Matrix
References 1. Vaillant, R., Monrocq, C., Le Cun, Y.: Original approach for the localisation of objects in images. In: IEE Proceedings Vision, Image and Signal Processing (1994) 2. Garcia, C., Delakis, M.: A neural architecture for fast and robust face detection. In: Proceedings of the 16th International Conference on Pattern Recognition, 2002 (2002) 3. Kumbhar, P.Y., Dhere, S.: Utilize Raspberry Pi, Haar Filter, and OpenCV for continuous face identification (2017) 4. Selvarajan, S., Shaik, M., Ameerjohn, S., Kannan, S.: Mining of intrusion attack in SCADA network using clustering and genetically seeded flora based optimal classification algorithm. Inf. Secur. IET. 14(1), 1–11 (2019) 5. Shitharth, S., Prince Winston, D.: An enhanced optimization algorithm for intrusion detection in SCADA network. J. Comput. Secur. 70, 16–26 (2017) 6. Image augmentation using keras: https://blog.keras.io/building-powerful-image-classificationmodels-using-very-little-data.html 7. MobileNetV2: https://towardsdatascience.com/review-mobilenetv2-light-weightmodelimage-classification-8febb490e61c 8. CNN: https://medium.com/@RaghavPrabhu/understanding-ofconvolutional-neural-networkcnn-deep-learning-99760835f148 9. Shitharth, S., Winston, D.P.: A new probabilistic relevancy classification (PRC) based intrusion detection system (IDS) for SCADA network. J. Electr. Eng. 16(3), 278–288 (2016) 10. Sangeetha, K., Venkatesan, S., Shitharth, S.: Security appraisal conducted on real time SCADA dataset using cyber analytic tools. Solid State Technol. 63 (1), 1479–1491 (2020)
292
A. Vishal et al.
11. Keras-tuner: https://www.tensorflow.org/tutorials/keras/keras_tuner 12. Real Time Face Detection and Tracking Using OpenCV. https://ijrest.net/downloads/volume4/issue-4/pid-ijrest-44201715.pdf 13. TransferLearning: https://machinelearningmastery.com/how-to-usetransfer-learning-whendeveloping-convolutional-neural-networkmodels/ 14. Hyperparameter Tunning: https://towardsdatascience.com/hyperparameter-tuning-c5619e 7e6624 15. Keras: https://keras.io/api/applications/
Chapter 27
Telugu Text Classification Using Supervised Machine Learning Algorithm G. V. Subba Raju, Srinivasu Badugu, and Varayogula Akhila
Abstract We live in a world where knowledge is extremely valuable, and the amount of information available in text documents has grown to the point, where it is difficult to find those that are important to us. As a result, language-based classification of text documents is important. Telugu is one of the morphologically rich Dravidian languages. Since there are many Telugu documents available on the Internet, it is important to organize the data by automatically assigning a collection of documents into predefined labels based on their content using modern techniques. On the basis of the Telugu corpus, we proposed Telugu text document classification using a variety of machine learning algorithms and feature extraction techniques. We gathered 1990 documents from an online newspaper, divided into three categories: cinema (467), sports (839), and politics (684). In this paper, we used the N-gram feature extraction method to apply the naive Bayes (NB) classifier and the one-hot encoding vectorization method to apply multinomial naive Bayes (MNB), support vector machine(SVM), and logistic regression (LR). We used 1990 documents to extract uni-gram and bi-gram features and 120 unseen documents to test a naive Bayes classifier for the n-gram approach, and we got 99% accuracy in uni-gram and 97% accuracy in bi-gram. We used 1375 documents (70%) for training and 597 documents (30%) for testing to construct a one-hot encoding vector (based on the size of the vocabulary). For classification, we used the multinomial naive Bayes, support vector machine, and logistic regression algorithms. MNB provides 98% accuracy, SVM provides 99% accuracy, and logistic regression provides 98% accuracy.
27.1 Introduction Text classification is the process of assigning tags or categories to text based on its content. It plays a fundamental role in natural language processing (NLP), with applications ranging from sentiment analysis to topic identification, spam detection, and purpose detection [1]. Text can be a tremendously rich source of information G. V. S. Raju · S. Badugu (B) · V. Akhila Stanley College of Engineering and Technology for Women, Hyderabad, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_27
293
294
G. V. S. Raju et al.
[2], but extracting insights from it can be challenging and time-consuming due to its unstructured nature. Businesses use text classification to quickly and cost-effectively arrange text in order to improve decision-making and automate processes [3]. Telugu is one of the morphologically [4, 5] rich Dravidian languages. It, like other languages, has polysemous terms that have different meanings depending on the context. There are no suitable resources for Telugu text processing [6, 7]. We will not be able to use current text processing software directly. Some methods may be used with minor changes.
27.2 Related Work We talked about document classification in Indian languages in this section. There was plenty of work for English, and there was also a lot of research on Indian languages [8]. Patil et al. [9] used SVM with TF IDF features for Marathi, while Patil et al. [10] used KNN, naive Bayes, centroid-based with dictionary features. For Bangla document categorization, Ankit et al. [11] applied term association and term aggregation, whereas Dhar et al. [12] adopted distance-based techniques. For Tamil, Rajan et al. [13] used vector space and an ANN model. Murthy et al. [14] classified Telugu documents using different term weights. Murthy [2] classified Telugu NEWS articles using naive Bayes. Jayashree [15] utilized bag of words to classify Kannada texts, while Deepamala [16] applied several preprocessing approaches to classify Kannada Web-pages. Sarmah [17] used Assamese Word-net for Assamese. Naidhi [18] adopted ontology for Punjabi. Ali [19] applied SVM for Urdu. For Indian languages, Raghuveer [20] and Swamy [21] implemented machine learning methods. Tummalapalli [22] focused on languages with a lot of morphology.
27.3 Proposed Approach On the Telugu corpus, we presented Telugu text document classification utilizing several machine learning algorithms and feature extraction techniques. We used the n-gram technique to apply the naive Bayes classifier. The vectorization methodology was utilised with multinomial naive Bayes, support vector machine, and logistic regression. We adopted uni-gram and bi-gram in the n-gram, and one-hot encoding in the vectorization model (Fig. 27.1).
27.4 Implementation For implementing this project, we have used python 3.7 and we have implemented the vectorization and naive Bayes classification algorithm from scratch without use
27 Telugu Text Classification Using Supervised …
295
Fig. 27.1 System flow diagram
of any predefined packages. For this project, we have collected the corpus from the online Telugu newspapers. Collected corpus is divided into Cinema, Sports, and Politics. Our corpus statistics are shown in Table 27.1. For implementation process, first step we have done with preprocessing, next feature selection using different feature selection approaches, and using feature, we have created document vectors with help of one-hot encoding. Finally, we have applied different classification algorithms on created vectors and compare the result. Next section we will explain each and every module clearly. In Fig. 27.2 indicated the preprocessing steps. The first step in preprocessing is tokenization. We tokenized our documents using space as delimiter. Based on space, we divided the document into tokens. Next Table 27.1 Statistics from the corpus
Class type
#Documents
Cinema
467
Sports
839
Politics
684
296
G. V. S. Raju et al.
Fig. 27.2 Flow chart for preprocessing
we applied some corpus cleaning technique using patterns with help of regular expression. Table 27.2 depicts some of the Telugu Unicode characters for removing noise. Table 27.2 Telugu noise character Unicode values
S. No.
Character
Unicode
1
Comma
u002C
2
Full stop
u002E
3
Left parenthesis
u0028
4
Right parenthesis
u0029
5
Exclamation mark
u0021
6
Quotation mark
u0022
7
Apostrophe
u0027
8
Space
u0020
9
Colon
u003A
27 Telugu Text Classification Using Supervised …
297
After preprocessing, we have created uni-grams, bi-grams and vocabulary based term frequency. Using uni-gram and bi-grams, we estimated prior probability and likely-hood for each feature. Using this we estimated posterior probability using naive Bayes algorithm. Flow diagrams for feature selection are shown in Figs. 27.3 and 27.4. Tables 27.3 and 27.4 illustrate an example of n-gram feature selection. There are 1990 documents, 343,195 word tokens, and 66,158 different word types in our database (unique words). The naïve Bayes method is depicted in Fig. 27.5. For vectorization, we have used one-hot encoding approaches. For vectorization we have selected words-based their occurrence in corpus. It is called our vocabulary. The size of the vocabulary is 5380 words/types based on word type occurrence in corpus between 10 and 500 times. Length or features of the vector is 5380. It is a sparse vector. We have converted each document into document vector. After
Fig. 27.3 Work flow of n-gram-based feature selection
Fig. 27.4 Flow chart of one-hot encoding
298
G. V. S. Raju et al.
Table 27.3 Uni-gram and bi-gram examples Uni-gram example
Bi-gram example
D1: raamuDu baDiki veLLaaDu. D2: raamuDu iMTiki veLLaaDu
D1: D2:
D1: Rama Went to School, D2: Rama went to
D1: Rama Went to Schoo D2: Rama went to home
Home
* ramuDu
(Rama)
raamuDu (Rama)
RamuDu (Rama), baDiki (to School)
baDiki (to School)
baDiki (to School), veLLaaDu (went)
veLLaaDu (went)
RamuDu (Rama), iMTiki (to home)
iMTiki (to home)
iMTiki (to home),veLLaaDu (went) veLLaaDu (went)*
In the n-gram technique, an asterisk denotes an empty word.
Table 27.4 N-gram statistics on corpus S. No.
Class name
#Files
1
Cinema
467
#Uni-gram tokens 9,875,649
#Uni-gram type
#Bi-gram tokens
21,146
29,151,541
#Bi-gram type 62,423
2
Sports
684
20,812,068
30,427
65,683,836
96,029
3
Politics
839
24,154,8120
28,790
87,232,508
103,972
vectorization, we tested machine learning algorithms like SVM, NB, and logistic regression on document vectors for classification. Figure 27.6 depicts the entire classification technique utilizing the one-hot encoding vectorization method.
27.5 Result Analysis The outputs of the classifiers are tabulated, compared, and presented in this section using confusion matrices and graphs. First, we used uni-gram and bi-gram features to test the naive Bayes classification algorithm. The size of the training and test documents is shown in Table 27.4. Table 27.5 shows that the percentage of documents to cinema is 24%, sports 42%, and politics 34% in training documents. From training documents, we retrieved uni-grams and bi-grams. We used 40 previously unseen documents for testing. After conducted experiment on test documents, we tabulated resultant using confusion matrix and classification performance measurements. Below tables shows the result of naive Bayes classifier using uni-gram and bi-gram features (Table 27.6). Table 27.7 and Fig. 27.7 shows precision, recall, f 1 score results of uni-gram and bi-gram using naïve Bayes classifier.
27 Telugu Text Classification Using Supervised …
299
Fig. 27.5 Work flow of naïve Bayes classifier
Finally, we have observed that uni-gram results are high compare to bi-gram because of bi-gram features are very high number compared to uni-gram. So we can select best features from bi-gram for better result. For vectorization, we have selected different vocabulary sizes. Performance of the classification algorithm depends on the vocabulary size. First, we have selected 8206 words as a vocabulary and based on the vocabulary we created document vectors and tested different algorithms. We slitted documents into training and testing with 70:30 ratio. Table 27.8 shows the statistics. We have used multinomial naive Bayes (MNB), support vector machine (SVM) with linear kernel and logistic regression (LR) algorithms for classification. The results of classification algorithms with various vocabulary sizes are shown in Tables 27.9, 27.10, 27.11 and 27.12. Finally, we discovered that MNB’s results are superior to those of SVM and LR (Figs. 27.8 and 27.9). Finally, we have observed that SVM results are high compare to MNB and LR. We have increased vocabulary size SVM performance was good. In low or less vocabulary size MNB performance was good. Finally, we have done comparison between algorithm with different features. Table 27.13 and Fig. 27.10 shows the final result of classification algorithms with different features.
300
G. V. S. Raju et al.
Fig. 27.6 Flow of classification algorithm Table 27.5 Statistics on training and testing documents
S. No.
Type
1
Cinema
2
Sports
839
40
3
Politics
684
40
1990
120
Total
#Documents in training 467
#Documents in testing 40
27 Telugu Text Classification Using Supervised …
301
Table 27.6 Uni-gram and bi-gram confusion matrix Uni-gram confusion matrix
Bi-gram confusion matrix
C
S
P
Sum
C
S
P
Sum
Cinema (C)
40
0
0
40
40
0
0
40
Sports (S)
0
40
0
40
1
37
2
40
Politics (P)
1
0
39
40
0
0
40
40
Sum
41
40
39
120
41
37
42
120
Table 27.7 Naïve Bayes accuracy using uni-gram and bi-gram Precision (%)
Recall (%)
F1 -score (%)
Uni-gram
99
99
98
Bi-gram
97
97
96
Fig. 27.7 Naïve Bayes classifier accuracy using n-gram features Table 27.8 Training and testing data statistics #Documents
Size of the vocabulary
#Training documents
#Testing documents
1990
5380
1393
597
1990
8026
1393
597
Table 27.9 Classification result with 5380 vocabulary size
Cinema (C) Sports (S) Politics (P) Sum
MNB confusion matrix
SVM confusion matrix
LR confusion matrix
C
Sum
C
Sum
C
S
P
S
P
S
P
Sum
125
0
1
126
121
0
5
126
124
1
1
126
1
248
04
218
0
248
5
253
0
248
5
218
2
0
216
253
11
3
204
253
4
2
212
253
128
221
248
597
127
218
252
597
127
218
251
597
302
G. V. S. Raju et al.
Table 27.10 Accuracy of machine learning model using one-hot encoding vectorization with 5380 Vocabulary Size Precision (%)
Recall (%)
f1-score (%)
MNB
98
99
99
SVM
95
96
96
LR
98
98
98
Table 27.11 Confusion matrix for classification algorithms with 8026 vocabulary size MNB confusion matrix C Cinema (C)
S
P
Sum
C
0
126
125
0
217
1
218
1
5
247
253
1
222
248
597
127
126
0
Sports (S)
0
Politics (P)
1 127
Sum
SVM confusion matrix S
P
LR confusion matrix
Sum
C
1
126
125
S 0
P
213
4
218
1
5
247
253
1
218
252
597
127
Sum 1
126
215
2
218
4
248
253
218
251
597
Table 27.12 Accuracy of machine learning model using one-hot encoding vectorization with 8026 Vocabulary Size Precision (%)
Recall (%)
f 1 -score (%)
MNB
98
98
98
SVM
99
99
99
LR
98
98
98
Performance Measures % Accuracy
100.00% 98.00%
Precision
96.00%
Recall
94.00%
F1-score
92.00% MNB
SVM
LR
Classification Algorithms Fig. 27.8 One-hot encoding with 5380 features words improves the accuracy of classification algorithms
27.6 Conclusion and Future Work We have successfully implemented and tested classification algorithms using different document features. The system has the capabilities to classify given new document into predefined category. We are able to achieve satisfactory results based
27 Telugu Text Classification Using Supervised …
303
Fig. 27.9 One-hot encoding with 8026 features words improves the accuracy of classification systems
Table 27.13 Classification result based on word-based features and one-hot-encoding Accuracy
NB-Uni-gram (%)
NB-Bi-gram (%)
MNB (%)
SVM (%)
LR (%)
99
97
98
99
98
99.00% 98.00% 97.00% 96.00%
LR
M
N
SV
-g N
B-
Bi
M
m ra
ra -g ni U BN
B
Accuracy
m
% Accuracy
Accuracy between Algorithms
Classification Algorithms
Fig. 27.10 Classification result based on word-based features and one-hot-encoding
on our training data. For n-gram approach we used 1990 documents for extracting uni-gram and bi-gram and testing 120 unseen documents using naïve Bayes classifier. We got best accuracy with uni-gram. We created one-hot encoding vector (based on size of the vocabulary) using 1375 documents (70%) for training and 597 documents (30%) for testing then we apply multinomial naïve Bayes, support vector machine, and logistic regression. we observed based on above result uni-gram approach using naïve Bayes classifier gives highest accuracy (99%), one-hot encoding using SVM give highest accuracy (99%) with 8026 vocabulary size. We conclude that vector model is good for corpus size is very large and wordbased model (n-gram) is good for small size corpus but make balance the vocabulary both cases. Our future work includes exploring other classification algorithms on
304
G. V. S. Raju et al.
a much more diverse dataset with different machine learning techniques. Boosting may also be considered for improving the performance further.
References 1. Kaur, J., Saini, J.R.: A study of text classification natural language processing algorithms for Indian languages. VNSGU J. Sci. Technol. 4(1), 162–167 (2015) 2. Murthy, K.N.: Automatic categorization of Telugu news articles. Department of Computer and Information Sciences (2003) 3. Jayashree, R., Srikanta, M.K.: An analysis of sentence level text classification for the Kannada language. In: 2011 International Conference of Soft Computing and Pattern Recognition (SoCPaR). IEEE (2011) 4. Badugu, S.: Morphology based POS tagging on Telugu. Int. J. Comput. Sci. Issues (IJCSI) 11(1), 181 (2014) 5. Srinivasu, B., Manivannan, R.: Computational morphology for Telugu. J. Comput. Theor. Nanosci. 15(6–7), 2373–2378 (2018) 6. Badugu, S.: Telugu movie review sentiment analysis using natural language processing approach. In: Data Engineering and Communication Technology. Springer, Singapore, pp. 685– 695 (2020) 7. Rao, P.V.P.: Recall oriented approaches for improved Indian language information access. Language Technologies Research Centre International Institute of Information Technology Hyderabad-500 32 (2009) 8. Islam, M., Jubayer, F.E.M., Ahmed, S.I.: A comparative study on different types of approaches to Bengali document categorization. arXiv preprint arXiv:1701.08694 (2017) 9. Patil, J.J., Bogiri, N.: Automatic text categorization: Marathi documents. In: 2015 International Conference on Energy Systems and Applications, pp. 689–694. IEEE (2015) 10. Patil, M., Game, P.: Comparison of Marathi text classifiers. Int. J. Inf. Technol. 4(1), 11 (2014) 11. Dhar, A., et al.: Performance of classifiers in Bangla text categorization. In: 2018 International Conference on Innovations in Science, Engineering and Technology (ICISET). IEEE (2018) 12. Dhar, A., Dash, N., Roy, K.: Classification of text documents through distance measurement: an experiment with multi-domain Bangla text documents. In: 2017 3rd International Conference on Advances in Computing, Communication and Automation (ICACCA) (Fall). IEEE (2017) 13. Rajan, K., et al.: Automatic classification of Tamil documents using vector space model and artificial neural network. Expert Syst. Appl. 36(8), 10914–10918 (2009) 14. Murthy, V.G., et al.: A comparative study on term weighting methods for automated Telugu text categorization with effective classifiers. Int. J. Data Mining Knowl. Manage. Process 3(6), 95 (2013) 15. Jayashree, R., Murthy K.S.: An analysis of sentence level text classification for the Kannada language. In: 2011 International Conference of Soft Computing and Pattern Recognition (SoCPaR). IEEE (2011) 16. Deepamala, N., Ramakanth Kumar, P.: Text classification of Kannada webpages using various pre-processing agents. In: Recent Advances in Intelligent Informatics. Springer, Cham, pp. 235–243 (2014) 17. Sarmah, J., Saharia, N., Sarma, S.K.: A novel approach for document classification using Assamese wordnet. In: 6th International Global Wordnet Conference (2012) 18. Nidhi, V.G.: Domain based classification of Punjabi text documents. In: Proceedings of COLING (2012) 19. Ali, A.R., Maliha I.: Urdu text classification. In: Proceedings of the 7th International Conference on Frontiers of Information Technology (2009) 20. Raghuveer, K., Murthy, K.N.: Text categorization in Indian languages using machine learning approaches. In: IICAI (2007)
27 Telugu Text Classification Using Supervised …
305
21. Swamy, M.N., Hanumanthappa, M., Jyothi, N.M.: Indian language text representation and categorization using supervised learning algorithm. In: 2014 International Conference on Intelligent Computing Applications. IEEE (2014) 22. Tummalapalli, M., Chinnakotla, M., Mamidi, R.: Towards better sentence classification for morphologically rich languages. In: Proceedings of the International Conference on Computational Linguistics and Intelligent Text Processing (2018)
Chapter 28
Application of Hybrid MLP-GWO for Monthly Rainfall Forecasting in Cachar, Assam: A Case Study Abinash Sahoo and Dillip Kumar Ghose
Abstract Rainfall being one of the key components of the hydrological cycle contributes significantly to assessing flood and drought events. Forecasting rainfall events is vital in field of hydrology and meteorology. A wide range of practical problems has been resolved utilizing multilayer perceptron (MLP). Optimization algorithms assist neural networks in selecting appropriate weights and obtains accurate results. In this study, grey wolves optimization (GWO) meta-heuristic algorithm is used for training MLP for improving accurateness of rainfall forecasting in one rain-gauge station (Silchar) of Cachar district, Assam, India. Performance of hybrid MLP-GWO algorithm is assessed against conventional MLP model using root mean square error (RMSE), coefficient of determination (R2 ) and Nash–Sutcliffe efficiency (NSE). Input parameters such as monthly average temperature, relative humidity, and rainfall data are considered for a time period of 1980–2019 for rainfall forecasting. Results showed that MLP-GWO3 with R2 —0.9816, RMSE—38.54, and NSE—0.985 presented the most accurate forecasting in Silchar station. Final results specified that GWO algorithm improved the accurateness of standalone MLP model and can be suggested for forecasting monthly rainfall.
28.1 Introduction Rainfall is a vital component of the hydrological cycle playing a substantial part in meeting water requirements. Hence it makes extremely essential for having a comprehensive water resource management strategy. So as to develop a comprehensive and feasible strategy, one must have an appropriate understanding of the future. Necessity of rainfall forecasting is undeniable, and therefore, investigators have been trying of making progress in forecasting frameworks. Generally, rainfall is influenced by several environmental aspects, like temperature, relative humidity, and prevailing wind speed. Such complicated physical mechanisms make rainfall forecasting very challenging. Subsequently, because of complex and dynamic variations within the A. Sahoo (B) · D. K. Ghose Department of Civil Engineering, National Institute of Technology, Silchar, Assam, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_28
307
308
A. Sahoo and D. K. Ghose
atmosphere and the real-time necessity to forecast rainfall, a high-precision and largescale forecasting model is required possessing a bigger task in the field of hydrology and meteorology. In recent decades, the application of artificial intelligence which includes artificial neural networks (ANNs), fuzzy logic [1, 2] and meta-heuristics [3–5], have become common in management of complex real-life problems. ANNs are the most effective learning tools presently accessible to estimate unknown functions [6–8]. The most widely and best-known utilized topology is MLP. El-Shafie et al. [9] applied MLP, input delay NN and radial basis function network to forecast rainfall at River Klang, Malaysia. Abbot and Marohasy [10] used ANN models for forecasting rainfall on a monthly and seasonal basis in Queensland, Australia. Mekanik et al. [11] applied multiple regression and MLP for forecasting seasonal precipitation in Victoria, Australia. Samantaray et al. [8] applied different ANNbased techniques for studying rainfall forecasts in Bolangir district, Odisha, India. Liu et al. [12] studied and applied many algorithms related to NNs for predicting precipitation in Darjeeling, West Bengal, India. Zhang et al. [13] implemented MLP and support vector machine (SVM) to predict maximum rainfall in non-monsoon seasons and annually. Several investigators have proposed optimization algorithms for training MLPs [14, 15]. However, conventional methods commonly face complications to solve optimization problems in real world, likely requiring a large extent of memory, large computational time, generating poor quality solutions and sometimes becoming stuck in local optima. For overcoming these complications, meta-heuristic algorithms have been increasingly used for training MLP like genetic algorithm (GA) [16], particle swarm optimization (PSO) [17], bat algorithm (BA) [18] and ant colony optimization (ACO) [19]. In this paper, robust GWO algorithm is applied to train MLP network. Jaddi and Abdullah [20] studied the potential of integrated MLP-KA (kidney-inspired algorithm) model in rainfall forecasting in Selangor state of Malaysia. Claywell et al. [21] used simple MLP, ANFIS, and an integrated MLP-GWO model for predicting solar diffuse fraction (DF) in Almeria, Spain. Findings from their study indicated that MLP-GWO model gave high performance in training as well as testing periods followed by simple ANFIS and MLP models. Maroufpoor et al. [22] utilized GWO algorithm by improving ANN model to estimate reference evapotranspiration on daily basis and their outcomes were very encouraging. The key objective of the current investigation is to explore applicability of hybrid MLP-GWO algorithm to model monthly rainfall pattern at Silchar area. This is the first time implementation of MLP-GWO for monthly rainfall forecasting and at this specific area which defines novelty of the present study.
28.2 Study Area Silchar city lies between 24° 22 N and 25° 8 N East latitudes and 92° 24 E and 93° 15 E longitudes. It is situated 35 m above mean sea level in southernmost portion of Assam. This city consists of an alluvial flat plain with streams, isolated small hills,
28 Application of Hybrid MLP-GWO for Monthly Rainfall …
309
Fig. 28.1 Location of selected rain-gauge station
and swamps marking its scene. River Barak is the major flowing river with Ghagra being the other major river. Silchar has a borderline tropical monsoon climate with wet season beginning early and monsoon arrive during April. Silchar has a very humid and hot weather for seven months of the year with heavy thunderstorms. Proposed study area is presented in Fig. 28.1.
28.3 Methodology 28.3.1 MLP ANN comprises of simple connections and neurons which process information for finding a relation amid input and output inspired by biological activity of human brain. MLP is the most commonly used ANN structure by hydrologists consisting of three neuronal layers namely input, hidden, and output layer (Fig. 28.2). Here, input data is received through input layer for further processing. Hidden layers play an essential role in an MLP network as they give nonlinearity amid input and output
310
A. Sahoo and D. K. Ghose
Fig. 28.2 Architecture of MLP network
datasets. By increasing number of neurons or layers in hidden layer, more complex problems can be solved. Desired output of model is obtained in the output layer. Mathematically, the network can be expressed as follow: ⎡ Yt = f 2 ⎣
J j=1
w j f1
I
⎤ wi xi ⎦
(28.1)
i=1
where xi —input to network, Yt —network output, wi —weight amid nodes of input and hidden layer and w j —weight amid nodes of hidden and output layer in respective order; f 2 and f 1 —activation functions for output and hidden layer respectively.
28.3.2 Gwo Mirjalili et al. [23] introduced GWO, based on inspiration from hunting behavior of grey wolves in nature. Wolves are fundamentally categorized into four classes as per social hierarchy namely, alpha (α), beta (β), delta (δ) and omega (ω) for simulating leadership of wolf hierarchy. Encircling prey: The equations for representing the encircling process is given as follows: − → D X (t + 1) = X P (t) − A.
(28.2)
− X→ = C. D P (t) − X (t)
(28.3)
28 Application of Hybrid MLP-GWO for Monthly Rainfall …
311
where t—present iteration, A and C—coefficient vectors, X —location of grey wolf, and X P —location of prey. A and C are computed by: → A = 2 a .− r1 − a
(28.4)
→ C = 2.− r2
(28.5)
Hunting: obtained first three preeminent solutions are stored and induce other searching agents to adjust their locations. Following equations are used to update position of grey wolves → − → − − − →− → → → − → − → − → − Dα = C1 . X α (t) − X ; Dβ = C2 . X β (t) − X ; Dδ = C1 . X δ (t) − X
(28.6)
− → − → → − → − → → − → − → → − → − − → − − → − X 1 = X α (t) − A1 ∗ Dα ; X 2 = X β (t) − A2 ∗ Dβ ; X 3 = X δ (t) − A3 ∗ Dδ (28.7) → − → − → − X1 + X2 + X3 X (t + 1) = 3
(28.8)
MLP-GWO flowchart is given in Fig. 28.3.
28.3.3 Evaluating Constraint Rainfall (Pt ), temperature (T t ) and relative humidity (H t ) data were collected from IMD Pune, for a period of 40 years (1980–2019). 75% of collected data (1980–2009) were used to train the model whereas rest 25% (2010–2019) to test proposed models. Following quantitative standards are used for validating and measuring accurateness of prediction models:
n 2 1 i i Robs − Rcomp RMSE = n i=1 ⎛ ⎞2 n i i (R − R )(R − R ) comp obs comp obs i=1 ⎠ R 2 = ⎝ n i i 2 2 i=1 (Rcomp − R comp ) (Robs − R obs ) 2 ⎤ i i Robs − Rcomp ⎥ ⎢ NSE = 1 − ⎣ 2 ⎦ n i i k=1 Robs − Robs ⎡
n k=1
(28.9)
(28.10)
(28.11)
312
A. Sahoo and D. K. Ghose
Fig. 28.3 Flowchart of MLP-GWO algorithm k k k where Rcomp = Predicted value; Robs = Observed value; Rcomp = Mean predicted k = Mean observed value. Higher the values of R2 and NSE and smaller value; Robs the values of RMSE, superior the forecasts are.
28 Application of Hybrid MLP-GWO for Monthly Rainfall …
313
28.4 Results and Discussion Main aim of current study is to explore applicability of hybrid MLP-GWO algorithm for rainfall forecasting. Proposed hybrid MLP-GWO is compared with conventional MLP model. Three input scenarios combined three different inputs as given in Table 28.1. For monthly rainfall forecasting at Silchar station, statistical analysis of obtained results utilizing MLP-GWO and MLP models are presented in Table 28.2. Here in training phase, among MLP models MLP3 with RMSE = 234.85 mm, R2 = 0.9246 and NSE = 0.9354 and in testing phase with RMSE = 195.61 mm, R2 = 0.9183 and NSE = 0.9302 presented most accurate outcomes for monthly rainfall forecasting. Contrariwise in training period among MLP-GWO hybrid model, MLPGWO3 attained RMSE = 43.64 mm, R2 = 0.9869 and NSE = 0.9897 and in testing period attained RMSE = 38.54 mm, R2 = 0.9816 and NSE = 0.985, performing superiorly than conventional MLP model. Further evaluations of the MLP and MLP-GWO models in rainfall forecasting for Silchar were investigated. Scatter plots of observed vs. forecasted rainfall using MLP3 and MLP-GWO3 models are presented in Fig. 28.4. It shows performance of MLP and MLP-GWO models in capturing actual rainfall. As shown in this figure MLP-GWO (MLP-GWO3) provides good rainfall forecasting compared to other applied models. Comparison of simulations of MLP and MLP-GWO with respect to observed rainfall is shown in Fig. 28.5. By comparing actual value (magneta color) and model output (olive color) in Fig. 28.5 represents ability of methods in training the data Table 28.1 Modeling input combination structure
Model description
Output variable
Input combinations
MLP-GWO1
Pt
Pt
MLP-GWO2
Pt
Pt , T t
MLP-GWO3
Pt
Pt , T t , H t
MLP
MLP-GWO
MLP1 MLP2 MLP3
Table 28.2 Performance of model Technique
Input
RMSE
R2
NSE
Training phase MLP
MLP-GWO
RMSE
R2
NSE
Testing phase
MLP1
282.13
0.9082
0.9216
249.87
0.896
0.9197
MLP2
253.42
0.9127
0.9283
215.36
0.9075
0.9238
MLP3
234.85
0.9246
0.9354
195.61
0.9183
0.9302
MLP-GWO-1
95.37
0.9764
0.9792
87.41
0.9702
0.9773
MLP-GWO-2
74.58
0.9813
0.9845
61.79
0.9768
0.9819
MLP-GWO-3
43.64
0.9869
0.9897
38.54
0.9816
0.985
314
A. Sahoo and D. K. Ghose
Fig. 28.4 Scatter plot of observed vs. forecasted rainfall during testing phase
Fig. 28.5 Forecasting results of rainfall at Silchar station using, a MLP and, b MLP-GWO
28 Application of Hybrid MLP-GWO for Monthly Rainfall …
315
Fig. 28.6 Violin plots of models for rainfall forecasting
for rainfall forecasting. In general, forecasts made by MLP are more deviated from observed rainfall in comparison to predictions by MLP-GWO. Forecasts made by the hybrid model are found to be very close to observed rainfall values. Another method utilized for comparing distribution of applied models with certainty is Violin Plot. It is like a box plot with more thorough description and better noticeable presentation of difference amid fundamental distributions. In Fig. 28.6, violin plot shows observed values along with models’ forecasting values, where distribution of observed data is less skewed, and in terms of models applied, specifically MLP-GWO model, the estimated distribution is similar to actual values. However, MLP model represented forecasts with more skewedness. These forecasting results can be helpful in planning urban and water resource management. On a whole, it can be observed from Figs. 28.4, 28.5 and 28.6 that proposed robust MLPGWO method shows better performance compared to MLP for selected gauging stations.
28.5 Conclusion Present work investigated the capability of hybrid MLP-GWO model for rainfall forecasting in Silchar gauging station of Cachar district, Assam, India. Performance of proposed hybrid model is compared against conventional MLP model based on different statistical indicators. Analysis of results indicated that newly proposed MLP-GWO algorithm is able to accurately forecast rainfall with R2 — 0.9816, RMSE—38.54 and NSE—0.985 for MLP-GWO3 model. The reason behind this is incorporation of global search function apart from local search by GWO in training phase, which promotes wider exploration and expands search range inside search space. Present research revealed that the proposed method is capable of forecasting rainfall events and can be recommended as an alternative tool.
316
A. Sahoo and D. K. Ghose
References 1. Mohanta, N.R., Biswal, P., Kumari, S.S., Samantaray, S., Sahoo, A.: Estimation of sediment load using adaptive neuro-fuzzy inference system at Indus River Basin, India. In: Intelligent Data Engineering and Analytics, pp. 427–434. Springer, Singapore (2021) 2. Sahoo, A., Samantaray, S., Bankuru, S., Ghose, D.K.: Prediction of flood using adaptive neurofuzzy inference systems: a case study. In: Smart Intelligent Computing and Applications, pp. 733–739. Springer, Singapore (2020) 3. Sahoo, A., Samantaray, S., Ghose, D.K.: Prediction of flood in Barak river using hybrid machine learning approaches: a case study. J. Geol. Soc. India 97(2), 186–198 (2021) 4. Samantaray, S., Sahoo, A., Ghose, D.K.: Assessment of sediment load concentration using SVM, SVM-FFA and PSR-SVM-FFA in Arid Watershed, India: a case study. KSCE J. Civ. Eng. 24(6), 1944–1957 (2020) 5. Samantaray, S., Tripathy, O., Sahoo, A., Ghose, D.K.: Rainfall forecasting through ANN and SVM in Bolangir Watershed, India. In: Smart intelligent computing and applications, pp. 767– 774. Springer, Singapore (2020) 6. Jimmy, S.R., Sahoo, A., Samantaray, S., Ghose, D.K.: Prophecy of runoff in a river basin using various neural networks. In: Communication Software and Networks, pp. 709–718. Springer, Singapore (2021) 7. Sahoo, A., Samantaray, S., Ghose, D.K.: Stream flow forecasting in Mahanadi River basin using artificial neural networks. Procedia Comput. Sci. 157, 168–174 (2019) 8. Samantaray, S., Sahoo, A., Ghose, D.K.: Assessment of groundwater potential using neural network: A Case Study. In: International Conference on Intelligent Computing and Communication, pp. 655–664. Springer, Singapore (2019) 9. El-Shafie, A., Noureldin, A., Taha, M., Hussain, A., Mukhlisin, M.: Dynamic versus static neural network model for rainfall forecasting at Klang River Basin, Malaysia. Hydrol. Earth Syst. Sci. 16(4), 1151–1169 (2012) 10. Abbot, J., Marohasy, J.: Application of artificial neural networks to rainfall forecasting in Queensland, Australia. Advances in Atmospheric Sciences 29(4), 717–730 (2012) 11. Mekanik, F., Imteaz, M.A., Gato-Trinidad, S., Elmahdi, A.: Multiple regression and artificial neural network for long-term rainfall forecasting using large scale climate modes. J. Hydrol. 503, 11–21 (2013) 12. Liu, Q., Zou, Y., Liu, X., Linge, N.: A survey on rainfall forecasting using artificial neural network. Int. J. Embedded Syst. 11(2), 240–249 (2019) 13. Zhang, P., Jia, Y., Gao, J., Song, W., Leung, H.: Short-term rainfall forecasting using multi-layer perceptron. IEEE Trans. Big Data 6(1), 93–106 (2018) 14. Nasseri, M., Asghari, K., Abedini, M.J.: Optimized scenario for rainfall forecasting using genetic algorithm coupled with artificial neural network. Expert Syst. Appl. 35(3), 1415–1421 (2008) 15. Tripathy, M., Maheshwari, R.P., Verma, H.K.: Power transformer differential protection based on optimal probabilistic neural network. IEEE Trans. Power Deliv. 25(1), 102–112 (2009) 16. Jaddi, N.S., Abdullah, S., Hamdan, A.R.: A solution representation of genetic algorithm for neural network weights and structure. Inf. Process. Lett. 116(1), 22–25 (2016) 17. Aladag, C.H., Yolcu, U., Egrioglu, E.: A new multiplicative seasonal neural network model based on particle swarm optimization. Neural Process. Lett. 37(3), 251–262 (2013) 18. Jaddi, N.S., Abdullah, S., Hamdan, A.R.: Multi-population cooperative bat algorithm-based optimization of artificial neural network model. Inf. Sci. 294, 628–644 (2015) 19. Salama, K.M., Abdelbar, A.M.: Learning neural network structures with ant colony algorithms. Swarm Intell. 9(4), 229–265 (2015) 20. Jaddi, N.S., Abdullah, S.: Optimization of neural network using kidney-inspired algorithm with control of filtration rate and chaotic map for real-world rainfall forecasting. Eng. Appl. Artif. Intell. 67, 246–259 (2018)
28 Application of Hybrid MLP-GWO for Monthly Rainfall …
317
21. Claywell, R., Nadai, L., Felde, I., Ardabili, S., Mosavi, A.: Adaptive neuro-fuzzy inference system and a multilayer perceptron model trained with grey wolf optimizer for predicting solar diffuse fraction. Entropy 22(11), 1192 (2020) 22. Maroufpoor, S., Bozorg-Haddad, O., Maroufpoor, E.: Reference evapotranspiration estimating based on optimal input combination and hybrid artificial intelligent model: hybridization of artificial neural network with grey wolf optimizer algorithm. J. Hydrol. 588, 125060 (2020) 23. Mirjalili, S., Mirjalili, S.M., Lewis, A.: Grey wolf optimizer. Adv. Eng. Softw. 69, 46–61 (2014)
Chapter 29
Temperature Prediction Using Hybrid MLP-GOA Algorithm in Keonjhar, Odisha: A Case Study Sandeep Samantaray, Abinash Sahoo, and Deba Prakash Sathpathy
Abstract Temporal variations of global atmospheric temperature have been utilized as an important sign for climatic changes; hence, making reliable temperature forecasts motivates the basis of proper environmental strategies. This paper established an optimization model integrating multilayer perceptron with grasshopper optimisation algorithm (MLP-GOA) for predicting temperature value of Anandapur and Champua stations of Keonjhar district, Odisha, India. Models were established utilizing five input variables of monthly temperature corresponding to T t−1 , T t−2 , T t−3 , T t−4 and T t−5 , respectively considering data from 1982 to 2020. 75% of temperature data was utilized to train and rest 25% to test proposed models. Ability of hybrid MLP-GOA model in predicting monthly temperature is assessed by comparing obtained results with those of conventional MLP model. Three statistical indices namely root mean squared error (RMSE), coefficient of determination (R2 ) and Willmott Index (WI) were used for performance assessment. The results show that optimal MLP-GOA model with RMSE—5.003, R2− —0.9543 and WI—0.9618 at Anandpur station and RMSE—5.187, R2− —0.9537 and WI—0.9603 at Champua station performed well in simulation and forecasting of temperature time series and outperformed traditional neural network model.
29.1 Introduction Effect of climatic change on water resources and hydrology has turned out to be a significant hydrological concern. Recently, global warming has become of big cause of worry. Therefore correct prediction of temperature values is valuable to improve environmental conditions and guide public life. Physical models used for predicting temperature are very complicated because they are certain of relying on S. Samantaray (B) · A. Sahoo · D. P. Sathpathy Department of Civil Engineering, National Institute of Technology, Silchar, Assam, India e-mail: [email protected] S. Samantaray Department of Civil Engineering, CET, Bhubaneswar, Odisha, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_29
319
320
S. Samantaray et al.
mathematical expressions, typically necessitating a set of preliminary as well as boundary conditions of those models. Such type of data may not be available in many regional or remotely located areas particularly in emerging nations where instrumental arrangements have not been established for budgetary and logistic reasons. Therefore machine learning models which exclusively depend on historical observations for generating predictions can assist modelers in overcoming challenges encountered, and thus, it can give a feasible substitute to forecast temperature. Interest in the use of ANNs for developing hydrological prediction models has increased in recent years due to their changing patterns across the world [1–6]. An SVM-based prediction model was developed for making temperature predictions [7, 8] and obtained results were compared with predictions made by MLP model. Findings revealed that the proposed method could forecast temperature accurately. Perera et al. [9] compared different ANN models for predicting minimum and maximum atmospheric temperature in Tabuk, Saudi Arabia. However, because of several difficulties such as selection of proper size and structure of the network, computational time faced during their application, meta-heuristic optimization algorithms have been broadly applied with conventional neural networks for prediction purposes of different hydrological variables. Among many nature-inspired algorithms, GOA is a recently introduced algorithm. It has been effectively employed in several hydrological studies [10–12]. Samadianfard et al. [13] aimed at estimating monthly soil temperature of Adana (Turkey) using hybrid MLP-FFA (firefly algorithm) model. Graf et al. [14] proposed a waveletbased ANN model (W-ANN) for forecasting the water temperature of Warta River in Poland. Moayedi et al. [15] used hybrid ANN-GOA and ANN-HHO (Harris hawks optimization) meta-heuristic techniques for predicting soil compression coefficient. Naganna et al. [16] proposed MLP-based hybrid optimization models for accurately estimating dew point temperature at two locations in India. Results revealed that applied hybrid MLP models showed superior estimation accurateness. Ewees et al. [17] applied MLP-GOA model for forecasting volatility in pricing of iron ore. Ghaleb et al. [18] proposed newly developed MLP-GOA model to resolve spam detection. Obtained results suggest that MLP-GOA model showed better performance than other techniques. Present work aimed at predicting temperature values on a monthly basis in two different locations of India with the use of novel MLP-GOA approach which is the first time application of developed model in temperature prediction.
29.2 Study Area Keonjhar district is bounded by Jharkhand’s Singhbhum district in north, Bhadrak and Mayurbhanj districts in east, Jajpur district in south, Sundargarh and Dhenkanal districts in west. It covers a geographical area of 8303 km2 lying between 21° 1 N and 22° 10 N latitudes and 85° 11 E to 86° 22 E longitudes. Here, climate is characterized by a harshly hot summer with more humidity commencing in from March with
29 Temperature Prediction Using Hybrid MLP-GOA Algorithm …
321
Fig. 29.1 Location of selected rain-gauge station
maximum temperature (38° C) in month of May. Temperature in December is lowest with 11° C. In the present study, Anandapur and Champua stations are considered for temperature forecasting (Fig. 29.1).
29.3 Methodology 29.3.1 MLP Multilayer perceptron (MLP) [19, 20] is one of the most common type of feedforward NNs to minimize error functions on basis of performances of certain synaptic
322
S. Samantaray et al.
Fig. 29.2 Architecture of MLP network
weights. Based on the learning approach for estimating target variable, values of synaptic weight are assumed for specified training information like input–output data. It is usually done through layers by considering backpropagation of error signals. MLP comprises of three layers namely input, hidden, and output layer. Input and output layers consist of computational nodes. Generally, at layer k+1, output function f (k+1) can be computed utilizing subsequent equation: f (k+1) = ∂ w k a k + bk
(29.1)
where ∂—kind of nonlinear activation operation; bk , a k and w k —bias, input, and weight of layer k (Fig. 29.2).
29.3.2 GOA Saremi et al. [21] first introduced GOA meta-heuristic algorithm inspired by foraging and swarming behavior of grasshopper. This technique is based on exploitation and exploration phases to find food. In GOA, food sources are preeminent locations of grasshoppers in horde while grasshoppers are search agents. Swarming behavior of grasshoppers is given in mathematical terms by Saremi et al. [21].
29 Temperature Prediction Using Hybrid MLP-GOA Algorithm …
323
⎤
⎡
⎥ ⎢ ⎥ ⎢ N ⎥ ⎢ ubd − lbd d d x j − xi ⎥ + Td X X id = C ⎢ − X C S j i ⎢ 2 di j ⎥ ⎥ ⎢ ⎦ ⎣j =1 j = 1
(29.2)
where X i and X j —position of ith and jth grasshopper, respectively, di j —distance amid ith and jth grasshopper, lbd and ubd —lower and upper bound in Dth dimension; Td —value of Dth dimension in best solution found so far, N —number of grasshoppers; C—coefficient to diminish repulsion, attraction, and comfort zones. Calculation of social forces using S function is given by:
S(r ) = f e−r/l − e−r
(29.3)
where f —desirable force intensity and l—scale of attraction length. Flowchart of MLP-GOA is presented in Fig. 29.3.
29.3.3 Evaluating Constraint Statistical indicators utilized to measure accurateness of prediction models comprised of following: RMSE, R2 and WI which are expressed in Eq. (29.4)–(29.6):
N 1 RMSE = (yc − yo )2 N k=1 ⎛
⎞2 − y − y (yo o )(yc c) ⎠ R 2 = ⎝ k=1 N 2 N 2 − y − y (y ) (y ) o c k=1 o k=1 c N 2 k=1 (yo − yc ) WI = 1 − N 2 k=1 (|yc − yo | + |yc − yc |)
(29.4)
N
(29.5)
(29.6)
where yc = predicted dataset, yo = actual dataset, y c = mean predicted dataset, y o = mean actual dataset.
324
S. Samantaray et al.
Fig. 29.3 Flowchart of MLP-GOA algorithm
29.4 Results and Discussion Performance of robust MLP-GOA and conventional MLP prediction models has been assessed in this section based on statistical and graphical evaluation of observed and predicted temperature data at Anandapur and Champua gauge stations. Prediction
29 Temperature Prediction Using Hybrid MLP-GOA Algorithm …
325
results of proposed models in training (model development) and testing (model evaluation) phases are presented in Table 29.1 in terms of RMSE, WI and R2 . Standard criteria during training and testing stages for best performing model are that R2 and WI must be maximum whereas RMSE must be minimum. Appraisal of MLP-GOA, MLP models during training and testing phases for selected gauging sites are shown in Fig. 29.4 in form of scatter plot. Preeminent values of R2 for MLP and MLP-GOA models are 0.9228, and 0.9772 respectively for Anandapur. Likewise for Champua best value of R2 is 0.9217 and 0.9761 for MLP and MLP-GOA approach respectively. The time series plot of observed and predicted temperature values at two stations by MLP and integrated MLP-GOA models are shown in Fig. 29.5. Results illustrate that estimated peak temperature is 30.577 °C, 32.885 °C for MLP and MLP-GOA against actual peak 34.20 °C for station Anandapur. The approximated peak temperature is 32.409 and 34.528 °C for MLP and MLP-GOA adjacent to actual peak 35.49 °C for Champua division. Analysis of prediction results by models for both stations as presented in Fig. 29.5, clearly illustrates that MLP-GOA model is able to estimate temperature values more accurately than traditional MLP model. In addition, boxplots of observed versus predicted temperature are also demonstrated in Fig. 29.6. It gives a more noticeable presentation and a more comprehensive depiction of difference amid fundamental distributions. In Fig. 29.6, boxplot reveals that for observed values at Anandapur and Champua stations distributions are less skewed to right and for models utilized, specifically MLP-GOA model distribution shown is similar to true values. Similar to previous outcomes, from these figures it is clearly evident that MLPGOA shows superiority over MLP model. In conclusion, we affirm that robust MLPGOA can be suggested as an appropriate tool for temperature prediction at various stations as an acceptable degree of statistical accurateness.
29.5 Conclusion The feasibility of GOA meta-heuristic algorithm in incorporation with MLP was investigated for predicting temperature at two selected stations in Keonjhar district of Odisha, India. Findings show that MLP-GOA obtained better prediction results with RMSE—5.003, R2− —0.9543 and WI—0.9618 at Anandpur station and RMSE— 5.187, R2− —0.9537 and WI—0.9603 than MLP at both stations. Indeed, GOA algorithm was used for optimizing biases and computational weights of MLP. Furthermore, the models are based on the real measured data. Therefore, they are applicable in real-world cases. Proposed robust model can also be employed for temperature prediction in other parts around the world for assessing more on its feasibility. In future works, accuracy of temperature prediction can be improved by using a combination with other newly developed algorithms.
326
S. Samantaray et al.
Table 29.1 Performance of model at Anandapur and Champua gauge station Stations
Models
Anandapur MLP-1
Champua
Input Training combination RMSE R2
Testing WI
RMSE R2
0.9114 0.9188 30.23
WI
Tt−1
20.77
MLP-2
Tt−1 , Tt−2
16.943 0.9146 0.9217 28.112 0.8911 0.8976
0.8889 0.8951
MLP-3
Tt−1 , Tt−2 , Tt−3
14.006 0.9178 0.9249 26.669 0.8926 0.8999
MLP-4
Tt−1 , Tt−2 , Tt−3 , Tt−4
11.398 0.9205 0.9273 24.859 0.8967 0.9032
MLP-5
Tt−1 , Tt−2 , Tt−3 , Tt−4 , Tt−5
9.003 0.9228 0.9299 22.184 0.9012 0.9098
MLP-GOA-I
Tt−1
3.836 0.9695 0.9769
8.132 0.9488 0.954
MLP-GOA-2 Tt−1 , Tt−2
2.641 0.9713 0.9781
7.902 0.9502 0.9568
MLP-GOA-3 Tt−1 , Tt−2 , Tt−3
1.709 0.9726 0.9806
6.93
MLP-GOA-4 Tt−1 , Tt−2 , Tt−3 , Tt−4
1.46
0.9752 0.9818
5.896 0.9528 0.9591
MLP-GOA-5 Tt−1 , Tt−2 , Tt−3 , Tt−4 , Tt−5
1.183 0.9772 0.9846
5.003 0.9543 0.9618
0.9511 0.958
MLP-1
Tt−1
21.736 0.9097 0.9165 31.997 0.8876 0.8935
MLP-2
Tt−1 , Tt−2
18.392 0.9139 0.9201 29.84
MLP-3
Tt−1 , Tt−2 , Tt−3
15.221 0.9164 0.9234 27.207 0.8923 0.8987
MLP-4
Tt−1 , Tt−2 , Tt−3 , Tt−4
13.38
MLP-5
Tt−1 , Tt−2 , Tt−3 , Tt−4 , Tt−5
9.446 0.9217 0.9285 23.004 0.8984 0.9055
MLP-GOA-I
Tt−1
0.9199 0.9266 25.73
0.8903 0.8964
0.8942 0.9021
4.566 0.9681 0.9745
8.389 0.9473 0.9538
MLP-GOA-2 Tt−1 , Tt−2
2.978 0.9704 0.9776
8.114 0.9495 0.9559
MLP-GOA-3 Tt−1 , Tt−2 , Tt−3
1.99
0.9718 0.9788
7.458 0.9504 0.9577
MLP-GOA-4 Tt−1 , Tt−2 , Tt−3 , Tt−4
1.533 0.9736 0.9811
6.221 0.9521 0.9589
MLP-GOA-5 Tt−1 , Tt−2 , Tt−3 , Tt−4 , Tt−5
1.374 0.9761 0.9824
5.187 0.9537 0.9603
29 Temperature Prediction Using Hybrid MLP-GOA Algorithm …
327
Fig. 29.4 Scatterplot of observed-predicted temperature for Anandapur and Champua station (testing)
328
S. Samantaray et al.
Fig. 29.5 Time-series plot of temperature at Anandapur and Champua Station
29 Temperature Prediction Using Hybrid MLP-GOA Algorithm …
329
Fig. 29.6 Box plot of actual and predicted temperature values
References 1. Jimmy, S.R., Sahoo, A., Samantaray, S., Ghose, D.K.: Prophecy of Runoff in a River Basin Using Various Neural Networks. In Communication Software and Networks (pp. 709–718). Springer, Singapore (2021) 2. Mohanta, N.R., Biswal, P., Kumari, S.S., Samantaray, S., Sahoo, A.: Estimation of sediment load using adaptive neuro-fuzzy inference system at Indus River Basin, India. In: Intelligent Data Engineering and Analytics, pp. 427–434. Springer, Singapore (2021) 3. Sahoo, A., Samantaray, S., Bankuru, S., Ghose, D.K.: Prediction of flood using adaptive neuro-fuzzy inference systems: a case study. In Smart Intelligent Computing and Applications, pp. 733–739. Springer, Singapore (2020) 4. Samantaray, S., Sahoo, A., Ghose, D.K.: Assessment of runoff via precipitation using neural networks: watershed modelling for developing environment in arid region. Pertanika J. Sci. Technol. 27(4), 2245–2263 (2019) 5. Samantaray, S., Ghose, D.K.: Modelling runoff in a river basin, India: an integration for developing un-gauged catchment. Int. J. Hydrol. Sci. Technol. 10(3), 248–266 (2020) 6. Sridharam, S., Sahoo, A., Samantaray, S., Ghose, D.K.: Estimation of water table depth using wavelet-ANFIS: a case study. In: Communication Software and Networks, pp. 747–754. Springer, Singapore (2021) 7. Pérez-Vega, A., Travieso, C.M., Hernández-Travieso, J.G., Alonso, J.B., Dutta, M.K., Singh, A.: Forecast of temperature using support vector machines. In: 2016 International Conference on Computing, Communication and Automation (ICCCA), pp. 388–392. IEEE (2016) 8. Radhika, Y., Shashi, M.: Atmospheric temperature prediction using support vector machines. Int. J. Comput. Theory Eng. 1(1), 55 (2009) 9. Perera, A., Azamathulla, H., Upaka, R.: Comparison of different artificial neural network (ANN) training algorithms to predict the atmospheric temperature in Tabuk, Saudi Arabia. Mausam 71(2), 233–244 (2020) 10. Alizadeh, Z., Shourian, M., Yaseen, Z.M.: Simulating monthly streamflow using a hybrid feature selection approach integrated with an intelligence model. Hydrol. Sci. J. 65(8), 1374– 1384 (2020) 11. Khalifeh, S., Esmaili, K., Khodashenas, S., Akbarifard, S.: Data on optimization of the nonlinear Muskingum flood routing in Kardeh River using Goa algorithm. Data in Brief 30, 105398 (2020) 12. Tao, H., Al-Bedyry, N.K., Khedher, K.M., Shahid, S., Yaseen, Z.M.: River water level prediction in coastal catchment using hybridized relevance vector machine model with improved grasshopper optimization. J. Hydrol. 126477 (2021)
330
S. Samantaray et al.
13. Samadianfard, S., Ghorbani, M.A., Mohammadi, B.: Forecasting soil temperature at multipledepth with a hybrid artificial neural network model coupled-hybrid firefly optimizer algorithm. Inf. Process. Agricult. 5(4), 465–476 (2018) 14. Graf, R., Zhu, S., Sivakumar, B.: Forecasting river water temperature time series using a wavelet–neural network hybrid modelling approach. J. Hydrol. 578, 124115 (2019) 15. Moayedi, H., Gör, M., Lyu, Z., Bui, D.T.: Herding Behaviors of grasshopper and Harris hawk for hybridizing the neural network in predicting the soil compression coefficient. Measurement, 152, 107389 (2020) 16. Naganna, S.R., Deka, P.C., Ghorbani, M.A., Biazar, S.M., Al-Ansari, N., Yaseen, Z.M.: Dew point temperature estimation: application of artificial intelligence model integrated with natureinspired optimization algorithms. Water, 11(4), 742 (2021) 17. Ewees, A.A., Abd Elaziz, M., Alameer, Z., Ye, H., Jianhua, Z.: Improving multilayer perceptron neural network using chaotic grasshopper optimization algorithm to forecast iron ore price volatility. Resour. Policy, 65, 101555 (2020) 18. Ghaleb, S.A., Mohamad, M., Abdullah, E.F.H.S., Ghanem, W.A.: Spam classification based on supervised learning using grasshopper optimization algorithm and artificial neural network. In: International Conference on Advances in Cyber Security, pp. 420–434. Springer, Singapore (2020) 19. Hornik, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are universal approximators. Neural Netw. 2(5), 359–366 (1989) 20. Yuan, C., Moayedi, H.: Evaluation and comparison of the advanced metaheuristic and conventional machine learning methods for the prediction of landslide occurrence. Eng. Comput. 1–11 (2019) 21. Saremi, S., Mirjalili, S., Lewis, A.: Grasshopper optimisation algorithm: theory and application. Adv. Eng. Software, 105, 30–47 (2017)
Chapter 30
Addressing Longtail Problem using Adaptive Clustering for Music Recommendation System M. Sunitha, T. Adilakshmi, G. Ravi Teja, and Ayush Noel
Abstract Music recommendation system (MRS) is an information filtering tool used to handle huge amount of digital music available through online platforms. Collaborative filtering (CF) is one of the most commonly used method in MRS. CF methods are efficient in recommending popular songs but fail to add the songs not very popular. This paper aims at addressing the less popular songs also known as longtail songs. Adaptive clustering method is proposed in this paper to add longtail songs for recommendation. The proposed method is compared with user-based and item-based CF models to identify longtail songs. Tail-P is the measure proposed and compared for adaptive clustering method with user-based and item-based CF models. Results prove that adaptive clustering has performed better compared to CF models in case of longtail songs identification.
30.1 Introduction Because of the information overload problem in the music industry, recommendation systems are designed to filter content from huge digital libraries and provide music interesting to users without human intervention. Recommendation systems are playing an important role in fields such as movie recommendation, book recommendation, and news articles recommendation. Collaborative filtering (CF) approach is the most commonly used method in recommendation systems. The basic idea of CF is if users agree in the past on some items, then they might agree for the different items in the future also. Even though CF method is very popular and effective, it is suffering with limitations such as scalability, cold-start, and longtail. The research work in the paper is aimed to address the longtail problem. Longtail is a term used in different businesses such as mass media, micro-finance, and social networks for economic model building by statisticians since 1946. Longtail term refers to the list of unique items which are not very popular and gained importance in M. Sunitha (B) · T. Adilakshmi · G. Ravi Teja · A. Noel CSE Department, Vasavi College of Engineering, Hyderabad, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_30
331
332
M. Sunitha et al.
recent times. Anderson [1] [6] elaborated the concept of longtail and divided items into two separate parts given as the head and the tail. The head part contains the items and one can find them by using basic popularity-based recommendation systems. The tail of the curve as shown in Fig. 30.2 is characterized by the list of products having less ratings/sales. In 2007, 844 million digital tracks were sold, only 1% of all digital tracks were popular and accounted for 80% of all track sales as shown in Fig. 30.1. Also, 1,000 albums accounted for 50% of all album sales, and 450,344 of the 570,000 albums sold were purchased less than 100 times. Music consumption is biased towards a few mainstream artists. Longtail allows the service providers such as Spotify, Apple music, Pandora, etc., to realize significant profit by recommending small volumes of hard-to-find songs to users instead of always recommending few popular items. Popularity of an item is found from the total number of users’ play counts [5][5]. The rest of the paper is organized as follows. Section 30.2 describes related work, Sect. 30.3 showcases the proposed adaptive clustering-based method, and results are given in Sect. 4. Section 30.4 explains about conclusion and future scope. References are given in the next section. Fig. 30.1 Top-10 artists of Last.fm in the year 2007, July
Fig. 30.2 Dividing products into head and tail by fitting longtail
30 Addressing Longtail Problem using Adaptive Clustering …
333
30.2 Related Work Collaborative filtering (CF) is the most fundamental and most popularly used model for music recommendation system. The underlying principle of CF-based method is if two users have similar taste in the past, i.e., they liked few songs in the past then they may also like similar kind of songs in the future. Collaborative filtering is further classified into three subcategories given as memory-based, model-based, and hybrid collaborative filtering. Memory-based collaborative filtering provides suggestions based on the nearest neighbours of users/items by using entire collection of the previous user-item ratings. Memory-based CF is implemented in two different ways, one is user-based CF and another is item- based CF [4][4]. CF most commonly uses the approach to find nearest neighbours for any given user or item and provide recommendations from nearest neighbours. Former is known as user-based CF. Even though CF approach is simple to implement, it suffers with few major issues known as popularity bias, cold-start, scalability, etc. In CF-based recommendation systems, popular music gets more ratings but the music in longtail [3], however, can rarely get any. As a result, collaborative filtering mainly recommends the popular music to the listeners. Though giving popular items are reliable, it is still risky since the user rarely get pleasantly surprised. Longtail problem has been addressed by researchers using each item (EI) and total clustering (TC) methods [8]. Longtail problem exists in EI method due to lack of data to make the model learn enough to give good predictions. TC method adds information to head items and does not add any additional information about the tail items. Adding additional information to head items increases the size of data without increasing the novelty of items in tail.
30.3 Addressing Longtail Problem with Adaptive Clustering Longtail problem in music recommendation systems is addressed in this paper by using adaptive clustering method. The steps in the proposed system are explained in the upcoming sub-sections.
30.3.1 Identifying Longtail Data To carry out this research work, data are obtained from users’ implicit feedback and represented by user-song frequency matrix. After pre-processing user logs, number of songs obtained are 14,458. Songs in music recommendation system are modelled by using the model given by Kilkki [2][2]. Only few songs out of 14,458 items are
334
M. Sunitha et al.
Fig. 30.3 Showing the longtail data fitting with F(x)
very popular and mostly recommended by CF-based recommendation systems. To find the songs that are not very popular but might be interesting to the users, longtail songs need to be separated from popular songs. This is achieved by modelling average song frequency data by using the Eq. (30.1) proposed by Kilkki. The graph is plotted with respect to F and average frequency of songs as shown in Fig. 30.3 for sample data. β F(x) = Q α 2 x
(30.1)
+1
where F(x) denotes portion of data under the frequency x, β is the total volume of data, α decides the shape of the longtail curve and 0 ≤ α ≤ 1, Q2 represents 50th percentile or median of the data. Once the songs average frequency data are modelled with F(x), songs are divided into three parts head, mid, and tail to indicate three parts of the curve. The definition for boundaries that separate head, mid, and tail parts is given by Eqs. (30.2) and (30.3), respectively. 2
Boundary between(Head− > Mid) = Q 23
(30.2)
4
Boundary between (Mid− > Tail) = Q 23
(30.3)
30 Addressing Longtail Problem using Adaptive Clustering …
335
30.3.2 Applying Adaptive Clustering Adaptive clustering is used to handle the songs at tail. After dividing the songs into head, mid, tail based on Kilkki’s model, adaptive clustering is applied to the songs in mid and tail sets. Basic idea of adaptive clustering is songs having sufficiently good number of listeners will be considered individually and songs less popular will be clustered by using K-means algorithm. Adaptive clustering algorithm is defined as shown in Algorithm 3.1. Algorithm 3.1 Adaptive-K-means-Item-CF() Input: Songs portioned into head, mid, tail, user-song matrix. Output: Song clusters at mid and tail. Method: Begin. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16.
Let head, mid, and tail represent the set of songs belong to each set defined according to the Eqs. (30.2) and (30.3) Let Mid = {SM1, SM2, …….SMl} are the songs belongs to Mid Let Mid = {ST1, ST2, …….STk} are the songs belongs to Tail Sum-freq-mid = 0 For i in 1 to l Sum-freq-mid + = frequency (SMi) Threshold-mid = Sum-freq-mid / l Sum-freq-tail = 0 For j in 1 to k Sum-freq-tail + = frequency (STi) Threshold-tail = Sum-freq-tail / k Find the songs from mid such that frequency (SMi) < threshold-mid. Label these songs as LT-mid and remaining songs as P-mid Find the songs from tail such that frequency (SMi) < threshold-tail. Label these songs as LT-tail and remaining songs as P-tail Songs from P-mid and P-tail are not grouped with other songs and create clusters with individual song Songs in LT-mid and LT-tail are clustered by using K-means algorithm Return the clusters resulted from steps 14 and 15.
End.
30.3.3 Recommendation based on Adaptive Clusters Music recommendation system to address longtail problem with adaptive clustering is evaluated by recommending songs to test users. In this research work, we considered 200 users listening history. 60 users (30%) are randomly selected as
336
M. Sunitha et al.
test users. For each test, user algorithm defined in Algorithm 3.2 is applied to get recommendation vector. Algorithm 3.2 Recommendation-Adaptive-clustering () Input: Clusters at mid and tail, user-song matrix. Output: Recommendation vector for test users. Method: 1. 2. 3. 4. 5. 6. 7.
Let Ut = {U1 , U2 , ….. U60 } be the list of test users Let Ut -head be the list of songs of a test user from the head list For each song Si e Ut-head repeat through steps 4 to Find the nearest cluster at mid from Si Add all the songs from mapped mid cluster to the recommendation vector For the mapped mid cluster identify the nearest song cluster at tail Add all the songs from mapped tail cluster to the recommendation vector
End.
30.3.4 Evaluation of the Proposed Recommendation System Proposed recommendation system is evaluated by using precision. Precision indicates the preciseness of the proposed system. The paper is proposed to address longtail issue, so computed tail precision of the proposed system defined as shown in Eq. (30.4). Adaptive clustering-based recommendation system to address longtail problem is compared with collaborative filtering recommendation to identify the tail songs. The proposed method is performed better compared to collaborative filtering-based recommendation system in case of tail-P as shown in Fig. 30.4.
Fig. 30.4 showing the comparison of adaptive clustering with user-based and item-based CF for longtail songs
30 Addressing Longtail Problem using Adaptive Clustering …
337
Table 30.1 Tail-P comparison of adaptive clustering with collaborative methods No of neighbours/method
Adaptive clustering
UCC-W-KNN
ICC-KNN
Tail-P
Tail-P
Tail-P
NN = 1
0.317
0.117
0.219
NN = 2
0.341
0.112
0.243
NN = 3
0.343
0.118
0.271
NN = 4
0.328
0.113
0.258
NN = 5
0.325
0.108
0.241
Tail − P =
Tail(TP) Total number of Songs recommended from Tail
(30.4)
where tail (TP) represents the number of true positives (TPs) from the tail out of total true positives for a test user. In Table 30.3.1 UCC-W-KNN is the model proposed [9] user-based CF model. ICC-KNN model is proposed [10] item-based model with highest precision. The best user-based and item-based collaborative filtering models are compared with adaptive clustering method to address longtail songs by using the measure Tail-P. Results obtained for adaptive clustering-based music recommendation system are compared with user-based CF model and item-based CF model and are shown in Table 30.1. The results suggest that adaptive clustering-based method is capable of identifying longtail songs for recommendation as shown in Fig. 30.4. NN = 3, i.e., when 3 nearest neighbours are giving good tail precision compared to other values.
30.4 Conclusion and Future Scope This paper proposed adaptive clustering-based method to identify longtail songs. Initially, songs from the user implicit feedback are separated into three sets such as head, mid, and tail. Kilkki’s model is applied to identify tail songs. Adaptive clustering is applied to the songs in the tail which are not very popular. During the recommendation process, tail songs identified as less popular are added along with K similar songs present in the mapped cluster to the recommendation vector. Songs having enough frequency is handled individually. The proposed system is compared with collaborative filtering-based recommendation. Adaptive clusteringbased method has performed better with good tail precision compared to collaborative filtering-based recommendation methods. Music is very versatile in nature, and longtail is one of the issues faced by CF-based recommendation system. Cold-start and scalability are the other major challenges faced by CF. Recommendation system to address new user and new song cold-start problem may be considered as future direction for research.
338
M. Sunitha et al.
References 1. Anderson, C.: The long tail. Why the Future of Business Is Selling Less of More. Hyperion, New York, NY (2006) 2. Kilkki, K.: A practical model for analyzing long tails. First Monday 12(5) (2007) 3. Yin, H., Cui, B., Li, J., Yao, J., Chen, C.: Challenging the long tail recommendation, The 38th International Conference on Very Large Data Bases, August 27th - 31st 2012, Istanbul, Turkey. Proceedings of the VLDB Endowment, vol. 5, No. 9, pp. 896–907 4. Celma, Ò.: Music recommendation and discovery—the long tail, long fail, and long play, the digital music space. Springer, Berlin, Germany (2010) 5. Schedl, M., Knees, P., Gouyon, F.: New paths in music recommender systems research, Proceedings of the 11th ACM conference on recommender systems (RecSys 2017), Como, Italy (2017) 6. Sundaresan, N.: Recommender systems at the long tail. Fifth ACM Conference on Recommender Systems, pp. 1–5 (2011) 7. Park, Y.-J., Tuzhilin, A.: “The long tail of recommender systems and how to leverage it.” In Proceedings of the 2008 ACM Conference on Recommender Systems, ser. RecSys ’08. New York, NY, USA, ACM, pp. 11–18 (2008) 8. Park, Y.J.: The adaptive clustering method for the long tail problem of recommender systems. IEEE Trans. Knowl. Data Eng. 25(8), 1904–1915 (2013) 9. Sunitha, M., Adilakshmi, T., Sreenivasa Rao, M.: Enhancing user-centric clustering (UCC) model with Weighted KNN for music recommendation system. JoICS 11(1) 10. Sunitha, M., Adilakshmi, T., Sreenivasa Rao, M.: Enhancing item-centric clustering with nearest neighbors for music recommendation system. J. Sci. Comput., UGC CARE, ISSN1524–2560, (Jan 2021)
Chapter 31
A Proficient GK-KMA Based Segmentation and Lung Nodule Detection in CT Images Using PTRNN Vijay Kumar Gugulothu
and Savadam Balaji
Abstract One among the prevalent diseases and the foremost causes of cancerassociated death worldwide is Lung Cancer (LC). The prognosis can be probably enhanced, and many lives can be saved every year if the Lung Nodules (LN) analysis is done early. Prevailing LN detection methods was complex as it shows extensive variation in appearance as well as spatial distribution. Here, a new Deep Learning (DL) approach, Parameter Tuned Recurrent Neural Network (PTRNN), is proposed aimed at LN detection. Preprocessing, lung segmentation, detection of nodule candidates, deep Feature Extraction (FE), together with classification are the ‘5’ phases of proposed LN detection. Primarily, from the openly available data sources, the CT lung images are amassed. Then, those data are preprocessed via carrying out filtering and contrast enhancement. Next, a Gaussian Kernel-based K-means algorithm (GKKMA) segments the lung areas from the preprocessed image. Subsequently, nodule candidates are detected amongst every available object in the lung areas through a morphological process. After that, utilizing Residual Networks (ResNet 152), features are extracted as of the candidate nodules. Lastly, for nodule classification, the feature vectors (extracted ones) are inputted to PTRNN. In PTRNN, Cauchy Mutated Whale Optimization Algorithm (CMWOA) tune and optimize the weight and bias parameters of RNN. The outcomes exhibited that the PTRNN superiorly performs nodule detection with a high detection rate when weighed against prevailing classifiers.
V. K. Gugulothu (B) · S. Balaji Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Deemed To Be University, Hyderabad 500075, Telangana, India S. Balaji e-mail: [email protected] V. K. Gugulothu Head of Computer Science, Department of Computer Science & Engineering, Government Polytechnic College, Masab Tank, Hyderabad 500004, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_31
339
340
V. K. Gugulothu and S. Balaji
31.1 Introduction On the earth, one among the most serious cancers is the LC. Every year, the total demises have gradually augmented [1]. Globally, LC is the commonest one in men; whereas, it stands as the ‘4th’ most often diagnosed cancer as well as the ‘2nd’ foremost cause of demise due to cancer amongst women [2]. To efficiently trounce this burden, detection and also treatment at an initial stage is needed [3]. Computed Tomography (CT) imaging has turned into a standard modality aimed at detecting and assessing LC with its development [4]. Lately, CT techniques were implemented to screen for LC detection in high-risk populace [5]. The LN, at the initial stage, is still in a treatable stage. If it could be successfully detected, then the survival pace is high [6]. The misjudgment together with correct contour can well be lessened by the candidate nodule detection. Thus, the object can well be illustrated during the screening that is executed on the feature value extraction [7]. Benign or mass is the "2" forms wherein the LN is described generally [8]. The nodules detection is the chief intricacy in the automated screening with CT for LC [9]. Nodules’ segmentation in conjunction with the detection has become the main challenge since LN is characterized via their densities, location, size together with inner content [10]. Most preceding research has concentrated on the early LC detection utilizing the texture-centered elucidation of chest CT [11]. On CT (chest) scans, LN typically alludes to a “spot” of below ‘3’cm in diameter [12]. Nevertheless, it will be functional for radiologists in decisionmaking if the lung area in tandem with LN is accurately segmented [13]. Recently, image processing is extensively utilized in medical fields [14]. Convolutional Neural Networks (CNN) is used by such technique, and it had attained promising outcomes in initial detection [15]. Nevertheless, the key challenges in LN detection are the higher level of similarity betwixt the LN and their adjacent tissues [16] and the broad disparity in appearance as well as a spatial distribution [17]. DL, as a significant branch of machine learning, has rapidly developed recently, and it attains better outcomes [18]. Thus, an effectual LN detection technique as of CT images is proposed utilizing GK-KMA for segmentation as well as PTRNN for LN detection. Whale Optimization Algorithm: suggests a novel nature-inspired meta-heuristic optimization algorithm, called Whale Optimization Algorithm, which mimics the social performance of humpback whales. The algorithm is motivated by the bubblenet hunting strategy. WOA is tested with various mathematical optimization problems and six essential design problems. Optimization results prove that the WOA algorithm is very competitive compared to the state-of-art meta-heuristic algorithms as well as conservative methods. The work is pre-arranged as: In Sect. 31.2, the literature survey is rendered; in Sect. 31.3, the proposed work is elaborated; the result and its discussion are offered in Sect. 31.4, and lastly, the whole work is concluded in Sect. 31.5.
31 A Proficient GK-KMA Based Segmentation and Lung Nodule …
341
31.2 Literature Survey Xufeng Huang et al. [19]. rendered a diagnosis technique centered upon Deep Transfers-CNN (DTCNN) together with Extreme Learning Machines (ELM). It amalgamated the synergy of ‘2’ algorithms to cope with benign in tandem with malignant nodules classification. For extracting higher-level features of LN, an optimum DTCNN was adopted. It was trained in the ImageNet in advance. Next, for classifying benign as well as malignant LN, an ELM was developed further. The outcomes exhibited that the DTCNN-ELM attained the performance with 94.57% accuracy, 93.69% sensitivity, 95.15%specificity, together with 94.94% Area Under Curve (AUC). Nevertheless, ELM has some downsides, say over-fitting issues. Alberto Rey et al. [20] offered a CAD that utilized a hybridized design for the investigation of medical images in tandem with soft computing aimed at LN detection. Preprocessing, Regions of Interests (ROI) identification, formation of VOI (volume interests), together with ROI classifications were the main phases. Detection automation was a stage to diminish the ROI False Positives (FP), and also an algorithm to put up the VOI for ameliorating the location along with better detection. The system attained 82% sensitivity and FP of 7.3 per study. Nevertheless, the SVM utilization in the hybrid technique was not appropriate for big datasets. Additionally, it was not extremely well when the data-set has more noise. Xuechen Li et al. [21] recommended solitary feature-centered LN detection. For extracting the texture aspects, a stationary wavelet transform together with convergences index filter was engaged. For forming a white nodules-likeness map, AdaBoost was employed. For assessing the isolation level of candidates, a solo feature was stated. The isolation degree in cooperation with the white nodules-likeness was utilized as the last assessment of LN candidates. Once the FP was ‘2’ and ‘5’, correspondingly, above 80 and 93 (%) of LN on the JSRT were detected. Nevertheless, the standard wavelet transform-centered FE had the downside of poor directionality, shift sensitivity, together with devoid of phase details. Imdad Ali et al. [22] recommended a transferable textured CNN for ameliorating the categorization of LN on CT. An Energy Layer (EL) was included that extracted texture features as of the convolution layer. In the place of a pooling layer, the design encompassed merely ‘3’ convolutional layers together with ‘1’ EL. Aimed at automatic FE and categorization of LN to be malignant or benign, the suggested CNN encompassed ‘9’ layers. 96.69% ± 0.72% accuracy score with just a 3.30% ± 0.72% error rate was obtained. Nevertheless, for processing and training the neural network, a CNN needed a huge dataset. Qinghai Zhang and Xiaojing Kong [23] recommended an effectual identification system aimed at LN centered upon the multiple-scene DL method via the vesselness filter. For ameliorating the radiologist’s awareness in the ‘4’-stage nodules discovery, a ‘4’-channel CNN was designed through integrating ‘2’ image scenes. This could be implemented in ‘2’ disparate classes. It was efficient in augmenting the accuracy as well as considerably lessened FP in a huge data on the LN detection. Nonetheless, the vesselness filter held the downside of bad response uniformity on vascular structures together with vessel suppression.
342
V. K. Gugulothu and S. Balaji
31.3 Proposed Method The foremost cause of demise due to cancer worldwide is the LC. At the initial stages, LC patients don’t experience any symptoms. Automatic LN detection has huge implications for treating LC and augmenting patient survival rates. Here, PTRNN is proposed aimed at LN detection. (a) Preprocessing, (b) lung segmentation, (c) detection of nodule candidate, (d) deep FE, and (e) classification are the stages that are involved in the proposed LN detection. Primarily, as of the openly accessible data sources, the CT lung images are amassed. Next, the gathered images are preprocessed, wherein (i) Median Filter (MF) is employed for eliminating the noise as well as (ii) Adaptive Histogram Equalizations (AHE) is executed to ameliorate the contrast. After pre-processing, GK-KMA segments the lung areas. Then, with the aid of morphological procedure, nodule candidate detection is performed amongst every available object in the lung area. After that, the ResNet 152 extracts the deep features as of the candidate nodule. Finally, the PTRNN classifies the features (extracted ones). The classification outcomes encompass (i) nodule and (ii) non-nodule classes of data. The PTRNN block diagram is publicized in Fig. 31.1.
31.3.1 Pre-processing Primarily, the inputted CT lung images are amassed as of the openly accessible data sources. After that, they are pre-processed via performing two techniques, i.e., MF for noise removal and AHE for contrast enhancement, which is elucidated below,
31.3.1.1
Median Filter
The MF is employed to eliminate noise as of the images. It is executed by checking pixel by pixel throughout the image and substituting every value with the adjacent
Fig. 31.1 Block diagram of proposed PTRNN
31 A Proficient GK-KMA Based Segmentation and Lung Nodule …
343
pixels’ median value. It preserves or enhances the edges whilst removing salt and pepper noise. As the intensity value is regarded from the actual image, the image’s brightness is not reduced while eliminating the noise.
31.3.1.2
AHE
AHE is essentially employed to ameliorate the images’ contrast. It is performed through transforming every pixel with a transformation function derived as of a neighborhood area. It is appropriate to ameliorate the local contrast and also augment the edges’ definitions in an image’s every area.
31.3.2 Segmentation Using GK-KMA Here, the GK-KMA segments the lung in the pre-processed image. For the lung area identification, the image is segmented into considerable regions. In the common K-means Algorithm (KMA), the total cluster centroids are initialized inside the data space arbitrarily. Every data point is allotted with its adjacent centroid via gauging the distance. Therefore, Euclidean distance is employed for computation that is only appropriate for grouping clusters with plain geometry and is inappropriate for managing the complicated in addition to non-Euclidean structure of the inputted data. Gaussian Kernel (GK) induced norm is employed in place of the actual Euclidean norm for trouncing this issue since the kernel function brings implicitly about mapping data into high dimensional feature space. This GK-centered KMA for lung area segmentation is termed the GK-KMA. The GK-KMA steps are stated as, Step 1. Let’s regard the inputted image (X ) with the p×q resolution and the image is clustered into n-number of clusters, cn signifies the cluster’s center together with I ( p, q) implies the input pixels to be clustered. Next, the GK distance (G) betwixt the cluster’s center and every pixel is rendered by, G = 2{1 − K [(I ( p, q)), cn ]}
(31.1)
wherein, K [•] signifies the kernel function that can well be denoted as,
− p − q2 K ( p, q) = exp (2σ )2
(31.2)
wherein, σ implies the free parameter. And so, the image’s every pixel is allotted to its nearest center centered on G. Step 2. The cluster’s centers are updated with Eq. (31.2)
344
V. K. Gugulothu and S. Balaji
cn =
1 I ( p, q) n q∈c p∈c n
(31.3)
n
The process will keep on updating the cluster’s centers and grouping them into the new cluster. The loop will end once the cluster’s center becomes fixed. Next, the pixels of every cluster are reshaped into the image that holds the lung region. Consequently, the image (clustered) can well be signified as C; as of this, the nodules can well be additionally detected.
31.3.3 Detection of Nodule Candidates Subsequent to the segmentation, utilizing the morphological process, the nodule candidates on every available object on the lung area are detected. The morphological process’s steps are specified below, • The unwanted objects are smaller in size than the structuring constituent. Erosion is executed to eradicate these objects. The unwanted pixels are obliterated, as well as the net outcome is the object sharpening on an image. The image C erosion via structuring element Y is exhibited as C − Y = z|(Y )z ⊆ C
(31.4)
wherein, z is the image’s subset. • For filling the mixing pixels, dilation is performed. In the meantime, it adds pixels at the object’s boundaries, which affects the intensity, consequently, a blurring effect can well be observed. The image C dilation via structuring element Y is exhibited as, C + Y = z|(Y )z ∩ C
(31.5)
• Find the objects’ boundaries via concentrating on the exterior boundaries. The detected nodule candidates are signified as Nc . Therefore, features are extracted utilizing ResNet 152 as of the detected nodule candidates.
31.3.4 Feature Extraction Using ResNet 152 For additional processing, ResNet 152 is employed for extracting the deep features of the nodule candidate. ResNet 152 stands as a deep residual pre-trained CNN having identical structures; however, conflicting depths equal 152 layers. For training good
31 A Proficient GK-KMA Based Segmentation and Lung Nodule …
345
Fig. 31.2 Architecture of ResNet 152
features, appropriate depth is still vibrant. It has reported augmented accuracy along with low training times. The major norm of residual learning is exhibited in Fig. 31.2. With the aid of the full stacked layers (depth), the features level can well be ameliorated. The residual function (δ) formula is exhibited as, δ = Nc − Nc
(31.6)
wherein, Nc signifies the parameter function directly learned via training. Therefore, augmenting the network’s depth to a certain level can extract features with strong ability. The ResNet output is the extracted features (i ) that are signified by, i = (1 , 2 , · · · · · · , m )
(31.7)
wherein, m implies the total extracted features.
31.3.5 Classification Using PTRNN The extracted (i ) features are categorized utilizing PTRNN into ‘2’ classes: nodule and non-nodule data. For envisaging the layer’s output, Recurrent Neural Networks (RNN) operates on the standard of saving the particular layer’s output, and then, inputting this back to the inputted layer. It is impacted via the weights executed on the input and the hidden states together with the bias, which signifies the input or output. Consequently, the weight along with bias values are initialized arbitrarily,
346
V. K. Gugulothu and S. Balaji
which brings about slower computation. Thus, to tune the RNN’s bias along with the weight parameters, CMWOA is utilized. Additionally, the Activation Function (AF) in the common RNN stands as a parametric function that converges to the sign function as the parameter goes to infinity. Thus, for gauging the hidden layer’s last output, a novel AF is employed. Such parameter tuned in common RNN is labeled as PTRNN. RNN updates its recurrent state h(t) by, h(t) =
0 if t = 0 υ[h(t − 1), i (t)] otherwise
(31.8)
wherein, υ signifies the novel AF, h(t − 1) implies the preceding state and Ni (t) implies the inputted vector at each time step t. The novel AF (υ) utilized here is given as, i (t) υ= 1 1 + [i (t)](4s−2) 4s−2
(31.9)
wherein, s = 1, 2, · · · · · · , n. The present hidden state h(t) as well as the output state y(t) at time step t could well be gauged by, h(t) = υh (wh i (t) + u h h(t − 1) + bh )
(31.10)
y(t) = υ y w y h(t) + b y
(31.11)
wherein, wh , w y characterizes the input to hidden as well as hidden to output weight matrices, u h signifies the recurrent weights amongst hidden layer as well as ‘2’ adjacent time steps, bh , b y implies the biases and υh , υ y indicates the novel AF in the hidden together with output region. Therefore, the CMWOA optimized the weight and bias values, which are elucidated in the section below. Consequently, the PTRNN’s output encompasses the nodule and non-nodule data. Like so, the LN is categorized as of the CT images.
31.3.5.1
Optimization Using CMWOA
Here, the CMWOA optimized the PTRNN’s weight and bias values. A new sort of swarm intelligence optimization algorithm is basically the Whale Optimizations Algorithm (WOA), and as well, it is enthused via the unique behavior of humpback whales. It is normally utilized for incessant function optimization issues. Whilst resolving intricate optimization issues, the standard WOA has the issues of premature convergence together with local search capability. For augmenting the standard WOA’s capability of evading as of the local optima, a Cauchy Mutation (CM) is
31 A Proficient GK-KMA Based Segmentation and Lung Nodule …
347
incorporated in it. This CM incorporated WOA is called CMWOA. ‘3’ rules are utilized by this to update and ameliorate the candidate solution’s position in every step, which is elucidated as, Step 1. Computing encircling prey. CMWOA presumes that the present best candidate solution (weight and bias), which is signified as Yw,b w = wh , w y and b = bh , b y is the target prey. The candidate solution’s position (d) is updated by the subsequent equation,
d = a.Y˜w,b (t) − Yw,b (t)
(31.12)
Yw,b (t + 1) + Cm Yw,b = Y˜w,b (t) − b.d
(31.13)
wherein, t signifies the current iteration, a = 2cr − c and b = 2r implies the coefficients wherein c is decreased as of 2–0 over the iterations r implies a Cauchy arbitrary vector in [0,1], Y˜w,b (t) denotes the best solution’s position that ought to be updated in every iteration, if a better solution is found, and Cm Yw,b implies the CM operator. Step 2. The CM operator generated populace is considerably disparate from its parents. The bound of the arbitrary value becomes wider, and more chances are offered for a solution to flee from the local optima. The Cauchy distribution function is rendered by, 1 1 Yw,b Cm Yw,b = + arctan 2 π k
(31.14)
wherein, k > 0 signifies the scale factor. Step 3. Spiral updating position (if p ≥ 0.5) wherein p signifies a Cauchy arbitrary number [0,1]. A spiral equation is generated betwixt the whale’s and prey’s position to imitate the helix-shaped movement, which is exhibited in Eq. (31.15) Yw,b (t + 1) + Cm Yw,b = d( j) . e zl . cos(2πl) + Y˜w,b (t)
(31.15)
wherein, d( j) = Y˜w,b − Yw,b implies the distance betwixt jth whale to the best solution, z indicates the constant for defining the logarithmic spiral shape, l signifies a Cauchy arbitrary number in [-1,1]. Step 4. Gauging search for pray. An arbitrary candidate solution Yw,b (t) rand , which is indicated below, is utilized for searching prey.
d = a . Yw,b (t) rand − Yw,b (t)
(31.16)
Yw,b (t + 1) + Cm Yw,b = Yw,b (t) rand − b . d
(31.17)
348
V. K. Gugulothu and S. Balaji
The CMWOA commences with a cluster of arbitrary solutions. In every iteration, search agents updated their locations with arbitrarily chosen search agents or the better solution attained.
31.4 Result and Discussion The proposed segmentation in addition to classification models’ results for LN detection is rendered here. The LIDC/IDRI dataset is employed for performance analysis, which is publicly accessible. The proposed one is executed in the working platform of MATLAB. Initially, Table 31.1 tabulates the segmentation outcomes of the proposed and existent clustering methods (KMA, Fuzzy C-Means (FCM), along with K-medoids clustering). Centered on the metrics, explicitly precision, f-measure, recall, along with accuracy, these methods are contrasted. It was perceived as of Table 31.1 that the best clustering performance is achieved by the GK-KMA concerning all metrics. The precision along with the recall of 95.63 and 94.89 are attained by the proposed one, which is greater analogized to the precision along with recall attainted by the existent methods. The existent KMA, FCM, and K-medoids’ F-measure are 87.56, 83.25, and 78.32, whilst the f-score acquired by the GK-KMA is 95.82. Similarly, the accuracy of 96.26 is got by the GK-KMA that is greater amongst all while contrasting the accuracy level of techniques for lung region segmentation. Therefore, when analogized to existing top-notch approaches, the GK-KMA performs well for the lung regions segmentation from the preprocessed CT images with the best clustering accuracy, which is concluded from the outcomes. Then, by contrasting the technique with the prevailing algorithms, like RNN, Deep Convolutional Neural Networks (DCNN) along with Support Vectors Machine, the performance appraisal of the proposed classification algorithm for LN detection is performed. Centered on the metrics, namely sensitivity, specificity, accuracy, error rate, training time, along with testing time, these techniques are contrasted. The comparison results of techniques concerning sensitivity, specificity, accuracy, as well as error rate are exhibited in Fig. 31.3. From Fig. 31.3, when analyzing sensitivity along with specificity metrics, the proposed PTRNN attains 95.68 and 96.54, which are higher than that of RNN, DCNN, and SVM. Similarly, the proposed PTRNN’s accuracy is very high when Table 31.1 Results of clustering techniques for lung segmentation Techniques
Performance metrics (%) Precision
Recall
F-measure
Accuracy
Proposed GK-KMA
95.63
94.89
95.82
96.26
KMA
89.85
88.45
87.56
89.63
FCM
82.36
84.89
83.25
85.23
K-medoids
78.02
76.98
78.32
78.62
31 A Proficient GK-KMA Based Segmentation and Lung Nodule …
349
Fig. 31.3 Performance of classifiers regarding sensitivity, specificity, accuracy, and error rate
contrasted to others. It attains the greatest accuracy of 96.45, while the accuracy of 92.87, 89.65, and 83.69 is proffered by the existent RNN, DCNN, and SVM. When executing nodule classification, the error rate is minimal for the proposed one. It only obtains the error of 3.55, while the error of 7.13, 10.35, and 16.31 is offered by the existing RNN, DCNN, and SVM. The performance efficiency of PTRNN for LN detection with larger accuracy and a lower error rate is exhibited by these results. Figure 31.4 exhibits the techniques’ performance, which is checked by the AUC scores. It was noted that when contrasted with the existent RNN (94.58), DCNN (90.25), and SVM (86.98), the AUC of the PTRNN is high (98.56). Thus, both proposed GK-KMA and PTRNN are the greatest methods for attaining effective lung region segmentation and LN detection with maximum accuracy, which is deduced from all results.
Fig. 31.4 AUC score of the classifiers
350
V. K. Gugulothu and S. Balaji
31.5 Conclusion This paper renders a PTRNN for effectual LN detection. The proposed method’s performance is contrasted with the prevailing techniques for weighing up the proposed method’s efficiency. The proposed classifier’s performance is examined centered on sensitivity, specificity, accuracy, error rate, in addition to AUC. Additionally, concerning precision, accuracy, f-measure, together with recall, the clustering technique of the proposed work is assessed. The proposed PTRNN technique delivers higher sensitivity (95.68%), specificity (96.54%), accuracy (96.45%), AUC (98.54%), in cooperation with a low error rate of 3.55. Similarly, the GKKMA conveys precision (95.63%), accuracy (96.26%), f-measure (95.82%), and recall (94.89%) that is high with regards to the customary method. From this, it is deduced that the PTRNN centered LN detection is better when weighed against the prevailing ones. In the future, the work will be extended by using advanced clustering techniques.
References 1. A. Chaudhary, S.S. Singh, “Lung cancer detection on CT images by using image processing”, In 2012 International Conference on Computing Sciences (IEEE, 14–15 Sept 2012, Phagwara, India, 2012) 2. Halder, A., Dey, D., Sadhu, A.K.: Lung nodule detection from feature engineering to deep learning in thoracic CT images: a comprehensive review. J. Digit. Imaging 33(3), 655–677 (2020) 3. Teramoto, A., Fujita, H.: Fast lung nodule detection in chest CT images using cylindrical nodule-enhancement filter. Int. J. Comput. Assist. Radiol. Surg. 8(2), 193–205 (2013) 4. Xie, H., Yang, D., Sun, N., Chen, Z., Zhang, Y.: Automated pulmonary nodule detection in CT images using deep convolutional neural networks. Pattern Recogn. 85, 109–119 (2019) 5. M. Gomathi, P.P. Thangaraj, “Automated CAD for lung nodule detection using CT scans”, In 2010 International Conference on Data Storage and Data Engineering, IEEE, Bangalore, India, 9–10 Feb 2010) 6. C. Zhao, J. Han, Y. Jia, F. Gou, “Lung nodule detection via 3D U-net and contextual convolutional neural network”, In 2018 International Conference on Networking and Network Applications (NaNA), IEEE, 12–15 Oct 2018 7. Kuo, C.F.J., Huang, C.C., Siao, J.J., Hsieh, C.W., Huy, V.Q., Ko, K.H., Hsu, H.H.: Automatic lung nodule detection system using image processing techniques in computed tomography. Biomed. Signal Process. Control 56, 1–20 (2020) 8. Veronica, B.K.J.: An effective neural network model for lung nodule detection in CT images with optimal fuzzy model. Multimedia Tools Appl. 79, 1–21 (2020) 9. P. Kamra, R. Vishraj, S. Gupta, “Performance comparison of image segmentation techniques for lung nodule detection in CT images”, In 2015 International Conference on Signal Processing, Computing and Control (ISPCC) (IEEE, Waknaghat, India, 24–26 Sept 2015) 10. R. Mastouri, N. Henda, S. Hantous-Zannad, N. Khlifa, “A morphological operation-based approach for Sub-pleural lung nodule detection from CT images”. In 2018 IEEE 4th Middle East Conference on Biomedical Engineering (MECBME), IEEE, 28–30 March 2018, Tunis, Tunisia, 2018
31 A Proficient GK-KMA Based Segmentation and Lung Nodule …
351
11. Khan, S.A., Nazir, M., Khan, M.A., Saba, T., Javed, K., Rehman, A., Akram, T., Awais, M.: Lungs nodule detection framework from computed tomography images using support vector machine. Microscopy Res. Technique 82(8), 1256–1266 (2019) 12. Zhang, J., Xia, Y., Cui, H., Zhang, Y.: Pulmonary nodule detection in medical images: a survey. Biomed. Signal Process. Control 43, 138–147 (2018) 13. Naqi, S.M., Sharif, M., Yasmin, M.: Multistage segmentation model and SVM-ensemble for precise lung nodule detection. Int. J. Comput. Assist. Radiol. Surg. 13(7), 1083–1095 (2018) 14. F. Lei, Z. Xia, X. Zhang, X. Feng, “Lung nodule detection based on 3D convolutional neural networks”. In 2017 International Conference on the Frontiers and Advances in Data Science (FADS), IEEE, 18–21 Apr 2017, Melbourne, VIC, Australia, 2017 15. E.R. Capia, A.M. Sousa, A.X. Falcão, “Improving lung nodule detection with learnable nonmaximum suppression”. In 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI), IEEE, 3–7 April 2020, Iowa City, IA, USA, 2020 16. Cao, H., Liu, H., Song, E., Ma, G., Xiangyang, X., Jin, R., Liu, T., Hung, C.-C.: A two-stage convolutional neural networks for lung nodule detection. IEEE J. Biomed. Health Inform. 24(7), 2006–2015 (2020) 17. R. Sathish, R. Sathish, R. Sethuraman, D. Sheet, “Lung segmentation and nodule detection in computed tomography scan using a convolutional neural network trained adversarially using turing test loss” (2020). https://doi.org/10.1109/EMBC44109.2020.9175649 18. P. Wu, X. Sun, Z. Zhao, H. Wang, S. Pan, B. Schuller, “Classification of lung nodules based on deep residual networks and migration learning”. Comput. Intell. Neurosci. (2020). https:// doi.org/10.1155/2020/8975078 19. Huang, X., Lei, Q., Xie, T., Zhang, Y., Zhen, H., Zhou, Q.: Deep transfer convolutional neural network and extreme learning machine for lung nodule diagnosis on CT images. Knowl.-Based Syst. 204, 1–8 (2020) 20. Rey, A., Arcay, B., Castro, A.: A hybrid CAD system for lung nodule detection using CT studies based in soft computing. Expert Syst. Appl. 168, 1–13 (2021) 21. Li, X., Shen, L., Luo, S.: A solitary feature-based lung nodule detection approach for chest X-ray radiographs. IEEE J. Biomed. Health Inform. 22(2), 516–524 (2017) 22. I. Ali, M. Muzammil, I.U. Haq, A.A. Khaliq, S. Abdullah, “Efficient lung nodule classification using transferable texture convolutional neural network”. IEEE Access 8, 175859–175870 (2020) 23. Zhang, Q., Kong, X.: Design of automatic lung nodule detection system based on multi-scene deep learning framework. IEEE Access 8, 90380–90389 (2020)
Chapter 32
Advertisement Click Fraud Detection Using Machine Learning Algorithms Bhargavi Mikkili and Suhasini Sodagudi
Abstract Online advertising has become one of the important funding models to support websites. Advertisers pay the publisher a set amount of money for each click on their ad (which leads to the advertiser’s website). Because Internet advertising involves such large sums of money, malevolent parties try to obtain an unfair edge. To raise illegally money, attackers use “click fraud” in which a person will repeatedly click on a specific link with the motive to earn money illegally. Click fraud occurs when a script exploits Internet marketers by repeatedly clicking on a pay-per-click (PPC) advertisement to generate fake charges. There are several models to detect click fraud whenever a person or a computer program clicks on a specific link to check if a person is legal or illegal by calculating click through rate to provide security. This paper proposed a machine learning approach to identify the false click by users for differentiating illegal users from original users for this logistic regression and Gaussian naive Bayes classifier algorithms are used to classify clicks with the respective results of 99.76% for logistic regression and 99.23% for Gaussian naive Bayes.
32.1 Introduction The popularity of the Internet has exploded in recent decades. For advertisers, the rise of the web-based online advertisement industry has opened up a slew of new lead generation, brand awareness, and electronic commerce opportunities. The advertisement is the method of distributing a message about ideas, products, and services to the public through the media for a fee charged by a specific sponsor. Advertising can be done in almost any medium either physical which is a newspaper or digital mediums using the Internet. As a result, the advertiser will get the pricing amount is typically based on advertiser’s pay for each ad click on a publisher’s website in a pay-per-click (PPC) arrangement. The popularity of the Internet will create a B. Mikkili (B) · S. Sodagudi Velagapudi Ramakrishna Siddhartha Engineering College, Vijayawada, Andhra Pradesh, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_32
353
354
B. Mikkili and S. Sodagudi
brand new gate for illegal users to make use of these advertisements to make money, by just using click fraud which means repeatedly click on the specific link to gain illegal money. Detecting duplicate clicks over decreasing windows such as jumping windows and sliding windows is a critical problem in detecting click fraud. Decaying window models can be extremely useful for identifying and detecting click fraud. Ad fraud thrives by diverting off money from advertisement transactions, and it most often occurs in the form of domain spoofing. Data processing to identify, cluster, and section knowledge information and mechanically realize connections and rules within the data are the most popular AI techniques used for fraud detection. Machine learning reduces fraud until it has an effect on the victim. Firms can detect data anomalies in 5–10 ms and create decisions based on the data as it occurs, even predicting outcomes, the unique combination of machine learning and AI. Logical regression is a valuable analytic technique since aspects of cyber security, such as attack detection, are classification issues. Complex machine learning models working in high-dimensional feature spaces are typically used to detect fraud. Naive-Bayes is the best model in this feature space. If fraud could be discovered in real-time, ad networks and advertisers would be better positioned to punish offenders and take preventative action. Models are easy to construct. Despite their inexpensive cost, these would likely have poor precision and recall. It is ideal to have a fraud detection system that optimizes for minimal cost while giving excellent precision and recall.
32.2 Literature Survey This section illustrates the different project-related works. The following are some of the papers that were studied and summarized: Mouawi et al. [1] proposed a model including KNN, ANN, and SVM classifiers. They used self-designed features based on a variety of pattern recognition theories. Pearce et al. [2] developed a Zero Access bot that eliminated the need for FakeAV. FakeAVs were used to detect payloads. A malware protection mechanism named MMPC was proposed by Tommy Blizard, Nicola Livic, and others [3] (Microsoft Malware Protection Center). An analysis of hapili has been done manually.
32.3 Proposed Methodology The proposed model is used to identify fraudulent digital advertising clicks that result in illegal revenue. It was created with the aim of resolving an existing problem with the help of a thorough analysis and a powerful algorithm that is extremely accurate. The design is built following a massive research as well as testing. The dataset has been thoroughly trained and tested. Produce the best outcome (Fig. 32.1). The system’s overall architecture includes methodology is carryout on following processes steps to train any model data need to undergo some preprocessing steps to
32 Advertisement Click Fraud Detection Using …
355
Fig. 32.1 Architecture diagram
prepare the raw data and converting it into a machine learning model. The data which is now split into 80% of training and 20% of testing data. The data is trained and test on machine learning models such as Gaussian naive Bayes and logistic regression. The highest the accuracy of a model wills taken as the work which perform well on the data. The performance of an algorithm is measured using measure metrics.
32.3.1 Overview of Dataset Dataset is taken from Kaggle platform [4]. It has the following attributes: ip: ip address of click, app: app id for marketing, device: device type id of user mobile phone (e.g., iPhone 6 plus, iPhone 7, Huawei mate 7, etc.), os: os version id of user mobile phone, channel: channel id of mobile ad publisher, click_time: timestamp of lick (UTC),attributed time: if user download the app for after clicking an ad, this is the time of the app download, is_attributed: the target that is to be predicted, indicating the app was downloaded. Dataset consist of total 100,000 randomly-selected rows of training data, to inspect data before downloading full set for training and testing of model. Dataset is formed with combination of legal and illegal clicks.
356
B. Mikkili and S. Sodagudi
32.3.2 Algorithm Building Data exploration, data analysis preprocessing, and data forecasting are the three important part of that make up the entire algorithm. All of the components are collected in order, one after the other. These are described in detail below.
32.3.2.1
Data Preprocessing
Preprocessing of data the model corrects missing or null data, filters out inaccurate data, and minimizes irrelevant data during the data preprocessing step. Subtasks such as aggregation, smoothing, feature creation, summation, normalizing, and generalization was done after sorting data will converted into numerical data using label encoder after converting data the null or zero values in dataset need to place with the values for this we do imputation to replace null values using mean of values. We used two correlation models which are Pearson correlation and spear son to evaluate the strength of relationship between two quantitative variables, the abovementioned processes are used to accurately predict the results. The count of clicks on the ad is monitored daily, and the frequency of each click is calculated precisely to predict accurate result. To test the efficiency of fake news detection classifiers, in combination with our proposed methodology.
32.3.2.2
Analyze Exploratory Data
The Exploratory Data Analysis (EDA) is the most important step in the dataset analysis. It comprehends and consolidates the dataset’s contents. Visualizes to them recognize patterns and features that have been defined. If the dataset size is 3.89 MB and they keep track of the following data for analysis.
32.3.2.3
Logistic Regression Model
A logistic regression (LR) model is used to classify text based on a large number of features and a binary output (true/false), because it gives an easy method for categorizing problems into binary or even many classes. We used hyper parameters adjusting to acquire the best results for each dataset, and we tested numerous parameters before getting the greatest accuracies from the LR model. Mathematically, the logistic regression hypothesis function is as follows: The performance of logistic regression is transformed into a probability value using a sigmoid function: The goal is to get the best probability by minimizing the cost function. The following is how the cost function is calculated: h 0 (X ) =
1 1 + e−(β0+β1X )
(32.1)
32 Advertisement Click Fraud Detection Using …
357
Logistic regression uses a sigmoid function to transform the output to a probability value; the objective is to minimize the cost function to achieve an optimal probability. The cost function is calculated as shown in y = 1, log(h 0 (x)), (32.2) Cos t(h 0 (x), y) = − log(1 − h 0 (x)), y = 0.
32.3.2.4
Gaussian Naive Bayes Algorithm
Gaussian naive Bayes is a naive Bayes variant that supports continuous data and follows the Gaussian normal distribution. The Bayes theorem is the basis for a group of supervised machine learning classification algorithms known as Naive Bayes. It is a simple classification technique with a lot of power. They are useful when the inputs’ dimensionality is high. The naive Bayes classifier can also be used to solve complex classification problems. Gaussian naive Bayes is a classification algorithm that uses the Gaussian naive Bayes algorithm. The features’ probability is believed to be Gaussian. xi − μ y 1 exp − √ (32.3) P(xi |y) = √ 2π σ y2 2σ y2
32.3.2.5
Performance Metrics
Confusion matrix: We used a number of different techniques to measure algorithm performance. The confusion matrix is used in the majority of them. The confusion matrix is a four-parameter table that shows the results of a classification model on the test set: true positive, false positive, true negative, and false negative. In Table 32.1, we get the confusion metrics values for logistic regression as actual true and predicted true as 29,919 and actual false and predicted true as 81 and actual false and predicted false as well as actual true and predicted false as 0 and we get the confusion metrics values for logistic Gaussian naive Bayes as actual true and predicted true as 29,762 and actual false and predicted true as 73 and actual false and predicted false as 8 well as actual true and predicted false as 157. Accuracy: Accuracy is a commonly used metric that represents the percentage of correctly expected true or false observations. The following equation can be used to Table 32.1 Confusion matrix
Predicted true
Predicted false
Actual true
True positive (TP)
False positive (FP)
Actual false
True negative (TN)
False negative (FN)
358
B. Mikkili and S. Sodagudi
measure the accuracy of a model’s performance. The accuracy obtained for logistic regression as 1.00 and accuracy for Gaussian naive Bayes as 1.00 as shown below Accuracy =
TP + TN TP + TN + FP + FN
(32.4)
Recall: The total number of true class favorable classifications is referred to as recall. In our situation, it measures the percentage of articles that are estimated to be true out of the total number of true articles. Recall =
TP TP + FN
(32.5)
Precision: Precision score, on the other hand, is the proportion of genuine positives to all positives true occurrences expected. Precision the number of articles tagged as true out of all the positively expected (true) items in this case: Precision =
TP TP + FP
(32.6)
F1-Score: The F1score is a measure of the precision versus recall trade off. It calculates the harmonic mean of each pair of numbers. As a result, it considers both false positive and false negative observations F1-Score = 2
Precison ∗ Recall Precision + Recall
(32.7)
32.4 Results and Discussion Confusion matrix See Tables 32.2 and 32.3. Table 32.2 Confusion matrix for logistic regression Actual true Actual false
Table 32.3 Confusion matrix for Gaussian Naïve Bayes Actual true Actual false
Predicted true
Predicted false
29,919
0
81
0
Predicted true
Predicted false
29,762
157
73
4
32 Advertisement Click Fraud Detection Using …
359
The suggested model’s procedure of detecting legal and criminal users concludes with the data prediction phase. This is how the prediction process works, an algorithms known as Gaussian naive Bayes classifier and logistic regression. It is employed the Gaussian naive Bayes classifier and logistic regression, which outperforms all other classification and regression techniques. It employs a depth-first method to tree pruning, which effectively manages missing data while avoiding over fitting. It employs a parallelized tree-building strategy. It has been shown to be ten times more effective than current packages with a gradient 75% of the dataset is utilized to train the algorithm in the prediction stage, while the remaining 25% is used to test and compare the results. The dataset is divided into two segments (Train X and Train Y) for the algorithm’s training. The trained algorithm is then used to generate Test Y, which is subsequently compared to the Y component of the train dataset to forecast the model’s accuracy. The graphics be low are used to categorizes users as either fraudulent or non-fraudulent. It is easier to comprehend a graphic representation of the conclusion than it is to comprehend a written version. The output is an array in which each transaction is represented as a number with a 0 or 1, with 0 being a lawful user and 1 representing a fraudulent or unlawful user. A report on classification on the prediction can be produced to determine the detection model’s recall, accuracy, and precision. Given the significant amount of fraud that occurs and is generally caught only after the fraud has happened, the results highlight the need of having a fraud detection and prevention strategy. We can learn about previous attacks and prevent them from happening again in the future using this analysis (Figs. 32.2 and 32.3; Tables 32.4 and 32.5). We compare the accuracy for both algorithms and take the highest accuracy algorithm as the best model to classify the click fraud addresses. The outcome demonstrates the significance of building a fraud detection system and hindrance system because the amount of fraud that occurs is significant, and it is usually found only after the scam has occurred. We will be able to recognize previous attacks and prevent them in the future using this research (Fig. 32.4; Tables 32.6, 32.7 and 32.8).
Fig. 32.2 Actual true values for logistic regression and Gaussian Naïve Bayes
360
B. Mikkili and S. Sodagudi
Fig. 32.3 Actual true values for logistic regression and Gaussian Naïve Bayes Table 32.4 Classifier accuracy for click fraud detection Algorithms
TPR
FPR
TNR
Logistic regression
0
0
0
1
1
Gaussian Naïve Bayes
0.99
0.68
0.31
2.68
0.99
Table 32.5 Accuracy comparison of two algorithms Model name
Accuracy (%)
Gaussian Naïve Bayes
99.23
Logistic regression
99.76
Fig. 32.4 Classifier accuracy
FNR
Accuracy
32 Advertisement Click Fraud Detection Using …
361
Table 32.6 Calculation of performance metrics for both logistic regression and Gaussian Naïve Bayes Performance metrics
Logistic regression
Accuracy
Accuracy for logistic regression
Accuracy for Gaussian Naïve Bayes
29919+81 = = 29919+81+0+0 1.0
=
Precision Recall
F1-Score
Gaussian Naïve Bayes 30000 30000+0+0
=
29762+73 29762+73+157+8
=
29835 30000
= 1.0
Precision for logistic regression
Precision for Gaussian Naïve Bayes
=
=
29919 29919+0
= 1.0
29762 29762+157
= 0.99
Recall for logistic regression = f rac2991929919 + 0 = 1.0
Recall for Gaussian Naïve Bayes
F1-score for logistic regression
F1-score for Gaussian Naïve Bayes
=2* = 2.50
=2* =0
1.0∗1.0 1.0+1.0
Table 32.7 Precision recall and F1-score for Gaussian Naïve Bayes
Table 32.8 Precision, recall, and F1-score for logistic regression
=
29762 29762+8
= 0.99
0.99∗0.99 0.99+0.99
Performance
Precision
Recall
F1-Score
0
1
0.99
1.0
1
0.5
0.10
0.7
Performance
Precision
Recall
F1-Score
0
1
1
1
1
0
0
0
We summarize each algorithm’s accuracy on the single dataset under consideration. It is evident that the maximum accuracy achieved on dataset is 99.23%, achieved by logistic regression algorithm and Gaussian naive Bayes classifiers achieved an accuracy of 99.78%. Summarize each algorithm’s recall, precision, and F1score across the dataset. Logistic regression produced the best results in terms of average precision. The average performance of learning algorithms on the dataset is represented graphically using precision, recall, and F1-score. Except for logistic regression and Gaussian naive Bayes classifier, there is little difference in the performance of learning algorithms employing multiple performance criteria.
32.5 Conclusion The percentage of ad click fraud has been determined to be considerable. Recent figures show that the problem is common and likely to increase in the future. As a result, the proposed methodology was created to detect and reduce malware that
362
B. Mikkili and S. Sodagudi
earns money through click fraud. It is required because thieves profit from it, and the problem is widespread. As a result, the proposed solution has resolved the difficulty to the greatest extent possible and offers an accurate result.
References 1. Mouawi, R., El Hajj, I.H.: Towards a machine learning approach for detecting click fraud mobile advertising. In: 13th International Conference on Innovations in Information Technology (2018) 2. Pearce, P., Dave, V., Grier, C., Levchenko, K., Guha, S., McCoy, D., Voelker, G.M.: Characterizing large-scale click fraud in ZeroAccess. In: Proceedings of the ACM SIGSAC Conference on Computer and Communications Security, pp. 141–152, Arizona, USA, ACM (2014) 3. Blizard, T., Livic, N.: Click-fraud monetizing malware: a survey and case study. In: 2012 7th International Conference on Malicious and Unwanted Software (MALWARE), pp. 67–72, Puerto Rico, USA, IEEE 4. https://www.kaggle.com/c/talkingdata-adtracking-fraud-detection/data?select=train_sample. csv
Chapter 33
Wind Power Prediction Using Time Series Analysis Models Bhavitha Katari and Sita Kumari Kotha
Abstract Wind power prediction is considered as an estimation of energy generated from one or more wind farms. The production is often meant by the production of wind farms considered with units of measurement scale depending on the wind farm capacity. Forecasting energy can help you make decisions to know the volume and any underlying trends of the future energy consumption to better organize the supply system. Various prediction models have been developed for wind power prediction but it is still important to consider time while predicting the energy from wind farms. This paper aims to predict the wind energy generated from four different wind farm data for this Random Forest Regression, Autoregressive Integrated Moving Average (ARIMA) and Long Short-Term Memory (LSTM) algorithms are used to predict the amount of energy and the model which work best on different datasets is chosen using error measure metrics.
33.1 Introduction Wind energy is that the energy generated from the wind. It is also one of the natural energy resources often used by developed nations and today it is one of the most established renewable energy source. Wind energy is generated by the rotation of turbine blades that attached to wind farm, the energy from rotational blades is converted into electrical energy using an electrical generator. We need to forecast the amount generated from wind turbans. For the subsequent 48–72 h, and for extended time scales which is up to 5–7 day length, with forecasting maintenance of wind farm will become less costly. Forecasting wind energy depends of different factors for example temperature, humidity, direction of wind which can affect the production of energy generated it is necessary to consider these factors into consideration when predicting the wind energy [1]. Prediction of wind power generation can be done using either direct models or indirect models where direct models uses past values B. Katari (B) · S. K. Kotha Vellagapudi Ramakrishna Siddhartha Engineering College, Vijayawada, Andhra Pradesh, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_33
363
364
B. Katari and S. K. Kotha
of energy as feeding data for predicting model and we get the predicted value as result and with indirect prediction models the energy is predicted using wind speed and wind curve for converting wind speed into power output [2]. This paper aims to predict the wind energy generated on next few days from four different company’s data and compare the working of models on these datasets consists of a measure of energy generated every day in a very year. Here time series forecasting models like random forest regression and autoregressive integrated moving average and long short-term memory algorithm are used to predict energy and error values for each dataset to choose which algorithm works well on each dataset.
33.2 Related Work In this paper, the authors proposed LSTM prediction model to forecast the future 24 h wind power, to predict the wind power the on numerical weather prediction (NWP) data is used. For reducing the dimensions of input variable and to choose the samples from input variable Principal Component Analysis is used. By using simulation results which shows the comparison between the proposed algorithm with traditional models like BP neural network and support vector machine (SVM) model, the LSTM prediction model has the highest accuracy among them [3]. Random forest algorithm is applied to work with predicting then short-term wind power generation. The random forest has many advantages for predicting like it has fewer parameters to adjust and high precision value of prediction and better generalization. Using random forest in wind power prediction will effectively analyze the nonlinear and interaction data [4]. ARIMA model is proposed and used in predicting the future pricing value of gold to reduce the risk of purchasing gold, the model is trained on past observational values of gold from the month November 2003 to January 2014. Forecasting data using the ARIMA model have limitations. The technique in this paper proposed for a short run to detect variations in data if there is any sudden change happen in data for example in changes in policies provided by the government. In this situation, the model becomes ineffective [5].
33.3 Proposed Methodology 33.3.1 Wind Power Generation Dataset The dataset used to predict wind energy was taken from kaggel wind power generation dataset which is a collection of four German energy companies (50 Hz, Amprion, TenneTTSO, and TransnetBW) data. Wind energy dataset is a time series which contains power generation data which is non-normalized with an interval of 5 min, total 96 points (Fig. 33.1).
33 Wind Power Prediction Using Time Series Analysis Models
365
Fig. 33.1 Architecture diagram
Wind power generation data is pre-proposed to delete any unnecessary data and null values. Exploratory data analysis is used to analyze data to summarize their main characteristics, dataset is divided to train and test data of 70% train data and 30% test data. The model is built on train data to train the model to predict the future values from the previous observation. Here, the random forest classifier and Autoregressive integrated moving average (ARIMA) and long short-term memory models are used to predict the further values of energy generated from four companies. The one algorithm with lesser error is considered as the one which works better.
33.3.2 Time Series Forecasting Time series forecasting is predicting the future trends in time series data points, these data points are collected at specific interval of time called time series data. These points are analyzed to find the future trends or to understand the concept of the data.
33.3.3 Random Forest Regression Breiman’s work of random forest got inspiration from the work of Amit and German who introduce the searching for a value over a random sampled subset of decisions while splitting a node to from a single tree [6]. Random forest regression is the combination of two are more decision trees the final output from a random forest is
366
B. Katari and S. K. Kotha
Fig. 33.2 Random forest algorithm
the mean of all outputs, following steps are the working of random forest algorithm [4] Step 1—Extract k random samples along with their classification trees from the original training data N using bootstrap method. Step 2—From k samples select m variables randomly at each node of a tree, this m variable is examining with threshold value to choose the classification variable. Step 3—if each tree grows the maximum and does not remove any branches in the tree. Step 4—From the final random forest the new data is classified using random forest classifier. Figure 33.2 illustrate its work. For performance measuring root mean square error is used in random forest regression algorithm n (Yi − Y )2 /n RSME =
(33.1)
i=1
RSME is root mean square error represented squared root value of differences between predicted value yˆ and observed value yn , where n is no of observations in data yi is observed values and yˆ is predicted vales, and i is variable.
33 Wind Power Prediction Using Time Series Analysis Models
367
33.3.4 Autoregressive Integrated Moving Average (ARIMA) Herman Wold introduced Autoregressive Moving Average (ARMA) models for stationary series, an autoregressive integrated moving average (ARIMA) model is a combination of two algorithms that is autoregressive (AR) model and moving average model along with the integrated term (I). ARIMA model is used to predict the future values depending on past values. These model are used to better understand the data or to predict future values. That is, the underlying process that generate the statistic in the form of below equation [5]. yt = θ0 + ϕ1 yt−1 + ϕ2 yt−2 + · · · + ϕ p yt− p + εt − θ1 εt−1 − θ2 εt−1 − · · · − θ2 εt−2 − · · · − θq εt−q
(33.2)
where yt and εt are the actual and random error values at time t, respectively ϕi where i value range from 1 to p and θ j where j value range from 0 to q here i and j are model parameters. p and q are orders of model which takes the integer values. Values for random error εt are distributed with the mean zero and a constant variance σ 2 . ARIMA model uses the previous historical data and converted it into weighted moving average value over past observations, in ARIMA model Autoregressive (AR) symbolizes the prediction future values over the historical data points. Integrated (I) symbolizes any differencing term, and Moving Average (MA) indicate dependence between both observed and residual error values. Therefore, its three model parameters AR(p), I(d), and MA(q) all combined to make ARIMA (p, d, q) model. For performance measuring root mean square error is used in ARIMA algorithm n 2 (Yi − Y ) /n RSME =
(33.1)
i=1
RSME is root mean square error represented squared root value of differences between predicted value yˆ and observed value yn , where n is no of observations in data yi is observed values and yˆ is predicted vales, and i is variable.
33.3.5 Long Short-Term Memory Algorithm Long short-term memory (LSTM) is an artificial recurrent neural network (RNN) architecture which introduces to reduce the disadvantages of RNN. RNN gives accurate results from recent information but when the information become bigger RNN cannot remember the past information. A cell composed of input gate, output gate and a forget gate (Fig. 33.3).
368
B. Katari and S. K. Kotha
Fig. 33.3 Long short-term memory algorithm
LSTM algorithm works well with classification and prediction problems based on time series data.
Input: xt = wi h t−1 , xt
(33.3)
Forget gate: f t = σ xt + b f
(33.4)
Input gate: i t = σ (xt + bi )
(33.5)
Output gate: ot = σ (xt + bo )
(33.6)
Candidate value: c˜t = tan(xt + bc )
(33.7)
Update candidate value: ct = f t ∗ ct−1 + i t ∗ c˜t
(33.8)
Input at hidden state: h t = tan(ct ) ∗ ot
(33.9)
Here f t represents output value from forget gate at current time t and b f represents bias value for forget gate and i t represents output value from input gate at current time t and bi represents bias value for input gate and ot represents output value from output gate at current time t and bo represents bias value for output gate were as c˜t is candidate value which obtained from tanh layer and ct is updated tanh layer value and h t is value at hidden state. For performance measuring, root mean square error is used in LSTM algorithm n RSME = (Yi − Y )2 /n
i=1
(33.1)
33 Wind Power Prediction Using Time Series Analysis Models
369
RSME is root mean square error represented squared root value of differences between predicted value yˆ and observed value yn , where n is no of observations in data yi is observed values and yˆ is predicted vales, and i is variable.
33.4 Results and Discussion Wind power generation data is pre-proposed to delete any unnecessary data and null values. Exploratory data analysis is used to analyze datasets to summarize their main characteristics. Random forest: In this paper mean absolute error metrics is used to evaluate the random forest regression for time series forecasting. Here the algorithm is generated on first 100 test values (Table 33.1). The above table are the error metrics for random forest reggressor for four different datasets from German based wind farm. Below graphs shows the predicted values on the test dataset. The root mean square error values for Amprion dataset as 42.5 and the error values for 50 Hz dataset as 55.35, the error value for TenneTTSO dataset as 54.17, the error values for dataset TransnetBW as 11.46, respectively (Figs. 33.4, 33.5, 33.6 and 33.7). Table 33.1 Error metric values for random forest model Dataset
RSME
Amprion
42.5
50 Hz
55.35
TenneTTSO
54.17
TransetBW
11.46
Fig. 33.4 Random forest model for Amprion dataset
370
B. Katari and S. K. Kotha
Fig. 33.5 Random forest model for 50 Hz dataset
Fig. 33.6 Random forest model for TenneTTSO dataset
Fig. 33.7 Random forest model for TransnetBW dataset
The above charts show the comparison between predicted data from ARIMA model and the expected data to know how algorithm works on different dataset from the above four graphs we can say that we get the data. Autoregressive integrated moving average (ARIMA): Root mean square error metrics are used to evaluate the working of ARIMA model (Table 33.2). The above table are the error metrics for Autoregressive Integrated Moving Average Model (ARIMA) for four different datasets from German based wind farm. Below graphs shows the predicted values on the test dataset. The root mean square
33 Wind Power Prediction Using Time Series Analysis Models
371
Table 33.2 Error metric values for ARIMA model Dataset
RSME
Amprion
50.9
50 Hz
72.6
TenneTTSO
67.0
TransetBW
15.6
error values for Amprion dataset as 50.9 and the error values for 50 Hz dataset as 72.6, the error value for TenneTTSO dataset as 67.0, the error values for dataset TransnetBW as 15.6, respectively (Figs. 33.8, 33.9, 33.10 and 33.11). The above charts show the comparison between predicted data from ARIMA model and the expected data to know how algorithm works on different dataset from
Fig. 33.8 ARIMA model for Amprion dataset
Fig. 33.9 ARIMA model for 50 Hz dataset
372
B. Katari and S. K. Kotha
Fig. 33.10 ARIMA model for TenneTTSO dataset
Fig. 33.11 ARIMA model for TransnetBW dataset
the above four graphs we can say that we get the data we predicted is nearly equal to the expected data expect the last chart. Long short-term memory algorithm (LSTM): In this paper root mean square error metrics is used to evaluate the long short-term algorithm for time series forecasting (Table 33.3). The above table is the error metrics for long short-term memory algorithm for four different datasets from German based wind farm. The root mean square error values for Amprion dataset as 52.01 and the error values for 50 Hz dataset as 74.37, the error value for TenneTTSO dataset as 68.24, the error values for dataset TransnetBW as 17.72 respectively (Figs. 33.12, 33.13, 33.14 and 33.15). Table 33.3 Error metric values for LSTM
Dataset
RSME
Amprion
52.01
50 Hz
74.37
TenneTTSO
68.24
TransetBW
17.72
33 Wind Power Prediction Using Time Series Analysis Models
373
Fig. 33.12 LSTM model for Amprion dataset
Fig. 33.13 LSTM model for 50 Hz dataset
The above charts show the predicted data from LSTM model to know how algorithm works on different dataset from the above four graphs we can say that the algorithm shows really good results on four different datasets.
33.5 Conclusion Wind power is one of the renewable energy resources which generates fewer greenhouse gas compared to other energy resources. Wind power prediction is considered
374
B. Katari and S. K. Kotha
Fig. 33.14 LSTM model for TenneTTSO dataset
Fig. 33.15 LSTM model for TransnetBW dataset
an estimation of energy generated from one or more wind farms. These developed models help the farm owner to optimize the budget for predicting the amount of energy to be generated for wind power plants. In this paper, Random Forest Regression algorithm and Auto Regression Integrated Moving Average algorithms and Long Short-Term Memory algorithm are used for predicting the wind energy generated in the next few days. These algorithms train on different datasets to check how the proposed model works on different datasets. These models are compared inorder to find the best suitable algorithm in predicting the energy in respect to predicted energy
33 Wind Power Prediction Using Time Series Analysis Models
375
which gives the value closer to expected energy and it was resulted and observed that LSTM gives the value which closer to the original value.
References 1. A detailed literature review on wind forecasting. In: 2013 International Conference on Power, Energy and Control (ICPEC) 978-1-4673-6030-2/13/$31.00 ©2013 IEEE 630 2. Saleh, A.E., Moustafa, M.S., Abo-Al-Ez, K.M., Abdullah, A.A.: A hybrid neuro-fuzzy power prediction system for wind energy generation. https://doi.org/10.1016/j.ijepes.2015.07.039 3. Qu, X., Kang, X., Zhang, C., Jiang, S., Ma, X.: Short-term prediction of wind power based on deep long short-term memory. In: 2016 IEEE PES Asia-Pacific Power and Energy Engineering Conference. Shaanxi Key Laboratory of Smart Grid, Xi’an Jiaotong University Xi’an 710049, China (APPEEC). https://doi.org/10.1109/APPEEC.2016.7779672 4. Zhou, Z., Li, X., Wu, H.: Wind power prediction based on random forests. In: 4th International Conference on Electrical & Electronics Engineering and Computer Science (ICEEECS 2016), School of Electrical and Automation Engineering, Nanjing Normal University, Nanjing 210042, China 5. Guha, B., Bandyopadhyay, G.: Gold price forecasting using ARIMA model. J. Adv. Manage. Sci. 4(2) (2016) 6. Ho, T.K.: Random decision forests. In: Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC, 14–16 August 19 (1995)
Chapter 34
Classification of Astronomical Objects using KNN Algorithm Mariyam Ashai, Rhea Gautam Mukherjee, Sanjana P. Mundharikar, Vinayak Dev Kuanr, and R. Harikrishnan
Abstract This project dynamically lists out all the concepts and applications of various approaches and algorithms that are in line with the problem statement and sets out in finding the most effective solution. Availability of satellite and observational data across the NASA database has opened up a plethora of tasks including that of planet identification and trajectory prediction for machine learning model interpretation and development. The aim is to gain a deeper understanding of the planetary data through constructive analysis and comparison between different machine learning algorithms such as SVM, KNN, tree classifier, neural networks, and Naïve Bayes classifier models as our problem is based on multiclass classification. The computed results are presented in the form of tables and figures for visual inspection. In this study, we have compared different classification models and datasets, owing to which a K-nearest neighbor (KNN) fared out to be the most optimum machine learning model to classify these objects of interest and thus a subsequent simulation of the trajectories in the Orange software. The KNN classifier obtained a testing accuracy score of 96.5%, thus faring a great accuracy for our work.
34.1 Introduction This project subsists of classifying the planets on basis of their orbital position. Two datasets have been framed, one with 627 labeled samples for training purpose and one with 29 samples for testing purpose. The samples include the entries of five celestial bodies that are Mercury, Venus, Mars, Jupiter, and Moon along with their 6 features that are right ascension and declination angle, magnitude, altitude, distance with respect to earth and azimuth angle. Then, simulating the selected five classifiers or Orange platform and selecting the most effective classifier by comparing the performance scores. At last, determining the final accuracy of the model by simulating it on MATLAB. The prime objective was to develop a convention that would M. Ashai · R. G. Mukherjee · S. P. Mundharikar · V. D. Kuanr · R. Harikrishnan (B) Symbiosis Institute of Technology (SIT), Symbiosis International Deemed University (SIDU), Pune, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_34
377
378
M. Ashai et al.
be friendly from an economic and systematized point of view. Machine learning aspirants form one of the major communities across the globe and this project would create a tangible impact and enthusiasm among them. The said aim has been implemented successfully via the series of procedures which are being explained in the forthcoming topics, in addition to that a detailed survey has been done to look into all the possible solutions to the problem statement. While classifying the Celestial body, the selection of the most effective classifier had to be resolved which has been chosen with respect to the performance factors of the classifier that are precision, recall, AUC scores, F1 scores, and CA scores.
34.2 Literature Survey A comparison of different algorithms, namely support vector machines (SVMs), learning vector quantization (LVQ), and single layer perceptron (SLP), for classification of this data between AGNs and Stars & Galaxies classes has been done. The accuracy for test data has been seen to be 75.52%, 97.66%, 97.69% for 2D, 98.09%, 97.69%, 98.09% for 5D and 98.31%, 97.80%, 98.05% for 10D space, respectively [1]. The approach of hierarchical clustering algorithm has been used to classify data or 4100 asteroids with their distant future constancy and categorize them in reliable families according to the belt. From this clustering, it was obtained that most of the powerful families are those which were related to Thennis, Eos, and Koronis which comprises of 14% of the main belt and other twelve more families were found but it was difficult to categorize them. In the inner belt region, identification of families was difficult because of high density and poor inclination [2]. The performance of two classifiers; support vector machine and K-dimension tree was compared for classifying Quasars from the stars. Database was taken from SDSS and 2MASS catalog. Different combinations were chosen to train the classifier, and performance was compared based on precision, recall, true positives, false negative, F-means, G-means, and weighted accuracy. SVM classifier indeed better results with high accuracy but the competition time taken by clear dimension tree was less. It was seen that performance obtained by 10 cross validations is much more effective than the train test method [3]. The classification of multiwavelength data of 1656 AGNs, 3718 stars, and 173 galaxies is done through the usage of algorithms for better accuracy during segregation. Initially, PCA method is used for reduction of dimensionality of the parameter space. The sample has AGNs, stars, and normal galaxies as three classes. PCA is then used to pre-processes the data after which classifiers of SVM and LVQ work toward the classification of AGNs, stars, and galaxies. The results show case that PCA + SVM and PCA + LVQ are effective ways of classification. Furthermore, PCA + SVM gave accurate results whereas PCA + LVQ had faster computation time [4].
34 Classification of Astronomical Objects using KNN Algorithm
379
Tracking objects are a conspicuous stride in tracking and detecting recording or optical devices. Thereby, it is marked as a noteworthy stride, which ascertains and traces the recording objects by making use of a hybrid model entitled nearest search algorithm–nonlinear autoregressive exogenous neural network. Moving forward, the traced occurrences are deployed for training, and the acknowledged occurrences associated to the query are fined by initiated Naïve Bayes classifier for recovering the tremendously admissible object trajectories and occurrences that correspond with it. The results achieved from the proposed Naïve Bayes classifier are better when compared to other prevailing techniques with acute precision of 81.985%, F-measure of 86.847%, and recall of 85.451%, consequently [5]. Another research put forward a method that uses cascaded regression algorithm that accommodates two consequent phases. In the first phase of the algorithm, effortlessly, perceptible negative candidates are excluded with the help of convolutional regression. Perhaps in the second phase, the less evident ones are excluded using a discrete sampling-based ridge regression. This phase serves as a surrogate of entirely connected layers and learns efficiently from the solver. While the reconditioning of the model takes place, extensible ridge regression and core mining techniques are used to generate improved results from the phase two regressor. This study conducted accomplished state-of-the-art execution in a present time environment [6]. According to the latest research done in year 2021, the method of transfer learning gated recurrent unit neural network was used to predict their satellite traffic and to neglect the complications of gradient disappearance and gradient explosion. The model was tuned by using the method of batch normalization. This algorithm has high traffic prediction accuracy lower computational complexity and fast convergence speed [7]. Same year, one more project was carried out to classify the heavenly bodies. The bodies which float in the expanse beyond the Karman line are known as heavenly bodies. The dataset was gathered from SDSS. PCA algorithm was used to reduce the dimensionality and allowed to extract the important features and remove which were not needed. 75% of the data was used for training and 25% of data was used for testing. 21 classifiers were used which included SVM, decision tree, KNN, Naïve Bayes, random forest classifier, XG boost classifier, and MLP. The precision score of random forest classifier was the highest with the precision score of 98.61%. The highest accuracy obtained among the 21 classifiers was 99%, out of classifiers code accuracy above 99%. The model can also be used for medical prediction in future [8]. Emission lines of galaxies find their origin from big young stars and huge black holes. This in turn puts the focus on the spectral classification of above-mentioned emission lines into the formation of respective AGN and stars. Specific spectral classification done on the basis of finding efficiency is carried out by using SVM. This SVM-based learning algorithm is applied to a sample provided by SDSS and data provided by the MPA and JHU. The process gets spilt into two parts, the first being the training of subsets of objects which are related to be AGN hosts or help in the formation of stars. The training is carried out by using the line measurements of powerful emissions as input vectors in the designated n-dimensional space. The
380
M. Ashai et al.
second part of the process is concerned with automatically classifying all leftover galaxies after a particular sample is successfully trained. Based on the emission line ratio used, the accuracy in classification is noted to be at a high 98.8% with lowest result being 91% for samples being tested [9]. The KNN search is an intriguing question pointed out in almost all research and industrial realms imaginable. The usage of two GPU-based implementations of the same KNN search algorithm named CUDA and CUBLAS is done. Their performance is to be done on synthetic data and high-dimensional SIFT matching with the C++ ANN library serving as a comparative algorithm. It is one of the fastest KNN search method. CUDA is implemented through the introduction of a distance matrix classifying query and reference points while two kernels are used for its computation and sorting, respectively [10]. The approach of recognizing various patterns and their subsequent classification through KNN classifier algorithms is known for its excellent execution. The only drawback to this approach of utilizing individual classifiers is the heavy cost of computational demand. In order to implement KNN, a measure of dissimilarity is computed between a test sample and a huge number of samples which further adds to the load of computation. Multistage classifiers are proposed as the solution to the problem of computational load as they reduce it. The system uses a training set (TS) and reduced set (RS), and the experimentation is carried out on four different kinds datasets, i.e., Satimage, Statlog Shuttle, UCI letter recognition database, and NIST special databases (digits and letters). Training algorithms that have been utilized consist of TA, ITA, MRA, and MRAMA. The multistage classifier provides as good recognition results as an individual KNN classifier does, but it accomplishes this task with a reduced computational load. This reduction occurs due to easier patterns being classified at the initial levels and only the most complex patterns are put through tougher last levels [11].
34.3 Experimental Work Carried Out The aim of the experiment was to develop a classification-based machine learning implementation. The positional parameters such as right ascension, declination, altitude, and azimuth serve a major purpose with regards to projecting the orbit of a celestial body. This is the key information that helps us differentiate between the orbits of different astronomical objects. Taking into consideration the abovementioned parameters along with the magnitude and distance from earth, a potential dataset has been developed with the entries of five Celestial bodies: Moon, Mercury, Venus, Mars, and Jupiter. As these bodies revolve in a fixed orbit in the solar system, the values for their location parameters tend to repeat in each revolution. Meaning that the values for each of these bodies need only to be taken within the period of their respective revolutions. The revolution of moon is of 27.322 days while that for Mercury, Venus, Mars, and Jupiter are 87.97 earth days, 224.7 days, 1.88 earth years, and 11.86 earth years, respectively. There is a significant amount of distance between
34 Classification of Astronomical Objects using KNN Algorithm
381
Fig. 34.1 Multidimensional space plot of different classified celestial bodies
the orbits of each of these bodies, making them a perfect fit for the classification-based implementation. Upon exploring the possibilities, it was deduced that a model can be trained to classify the said parameters and return the respective class, i.e., the celestial body, those parameters belong to. The dataset has been divided into training and testing part. The training dataset has 637 samples out of which 86 samples are for Mars, 57 for Mercury, 25 for Moon, 72 for Venus, and 398 for Jupiter. The test dataset consists of 29 samples. The values of the parameter samples for different classes have been plotted in a multidimensional space (multi-axis plot) are shown in Fig. 34.1. As the problem is based on multiclass classification, the dataset was analyzed and based on the comparison between the SVM, KNN, tree classifier, neural networks, and Naïve Bayes classifier models (Fig. 34.2). The accuracy for Naïve Bayes on the training dataset is the least while as the rest of classifiers give the accuracy in the range of 98%. Based on this information, the KNN classifier has been used. Though KNN algorithm can be implemented in different programming languages, we have used MATLAB as it does not require intense modules for training the algorithm. While training the algorithm, the hyperparameters were tuned to optimize the algorithm and to try all the types of distance calculation formulas to find the nearest neighbor.
382
M. Ashai et al.
Fig. 34.2 Comparison chart for different classifiers with their respective performance scores
34.4 Results The optimal results were achieved for the following hyperparameters: • Number of neighbors: 1 • Distance type: Square Euclidean • Optimization type: Bayesian optimization. The model was cross validated with the 10 K-folds. As the KNN, apart from these hyperparameters, the prior probability of each of the class has been calculated. It was found to be 0.6238 for Jupiter, 0.1348 for Mars, 0.0893 for Mercury, 0.1129 for Venus, and 0.0392 for Moon. Training Results: The plots for training data are shown in Figs. 34.3 and 34.4. In Fig. 34.4, the KNN classifier has used all the different types of distance functions (such as Euclidean, Hamming, and Jaccard) for training to finalize the one which gives optimal results. Iterations: As shown in Fig. 34.5, the algorithm has been trained over 50 iterations over different distance functions along with their performance evaluations. The model summary, validation data, and classes calculated after running the iterations have been given below. It gives us the number of observations, K-fold value, finalized distance function, and other important information about the trained algorithm.
34 Classification of Astronomical Objects using KNN Algorithm
Fig. 34.3 Minimum objective versus number of function evaluations
Fig. 34.4 Objective function model
383
384
Fig. 34.5 Iterations while training the algorithm
M. Ashai et al.
34 Classification of Astronomical Objects using KNN Algorithm
385
Model Summary
Validation
Classes and their prior probabilities
Accuracy of Test Model
Upon testing, 28 out of 29 samples were classified correctly, making the test accuracy as 96.5517%.
386
M. Ashai et al.
34.5 Eccentricity of the Project As per the survey carried out, it was seen that a lot of classification-based work has been done in the domain of astrophysics which pertains to the machine learning domain. KNN along with other ML algorithms such as LVQ, CNN, PCA, and SVM have been used previously for classifying stars, galaxies, AGNs, space objects, and planets as well. Though some of the applications incline with this project, the implementation and the data vary. When compared to this project, the following are the major conclusions drawn: • As compared to the latest work done in the year of 2021, the classification of heavenly bodies was done using the 21 classifiers among which 7 of them performed well with the accuracy score above 99%. Instead of the heavenly bodies in the outer space, we classified the planets on basis of their orbital position. • We also compared 5 classifiers on the Orange simulation software among which 4 of them yielded the accuracy score in the range of 98% and based on the performance metrics, KNN was used. • According to the previous research, the ANN model suggested by AlphaGo was simulated on MATLAB because the inbuilt function makes the simulation faster with less computational time. Like that, we implemented our model on MATLAB and yielded good results. The papers reviewed present different algorithms that help in determining different features associated with different planets. With the previous work, planet positions were identifiable at a given date and time. This project stands out from all the previously carried out work due to its aim, the dataset used, the parameters for classification, and the implementation carried out. Apart from this, the dataset has not been taken from any external source rather it has been developed manually unlike in any of the surveyed projects, thereby preserving the authenticity and originality of the data used.
34.6 Conclusion Through the study of different machine learning algorithms, it was realized to what extent different parameters like right ascension, declination, magnitude, altitude, earth distance, and azimuth affect the overall tracking efficiency. Different researchers suggest different classification techniques, however, while comparing different classifiers, it was observed that KNN gave maximum accuracy in terms of detection. All these classifiers were simulated on the Orange platform which is a data mining, open-source visualization, and machine learning toolkit. It must be noted that this project has the capacity to detect five planets with 98% training accuracy and when tested on unseen data, the accuracy achieved is 96.5%.
34 Classification of Astronomical Objects using KNN Algorithm
387
Due to time and resource constraints, the number of planets taken for classification are five. However, the scope of this project is not limited only five planets but could be further used to classify other planets and celestial bodies. Taking into consideration the criteria and features of the celestial objects, the KNN algorithm can be integrated with other sophisticated algorithms and hardware circuitry to improve the quality of results. Different optical devices or telescopes might also be incorporated to expand the applicability of this project.
References 1. Zhang, Y., Zhao, Y.: Automated clustering algorithms for classification of astronomical objects. Astron. Astrophys. 422(3), 1113–1121 (2004) 2. Zappala, V., Cellino, A., Farinella, P., Knezevic, Z.: Asteroid families. I-Identification by hierarchical clustering and reliability assessment. Astron. J. 100, 2030–2046 (1990) 3. Gao, D., Zhang, Y.X., Zhao, Y.H.: Support vector machines and kd-tree for separating quasars from large survey data bases. Mon. Not. R. Astron. Soc. 386(3), 1417–1425 (2008) 4. Zhang, Y., Zhao, Y.: Classification in multidimensional parameter space: methods and examples. Publ. Astron. Soc. Pacific 115(810), 1006 (2003) 5. Ghuge, C.A., Prakash, V.C., Ruikar, S.D.: Naive Bayes approach for retrieval of video object using trajectories. In: International Conference on Intelligent and Smart Computing in Data Analytics: ISCDA 2020, pp. 115–120 (2021) 6. Wang, N., Zhou, W., Tian, Q., Li, H.: Cascaded regression tracking: towards online hard distractor discrimination. IEEE Trans. Circuits Syst. Video Technol. (2020) 7. Li, N., Hu, L., Deng, Z.-L., Su, T., Liu, J.-W.: Research on GRU Neural Network Satellite Traffic Prediction Based on Transfer Learning. School of Electronic Engineering, Beijing University of Posts and Telecommunications, Beijing, China (2021) 8. Wierzbi´nski, M., Pławiak, P., Hammad, M., Rajendra Acharya, U.: Development of accurate classification of heavenly bodies using novel machine learning techniques. Soft Comput. 25, 7213–7228 (2021) 9. Shi, F., Liu, Y.-Y., Sun, G.-L., Li, P.-Y., Lei, Y.-M., Wang, J.: A Support Vector Machine for Spectral Classification of Emission-Line Galaxies from the Sloan Digital Sky Survey. North China Institute of Aerospace Engineering, Langfang 10. Soraluze, I., Rodriguez, C., Boto, F., Cortes, A.: Fast multistage algorithm for K-NN classifiers. In: Iberoamerican Congress on Pattern Recognition, pp. 448–455. Springer, Berlin, Heidelberg (2003) 11. Garcia, V., Debreuve, E., Nielsen, F., Barlaud, M.: K-nearest neighbor search: fast GPUbased implementations and application to high-dimensional feature matching. In: 2010 IEEE International Conference on Image Processing, pp. 3757–3760 (2010)
Chapter 35
Efficient Analogy-based Software Effort Estimation using ANOVA Convolutional Neural Network in Software Project Management K. Harish Kumar and K. Srinivas Abstract The analogy-based software effort estimation (SEE) technique assesses the requisite effort aimed at an advanced software project centred upon the whole effort utilized in accomplishing past similar projects. Therefore, it is not often probable aimed at antedating the precise guesses in the software development effort’s estimation. Existent works have established numerous techniques aimed at SEE, however, no consensus regarding the methodology and settings apt to generate precise estimates are attained. Aimed at boosting the accuracy and decrementing the training time, this research methodology presented an ANOVA convolutional neural network (A-CNN) methodology for effort estimation (EE). Initially, the input data are preprocessed, then the features are taken out as of the pre-processed data and the vital features are chosen by employing the linear scaling-based deer hunting optimization (LS-DHO) technique. Then, the weighting value is enumerated aimed at the features selected. Utilizing the modified K-harmonic means (MKHM) algorithm, the similar values are clustered centred on those values. Consequently, the data clustered are inputted to the A-CNN algorithm. It estimates the software project developments’ effort. In an experiential evaluation, the methodology proposed is examined with the existent research methodologies centred upon the accuracy, mean magnitude of relative error (MMRE), PRED, and also execution time metrics. The methodology proposed attains excellent outcomes analogized to the existent methodologies.
K. Harish Kumar (B) · K. Srinivas Department of Computer Science & Engineering, Koneru Lakshmaiah Education Foundation, Deemed to be University, Hyderabad, Telangana 500075, India K. Srinivas e-mail: [email protected] K. Harish Kumar Department of Computer Science & Informatics, Mahatma Gandhi University Nalgonda, Nalgonda, Telangana 508001, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_35
389
390
K. Harish Kumar and K. Srinivas
35.1 Introduction Aimed at executing activities, like feasibility studies, project bidding, cost estimation, planning, and also resource allocation, software project managers need trustworthy methodologies, at software development’s primary stages [1]. So, SEE is one amidst the utmost essential activities in the software project development management. It is essential aimed at optimal planning and is vital aimed at governing the software development procedure [2]. Over-estimation can give rise to resources’ wastage, however, underestimation can cause schedule/budget overruns and/or quality compromises [3]. Hence, an efficient estimation method is needed. SEE is researched and developed in algorithmic and also machine learning methodology since the 1960s [4]. Two types of SEE types are implemented that are expert-centred and model-centred methodologies [5]. Estimation centred on expert judgement is one amidst the earliest and extensively utilized methodologies. The EE procedure is executed by pondering inputs, namely requirements, line of codes, function points, use case points, size, etc. Additionally, project plan, investment analysis, budgeting, pricing procedures, iteration plans, and bidding rounds are centred upon software effort estimates’ inputs [6]. Furthermore, it is well-known that the EE’s performance is prominently impacted via these parameter settings. The optimal parameters (i.e. parameters that yield the finest estimation results) depend upon the dataset utilized and thus must be elected and changed appropriately [7]. Excluding the expert and model-centred methodologies, diverse deep learning and also machine learning algorithms are recently utilized that are logistic regression, multiple linear regression, stepwise regression, linear regression, lasso, ridge, elasticnet regression, Naïve Bayes, support vector machines (SVMs), decision tree, random forest, neural net, recurrent neural network (RNN), Deepnet, ensemble methodologies, etc., are utilized aimed at estimation [8]. Ensemble methodologies as well gain much attention and create much prediction analogized to the individual techniques aimed at effort prediction [9]. But those methodologies are the software effort’s inaccurate estimation that is a key risk factor, which can cause the software project’s failure [10]. So, this research methodology proposes a novel method for the SEE procedure. The presented research paper’s structure is systematized as: Sect. 35.2 explains the existent papers relevant to the SEE; Sect. 35.3 explains the proposed A-CNN centred SEE; Sect. 35.4 examines the presented research’s result analysis; Sect. 35.5 concludes the paper with future enhancement.
35 Efficient Analogy-based Software Effort Estimation using ANOVA …
391
35.2 Related Work Amazal and Idri [11] proffered the former 2FA-k prototypes technique’s improvement termed as 2FA-c-means aimed at SEE. 2FA-c-means utilizes a clustering technique termed common fuzzy c-means that was the fuzzy c-means clustering methodology’s generalization aimed at clustering the objects utilizing mixed features. The 2FA-c-means’s performance was examined and analogized with the former 2FAk prototypes methodology’s performance, as well as traditional analogy over ‘6’ datasets that are quite varied and have varied sizes. Experiential outcomes exhibited that 2FA-c-means outshined the ‘2’ other analogy methodologies utilizing all-in and jack-knife examination methods. In the approach, the best k-value was chosen by conducting various experiments that take more time. Eswara Rao and Appa Rao [12] recommended a novel design centred on ensemble learning and also recursive feature’s elimination centred methodology aimed at estimating the effort. Utilizing the feature’s ranking and also selection protocol, the methodology comprises the potential for assessing the efforts regarding the parameters, namely cost and size. The features were ranked, and the best features were input to another design centred upon ensemble learning. Simulation outcomes were encouraged with the methodology utilizing the COCOMO II dataset. The methodology was not effective all time if the dataset encompasses repeated information that will generate an error result. Phannachitta [13] presented an efficient solution aimed at constructing an exceptionally reliable, accurate, and also explicable design-centred software effort estimator. The software effort estimators’ systematic analogy that was wholly optimized via the Bayesian optimization technique was executed on ‘13’ standard benchmark datasets. The analogy was facilitated by the strong performance metrics and also statistical test methodologies. The approach uses the recursive feature elimination process that takes more time to accomplish the procedure.
35.3 Efficient Software Effort Estimation Methodology Amidst diverse probable selections of the EE methodologies, analogy-centred SEE is one amidst the most implemented methodologies in the industry and research communities. In the existent research technique, diverse feature weight optimization methodologies are proposed aimed at similarity functions in analogy-based estimation (ABE); however, no consensus concerning the methodology and also the settings apt to create accurate estimates are attained. Aimed at boosting the prediction EE accuracy and decrementing the training time, the novel proposed system was established. The methodology proposed encompasses pre-processing, feature extraction (FE), feature selection (FS), weighting calculation, similarity evaluation, and EE. In the pre-processing phase, the repeated data are eliminated and the string values are converted into numerical values. After pre-processing, the features are taken
392
K. Harish Kumar and K. Srinivas
Fig. 35.1 Block diagram for the proposed software effort estimation methodology
out as of the inputted data, and the important features are selected employing the LS-DHO algorithm. After that, the weight value is calculated aimed at the features utilizing entropy calculation and centred on the calculated weight values, the similarity evaluation is executed by MKHM. As of the similarity evaluation output, the software’s effort is estimated by utilizing A-CNN. Figure 35.1 exhibits the proposed methodology’s block diagram.
35.3.1 Pre-processing The inputted data comprise the historical information regarding the projects. The historical data comprising a few repeated data and also string values. Repeated data spend much time and create unwanted errors, thus the repeated data are eliminated as of the inputted data, and all the string values are converted into numerical values aimed at making an effective estimation procedure. The final pre-processed input data are articulated as (35.1) K = k, k, k,..., k d
1
2
d
n
3
n
Herein, K signifies the input dataset; k implies the n-number of input data.
35 Efficient Analogy-based Software Effort Estimation using ANOVA …
393
35.3.2 Feature Extraction Herein, the features are taken out as of the pre-processed data. The features are the vital source aimed at examining the software’s effort. Herein, the requisite software’s reliability, product’s complexity, main storage constraint, execution time constraint, application experience, analyst’s capability, programmer’s capability, requisite development schedule, etc., are extracted. Thus, the feature terms are equated as Z l = {z 1 , z 2 , z 3 . . . z n } (or) z i , i = 1, 2, . . . , n
(35.2)
Here, Z l implies the features’ set; z n signifies the n-number features.
35.3.3 Feature Selection After FE, the vital features are selected aimed at decrementing the training time. Here, the LS-DHA is utilized aimed at FS. The conventional deer hunting algorithm is developed centred upon the hunting nature of humans in the deer’s or buck’s direction. The normal deer hunting optimization algorithm provides better FS result, but it has low exploration searching capability performance. Aimed at boosting the exploration searching capability, this research methodology employs the linear scaling technique. The prime step is initializing of the hunters’ population. Here, the initialized population is Z l that is explicated in Eq. (35.2). After population initialization, the fitness is enumerated aimed at the initialized population, the fitness FFt is equated as FFt = Max(Ac (Q s ))
(35.3)
Herein, Max(Ac (Q s )) signifies the cluster outcome’s maximal accuracy. Then, the parameters are initialized. Here, the wind angle and as well the deer’s position angle are initialized. The wind angle is derived centred on the circle’s circumference that is equated as w(θi ) = 2π α
(35.4)
Herein, α implies a random number with a value inside the [0, 1] range; i signifies the existent iteration. The position angle ωi is articulated as ωi = w(θ ) + π
(35.5)
Herein, w(θ ) implies the wind angle. After the parametric initialization, the position propagation is executed. The position propagation is handled in ‘3’ steps that
394
K. Harish Kumar and K. Srinivas
are propagation through the leader’s position, propagation via ωi , and propagation through the successor’s position. Firstly, the propagation via the leader’s position (Z l ) f has been updated utilizing Eq. (35.6): (Z l ) j+1 = (Z l ) f − X . λ. P × (Z l ) f − (Z l ) j
(35.6)
Herein, (Z l ) j implies the position at the existent iteration; (Z l ) j+1 symbolizes the position at the subsequent iteration; X and P signify coefficient vectors; λ is a random number developed pondering the wind speed, whose value ranges as of 0–2. Then, the propagation via the ωi thus the ωi is pondered for the cause of incrementing the search space. Centred on the ωi , the position updation is executed utilizing Eq. (35.7), (Z l ) j+1 = (Z l ) f − λ cos (v) × (Z l ) f − (Z l ) j
(35.7)
Lastly, the position updation procedure takes place regarding a successor’s position rather than pondering the finest position. Next, the global search is executed employing Eq. (35.8). Herein, the linear scaling procedure is handled aimed at incrementing the exploration phase that is articulated as (Z l ) j − (Z l )min s (Z l ) j+1 = (Z l ) f − X.λ. P × max min (Z l )s − (Z l )s
(35.8)
Herein, (Z l )s implies the search agent’s successor position as of the prevalent and (Z l )min signify the successor position’s maximum and its population; (Z l )max s s minimum.
35.3.4 Weighting Calculation Next, the weight value is computed aimed at the features chosen for the tenacity of finding the feature’s rank. The 1st step is the measured values’ standardization. Thus, the standardization L i is equated as zi L i = n
i=1 z i
(35.9)
Next, for the standardized values, the entropy Eni is enumerated that is articulated as Eni = −
n 1 L i . log(L i ) log n i=1
(35.10)
35 Efficient Analogy-based Software Effort Estimation using ANOVA …
395
Lastly, the feature’s weights are derived via the entropy value calculated that is equated as 1 − Eni i=1 (1 − Eni )
βi = m
(35.11)
Herein, βi symbolizes the weight value; m signifies the m-number of entropy values calculated.
35.3.5 Similarity Evaluation Herein, centred on the βi calculated, identical data are clustered utilizing the MKHM technique. KHM clustering technique utilizes the harmonic means of the distances as of the data points to the cluster centres in its cost function. In the KHM, the normal membership function spends a greater time aimed at computation; consequently, this research methodology employs the Gaussian function aimed at decrementing the clustering time. Firstly, the cluster centres C(βi ) = {c(β1 ), c(β2 ), . . . , c(βn )} arbitrarily as of the n-weight values βi = {β1 , β2 , . . . , βn }. Aimed at every data point βi , enumerate its membership function χ ( C(βi )|βi ) in every C(βi ) and its weight a(βi ) in Eq. (35.12) as
exp −ρβi − C(βi )−g−2 χ ( C(βi )|βi ) = n −g−2 i=1 βi − C(βi ) n βi − C(βi )−g−2 a(βi ) = i=1
n −g 2 i=1 βi − C(βi )
(35.12)
(35.13)
Herein, g implies the parameter that is higher than 2; ρ signifies another constant parameter. Next, re-compute the centre’s location as of all data points regarding their memberships and also weights: n i=1 χ ( C(βi )|βi ).a(βi ).βi C(βi ) = n i=1 χ ( C(βi )|βi ).a(βi )
(35.14)
Repeated Eqs. (35.12) to (35.14) until meeting the pre-stated clustered. Lastly, the cluster set attained is signified asQ s = {q1 , q2 , . . . , qn } in this cluster set n-number of clusters are offered.
396
K. Harish Kumar and K. Srinivas
35.3.6 Effort Estimation Here, the clustered set is inputted into the A-CNN aimed at assessing the software development’s effort. Convolution neural network (CNN) algorithm is a multilayer perceptron. The CNN comprises ‘5’ layers that are input layer, convolutional layer (CL), pooling layer, fully connected layer (FCL), and then an output layer. The CNN algorithm yields an efficient outcome; however, if the data’s size is large, then the output layer’s normal Softmax kernel function causes the error in the training time. Hence, this research methodology utilizes the analysis of the variance (ANOVA) radial basis kernel function as the activation function that decrements the training error. Firstly, the inputted data Q s = {q1 , q2 , . . . , qn } are altered as a singular column in the inputted layer. The inputted layer’s outcome X i is articulated as X i = Qs
(35.15)
Next, the input layer’s outcome is inputted into the CL. CLs convolve the input and then pass its outcome onto the subsequent layer. Every convolutional neuron processes data just aimed at its receptive field. The CL function is articulated as Oil =
n
ηi . X i + τi
(35.16)
i=1
Herein, Oil implies the CL’s output; ηi signifies the weight value; τi symbolizes the bias value. Then, the pooling layer decrements the data’s dimensions by compiling the neuron clusters’ outputs at ‘1’ layer into a single neuron in the subsequent layer. The pooling function H (.) is articulated as
Jil = H Oil
(35.17)
Here, Jil implies the pooling layer’s output. After diverse CL and pooling layer, there can be ‘1’ or more FCLs that target to execute high-level reasoning. The FCL function is signified as yel and it is articulated in Eq. (35.18) as yel =
n
ηi . Jil + τi
(35.18)
i=1
The FCL’s outcome is inputted into the final output layer. Usually, in CNN, the Softmax kernel function is utilized as the final activation function; aimed at boosting the performance, this research methodology utilizes the ANOVA radial basis kernel function in place of utilizing Softmax kernel function, which decrements the error and it is articulated in Eq. (35.19). The ANOVA assesses the impacts of one or numerous features clusters on ‘1’ project.
35 Efficient Analogy-based Software Effort Estimation using ANOVA …
Al =
n
2 d exp −ε yel−1 − yel
397
(35.19)
e=1
Herein, Al implies the activation layer’s output; yel−1 implies the FCL’s outcome at former iteration; ε signifies a parameter; d symbolizes the polynomial degree. Then, the loss lss is enumerated centred on Eq. (35.20): lss = Al − Tt
(35.20)
Herein, Tt implies the target function. Utilizing this A-CNN’s aid, the software’s effort is estimated aimed at the real-time project that is examined in the testing procedure.
35.4 Results and Discussion Herein, the proposed A-CNN-centred SEE’s performance is examined. The methodology proposed is applied in Python’s working platform.
35.4.1 Database Description In this methodology proposed, ‘4’ databases are pondered aimed at the performance examination. Hence, the ‘4’ databases are Desharnais, COCOMO81, COCOMONASA60, and also COCOMONASA93. The Desharnais dataset comprises 81 projects, COCOMONASA93 comprises a total of 93 projects, COCOMONASA60 comprises a total of 60 projects, and also COCOMO81 comprises 81 projects. Hence, the database is utilized aimed at training and testing time.
35.4.2 Performance Analysis Here, the proposed A-CNN-centred SEE’s performance is examined with the CNN, recurrent neural network (RNN), deep belief networks (DBN), and also deep neural network (DNN) regarding the accuracy and also training time. Table 35.1 exhibits the proposed A-CNN-centred SEE’s and the existent algorithm-centred EE’s performance. In Table 35.1(a), the method’s performance is examined centred upon the accuracy metric. Herein, the performance is examined aimed at all ‘4’ datasets. All the dataset comprises greater accuracy analogized to the existent research methodologies. Explicitly, the A-CNN yields 96.63% accuracy
398
K. Harish Kumar and K. Srinivas
Table 35.1 Performance analysis of the proposed and existing methods in terms of (a) accuracy and (b) execution time Datasets
Proposed A-CNN
CNN
RNN
DBN
DNN
(a) Desharnais
87.45
78.85
67
71.25
73
COCOMO81
89.75
82.13
69.58
74
77.89
COCOMONASA60
91.56
85.12
73.87
78
82
COCOMONASA93
96.63
89.86
77
83.74
87.96
Desharnais
562
783
1019
938
812
COCOMO81
578
789
1023
942
814
COCOMONASA60
246
575
972
806
603
COCOMONASA93
683
856
1356
1065
993
(b)
for the COCOMONASA93 dataset. In Table 35.1(b), the performance is examined regarding the execution time. Herein, the proposed and existent methodology spends greater execution time aimed at the COCOMONASA93 dataset, however, the ACNN-centred SEE’s spends lesser time analogized with the other existent methodologies. The proposed A-CNN’s execution time is 683 s. Hence, the table exhibits that the A-CNN-centred SEE proposed yields efficient performance analogized to the existent research methodologies.
35.4.3 Comparative Analysis Herein, the proposed and existent methodologies’ performance is analogized centred on MMRE and also prediction (PRED) metrics with graphical representation. Figure 35.2 exhibits the proposed A-CNNs and existent algorithm-centred SEE’s comparative analysis. In Fig. 35.2a, the performance is examined centred on the MMRE metric. The existent methodologies yield greater MMRE; if the methodology proposed yields lesser MMRE, then the system is signified as the effective system. Likewise, the A-CNN proposed yields lesser MMRE explicitly aimed at the COCOMONASA93, the methodology proposed yields very less MMRE (i.e.) 0.0523. In Fig. 35.2b, the performance is analogized centred on the PRED metric. Herein, the proposed yields greater PRED analogized to the existent methodology; the RNN yields very lesser PRED. The COCOMONASA93 yields greater PRED (i.e.) 94%. Hence, the comparative examination exhibits that the algorithm-centred SEE proposed yields efficient outcomes analogized to the prevalent techniques.
35 Efficient Analogy-based Software Effort Estimation using ANOVA …
399
Fig. 35.2 Comparative analysis of proposed and existing methodologies centred on a MMRE and b PRED metrics
35.5 Conclusion A novel SEE methodology is propounded in this paper centred upon the A-CNN algorithm. The protocol proposed comprises pre-processing, FE, FS, weighting, similarity evaluation, and also EE steps. Here, the software’s effort is estimated employing the A-CNN algorithm. For the performance examination, this research methodology utilizes the ‘4’ databases, like Desharnais, COCOMO81, COCOMONASA60, and also COCOMONASA93. In the experiential examination, the proposed methodology’s performance is examined with the existent research protocols, namely CNN, RNN, DBN, and DNN centred upon the accuracy, training time, MMRE, and also PRED metrics. The methodology proposed yields efficient results aimed at all the datasets. Specifically, the proposed A-CNN attains higher accuracy and PRED for the COCOMONASA93 dataset (i.e.) 96.63 and 94%, and the proposed methodology’s
400
K. Harish Kumar and K. Srinivas
execution time is as well low. Hence, the suggested A-CNN-based SEE methodology attains excellent performance analogized to the existent research techniques. The methodology proposed can be progressed in the future by pondering advanced algorithms and better ranking methodologies aimed at boosting the EE procedure’s performance.
References 1. Benala, T.R., Mall, R.: DABE: differential evolution in analogy-based software development effort estimation. Swarm Evol. Comput. 38, pp. 158–172 (2018) 2. Ezghari, S., Zahi, A.: Uncertainty management in software effort estimation using a consistent fuzzy analogy-based method. Appl. Soft Comput. 67, 540–557 (2018) 3. Idri, A., Abnane, I., Abran, A.: Support vector regression-based imputation in analogy-based software development effort estimation. J. Softw.: Evol. Process 30(12), 1–23 (2018) 4. Ardiansyah, A., Mardhia, M.M., Handayaningsih, S.: Analogy-based model for software project effort estimation. Int. J. Adv. Intell. Inf. 4(3), 251–260 (2018) 5. BaniMustafa, A.: Predicting software effort estimation using machine learning techniques. In: In 2018 8th International Conference on Computer Science and Information Technology (CSIT), IEEE, 11–12 July 2018, Amman, Jordan, 2018 6. Saeed, A., Butt, W.H., Kazmi, F., Arif, M.: Survey of software development effort estimation techniques. In: Proceedings of the 2018 7th International Conference on Software and Computer Applications, 8 February, 2018, Kuantan Malaysia, 2018 7. Abnane, I., Hosni, M., Idri, A., Abran, A.: Analogy software effort estimation using ensemble KNN imputation. In: 2019 45th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), IEEE, 28–30 Aug 2019, Kallithea, Greece, 2019 8. Priya Varshini, A.G., Anitha Kumari, K., Janani, D., Soundariya, S.: Comparative analysis of machine learning and deep learning algorithms for software effort estimation. J. Phys.: Conf. Ser., IOP Publ. 1767(1), 1–11 (2021) 9. Priya Varshini, A.G., Anitha Kumari, K.: Predictive analytics approaches for software effort estimation: a review. Indian J Sci Technol. 13(21), 2094–2103 (2020) 10. Wu, D., Li, J., Bao, C.: Case-based reasoning with optimized weight derived by particle swarm optimization for software effort estimation. Soft Comput. 22(16), 5299–5310 (2018) 11. Amazal, F.A., Idri, A.: Estimating software development effort using fuzzy clustering-based analogy. J. Softw.: Evol. Process 33(4), 1–23 (2021) 12. Eswara Rao, K., Appa Rao, G.: Ensemble learning with recursive feature elimination integrated software effort estimation: a novel approach. Evol. Intel. 14(1), 151–162 (2021) 13. Phannachitta, P.: On an optimal analogy-based software effort estimation. Inf. Softw. Technol. 125, 1–11 (2020)
Chapter 36
Hand Written Devanagari Script Short Scale Character Recognition Kachapuram BasavaRaju and Y. RamaDevi
Abstract India is a country of various languages right from Kashmir to Kanyakumari. The national language of India is Hindi which is also the third most popular language in the world. The script in which the Hindi language is written is known as Devanagari script which in fact is used to write many other languages such as Sanskrit, Marathi, Nepali, and Konkani languages. Neural networks are recently being used in several different ways of pattern identification. It is common knowledge that every person’s handwriting is dissimilar. Therefore, it is challenging to recognize those handwritten monograms. The sector of pattern recognition that has become a hot topic for research purposes is handwritten character recognition. This is where neural networks play an important role. The competence of a computer to take in and decipher comprehensible transcribed input whose origin is paper documents, touch screens, photographs, and alternative gadgets are termed as handwriting recognition. Handwritten recognition of words is a model which is used to convert the written text into words that are crucial in the human computer interface. The handwriting recognition area is an extensively experimented branch till date and the Devanagari script recognition is progressing area of research. The above application is used in mail sorting, office automation, cheque verification, and human computer communication, i.e. the growing age of artificial intelligence. A sample of the dataset of images which are centralized and grayscale are considered and analysed using the K-nearest neighbour classification, extremely randomized decision forest classification and random forest classification are considered.
K. BasavaRaju (B) SreeNidhi Institute of Science & Technology, Hyderabad, TS, India e-mail: [email protected] Y. RamaDevi Chaitanya Bharathi Institute of Technology, Hyderabad, TS, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_36
401
402
K. BasavaRaju and Y. RamaDevi
36.1 Introduction Languages and scripts in India are the innumerable cause of its diversity in nature as well as culture. Devanagari is an Indic script that is used to write over 100 spoken languages such as Hindi, Sanskrit, Maithili, etc. in the region of India and Nepal. The script comprises 47 primary alphabets, 13 vowels, 33 consonants, and 10 digits. When the vowel is combined with the consonant it forms another alphabet or sounding. Unlike the English language wherein the alphabets are divided into lower case and upper case, the Devanagari script does not have any capitalization. The main objective of the above proposed work is to develop a model for character recognition using machine learning and convolutional neural networks. The above model is trained on the pre-processed images of 6 × 6 size which include alphabets and digits. There are basically two different algorithms based on which the model is trained and tested to calculate the results. The final accuracy results are displayed in the comparison table and confusion matrix. Handwritten character recognition [1] is that area of pattern recognition that has become the subject of research in the recent past. This is where neural networks play an important role. The potential of the computer to receive and understand intelligible handwritten input from various sources such as paper documents, touch screens, pictures, and other forms is termed to be handwriting recognition. It can be of two types, online and offline. When we speak about online recognition, it involves the conversion of digital pen-tip strokes into a list of co-ordinates, which is then used as an input for the classification system. On the other hand offline recognition uses images of characters as input. Some of the predatory works inculcate shallow learning with features designed by hand on both online and offline [2] datasets. The handwritten features comprise of the density of pixels over the region of an image, the curves involved in the character, the dimensionality of the region and the most important factor, i.e. the number of vertical as well as horizontal lines involved. The above model is developed with the help of python and machine learning. The main reasons for choosing them are listed below. ML can be defined as a branch of artificial intelligence (AI) where in the developed model undergoes an automated learning with little or no human intervention. 1. 2. 3.
4.
Machine learning (ML) is branch in computer science which is emerging with a very high growth rate. Machine learning is a technology that deals with training the system so as to make them automatically learn and improve with experience. The main aim of machine learning is to develop models which are trained either on one or more algorithms implied on the previously available data and progress over the time period so that it can be used to make predictions on the input data. Machine learning models are put into application in various industries such as health care, etc., wherein the health care has been benefitted a lot through the predictions made.
36 Hand Written Devanagari Script Short Scale Character Recognition
5.
6.
403
The main aspects of machine learning are analysing as well as interpretation of patterns and structures on the basis of previously available data to enable the processes such as learning, reasoning, and decision making skill without the help from human interaction. In machine learning, the model analyses and makes suggestions and decisions on the input data which are driven based on the available data while training. The model stores the information in order to improve its future decision making skills.
Objective of Character Recognition Even though a lot of research is done in the region of handwritten literature recognition in English [3] and various other languages over vast Asia but unfortunately very few attempts have been made in the Indian languages such as Hindi, Marathi, Telugu, etc. In this proposed work, I aim at generating a handwriting character recognition algorithm for Devanagari (an Indian language) with high prediction probability accuracy and least time is taken for the process of being trained and classified.
36.2 Proposed Work The model developed should have the ability to predict the Devanagari character if given a full image of the character which is possible on being trained for the full character image. The model is trained on full character image should be able to predict the Devanagari character on being given partial or half image of the character. The model being trained on half character image should be able to predict the Devanagari character on being given full image of the character. Character Recognition Character recognition is a process where the handwritten scripts are being captured and stored in the dataset as images. The images of the same character vary along with the script by the strokes used the angle deviations, etc. This proposed work can be sub divided into three broad steps: (i) (ii) (iii)
Preprocessing Feature extraction Classification using classifier algorithms.
The aim of the initial step is to remove the unnecessary information from the input dataset which would increase the recognition probability on the basis of the speed and accuracy aspects. The methods involved in the process of the preprocessing images are normalization, binarization, removing noise, sampling, and smoothening of the image. The intermediate step involves multi-dimensional vector fields, the output of the pre-processed images and higher dimensional data is fetched. The last step in the process is the classification, wherein numerous models are used to discriminate the
404
K. BasavaRaju and Y. RamaDevi
extracted features to respective character classes and hence verifying the words or characters with which the features match. The evaluation metric for the model will be transcription accuracy on the test images in the HP dataset of Devanagari characters. The definition of accuracy will be correctly predicting handwritten Devanagari [4] characters in the image. The model must not only predict the number of characters that are present but also identify each character correctly. In this proposed work the system will also evaluate it against other handwritten Hindi/Tamil/Bengali character recognition in both speed and accuracy. Data Collection The implementation is done on the modified national institute of standards and technology (MNIST) dataset which contains 80,000 training examples and 1000 testing examples. The database [5] is made up of grayscale handwritten character images that are resized to 6 × 6 pixel box, and these images are centered in 28 × 28 image (padded with white region). The images are resized with the help of an anti-aliasing technique in order to reduce the image distortion value. Step-1: Data Gathering Collection of data involves capturing a record of past events in order to define the recurring patterns. From the observed patterns, the predictive models are developed on being assisted by the learning algorithms which look for patterns and predict the changes in the mere future. The efficiency of the developed prediction models are completely based on the efficiency rate of the data on which they are trained, so good data collection is crucial to develop high performance models (Fig. 36.1). Step-2: Data Preprocessing The data has to be pre-processed before being implemented in any of the modelling techniques which means the data has to be normalized, error-free and should contain only the relevant information. Here the needed work is change the image filename as the writer name followed by the Unicode character so that each image will get a unique file name. And then all the images with the same Unicode were grouped under a folder with Unicode as its folder name so that system can use the load files method in sklearn.datasets2 module. Then, the original unequally sized rectangular images are resized into 80 × 80 quadratical images and saved as JPG files.
Fig. 36.1 The Devanagari script
36 Hand Written Devanagari Script Short Scale Character Recognition
405
Step-3 Feature Extraction In this process we extract the required information from the entire dataset in order to increase the prediction efficiency. Hence, the dataset is scrutinized based on the key factors which affect the output and all the data which is unrelated to the output is removed. Step-4 Model Training The most common method is to take all the available data and split the dataset into training and evaluation subsets. The rule for splitting the data is 80–20 where in training is 80 and testing is 20, respectively. The model uses the training dataset to train models in order to view patterns and assemble the data to evaluate the predictive quality of service. The model evaluates the predictive efficiency by comparing the results from the evaluation dataset with the true value, i.e. theoretical value using a variety of metrics. Step-5 Prediction If the prediction is 75% then the system is taken for deployment. Machine learning uses the previously analysed data to predict answers. Hence, prediction is the main step where the user gets the answers to the questions. This is the point where the model is ready to put to practice and finally use to predict.
36.3 Implementation The model uses CNN and DNN for the processes of feature extraction and two different classifiers for the process of classification based on which the recognition is done. The following steps are followed: (1) (2) (3) (4) (5) (6) (7)
Develop a CNN model to extract and configure features of the images. Feature extraction from the images using convolutional and deep neural networks [6]. Using the classifiers, the process of classification is done where in the 36 unique characters are separated. Random forest (classifier-1). K-nearest neighbours (classifier-2). The developed model has to be trained on the training dataset of 1700 images of each character in the dataset. The trained model is tested on the testing dataset before being deployed and tested in the real time environment (Figs. 36.2 and 36.3).
Data Preprocessing The dataset chosen from the UCI comprises of training as well as test data. For each of the 36 characters in Devanagari script, the dataset contains 1600 images where in
406
K. BasavaRaju and Y. RamaDevi
Fig. 36.2 Flowchart of proposed system
the folders are named in English based on the sound of the alphabet. Hence, data is pre-processed by extracting the character by using the name of the data folder and storing them in the form of a label array which can further be implemented for the process of training. The images used in the training are of 32 × 32 grayscale in nature which are to be converted into an array and stored in the form of an image matrix. Feature Extraction The features which are to be extracted are to be chosen or selected in a way that lessens the intra-class inconsistencies and increments the inter class segregation in the feature space. The convolutional neural network (CNN) is considered the best feature extraction neural model by various developers. The scope of the extraction is approved and analysed using the deep neural networks (DNN) in combined form to the CNN. “RELU” is a popularly used activation function in the input and invisible layers where as “sigmoid” is the popular activation function used in the output layer The features thus fetched from the layers of higher density are considered and are further passed on to the process of classification where various classifiers are used to discriminate the characters (Fig. 36.4).
36 Hand Written Devanagari Script Short Scale Character Recognition
407
Fig. 36.3 The output parameters
Fig. 36.4 Steps in feature extraction process
36.4 Results The features which are extracted using the convolutional and deep neural networks are passed to the different classifiers which are used for the purpose of classification. The main objective of the classifiers is to predict the target labels based on the dense extracted features. As we all know there are 36 different characters, the different target labels are created for the characters which raises the multiclass classification problem. The two main classifiers which are used for the classification of the characters are random forest classifier and K-nearest neighbour algorithms.
408
K. BasavaRaju and Y. RamaDevi
Fig. 36.5 The accuracy score
Random Forest Classifier The random forest can accommodate an n-number of decision tree classifying algorithms on different sub-samples of data and calculates the averages in order to increase the predictive accuracy. The model trained on all the 36 characters had less accuracy due to the various problems of the data hence the value was limited to 60–68%. Never the less the model trained on the limited number of characters had higher predictive probability hence with the accuracy values around 85–93%. Unfortunately, a model on which the cropped images were put to train and validate the random forest accuracy dropped to 22–30% (Fig. 36.5). K-Nearest Neighbour The above mentioned classifier implements the methodology of k-nearest neighbours vote, i.e. the nearest neighbour calls the vote of the stroke or line. If the model is trained on the entire dataset of 36 unique characters and validated, the classification accuracy was around 78–82% where as if the model is trained on the limited or small number of target classes then the classification accuracy was around 88–92%. Unfortunately, the accuracy of the model dropped to 25% on being trained and validated with the cropped images of characters which were used in the experiment. Prediction See Fig. 36.6. The process of predictions is done in various ways as listed below. The full image of the handwritten Devanagari character is predicted by using this trained model on the dataset comprising of full character images of the Devanagari characters. The designed model predicts such cases with very high accuracy as compared to many of other systems which were used in this domain. However, the character in the Devanagari script are similar in the partial images state so much accuracy cannot be achieved.
36 Hand Written Devanagari Script Short Scale Character Recognition
409
Fig. 36.6 The accuracy score of the k-nearest neighbour
To predict the full image of the printed handwritten Devanagari character and the model is trained on the handwritten Devanagari character dataset. Hence using tesseract, the model can also be used make predictions on the printed data too.
36.5 Model Evaluation and Validation The developed system has proved with test accuracy of 91% and training accuracy of 95% on Devanagari character dataset with 20 epochs, if we increase number of epochs, then the accuracy will increase further. The loss reduced from 4.03 to 0.4 as the training progressed. For the first few epochs, the training accuracy is less than the validation accuracy and then after some epochs, train accuracy increased. The accuracy obtained by the model is greater than the benchmark reported earlier. It is likely that adding more epochs could increase the accuracy further. The benchmark reported is 90% of test accuracy. But by running 20 epochs, the model obtained 91% accuracy and 95% training accuracy. By adding few more epochs, the accuracy may increase further (Fig. 36.7). The following table shows the actual labels and the predicted labels of the first 50 images which show the capability of the model to predict almost all the correct labels except 2 or 3 out of 50 images (Fig. 36.8). Confusion matrix is used to check the performance of the neural network classification. Figure 36.9 portrays the plot of confusion matrix for the model used to categorize handwritten Devanagari characters. From this approach it is clear that the proposed system made use of only single characters of the Devanagari script. The main problem of the character recognition is that it is not able to recognize the vowels sounds and words [5] formed using the vowel sounds and characters which plays as a major drawback in the field of research.
410
K. BasavaRaju and Y. RamaDevi
Fig. 36.7 The accuracy of validation curves
Fig. 36.8 Actual labels versus predicted labels
36.6 Conclusion Handwritten character recognition is the hot topic in research with immense experimentation being performed on pattern recognition. The steps contribute precisely to the accuracy of the system such as preprocessing, segmentation, feature extraction, training methods, etc. Character recognition of Dravidian script is difficult task cause of the similar nature of the characters. Considering help from the neural networks for
36 Hand Written Devanagari Script Short Scale Character Recognition
411
Fig. 36.9 Plotting of confusion matrix
extorting the discriminating lineaments of the characters in the grayscale images has been effective in prospecting the certain aspects of image and classifying the characters using multiclass classifiers. The process of experimenting with the full and limited images of the characters helped immensely in studying the model’s accuracy in the excellence of the distilled features. It is well known that the deep convolution neural networks are very good at classifying the image data. There were many experiments conducted on handwritten character recognition using convolution neural networks for English alphabets, numbers, Kanji, Hangul and some of the Indian languages like Hindi, Bengali script, etc. But there is very less contribution on Hindi language character recognition. The number of contributions made to this language was very few which made it difficult to find a dataset consisting of all these characters. But this proposed work consisted of characters which can be compared to vowels and consonants in English metaphorically. Also, due to time constraint put on me, generating a dataset with all those characters was not feasible.
References 1. Pal, U., Jayadevan, R., Sharma, N.: Handwriting recognition in Indian regional scripts: a survey of offline techniques. ACM Trans. Asian Lang. Inf. Process. (2012). Article No.: 1. https://doi. org/10.1145/2090176.2090177 2. Jayadevan, R.: Offline recognition of Devanagari script: a survey. Available at https://ieeexplore. ieee.org/document/5699408 3. Kaur, S., Bawa, S., Kumar, R.: A survey of mono- and multi-lingual character recognition using deep and shallow architectures: indic and non-indic scripts. Artif. Intell. Rev. 53, 1813–1872 (2020)
412
K. BasavaRaju and Y. RamaDevi
4. Malanker, A.A., Patel, M.M.: Handwritten Devanagari script recognition: a survey. Available at http://www.iosrjournals.org/iosr-jeee/Papers/Vol9-issue2/Version-2/L09228087.pdf 5. Jayadevan, R.: Database development and recognition of handwritten Devanagari legal amount words. Available at https://ieeexplore.ieee.org/document/6065324/ 6. Pradeep, J., Srinivasan, E., Himavathi, S.: Diagonal based feature extraction for hand written alphabet recognition system using neural network. Available at https://arxiv.org/ftp/arxiv/pap ers/1103/1103.0365.pdf
Chapter 37
Artificial Intelligence for On-Site Detection of Invasive Rugose Spiralling Whitefly in Coconut Plantation M. Kalpana and K. Senguttuvan
Abstract Coconut is a significant crop in Indian economy and contributes 20% of world’s production. Rugose spiralling whitefly is an invasive pest on coconut. It is a polyphagous pest with more than 200 host plants. This pest can cause 15–20% yield reduction. Hence, there is an urgent need on management of this pest. Pest monitoring is one of the key strategies to manage this pest. Artificial intelligence tools such as neural network, multinomial logistic regression, and random forest classifier are used for identification of field level images. The evaluation metrics for different artificial intelligence tools are presented in order to compare the train and test accuracy in classifying of Rugose spiralling whitefly images at field level. Random forest classifier is used to classify the three levels of images infected by Rugose spiralling whitefly, as the accuracy value for test set is 97.50% compared to the neural networks and multinomial logistic regression. Image classification using random forest classifier will enhance real-time pest advisory system for Rugose spiralling whitefly for better management of coconut plantation.
37.1 Introduction Coconut is a significant crop in around the world economy and contributes 20% of world’s production. Rugose spiralling whitefly (RSW) is an invasive pest on coconut plantation. It is a polyphagous pest with more than 200 host plants. Hence, there is an urgent need on management of this pest. The whitefly identification system is developed with artificial intelligence tool. Currently, incidence of whitefly in India is alarming due to its polyphagous nature and spread to other coconut growing areas in Tamil Nadu, Kerala, Bihar, North East, etc. The problem has been discussed by M. Kalpana (B) Department of Social Sciences, Anbil Dharmalingam Agricultural College and Research Institute, Tiruchirappalli, India K. Senguttuvan Department of Cotton, Tamil Nadu Agricultural University, Coimbatore, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_37
413
414
M. Kalpana and K. Senguttuvan
farmers in various local and research forums, and many reports confirmed the invasion of Rugose spiralling whitefly in coconut fields in India. Coconut is grown on 2.096 million ha in India producing 23,798 million nuts with productivity of 11,350 nuts/ha for the year 2017–18 (Coconut Development Board, India). The severity of RSW infestation ranged from 40 to 60%. Identification of this pest requires trained manpower. Automated detection of RSW at its early stage of incidence will help the farmers to initiate management strategies to reduce the further spread. Farmers are not aware about pest complex and implications. There is shortage of trained manpower. There is no image-based pest identification system. The invasion and infestation of the Rugose spiralling whitefly on coconut palm have increased. Traditional system of identification of whitefly is cumbersome because of the skill sets required and the time needed to develop pest advisories. Traditional method for identifying whitefly manually by an entomologist is time consuming. Image-based insect recognition is the emerging field of study concerning image processing and intelligent pattern recognition, replacing the traditional technique [1]. Insect recognition and classification are done more in recent decades. Image-based technology is used to improve the shortcomings of the traditional methods such as manual identification of insects by the experts as well enhancing accuracy and saving time. In fact, many researches have been done to develop the classification and recognition of insects. There have been many successful attempts of using machine learning in automation of labour intensive tasks [2]. Image-based insect recognition has wide range of applications especially in agriculture, ecology, and environmental science [3]. For eco-informatics research, the prevention of plant disease and insect pests, plant quarantine is essential. Insect detection has to be taken into a serious measure as insect presents severe threat because they can multiply alarmingly in a short period of time [4]. Developed intelligent system is very useful for laymen to identify species in insects. To monitor the water quality of rivers, the population counts of aquatic insects are a valuable tool. Identification of species in lab is time consuming and needs trained experts. The image identification system for stonefly (plecoptera) was developed using training and test set. The system is evaluated with respect to engineering requirements developed during the research, including image quality, specimen handling, and system usability [5]. DAISS is a programme constructed to identify the wing outlines in insects. The programme has two components. They are (1) digitization and elliptic Fourier transformation and (2) pattern recognition using support vector machine. The sample of 120 owl files (Neuroptera: Ascalaphidae) was taken and splits into training and test set. After training, the sample was sorted into seven species using the tool. In five repeated experiments, the accuracy value ranges from 90 to 98%. DAIIS is a useful tool for developing a system of automated insect identification [6]. Due to smart phone penetration and advances in computer vision, deep learning is used in assisting disease diagnosis in crops. Dataset consisting of 54,306 images of disease and healthy plant leaves was collected from 14 crop species for 26 diseases. The deep convolution neural network is used to train the model, which achieves the accuracy of 99.35% with test dataset. With the image dataset, the smart phone-assisted crop disease diagnosis was developed [7]. In stored grains, insect causes major loss and early stage identification and monitoring insect is essential.
37 Artificial Intelligence for On-Site Detection of Invasive …
415
An image processing system is embedded in smart phones to identify and count insects. Sliding window-based binarization and domain-based histogram statistics are used to identify and count the insects. Mobile application was developed to identify the random insect images with the accuracy of 95% [8]. The recurrent neural network is used to detect the suitable crop for the specific environmental conditions and to provide suggestion to grow the desired crop in the field. The recurrent neural network identifies the suitable crop based on the climatic conditions. The experiment was done by using the decision tree classifier, logistic regression, random forest classifier, multilayer perceptron, support vector machine classifier, and recurrent neural network. The experimental result shows that the recurrent neural network performs well compared to other methodologies [9]. The invasion and infestation of the Rugose spiralling whitefly on coconut palm have increased in recent days. The field level image identification system will reduce time gap by using artificial intelligence tools. The programme has the potentiality to speed up identification of whitefly and minimize the time gap.
37.2 Identification and Classification of Rugose Spiralling Whitefly Field Level Images using Artificial Intelligence In the field of agriculture, field level image classification is the major challenging area. In the current scenario, the classification of disease is important to monitor the coconut plantation using artificial intelligence approaches. Identified characteristics of field level image for AI application of Rugose spiralling whitefly images (three levels) are shown in Fig. 37.1. Field level images of Rugose spiralling whitefly are converted into same dimensions with pixel levels of 64 × 64 to classify the images. The sequential steps for Rugose spiralling whitefly field level images detection are illustrated in Fig. 37.2. The 1669 images are collected from each level in coconut plantation at Pollachi, Coimbatore, Tamil Nadu, India and the total number of images collected for classification is 5007. The images are split into train and test dataset and coded for healthy, RSW infestation and scooty mould leaflets. There are many artificial intelligence tools used for image identification, but the present study has handled three models to detect the Rugose spiralling whitefly field level images. They are multinomial logistic regression, random forest classifier, and neural networks are applied to identify and classify the Rugose spiralling whitefly field level images for coconut plantation. The classification model uses image dataset collected from coconut plantation. The image dataset is split into training (80%) and test set (20%) to generate a training model as the first part of algorithm. The test image dataset is used to compute the performance of the training model with accuracy as metrics. The entire analysis is carried out by using Python 3.8.1. Table 37.1 shows the sequential steps in image classification.
416
M. Kalpana and K. Senguttuvan
Healthy leaflet
Rugose Sprilling Whitefly infestation leaflet
Sooty mould leaflet
Fig. 37.1 Three level of field images of RSW in coconut
37.2.1 Algorithm for Rugose Spiralling Whitefly Field Level Images 37.2.1.1
Multinomial Logistic Regression
Multinomial logistic regression is a statistical model and uses three target classes for classification. The inputs to the multinomial logistic regression are the image dataset captured at the field level. They are healthy images, RSW infestation, and scooty mould leaflet of coconut. The leaflet images are treated as the inputs for the multinomial logistic regression. In multinomial logistic regression analysis, it estimates the parameters of a logistic model. Mathematically, a multinomial logistic model holds three levels of images such as X = [x1(healthy images), x2(RSW infestation), x3(scooty mould)]. The Logits (scores), is the output for Multinomial logistic regression model.
37 Artificial Intelligence for On-Site Detection of Invasive …
417
Fig. 37.2 Architecture of Rugose spiralling whitefly field level images identification
Table 37.1 Sequential steps in image classification Step 1 Collect of image dataset for Rugose spiralling whitefly field level images in coconut plantation Step 2 Images are split into training and test set Step 3 Image classification methods: Multinomial logistic regression, random forest classifier, and neural network are used to create the model Step 4 Accuracy is taken as performance metrics for image classification model
37.2.1.2
Random Forest Classifier
Random forest classifier is an ensemble learning method for classification by constructing numerous decision trees at training time. The prediction is calculated by computing the average or mean of the output from various trees. In the present study, the training image dataset consists of the healthy images, RSW infestation, and scooty mould leaflets of coconut. The random forest classifier divides the image dataset into subsets. The subsets are given to every decision tree in the random forest classifier. Each decision tree produces its specific output. The prediction for tree 1 is healthy, 2 is RSW infested tree, and 3 is scooty mould. The random forest classifier collects the majority voting to provide the final prediction of RSW field level images.
418
37.2.1.3
M. Kalpana and K. Senguttuvan
Neural Network
Neural network is used to classify the three levels of field level images of RSW. Neural network has interconnected process elements called nodes. The connection between nodes has numerical values called weights. Each node takes many inputs from other nodes and gives single output with the weights. The output is fed into the other neuron, and the process is repeated. For RSW infestation three layer neural network with input and output layer, which produces output. Neural network for RSW is implemented with sigmoid activation function. Based on summation of weights, sigmoid activation function is evolved for the control of the amplitude of the outputs ranging from 0, 1, and 2. In the RSW, classification the healthy, RSW infestation, and scooty mould leaflet are identified as the output range with 0, 1, and 2. The accuracy of the model is calculated for both train and test dataset.
37.3 Experimental Results Accuracy is taken as metrics to evaluate the model. Accuracy is the fraction of the total RSW images that were correctly classified by the classifier. Accuracy is calculated using the formula: (TP + TN)/(TP + TN + FP + FN). True positive (TP) refers to the predictions which correctly predicts the positive class which are positive, and true negative (TN) refers to predictions which correctly predicts the negative class as negative. False positive (FP) refers the prediction which incorrectly predicts the negative class as positive, and false negative (FN) refers to the predictions which incorrectly predicts the positive class as negative. The evaluation metrics for the train and test image dataset for the field level images are presented below in Table 37.2. For the present study, 80% of the images are used to train the model and 20% of the images are used for testing the model. Accuracy is taken as the evaluation metrics. The train and test accuracy are almost efficient for all the artificial intelligence models as shown in Figs. 37.3 and 37.4. The train accuracy for random forest classifier is 100%, but the test accuracy is 97.50%. So, random forest classifier is identified as the more effective one compared to multinomial Table 37.2 Evaluation metrics for the field level images of RWS infestation in coconut
Methods
Train accuracy (%)
Test accuracy (%)
Random forest classifier
100.00
97.50
Multinomial logistic regression
88.38
85.92
Neural network
91.33
89.82
37 Artificial Intelligence for On-Site Detection of Invasive …
419
Train Accuracy (%) 100 90 80 70 60 50 40 30 20 10 0
100 91.33 88.38
Random Forest Multinominal Neural Network Classifier Logistic Regression Train Accuracy (%) Fig. 37.3 Training accuracy for RSW infestation in coconut using artificial intelligence tool
Test Accuracy (%) 100 90 80 70 60 50 40 30 20 10 0
97.5 89.82
85.92
Random Forest Classifier
Multinominal Logistic Regression
Neural Network
Test Accuracy (%) Fig. 37.4 Test accuracy for RSW infestation in coconut using artificial intelligence tool
logistic regression and neural network methods used for classification of field level images. The random forest classifier will enhance real-time pest advisory system for RSW for better management of coconut plantation.
420
M. Kalpana and K. Senguttuvan
37.4 Conclusion From the various artificial intelligence tools deployed for identification and classification of images of healthy and disease infected leaflet images of coconut, random forest classifier ranked first based on train and test accuracy followed by neural networks and multinomial logistic regression. Random forest classifier is used to classify the three level of images infected by Rugose spiralling whitefly, as the accuracy value for test set is 97.50% compared to the neural networks and multinomial logistic regression. Image classification using random forest classifier will enhance real-time pest advisory system for whitefly for better management of coconut plantation. The random forest classifier is an early detection and prevention tool for Rugose spiralling whitefly in coconut plantation, will be a great relief for farmers to identify the symptoms at early stage and to take protection measures to prevent crop loss. The accuracy would be improved by adding more images for the subsequent analysis.
References 1. Zhu, L.Q., Zhang, Z.: Auto-classification of insect images based on color histogram and GLCM. In: Seventh International Conference on Fuzzy Systems and Knowledge Discovery (2010) 2. Htike, Z.Z., Win, S.L.: Recognition of promoters in DNA sequences using weightily averaged one dependence estimators. Procedia Comput. Sci. 23, 60–67 (2013) 3. Lu, A., Hou, X., Liu, C.-L., Chen, X.: Insect species recognition using discriminative local soft coding. In: 21st International Conference on Pattern Recognition (2012) 4. Davies, E.R.: Computer and Machine Vision: Theory, Algorithms, Practicalities, 4th edn. United States, Academic Press (2012) 5. Sarpola, M.J., Paasch, R.K., Mortensen, E.N., Dietterich, T.G., Lytle, D.A., Moldenke, A.R., Shapiro, L.G.: An aquatic insect imaging system to automate insect classification. Trans. ASABE 51(6), 1–9 (2008) 6. Yang, H.-P., Ma, C.-S., Wen, H., Zhan, Q.-B., Wang, X.-L.: A tool for developing an automatic insect identification system based on wing outlines. Sci. Rep. 5(12786), 1–11 (2015) 7. Mohanty Sharada, P., Hughes David, P., Salathé, M.: Using deep learning for image-based plant disease detection. Front. Plant Sci. 7(1419), 1–10 (2016) 8. Zhu, C., Wang, J., Liu, H., Mi, H.: Insect identification and counting in stored grain: image processing approach and application embedded in smartphones. Mob. Inf. Syst. 1–5 (2018) 9. Agila, N., Senthil, K.P.: An efficient crop identification using deep learning. Int. J. Sci. Technol. Res. 9(01), 2805–2808 (2020)
Chapter 38
Prediction of Heart Disease Using Optimized Convolution Neural Networks (CNNs) R. Sateesh Kumar, S. Sameen Fatima, and M. Navya
Abstract As stated by the World Health Organization (WHO), coronary heart ailment is the cause of many deaths in last 15 years across the globe. We will lessen the range of causalities if we discover heart sickness in an early stage. In this paper, we are presenting a model that properly and successfully predicts sickness in its early stages. There are various conventional techniques to predict such illnesses, but they are not properly sufficient for the present scenario. There may be a direct need for a scientific examination system to predict the ailment early and accurate than the traditional methods. We strive to implement an extra correct model for the prediction of coronary heart sickness using a convolutional neural network (CNN) (Albawi et al. in International Conference on Engineering and Technology (ICET) 2017:1–6, 2017) with the assistance of an ECG image dataset by applying Adam optimizer (Albawi et al. in International Conference on Engineering and Technology (ICET) 2017:1–6, 2017). Here, the results are compared with other optimizers like Rmsprop, SGD, Adagrad, and Adadelta, and we got the good accuracy score using Adam optimizer.
38.1 Introduction Cardiovascular diseases are one of the dangerous diseases of the contemporary world. According to a survey, about more than 17.7 million deaths [1] arise all the world over annually because of heart illnesses. Of these deaths, four million had been due to coronary heart diseases, and around, six million had been due to cardiac arrest. Our intention is to accurately predict heart sickness. This prediction has been done using deep learning methods. There was more research is going on in the field of health care in the last few years, particularly in the pandemic situations. It has been observed R. Sateesh Kumar (B) · M. Navya Vasavi College of Engineering, Hyderabad, Telangana, India e-mail: [email protected] S. Sameen Fatima Anurag University, Hyderabad, Telangana, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_38
421
422
R. Sateesh Kumar et al.
that heart diseases are one of all the deadliest sicknesses which causes more deaths in the world. It is also found that around 24% of the yearly deaths [2] in India is due to diverse kinds of a coronary heart complaints. So, there is a need of well-defined early prognosis system that forecast the deaths which are going on due to heart illnesses. Coronary heart illnesses [3] or also referred to as cardiac illnesses are normally because of the narrowing of coronary arteries which deliver blood to the heart. There are strategies like angiography that are used for detecting coronary heart ailments. Many of these methods are costly. There may be a need of developing healthcare products that offer quality outcomes at a less expensive rate. Healthcare organizations are also searching out scientific ways which may be performed without invasion at a low cost. The improvement of a computer-based decision-aid machine for the prognosis of numerous illnesses can help corporations cater to the need of millions of human beings around the arena. All the traditional methods and equipment are generating large amount of data [4] which helps in training the machine learning algorithms to develop an early prognosis model for heart problems. Numerous clinical features can be utilized by device mastering algorithms for predicting the likelihood of the patient’s heart problem. The proposed set of rules makes use of an aggregate of these capabilities for categorizing wholesome and non-wholesome sufferers via using the ECG image dataset. The contribution should contain no more than four levels of headings [4]. The following gives a summary of all heading levels.
38.2 Literature Survey Chitra et al. proposed a prediction of coronary heart disorder through the usage of a supervised reading classifier. 13 attributes of the UCI coronary heart illness dataset is used as a schooling information set. The results are in evaluation with SVM. Abhisekh et al. [3] for predicting heart disease using data mining techniques, around 7008 patients data are collected from the echocardiography reports. Total 20 variables are extracted from the above reports, and with the expert’s advice, 15 attributes used decision tree, neural network, and Bayesian classifier used. Algorithms are evaluated based on the classification accuracy and area under the ROC curve. Al-milli et al. proposed a lower backpropagation neural community for predicting coronary heart disease. Uci Cleveland facts set is used. 166 statistics are used for schooling; 50 data are used for trying out. 3 hidden layers with 8, 5, 2 neurons were utilized in each layer, respectively. Ishthake et al. [5] clever coronary heart disease prediction device the usage of data mining strategies. A complete 909 information with 15 clinical attributes (factors) was received from the Cleveland heart ailment database. Decision trees and neural networks use type algorithms at the same time as a regression, association regulations, and clustering use prediction algorithms. Medhkar et al. [6] used Naive Bayes for predicting the risk of heart disease. This paper affords a classifier method for the detection of coronary heart disease and
38 Prediction of Heart Disease Using Optimized Convolution …
423
suggests how Naive Bayes can be used for classification purposes. Consequences and analysis are carried out on the Cleveland dataset. Revathi et al. comparative look at on coronary heart disorder prediction system the usage of records mining techniques. In this paper, they discussed various tactics of information mining that might be beneficial in predicting coronary heart disease. The principle goal of this paper is to create a prototype for a coronary heart disease prediction gadget by using neural networks, Naive Bayes, and decision tree techniques of data mining. Jabbar et al. [7] type of coronary heart disease using KNN and genetic algorithm. KNN at the side of a genetic algorithm is used for category. The proposed set of rules is examined with seven datasets. KNN + GA gave 100% accuracy for the heart sickness facts set. As the ok value increases, accuracy decreases. Krishnaiah et al. heart ailment prediction device the usage of data mining strategies and shrewd fuzzy method. Their study confirmed unique accuracy by taking a unique number of features and methods which might be used for implementation. Dangre et al. progressed to take a look at coronary heart ailment prediction machine the usage of data mining class strategies. Advanced statistics mining techniques are used to find out information in databases and for scientific research, in particular to heart ailment prediction. Soni et al. wise and effective heart ailment prediction device the use of weighted associative classifiers. The intention of this paper is to layout a GUI-based totally absolutely interface to enter the affected character file and assume whether the affected person is having coronary heart sickness or not the usage of the weighted affiliation rule-based classifier.
38.3 Methodology The model is applied via using deep learning algorithms with distinctive activation capabilities. Figure 38.1 shows the methodology proposed in this paper. The observation is proven below. We have taken ECG image dataset as input to the model and divided the data into two parts and built them as 1405 images for training data with two classes (i.e., normal, abnormal) and 485 images for testing data with two classes in it.
38.3.1 Constructing Convolution Neural Network (CNN) A CNN is a particular type of neural network with multiple layers. Figure 38.2 explains entire architecture of the CNN. Its strategies statistics that have a grid-like [6] association then extracts essential functions. One big benefit of the use of CNNs is that you do not want to do several preprocessing on snapshots. With maximum algorithms that take care of picture processing, the filters are generally created with
424
R. Sateesh Kumar et al.
Fig. 38.1 Technique
Fig. 38.2 Structure of convolution neural network
the aid of an engineer based totally on heuristics. CNN can learn what characteristics inside the filters are the maximum important. That saves a variety of time and trial and blunders paintings since we don’t need as many parameters. It doesn’t appear to be a massive saving until you are working with high-decision photographs which have lots of pixels. The convolutional neural network set of rules’ predominant purpose is to get records into a bureaucracy that can be simpler to process without dropping the features which can be important for figuring out what the records represent. This additionally makes them wonderful candidates for dealing with huge datasets.
38 Prediction of Heart Disease Using Optimized Convolution …
425
A big difference between a CNN and a normal neural community is that CNN uses convolutions to handle the mathematics backstage. Convolution is used in preference to matrix multiplication in as a minimum one layer of the CNN. Convolutions take to two capabilities and return a function. CNN’s [2] paintings through making use of filters on your enter statistics. What makes them so unique is that CNN is capable of music the filters as schooling takes place. In that manner, the consequences are fine-tuned in actual time, even when you have big information sets, like with pix. Because the filters may be up to date to train the CNN higher, this gets rid of the want for hand-created filters. That gives us extra flexibility in the range of filters; we are able to apply to the facts set and the relevance of these filters. With the usage of this set of rules, we are able to work on greater sophisticated issues like face recognition. One of the factors that stop loads of problems from the use of CNNs is a loss of facts. At the same time, as networks may be educated with exceptionally few facts points (~10,000 >), the more data available, the higher tuned the CNN will be. Just understand that these information points need to be clean and categorized so as for the CNN in that allows you to use them.
38.3.2 Convolution Layer A convolution [8] is a mixed integration of talents that suggests a technique to 1 feature that modifies the alternative. There are three crucial devices to say on this device: the enter picture, the feature detector, and the characteristic map. The enter photograph is the picture being detected. The characteristic detector is a matrix, normally 3 × 3 (it is able to moreover be 7 × 7). The convolution layer (conv) uses filters that carry out convolution operations as it is far scanning the enter with admire to its dimensions. Its hyperparameters consist of the filter size and stride. The resulting output is called a characteristic map or activation map (Fig. 38.3).
Fig. 38.3 Convolution layer
426
R. Sateesh Kumar et al.
Fig. 38.4 Max pool
Fig. 38.5 Average pool
38.3.2.1
Pooling Layer
The pooling layer (pool) is a downsampling operation, normally implemented after a convolution layer, which does a few spatial invariances. Mainly, max and average pooling are special styles of pooling layers that we take as max costs and average costs. The motive of the max pool is every pooling operation selects the maximum cost of the contemporary view. The reason of average pooling is each pooling operation averages the values of the cutting-edge view (Figs. 38.4 and 38.5).
38.3.3 Fully Connected Layer After pulling down, the flattened function map is handed thru a neural network. This step is made from the enter layer, the absolutely related layer [5], and the output layer. The fully linked layer is just like the hidden layer in ANN’s, but in this example, it is completely linked. The output layer is in which we get the anticipated classes. The facts are exceeded through the community, and the mistake of prediction is calculated. The mistake is then backpropagated through the device to enhance the prediction.
38 Prediction of Heart Disease Using Optimized Convolution …
427
Fig. 38.6 Fully connected layers
The final figures produced through the way of the neural community don’t normally upload up to at the least one. But, it is far important that those figures are delivered right down to numbers among zero and one, which represent the opportunity of every magnificence. That is the characteristic of the softmax feature (Fig. 38.6).
38.3.4 Compiling CNN Using Adam Optimizer An optimizer’s function is to reduce the exponential work and time required to teach and get the weights of data factors at every stage, show a higher bias-variance tradeoff, and decrease computational time. Adam optimizer is an extension to stochastic gradient descent. It is far used to update weights in an iterative manner in a network [7] while schooling. Enter i. ii.
Alpha is the learning rate. ß1 and ß2 are hyperparameters having default values. β1 = 0.9 and β2 = 0.999.
iii. iv.
v.
Epoch = max no of iterations. Initialize M, N, m, and n to 0. Wherein, M and m are weighted averages of past gradients. N and n are weighted averages of the squares of the beyond gradients earlier than bias correction. P is the enter supplied, and r is the unfairness that gets accelerated with the weights at every neuron.
Output i.
Replace M and m like momentum. M = β1 × m + (1 − β1) × dp m = β1 × m + (1 − β1) × dr
428
ii.
R. Sateesh Kumar et al.
Update N and n like rmsprop. N = β2 × N + (1 − β2) × dP2 n = β2 × n + (1 − β2) × dr2.
iii.
After bias correction. Mcorrected = M/(1 − β1q) mcorrected = m/(1 − β1q) Ncorrected = N /(1 − β2q) ncorrected = n/(1 − β2q)
iv.
Replace parameters P and r. √ P = P − getting to know fee × (Mcorrected/ Ncorrected+). √ r = r − mastering charge × (mcorrected/ ncorrected+).
v.
Repeat step four until no additional correction is required.
38.4 Results 38.4.1 Confusion Matrix See Table 38.1. Table 38.1 Confusion matrix [8]
X
Y
X
243.0
3.0
Y
14.0
225.0
38 Prediction of Heart Disease Using Optimized Convolution …
429
Table 38.2 Comparison table of optimizers Optimizers
Precision (%)
F1-Score (%)
Accuracy (%)
ADAM
99
Recall (%) 94
96
96
RMS PROP
97
87
92
92
SGD
57
99
72
62
ADAGAURD
50
100
66
50
ADADELTA
49
100
66
49
Fig. 38.7 Comparison chart
38.4.2 Classification Report See Table 38.2 and Fig. 38.7.
38.5 Conclusion Convolution neural networks are multi-layer neural networks that might be appropriate at getting the capabilities out of records. They paintings well with pics, and they do not want several preprocessing techniques. Using convolutions and pooling to reduce a picture to its basic features, you may discover photos efficaciously. It is less complicated to train CNN fashions with fewer initial parameters than with other varieties of neural networks. You may not need a large number of hidden layers due to the fact the convolutions might be able to deal with a variety of hidden layers. The general observation confirmed that coronary heart disease may be expected greater accurately using convolution neural networks.
430
R. Sateesh Kumar et al.
References 1. Sateesh Kumar, R., Sameen Fatima, S., Anna, T.: Heart disease prediction using ensemble learning method. Int. J. Recent Technol. Eng. 9(1), 2612–2616 (2020) 2. Albawi, S., Mohammed, T.A., Al-Zawi, S.: Understanding of a convolutional neural network. In: 2017 International Conference on Engineering and Technology (ICET), pp. 1–6 (2017). https:// doi.org/10.1109/ICEngTechnol.2017.8308186 3. Abhishek, T.: Heart disease prediction system using data mining techniques. Orient. J. Comput. Sci. Technol. 6(4), 457–466 (2013) 4. Purushottam, Saxena, K., Sharma, R.: Efficient heart disease prediction system. Procedia Comput. Sci. 85, 962–969 (2016) 5. Ishtake, S.H., Sanap, S.A.: Intelligent heart disease prediction system using data mining techniques. Int. J. Healthc. Biomed. Res. 1(3), 94–101 (2013) 6. Shinde, R.M., Arjun, S., Patil, P., Waghmare, P.J.: An intelligent heart disease prediction system using K-means clustering and Naïve Bayes algorithm. Int. J. Comput. Sci. Inf. Technol. 6(1), 637–639 (2015) 7. Jabbar, M.A., Deekshatulu, B.L., Chandra, P.: Classification of heart disease using K-nearest neighbor and genetic algorithm. Procedia Technol. 10, 85–94 (2013) 8. Ghumbre, S., Patil, C., Ghatol, A.: Heart disease diagnosis using support vector machine. In: International Conference on Computer Science and Information Technology, 84–88 (2011)
Chapter 39
Sentiment Analysis using COVID-19 Twitter Data Nagaratna P. Hegde, V. Sireesha, K. Gnyanee, and G. P. Hegde
Abstract The COVID-19 pandemic has essentially transformed the way millions of people across the world live their life. As offices remained closed for months, employees expressed conflicting sentiments on the work from home culture. People worldwide now use social media platforms such as Twitter to talk about their daily lives. This study aims to gage the public’s sentiment on working from home/remote locations during the COVID-19 pandemic by tracking their opinions on Twitter. It is essential to study these trends at this point in the pandemic as organizations should decide whether to continue remote work indefinitely or reopen offices and workspaces, depending on productivity, and employee satisfaction. Tweets posted in the live Twitter timeline is used to generate the set of data and accessed through Tweepy API. About 2 lakh tweets relevant to the remote work during the pandemic were tokenized and then passed to Naive Bayes classifier that classifies the sentiments positive, negative, neutral to every tweet. Our findings emphasize on population sentiment which is the effects of the COVID-19 pandemic, especially resulting from the work from home policy.
N. P. Hegde (B) · V. Sireesha Department of Computer Science and Engineering, Vasavi College of Engineering (A), Hyderabad, Telangana, India e-mail: [email protected] V. Sireesha e-mail: [email protected] K. Gnyanee NCR Corporation, Hyderabad, India e-mail: [email protected] G. P. Hegde Department of Information Science and Engineering, SDMIT, Ujire, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_39
431
432
N. P. Hegde et al.
39.1 Introduction As the dot-com bubble in the United States increased the world has seen an increase in the usage of Internet Websites which resulted in an increase in the use of social media and microblogging platforms such as Twitter, Facebook, and Instagram in recent years [1, 2]. Twitter, generates approximately 500 million tweets a day, with users sending out about 6000 tweets every second. Analyzing this barrage of data existing on the Internet can be used to gage public sentiment. On social networks sites, users discuss topics ranging from politics to movie reviews [3–5]. The data collected from social media are beneficial for companies to analyze what people think about their products and services. Private organizations and governments have begun concentrating on mining public views on social media to understand their opinions more precisely. Manually, analyzing the data and comprehending the outcomes is a laborious. Hence, it is vital to leverage computing power to process the data and obtain statistically sound results. Sentiment analysis is used to identify and classify subjective information in textual data [6–9]. Generally, the polarity of these statements is categorized as “positive,” “negative” or “neutral.” Sentiment analysis employs three subtasks to identify the polarity accurately: 1. 2.
3.
Determination of a phrase to be positive, negative, or neutral is done using phrase-level determination. Sentences in a given message are marked positive, negative, or neutral. For sentences that express mixed sentiments, whichever polarity is stronger is chosen. Document analysis emphases on aligning the whole message in the document as positive, negative, or neutral.
Sentiment analysis uses statistical machine learning methods and natural language processing (NLP) to project correct outcomes by making sense of the intricacies of human language. The accuracy of result obtained through sentiment analysis has pushed organizations to use it in various fields. Companies belonging to marketing sector use opinion mining to develop strategies to improve the business, understand customers’ satisfaction toward products or brands investors use sentiment analysis to understand a company’s standing and people’s sentiment toward it before investing in it. Pharmaceutical companies can get an estimate of the demand–supply of their drugs or vaccines in the actual market, as seen during the COVID-19 pandemic in India. From the conception of sentiment analysis, it is widely used in politics to keep track of political views, detect the success and failure of their campaigns, identify the inconsistencies in governance, and predict election results. Sentiment analysis is used to determine the general mood of the public and help respond to it. Analyzing tweets come with a set of challenges, as in addition to being written in common usage language, they are generally prepared with noisy, incomplete, and unproductively written sentences. As Twitter has a limit of 140 characters, users commonly use shortened sentences and acronyms, which change according to dialect.
39 Sentiment Analysis using COVID-19 Twitter Data
433
It is easier to understand the exact feeling communicated by the user if the sentences are written following grammatical and semantic structure. Since most tweets do not follow such a structure, comprehending them can be quite a challenging task. To make the data suitable for analysis and obtain accurate results, the system is conceptualized in three stages: data extraction, pre-processing, and polarity assignment. In the data extraction phase, the system uses the Tweepy library to get the data as per query supplied. This process is carried out via the methods offered in the Twitter API. Next, in the pre-processing stage, the system uses the TextBlob library to tokenize the tweets, use stop words to ignore words irrelevant to the analysis, perform parts of speech (PoS) tagging, et cetera. These tokens are then fed into the Naive Bayes classifier, a probabilistic classifier trained on a previously classified database that classifies the tweets and assigns a polarity to each of them.
39.2 Proposed Scheme This research aims at answering a specific question: if remote work is a viable option for organizations post-pandemic, depending on employee opinions as expressed on Twitter. So, the proposed scheme analyzes the opinions about work from home policies during the COVID-19 period and understand the sentimentality on a large scale. The high-level system design of the proposed structure is given in the Fig. 39.1. This structure is used to quantify opinion estimation of remote work policies. The suggested framework mainly consists of data extraction, pre-processing, and sentiment determination modules. These modules are discussed in detail in the below sections. Fig. 39.1 Proposed scheme
434
N. P. Hegde et al.
39.2.1 Data Extraction The study aims to test user opinion as it changes in real-time. So, tweets that match the subject of the study are extracted from the live timeline. Twitter API used to collect tweet data and stored in a database. Tweepy is a Python library that is used to access the Twitter API. Internally, the Twitter API uses two streaming endpoints to deliver the tweets. The sampled stream endpoint is used when the general mood of the audience and global events are to be monitored as it returns a sample of 1% of the tweets appearing on the live timeline. For more specific queries like searching for tweets related to work from home policies during the COVID-19 period, the filtered stream endpoint can be used to apply filters and rules to track and listen to particular topics or events and get matching public tweets. Any one of these filtering criteria or combinations of these criteria can be specified for implementation. Acquiring data in parts at various points in time are done to increase the relevance of the data, and it is discouraged to collect data in a single iteration.
39.2.2 Data Pre-processing Tweets are highly non-structured, contain grammatical errors and non-grammatical data, linguistic variations, and extensive usage of acronyms and emoticons. So, it is crucial to process this data to perform any statistical analysis on it. For the purpose of data pre-processing, the TextBLob library is used. It is a Python library that is used for text-processing operations through an easy-to-use API. TextBlob objects are internally treated as Python strings, and standard natural language processing (NLP) tasks are applied. The following series of data formatting techniques are applied to the raw data: • Tokenization: A token is a sequence of characters that are grouped together as a meaningful unit like symbols and words. The process of obtaining the tokens from the tweet text is called tokenization. Tokens are stored internally in a database. • Cleaning: User references (identified by the token “@”) and URLs (identified by tokens “http”) are removed from the twee text. • Stemming: It is the process of normalizing text so that a derived word is reduced to its stem. For example, the root word “swim” is obtained by stemming the phrases “swimmer,” “swam,” and “swimming.” >>> from textblob import Word →>>> w = Word( ran )
→>>> w.lemmatize() → run
• Stop words removal: A few words such as "a," "an," "the," "he," "she," "by," "on," etc., are used purely for grammatical and referential purposes which serve no purpose while analyzing the data. These words, called stop words, can be
39 Sentiment Analysis using COVID-19 Twitter Data
435
removed as they are ubiquitous, but give no additional information regarding the subject. • Parts of speech (PoS) Tagging: Based on the context of the sentence, each word in the sentence previously broken down into tokens is assigned a tag corresponding to the grammatical part of speech (noun, verb, adjective, etc.) >>> sen = TextBlob JAVA is a high - level programming language. ( JAVA, N N P ), ( is, V B Z ), ( a, DT ), ( high-level, J J ), >>> sen.tags ( programming, N N ), ( language, N N )
39.3 Sentiment Determination The processed data are suitable to apply statistical methods to determine the sentiment of a particular tweet, thereby determining the sentiment of the collected data as a whole. This process can be divided into two subtasks: classification and polarity assignment.
39.3.1 Classification A pre-classified dataset is used to train the quantitative model so that the model can categorize the data into particular class. The processed Twitter data are passed to the trained Naive Bayes classifier model. Naive Bayes is a probabilistic machine learning technique that uses Bayes’ theorem to find the probabilities of classes assigned to texts by calculating the joint probabilities of words and predefined classes. This classification algorithm works with an assumption of independence among predictors, meaning that the presence of a specific feature in a class is independent and does not influence the other features (Fig. 39.2). The Naive Bayes algorithm works by predicting the probability of membership of a particular word for each class. The class with the highest probability is assigned as the most likely class. The following equation characterizes the mathematical reasoning behind the calculation of the unigram word models using Naive Bayes. P(word/obj) = P(word in obj)/P(total words in obj) P(word/obj) = P(word in obj)/P(total words in obj) According to Bayes’ rule, the probability of whether a tweet is objective is obtained by computing the probability of a tweet, from the given the objective class and the earlier probability of the objective class. The term P(tweet) can be substituted with.
436
N. P. Hegde et al.
Fig. 39.2 Training and prediction modules
P(tweet/obj) + P(tweet/subj).P(obj/tweet) = P(tweet/obj).P(obj)/P(tweet) P(obj/tweet) = P(tweet/obj).P(obj)/P(tweet) The probability of tweet given the objective class to a simple product of the probability of all words in the tweet belonging to objective class can be computed. The following equation can be formulated, where “wi ” stands for the ith word. n i=1
p(wi /obj)/
n i=1
p(wi /obj) +
n
p(wi /subj)
i=1
To predict the class of a word, the highest probability is calculated by the maximum a posteriori (MAP) method. The MAP for a hypothesis is defined by the formulae, MAP(H ) = max(P(H/E)) = max(P(E/H ) ∗ P(H )/P(E)) = max(P(E/H ) ∗ P(H ))
where P(H) is the probability of hypothesis and P(E) is evidence probability.
39 Sentiment Analysis using COVID-19 Twitter Data
437
39.3.2 Polarity Assignment In the final module of the system, the emphasis is on the determination of sentiment as positive, negative, or neutral. Sentiment the proposed schema follows general sentiment analysis to determine user sentiment toward remote work during the COVID-19 pandemic. The classifier used in this case, the Naive Bayes classifier, is a probabilistic classifier, meaning that for a document “d,” out of all classes c ∈ C the classifier returns the class ˆ c, which has the maximum posterior probability given the document. The following equation denotes the estimate of the appropriate class. c = argmax P(c|d) where ceC. Sentiment analysis is performed on the parsed tweets by using the sentiment polarity method of the TextBlob library. sentiment.polarity > 0 => positive sentiment.polarity == 0 => neural sentiment.polarity < 0 => negative Along with a polarity score, the model also assigns a subjectivity score in the range [0.0, 1.0]. Here, 0.0 is very objective and 1.0 is very subjective. Along with polarity and subjectivity, an intensity score is assigned to every word in the lexicon to determine how much it modifies the next word. Let us consider the word “very,” it has a polarity score of 0.3, subjectivity score of 0.3, and intensity score of 1.3. The negation of this word by adding “not” before it multiplies the polarity by -0.5 but keeps the subjectivity constant. In the phrase “very great,” TextBlob ignores polarity and subjectivity but takes intensity score into account while calculating the sentiment of the next word, “great.” TextBlob assigns polarity and subjectivity to words and phrases and averages the scores for longer sentences/text. According to sentiments, tweets are labeled as: positive, negative, and neutral. Positive: Tweet is positive if the user expressed tweet has an upbeat/thrilled/ecstatic/cheery outlook or if the words mentioned have positive annotations. Negative: Tweet is considered as negative if the tweet has negative/angered/sad feelings or negative annotations are mentioned. Neutral: Tweet is neutral if the user does not expresses his view/opinion in the tweet and merely communicates facts.
39.4 Results Goal is to predict employees’ opinions toward work from home policies enforced during the COVID-19. To test the model, result is compared with the result obtained by VADER tool considering the 300 tweets. Performance of Naïve Bayes classification is done by finding accuracy and precision (Tables 39.1 and 39.2). Accuracy = no of correct predictions /no of predictions made
438
N. P. Hegde et al.
Table 39.1 Classification details Positive tweets Manual classification of tweets Tweets classified by VADER Tweets classified by proposed method
92
Negative tweets
Neutral tweets
35
173
126
33
141
80
36
184
Table 39.2 Confusion matrix for the proposed method Total tweets = 300
True Tweet class Positive
Predicted tweet class
Negative
Neutral
Positive
71
5
Negative
7
24
4 5
Neutral
14
6
164
Accuracy of the multiclassification is given by the average accuracy of the 3 classes. Accuracy of the methods listed as in Table 39.3 is shown in Fig. 39.3. Similarly, precision is calculated in Table 39.4 as Table 39.3 Accuracy comparison Class
Accuracy by Naïve Bayes
Accuracy by VADER
Positive tweet
0.91
0.70
Negative tweet
0.92
0.87
Neutral tweet
0.90
0.76
Fig. 39.3 Comparison of accuracy of classification
Table 39.4 Precision comparison Class
Precision by Naïve Bayes
Precision by VADER
Positive tweet
0.88
0.56
Negative tweet
0.66
0.60
Neutral tweet
0.89
0.85
39 Sentiment Analysis using COVID-19 Twitter Data
439
Table 39.5 Classification of the tweets Total tweets considered
Positive tweets
Negative tweets
Neutral tweets
200,000
5640,028–0.20%
26,600–13.30%
117,000–58.50%
Precision = no of correct predictions/no of predictions made The testing of the model reveals that Bayesian classification method is better than VADER in finding the sentiments of the employees as shown in Table 39.4. For experimental purpose, 2 lakhs tweets are considered. Table 39.5 shows the results obtained by applying the proposed Bayesian method. Overall attitude toward work from home is positive with 28% of users showing a favorable sentiment, while 13% of users expressed a negative sentiment, 59% of users had a neutral attitude.
39.5 Conclusion Our findings emphasize the positive effects of the pandemic on overall population sentiment, especially resulting from the work from home policy. In light of the changing effect of working in isolation on employees, organizations might consider shifting to workspaces as the sentiments slowly turned more negative in the latter half of the study period. Thus, considering the opinions of Twitter users, these findings have important implications for organizations to maintain their productivity levels.
References 1. Kumar, A., Sebastian, T.M.: Twitter sentiment classification using distant supervision. Int. J. Comput. Sci. Issues 9(4) (2012) 2. Kisan, H.S., Suresh, A.P., Kisan, H.P.: Collective intelligence & sentimental analysis of Twitter data by using Standford NLP libraries with software as a service. In: Proceedings of International Conference on Computational Intelligence and Computing Research 3. Bouazizi, M., Ohtsuki, T.: A pattern-based approach for multi-class sentiment analysis in Twitter. IEEE Access 5, 20617–20639 (2017) 4. Shahana, P.H., Omman, B.: Evaluation of features on sentimental analysis. Procedia Comput. Sci. 46, 1585–1592 (2015) 5. Nigam, P.P., Patil, D.D., Patil, Y.S.: Sentiment classification of Twitter data: a review. Int. Res. J. Eng. Technol. (IRJET), 5(7), 929–931 (2018) 6. Bhayani, R., Huang, L.: Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford 1.12(2009) (2009) 7. Padmavathia, S., Ramanujam, E.: Naïve Bayes classifier for ECG abnormalities using Multivariate maximal time series, otif. Procedia Comput. Sci. 47(2015), 222–228 (2020). https:// www.sciencedirect.com/topics/engineering/naive-bayesclassifier
440
N. P. Hegde et al.
8. Al-Subaihin, S.A., Al-Khalifa, S.H.: A system for sentiment analysis of colloquial Arabic using human computation. Sci. World J. 14, 1–9 (2014) 9. Jianqiang, Z., Xiaolin, G.: Comparison research on text pre-processing methods on Twitter sentiment analysis. IEEE Access 5, 2870–2879 (2017)
Chapter 40
Speech Mentor for Visually Impaired People P. G. L. Sahithi, V. Bhavana, K. ShushmaSri, K. Jhansi, and Ch. Raga Madhuri
Abstract Book reading is a very interesting habit, but it will be difficult for visually impaired ones and the blind people. Braille-related machines help them to an extent but are not affordable to everyone. The current application which was build aims to help such blind people by making their daily tasks simple and even easy. The system was built with a camera that reads the content (like books, currency notes, and online parcel.) and gives them an output in the form of audio to the user. We have used Raspberry Pi to accommodate the portable camera and the audio output through headphones or speakers. This application uses optical character recognition (OCR) to extract text from images where we try to convert the text to speech and send audio signals as output. In addition the above ability, the system will be capable of extracting the text from labels on product packaging and can even identify currency notes, etc.
40.1 Introduction Books and different types of documents including newspapers are the best sources of data/information. But, this data sources are simply restricted to people with clear eye vision. According to a statement given by WHO, the visually impaired people in the world are around 285 million who are finding it difficult to lead a normal life. In India, we have about 8.7 million people who are visually impaired. We have read many news about blind children and old people with visual impairment who were hit by a train stating that “can’t read, so use new tech to let books speak” in Deccan Chronicle. This statement provoked us to find a new alternate to help the blind people. Speech mentor for visually impaired people could help the blind to a great extent. Our device gives them an opportunity to concentrate any book which P. G. L. Sahithi (B) · V. Bhavana · K. ShushmaSri · K. Jhansi · Ch. R. Madhuri Department of CSE VR Siddhartha Engineering College, Vijayawada, India Ch. R. Madhuri e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_40
441
442
P. G. L. Sahithi et al.
is viewed as the fundamental piece of anybody’s life. Not all the blind people are familiar with the braille which they use for reading and writing purpose. Around 20 million blind people in the USA have visual impairments disability. They face a lot of issues in their work life. Our device is completely usable for the sighted people. In our project, we developed a device which capture the image of the object with the help of a camera and convert it into speech with the help of OCR. This device brings a new ray of hope into the lives of the visually impaired people and helps them to lead a normal day to day work life.
40.2 Literature Survey ROI [1] detection and text localization algorithms are used for identifying text from object. Camera is employed to capture the objects, and an audio is obtained as output to the blind user. This prototype is it can identify the moving objects and might also read text labels. Region of interest pooling is an operation widely utilized in object detection tasks by using convolution neural networks. The Radon Signature [2] is employed to spot the directionality feature of clothing patterns. Radon Signature (RadoSig) relies on the Radon transform. This prototype is employed to spot the fabric through camera, and an audio description is provided about the color and pattern of the material. Sight to sound an interface for human machine is proposed that uses a camera for performing scene analysis [3]. Here, machine capture the pictures. It identifies the key objects and so builds a map, so the blind can reach. The output is within the type of an audio, and it detects the objects through smart device camera. The demerit of the prototype is that it takes lots of your time to spot all the possible ways to succeed in the destination by detecting all the obstacles. Images are trained using CNN [4], and text is converted to speech. It uses and ultrasonic sensor and CNN to spot the potholes on the road. Infrared transceiver sensor module adoption and triangulation method [5] are used for distance detection of aerial objects. It uses a stick and glasses as a medium to capture the photographs. It gives audio as an output. This model can read text from aerial objects, and potholes identification is additionally possible. It converts the text from the captured image into a voice using TTS module [6]. It can detect the article within a fine range, and its accuracy is about 98.3%. The demerit of the prototype is that the detection of front aerial and ground objects isn’t possible. Sensors for avoidance of obstacle and image processing algorithms [7] are used for detection of objects. The system includes a reading assistant within the sort of image-to-text converter and provides audio as an output. It may detect moving objects and allows the user to read the text from any document to tell the user about his or her distance from the thing. The output medium could be a digital magnetic tape [8] which might even be checked by an editor. The prototype provides us with a facility to listen to the audio again and again, and speaking range is adjustable within wide limits. From using the IOS devices, we will capture the image during which text is presented, and this text by using the OCR and TTS [9] converted to natural human voice as an audio. The captured image is completed inside the appliance, avoiding the
40 Speech Mentor for Visually Impaired People
443
employment of applications with additional features like accessibility to the photo gallery. The features that were taken into consideration here are the styles of device used for capturing, styles of feedback signals provided, the covered area, the load, and also the cost. Computer vision-based walking assistants approaches, differing types of camera are want to capture the photographs from real-time environment, and algorithms supported computer vision-based [7] are used to detect obstacles. The tasks of the system are indoor object detection to hunt out nearby wall, door, elevator or signage, and text extraction to read the involved text information. To conveniently and efficiently assist blind people in reading out the text objects in the hand, we used a camera-based assistance for text reading system [10] for extracting significant text from objects with complex and dim backgrounds and with multiple patterns of text. The input image color is converted into gray scale, and then, adaptive thresholding (AT) is applied on images to constitute the binaries image. Unwanted line elimination algorithm [11] is employed to get rid of noise. The main goal of the interface is to provide a user-friendly environment. This interface also will respond to the main foremost concerns of (i) by providing how image acquisition maintains the integrity of source, (ii) by reducing the character recognition errors put forth by document placement related errors, and (iii) determining an appropriate classification algorithm for accurate recognition of character by proposing a replacement neural network algorithm [12]. In our paper, we came across a computer vision-based procedure for restroom signage detection [13], and recognition detection method gets the situation of a signage within image. Recognition procedure will then be performed to acknowledge the detected signage as ‘Women,” “Men,” or “Disabled.” System called OPT ~ CON generates patterns just like the sort of the characters. Another type is “intelligent systems"[14].
40.3 Proposed System The proposed system includes Raspberry Pi, camera that captures input image which is further sent for text extraction and then converted to audio output. It helps visually impaired individuals identify text on various backgrounds and acts as an alternative to traditional braille machine. This system also provides audio output which facilitates the user. Hardware Specifications:-Raspberry Pi 3 Model B + , 1.4 GHz 64-bit quad core processor, dual-band 2.4 GHz,5 GHz wireless LAN, Bluetooth having 4.2/BLE. Text reading system has two main parts: image-to-text and text-to-voice conversion. The system consists of a portable camera, a computing device and a speaker or headphone or Bluetooth device. Picture preparing module recognizes picture-based camera, changing the picture into text. Voice preparing module converts the content into sound and cycles it with explicit actual qualities all together that the sound is frequently perceived. OCR algorithm converts .jpg to.txt format. 2nd is module for voice processing which converts .txt to speech (.mp3). Optical character recognition
444
P. G. L. Sahithi et al.
(OCR) is a technology which will automatically recognize and convert the character through the optical mechanism. Tesseract engine has extensibility and flexibility of machines and the fact about many communities have active researcher to design the OCR engine, and because in this project, we are identifying English alphabets for the image captured. The output of optical character recognition (OCR) is text, which is stored in the form of a file (File.txt). The .txt document is changed over into discourse record, i.e., sound document and sound will be out through a headset, and individuals can tune in through it. Figure 40.1 depicts the workflow of the proposed system which takes sample image through camera, then uses tesseract engine to extract text from captured image and then makes use of gTTs to convert this text to audio. This application includes the following modules:Module 1: Image Capture through camera:The smart glasses (with camera) are the input medium. Module 2: Text extraction from image using OCR:Text extraction/recognition from the captured image is done through tesseract OCR. The Python library pytesseract serves for this purpose. Module 3: Text-to-speech conversion:The extracted text is now converted into speech. Google Text-to-Speech (gtts) can be used.
Fig. 40.1 Workflow
40 Speech Mentor for Visually Impaired People
445
Fig. 40.2 Proposed system flowchart
Module 4: Audio through headphones/earphones:The obtained audio file is played through speakers or headphones. Figure 40.2 represents the proposed system flowchart. It all starts with image capture. Then, the image is sent through tesseract engine that uses optical character recognition algorithm to extract text from images.
40.4 Methodology 40.4.1 Optical Character Recognition (OCR) The proposed optical character recognition (OCR) method is divided into four steps as shown in Fig. 40.3, i.e., detection of objects, preprocessing techniques with text localization, text extraction, and text-to-speech conversion. . Object Region Detection—to ensure about the hand-held object appears within the view of the camera, we used a camera with fairly wide angle in this prototype.
446
P. G. L. Sahithi et al.
Fig. 40.3 Schematic diagram of the proposed OCR method
Logitech C270 Webcam supports this need. However, this might end in another inappropriate but may be texted objects appears within the camera view. Localization of text on object—for localization of the text on an object, we have made use of various techniques which includes image processing. The region of interest(ROI) is calculated easily once the edge(s) of the foreground were blurred. The blurred image if any is identified and provided with a binary image with foreground containing all the dark pixels and the background having all bright pixels. Extraction of the text—after the image processing has been done, the required text in the image will be extracted using Tesseract OCR. Further, this image goes through the component analysis phase. Now, the words, paragraphs, and lines are detected. It uses line finding algorithm for this purpose. Finally, it generates an editable .txt file. Conversion of the text—using Tesseract engine, the text-to-string conversion of operation is performed on obtained image. There have been multiple of text-tospeech conversion modules. By using gtts, the text is finally converted to speech, and by using headphones, the obtained output is played, and he can enjoy whatever information is required either, to read a book, to read a poster label, hand written text, etc.
40.5 Results Firstly, we need to connect to the VNC Viewer by using the IP address. Connect the Raspberry Pi to it. Connect camera to Raspberry Pi and switch it on. Connect mobile hotspot to laptop and dedicate the same to Raspberry Pi. Now, write IP address of
40 Speech Mentor for Visually Impaired People
447
Fig. 40.4 VNC Viewer establishing connection
Raspberry Pi on a paper. Open VNC Viewer and type the address in search bar and hit enter. Make sure that both laptop and Raspberry Pi are connected to the same network. Now, Raspberry Pi environment will be opened in laptop. Figure 40.4 Shows the VNC Viewer screen from where the connection establishment to Raspberry Pi is made Figure 40.5 indicates the final connections and components of the working hardware. Based on the picture taken from the camera, the processing is done, and the outputs are obtained. An image that is captured as shown in Fig. 40.6 using camera is given as an input. Text from this image is extracted in the next step. Fig. 40.7 shows how the captured image is now saved as images .jpg. This is processed now.
Fig. 40.5 Working hardware
448
P. G. L. Sahithi et al.
Fig. 40.6 Input image
Fig. 40.7 Processing the image
Now, text is detected and extracted from the input image as shown in Fig. 40.8. This is done through tesseract OCR engine.
Fig. 40.8 Detecting the text
40 Speech Mentor for Visually Impaired People
449
40.6 Conclusion and Future Work In this paper, we developed a device that captures the image which is being scanned by the blind and convert the text into speech. Despite these limitations, the developed speech mentor is a supporting aid for the blind. In this paper, we have proposed a prototype system which captures images and reads the printed text on hand-held objects, product packaging’s, posters, etc. We have proposed a system that detects the object of interest for the blind user to simply click on the button to capture the object. The proposed system provides an accuracy of 75–80%. Our future work will try to extend our localization algorithm(s) to process the audio in few other languages. We will also extend our algorithm to handle capture the images with brighter backgrounds more efficiently. Furthermore, we will address many human interface issues related to reading text by blind users.
References 1. Yang, X., Yuan, S., Tian, Y.: Assistive clothing pattern recognition for visually impaired people. IEEE Trans. Human-Mach. Syst. 44(2), 234–243 (2014). https://doi.org/10.1109/THMS.2014. 2302814 2. Yi, C., Tian, Y., Arditi, A.: Portable camera-based assistive text and product label reading from hand-held objects for blind persons. IEEE/ASME Trans. Mechatron. 19(3), 808–817 (2014). https://doi.org/10.1109/TMECH.2013.2261083 3. Yang, G., Saniie, J.: Sight-to-sound human-machine interface for guid-ing and navigating visually impaired people. IEEE Access 8, 185416–185428 (2020). https://doi.org/10.1109/ ACCESS.2020.3029426 4. Islam, M.M., Sadi, M.S., Bräunl, T.: Automated walking guide to en-hance the mobility of visually impaired people. IEEE Trans. Med. Robot. Bionics 2(3), 485–496 (2020). https://doi. org/10.1109/TMRB.2020.3011501 5. Infant Abinaya, R., Esakkiammal, E., Pushpalatha: Compact camera based assistive text product label reading and image identification for hand-held objects for visually challenged people. Int. J. Comput. Sci. Info. Technol. Res. 3(1), 87–92 (2015) 6. Chang, W.-J., Chen, L.-B., Chen, M.-C., Su, J.-P., Sie, C.-Y., Yang, C.-H.: Design and implementation of an intelligent assistive system for visually impaired people for aerial obstacle avoidance and fall detection. IEEE Sens. J. 20(17), 10199–10210 (2020). https://doi.org/10. 1109/JSEN.2020.2990609 7. Khan, M.A., Paul, P., Rashid, M., Hossain, M., Ahad, M.A.R.: An AI-based visual aid with integrated reading assistant for the completely blind. IEEE Trans. Human-Mach. Syst. 50(6), 507–517 (2020). https://doi.org/10.1109/THMS.2020.3027534 8. Nye, P., Hankins, J., Rand, T., Mattingly, I., Cooper, F.: A plan for the field evaluation of an automated reading system for the blind. IEEE Trans. Audio Electroacoustics 21(3), 265–268 (2015). https://doi.org/10.1109/TAU.1973.1162464 9. Roberto, N., Nuno, F.: Camera reading for blind people. Proc. Technol. 16, 200–1209 (2014). ISSN 2212–0173 10. Yi, C., Tian, Y.: Assistive text reading from complex background for blind persons. In: Iwamura, M., Shafait, F. (Eds) Camera-Based Docu-ment Analysis and Recognition. CBDAR 2011. Lecture Notes in Computer Science, vol 7139. Springer, Berlin, Heidelberg (2012) 11. Rajkumar, N, Anand, M.G., Barathiraja, N.: Portable camera-based product label reading for blind people. Int. J. Eng. Trends Technol. (IJETT) 10(1) (2014)
450
P. G. L. Sahithi et al.
12. Adjouadi, M., Ruiz, E., Wang, L.: Automated book reader for persons with blindness. In: Miesenberger, K., Klaus, J., Zagler, W.L., Karshmer, A.I. (eds) Computers Helping People with Special Needs. ICCHP 2006. Lecture Notes in Computer Science, vol. 4061. Springer, Berlin, Heidelberg (2006) 13. Wang, S., Tian, Y.: Camera-based signage detection and recognition for blind persons. In: Miesenberger, K., Karshmer, A., Penaz, P., Zagler, W. (eds) Computers Helping People with Special Needs. ICCHP 2012. Lecture Notes in Computer Science, vol. 7383. Springer, Berlin, Heidelberg (2012) 14. Conter, J., Alet, S., Puech, P., Bruel, A.: A low cost, portable, optical character reader for the blind. In: Emiliani, P.L. (eds) Development of Elec-tronic Aids for the Visually Impaired. Documenta Ophthalmologica Proceed-ings Series, vol. 47. Springer, Dordrecht (1986)
Chapter 41
An Extended Scheduling of Mobile Cloud using MBFD and SVM Rutika M. Modh, Meghna B. Patel, and Jagruti N. Patel
Abstract Mobile cloud refers to a cloud network which supports mobility. The mobility provides a lot of benefits including the fast execution of user request and less wastage of resources. To encompass the services, the cloudlet use virtual machines (VMs) which not only share the load of the cloudlet but also increases the processing speed. This paper presents a unique allocation and utilization policy of mobile cloud which uses load management and VM allocation and migration policy to provide the best services to the user. The VM selection is complete via support vector machine (SVM). The paper uses modified best fit decreasing (MBFD) algorithm to settle down the VMs in the mobile environment.
41.1 Introduction Mobile cloud is an extended cloud platform which allows the mobile cloudlets to connect to each other and to full fill the demands of the users [1]. Connecting to each other also allows the sharing of the resources which makes this architecture more flexible in terms of resource utilization and data processing. The mobile computing node (MCN) takes the request from the user and generally broadcasts it into the network. The nearest node to the user who can full fill the demand of the user is selected as the computation device (CD) [2]. The CD can also ask for the resources from the MCN. A CD cannot ask for resources from other CD directly. MCN takes the resources from other CD and shares it with the demanding CD. CD may also contain virtual machines (VMs) which can share the load of the CD [3]. This paper R. M. Modh City Civil and Sessions Court, Ahmedabad, Gujarat, India M. B. Patel (B) A. M. Patel Institute of Computer Studies, Ganpat University, Mehsana, Gujarat, India e-mail: [email protected] J. N. Patel Department of Computer Science, Ganpat University, Mehsana, Gujarat, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_41
451
452
R. M. Modh et al.
presents a unique solution to share the load through VMs by allocating each VM to the suitable CD. Some previous allocation policies are listed in the related section [4].
41.2 Related Work This section describes the work done in the field of mobile cloud computing and the work has been defined in Table 41.1. In the procedure of describing offered methods, tools used and the result. 1.
In the cloud data center, energy efficient management is a challenging problem. Servers are always on and consume 60% to 70% power so a lot of power is
Table 41.1 A glance of existing work Reference
Proposed methods
Tool used
Outcomes
[5]
DVFS, minimum migration
.NET-based platform
The parameters like SLA violation, energy consumption, number of VM migration. SLA violation with 40% interval between thresholds has been attained
[6]
Structural constraint virtual machine placement (SCAVP), minimum maximum pool of virtual machines
JAVA The parameters are evaluated for VMs ranges from 20 to 100 VMs
The problem of large data size has been resolved by using proposed algorithms. The tome difficulty of the proposed procedure has been measured. Application with availability limitation is fewer complex than with not any restrictions. Thus, the difficulty compact in 30% for either type of restrictions
[7]
ProfminVmMaxAvaiSpace CloudSim used to increase the profit by reducing virtual machines that have maximum available space (ProfminVmMinAvaiSpace is used reduce profits while minimizing costs by reducing VM that has minimal accessible space
The SLA violation of proposed algorithm is less than 13%. VM migration up to 49% has been reduced
[8]
MPC algorithm for dynamic MATLAB model of the process
This shows that our solution is most effective in highly dynamic conditions (such as crowd flash effects) where demand can change dramatically over time (continued)
41 An Extended Scheduling of Mobile Cloud using MBFD and SVM
453
Table 41.1 (continued) Reference
Proposed methods
[9]
Parallel processing and two CloudSim algorithms have been proposed for the task scheduling, i.e., dynamic list scheduling in the cloud and dynamic min-min scheduling in the cloud
The energy consumption by suing the proposed algorithms has been reduced dynamic cloud min-min scheduling perform better than dynamic cloud list DCMMS has the smallest execution time than DCM algorithm
[10]
Suggested three different JAVA algorithms called as the first fit decreasing algorithm, and the remaining are based on best fit decreasing algorithm
Energy degradation of up to 3.24% has been observed. Resolved the energy efficiency problem arises in VM migration using three new algorithms
[15]
Genetic algorithm (GA) has been used to optimize the MBFD performance by fitness function. For the cross validation, polynomial support vector machine (P-SVM) is used
2.
3.
Tool used
Outcomes
wasted due to low server utilization. Thus, there is a need of extreme power cycling of servers to decrease this indirect energy utilization. We have found that live VM migration technology has been appeared as a result to deal with the problem of energy efficiency. For reducing the energy consumption in the cloud data center, VM migration outperforms static allocation policies by minimizing the active physical machine and closure of the idle server. But it is risky from QoS aspect to turn off resources in a dynamic environment. Thus, there is need to measure proper monitoring and resource utilization. To avoid the performance interferences, it is decided that when and which VMs must be migrated. To optimize VM migration, you need to select the correct target host for migration, as aggressive VM consolidation can cause performance disruption when VM footprint fluctuations cause congestion at unexpected rates. Thus, there is a requirement to select a + *9ooo*ook exact host and decrease the blockage on the network to provide network bandwidth in maximum way.
41.3 Proposed Solution VM migration is known as the techniques for the optimization of the energy consumption in mobile cloud data centers [11]. Virtual machine placement and migration have always been a difficult task for the past two years. If the correct virtual machine is not selected, it can result in an SLA violation or increase the number of migrations, which can negatively affect power consumption. Whenever a physical machine
454
R. M. Modh et al.
cannot meet all the needs of the virtual machine, there is a requirement to migrate virtual machines [12]. During this process, virtual machines are migrated without interrupting work in the running state. Many researchers have done their best to minimize the number of migrations and SLA violations in various algorithms. Existing implemented algorithms are complex in nature and time consuming to find and map a physical machine [13]. So our aim is to minimize SLA violation, power consumption with a series of migrations, MBFD algorithm and SVM technique would be used for power optimization by VM migration. The proposed algorithm also utilizes dynamic voltage and frequency scheduling (DVFS) algorithm to reduce the energy consumption of the planned solution [14]. Algorithm 1 Modified_Best_Fit_Decreasing (MBFD) VML: [80:20:260], Host = [10:2:28]. 1. For it = 1:10 2. tvm = vml [itr]; 3. those = Host [itr] 4. for each VM in tvm do 5. manpowerl ← MAX 6. allocated host = NULL 7. for each host in those 8. if host hs enough resource for VM then 9. powerl ← estimate power (host, VM); 10. if power < manpower then 11. allocated hostl ← host 12. manpowerl ← power 13. if allocated host = NULL then 14. allocate VM to Allocated Host 15. return allocation 16. for each h in hostListDo 17. vmListl ← h.getVm List (); 18. Sutil ← h.getutil 19. bestFitUtil ← MAX 20. while hutil > THRES_UP do 21. for each VM in VmList do 22. if vm.getutil-hutil + THRES_UP 23. if t < bestFitUtil then 24. bestFitUtill ← t 25. best iteml ← VM 26. else
27. if bestFitUtil = Max then 28. best item ← VM break 29. Hutil ← Hutil –bestFit Vm.getUtil() 30. migration list.add (bestFitVM) 31. VmList.remove (best item) 32. Return migration list Apply DVFS (); 33. For each h in migration List 34. vmListl ← h. get list (); 35. end for 36. allocated Powerl ← (host p + VM P + network P) 37. Groupl ← Host Id 38. Training datal ← Allocated power 39. SvmStructl ← Svmtrain (Allocated power, Group) 40. TrueAllocationl ← Svmclassift (svmStruct, allocated power) 41. Kl ← Find (migration list = TrueAllocation) 42. Dl ← migration list –k 43. Add d to un-migrated 44. End for 45. Apply DVFS();
Algorithm 1 defines the allocation process and the migration of mobile cloud. The total utilized power in the allocation will act as the training data. The allocated VMs will be the target. If the classified structure is not similar to that of the training structure, then it is a suitable application else it would be a false migration and will be rehandled. The below algorithm performance is evaluated using CloudSim Tool on Windows 7 with 260 VMs.
41 An Extended Scheduling of Mobile Cloud using MBFD and SVM
455
41.4 Performance Metrics and Result Discussion During the placement process of VMs, the virtual machines are allocated to their respective host according to the resources (CPU utilization, memory) as per the MBFD algorithm. After that, VMs are migrated from over-utilized host toward underutilized host; a number of migrations would be less by using SVM technique with kernel function. With the reduced number of migrations, energy consumption, and SLA violation would be less. The following performance metrics shown in Table 41.2 would be used for the evaluation of the proposed work: The proposed work aims to enhance the scheduling process by enhancing the MBFD algorithm which was proposed first by Dr. Rajkumar Buyya in 2010 and it has received a lot of modifications since then. The process of MBFD algorithm manages the virtual machine placement and scheduling architectures based on the CPU utilization. Hence, first of all the simulation environment is set up with say N number of VMs. The result section firstly introduces the VM with their associated CPU utilization and then sorts them accordingly. Figure 41.1 represents the CPU utilization of VMs in case of unsorted CPU utilization. The CPU utilization is created randomly for the capacity of individual VMs. The next process aims to sort the CPU Utilization in decreasing order, and hence, it is clearly visible in Fig. 41.2 that the CPU utilization is going down. Figure 41.2 represents the sorted VMs by using the MBFD algorithm in decreasing order. The value of VMs is high when CPU value is small. Thus, as the CPU utilization value increases, the VMs decrease. Now, the allocation process is done by MBFD. Figure 41.3 demonstrates the results of MBFD algorithm. Figure 41.3 represents the graph between total allocated Table 41.2 Performance metrics used for evaluation I. SLA violation
II. Number of migration
III. Energy consumption
SLA stands for “service level agreement.” It is an assurance by the service provider to the user. SLA can be known as violation, e.g., if the job ought to be scheduled and it is non-scheduled
VM migration includes the cost of RAM and hard disk, so it is a costly operation. It also includes the CPU use, link bandwidth, downtime of services, and total migration time, so our main aim is to reduce the number of migrations
It is described as the total energy consumed by each server within the system. Mathematically, it can be represented as
SLA violation =
Total number of migrations =
Energy consumption =
p
j
n
i=1
SLAv (host, VM)
Where, P = total iterations
i=1
mig(host, VM)
Where J = total iterations
i=1
VMe +
k
hoste
i=1
where VMe —Signifies the energy of VM hoste —Signifies the energy of host
456
Fig. 41.1 Number of VMs V/S unsorted CPU utilization
Fig. 41.2 Sorted VMs using MBFD algorithm
Fig. 41.3 VM V/S host number
R. M. Modh et al.
41 An Extended Scheduling of Mobile Cloud using MBFD and SVM
457
VMs in order v/s VM and host number. The x-axis represents the total allocated VMs in order started from 0 to 260 whereas the y-axis represents the VM and host number ranges from 4 to 255. Blue lines represent the number of hosts, whereas yellow lines represent the number of VMs. As shown in Fig. 41.3, a total of 260 hosts are considered starting from 1. According to Fig. 41.3, the 1st host has gained 250 VMs and the 260th host has got around 70 VMs. Now, there are two stages of improvement utilizing the swarm intelligence algorithm and cross validation architecture. The optimization series check for the false allocations, and cross validation is done by support vector machine (SVM). SVM is kernel-based architecture, and the proposed work has utilized all of its kernels, namely linear and polynomial. From the Fig. 41.4, there are two types of data represented by “ + ” and “*.” The training, as well as testing of the data, is represented in the Fig. 41.4. Red + sign represents the training of 1st data, whereas pink + sign represents the classification value. Green * and Blue * represent the training and testing data of the 2nd class or data, respectively. The circle represents the data which is supported by support vector machine (SVM). Here, the slant line represents the kernel of SVM. On the right side of the slant line, the 1st category support vector data are represented, whereas on the left side of the slant line, the 2nd category support vector data are represented. Figure 41.5 represents that the support vector data along with two category data. Here, the kernel is positioned at (900, 900) location in the graph. This is because the data in the 2nd category is less than the 1st category. The cross validation architecture is a supervised architecture, and hence, all the allocation power consumption is provided as the training data, and the same data values are provided as the test data. The classification architecture sets the power consumption as the input data and the VM number as the target label. If the cross validation architecture returns the same target values, then the VM is considered to be at the right host else the VM is considered to be at false host. This process saves a lot of power consumption as a false allocation will consume a lot of power. Based on this architecture, the following parameters are evaluated.
Fig. 41.4 SVM plot (I)
458
R. M. Modh et al.
Fig. 41.5 SVM plot (II)
Figure 41.6 represents a total number of migrations with respect to lower utilization. Here, the value of a total number of migrations for the proposed work is with blue line whereas the red line mentioned a total number of migrations obtained by Anton. The maximum count for the lower utilization was 0.124 for less than 50 migrations. The maximum migrations were 245 which gives 0.106 lower utilization rate. In other words, if the lower utilization value is 0.106, then the VM migration is 245, 242.038 observed by Anton and me, respectively.
Fig. 41.6 Total number of migrations V/S lower utilization
41 An Extended Scheduling of Mobile Cloud using MBFD and SVM
459
Fig. 41.7 Energy consumption V/S lower utilization
The average value obtained for the total number of migrations for the proposed and existing work is 79.63 and 82, respectively. Thus, it is concluded that there is a reduction of 2.89% from the previous work performed by Anton in the total number of migrations. The maximum count for lower utilization was 0.124 which provides 35 migrations. On the other hand, lower utilization 0.106 provides maximum migration approximately 243. Figure 41.7 represents the energy consumption with respect to lower utilization. It is seen that maximum energy consumption was 1200 for 0.106 lower utilization rate. The energy consumption remains at 600 for 0.11 and 0.118, respectively. The value of energy consumption for the proposed work is shown by blue line whereas an indication of the energy consumption values obtained by Anton is shown by the red line. If the lower utilization value is 0.106, then the VM migration is 1181.579 and 1137.969 observed by Anton and me, respectively. The total number of migrations value obtained for the proposed and existing work is 369.103 and 410.167, respectively. The proposed value of energy consumption for 0.106 lower utilization was 1138 approximately. The Anton model consumes energy approximately 1182. There is a reduction of 10.01% in the total number of migrations from the previous work performed by Anton. Thus, it has been concluded that our proposed work provides more energy consumption as compared to Anton model. Figure 41.8 blue line shows the value of SLA violation for the proposed work, whereas the SLA violation value obtained by Anton is shown by red line. It is seen that maximum SLA violation was 0.8 for 0.108 lower utilization, while minimum SLA violation was almost zero for 0.112. The SLA violation was 0.4 having 0.12 lower utilization rate.
460
R. M. Modh et al.
Fig. 41.8 SLA violation V/S lower utilization
The average value obtained of SLA violation for the total number for the proposed and existing work is 0.005984 and 0.290600, respectively. It is concluded that there is a reduction of 97.94% in the SLA violation from the previous work performed by Anton. The proposed SLA violation was 0.00385 for lower utilization 0.0106. In case of Anton, it was 0.207. Hence, improved SLA violation rate has been achieved using the proposed work.
41.5 Conclusion MCC is the emerging technology that is reaching to its users and providing numerous services to its consumers to show their interest toward itself. Since there is no need to buy the resources, therefore, it also used to save money. In this research work, we have discussed the cloud computing, their types, along with VM migrations, and the algorithms used in the proposed work. It shows the concept of the VM, virtualization, and migration of VM and its methods. The virtual machine is migrated from an underutilized server to a resource-rich server to shut down the former for efficient use of resources. During this process, virtual machines are migrated without interrupting work in the running state. In this article, the MBFD algorithm is used to sort the list of virtual machines in descending order based on CPU usage. The virtual machines are then assigned to the host according to the MBFD algorithm. The MBFD algorithm is used to minimize SLA violation, power consumption, and number of migrations, as well as support vector machine (SVM). SVM is used to classify overloaded and underloaded host machines thus migrate the VM to the underloaded host and thus load is balanced. In future, the migrated VMs can be classified by using the artificial neural network (ANN) instead of SVM because SVM is a binary classifier, it classifies
41 An Extended Scheduling of Mobile Cloud using MBFD and SVM
461
only data of two types at a time. But ANN is a multi-class classifier that classifies the number of migrated VM at a time.
References 1. Mell, P., Grance, T.: The NIST Definition of Cloud Computing. National Institute of Standards and Technology: U.S. Department of Commerce, NIST Special Publication 800-145. 2. Huang, D., Xing, T., Wu, H.: Mobile cloud computing service models: a user-centric approach. IEEE Netw. 27(5), 6–11 (2013) 3. Li, X., Du, J.: Adaptive and attribute-based trust model for service level agreement guarantee in cloud computing. IET Inf. Secur. 7(1), 39–50 (2013) 4. Gonzales, D., Kaplan, J., Saltzman, E., Winkelman, Z., Woods, D.: Cloud-trust—a security assessment model for infrastructure as a service (IaaS) clouds. IEEE Trans. Cloud Comput. 5(3), 523–536 (2017) 5. Ahmed, M.T., Hussain, A.: Survey on energy-efficient cloud computing systems. Int. J. Adv. Eng. Res. 5(II), (2013). ISSN: 2231-5152 6. Esfandiarpoor, S., Pahlavan, A., Goudarzi, M.: Virtual machine consolidation for datacenter energy improvement. arXiv preprint arXiv:1302.2227 (2013) 7. Bertini, L., Leite, J.C., Mossé, D.: Power optimization for dynamic configuration in heterogeneous web server clusters. J. Syst. Softw. 83(4), 585–598 (2010) 8. Ahmad, R., Gani, A., Ab, S., Shiraz, H., Xia, F., Madani, S.: Virtual machine migration in cloud data centers: a review, taxonomy, and open research issues. J. Supercomput. 2473–2515 (2015) 9. Sonkar, S., Thorat, A.: A Review on Dynamic Consolidation of Virtual Machines for Effective Utilization of Resources and Energy Efficiency in Cloud (2016) 10. Speitkamp, B., Bichler, M.: A mathematical programming approach for server consolidation problems in virtualized data centers. IEEE Trans. Serv. Comput. 3(4), 266–278 (2010) 11. Tucker, R., Hinton, K., Ayre, R.: Energy-efficiency in cloud computing and optical networking. In: 2012 38th European Conference and Exhibition on Optical Communications, pp. 1–32. Amsterdam (2012) 12. Wajid, U., et al.: On achieving energy efficiency and reducing CO2 footprint in cloud computing. IEEE Trans. Cloud Comput. 4(2), 138–151 (2016) 13. Farahnakian, F. et al.: Using Ant colony system to consolidate VMs for green cloud computing. IEEE Trans. Serv. Comput. 8(2), 187–198 (2015) 14. Wang, Z., Yuan, Q.: A DVFS based energy-efficient tasks scheduling in a data center. IEEE Access 5, 13090–13102 (2017) 15. Singh, G, Mahajan, M.: VM Allocation in Cloud Computing Using SVM (2019). https://doi. org/10.35940/ijitee.I1123.0789S19
Chapter 42
Performance Enhancement of DYMO Routing Protocol in MANETs Using Machine Learning Technique P. M. Manohar and D. V. Divakara Rao
Abstract Mobile ad hoc networks (MANETs) is one the notable areas of wireless ad hoc networks in which nodes form a temporary network without any infrastructure. Because of the network’s dynamic topology, routing in MANETs is a challenging task. As per Internet Engineering Task Force (IETF) draft, protocols in MANETs have to adapt to the dynamic environment by changing the default configuration parameters to enhance the performance of QoS. In this paper, the goal is to improve the performance of the dynamic MANET on-demand routing protocol (DYMO) by changing the default configurable parameters using the machine learning technique. The multiple linear regression technique is used to predict the most significant parameter route request wait time (RRWT) dynamic configurable value according to the environment. The NS2 simulator is used to carry out this research work. The results indicate that the proposed machine learning-based DYMO (ML-DYMO) enhances QoS performance when compared to the existing DYMO routing protocol for different network sizes and mobility speeds.
42.1 Introduction MANET refers to a set of wireless computing devices that are autonomous and exchange information without any centralized administration [1, 2]. Such a network is called an infrastructure-less or ad hoc network. Nodes in these networks are selfconfiguring, self-organizing, and multi-hop to act as sender, receiver, and router to carry out information exchange and control information. Wireless nodes often join and leave the network due to these networks’ dynamic nature, which leads to link failures and route failures [3]. Hence, routing is a difficult task in MANETs. Wired network routing protocols are ineffective, so researchers developed various routing protocols that fall into proactive, reactive, and hybrid categories. Proactive routing protocols save each node’s routing information even P. M. Manohar (B) · D. V. Divakara Rao Computer Science and Engineering Department, Raghu Engineering College(A), Visakhapatnam, Andhra Pradesh, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_42
463
464
P. M. Manohar and D. V. Divakara Rao
before they are required. Because of their high control overhead, proactive routing techniques are not suitable. Reactive routing protocols are the ones in which nodes do not preserve routing data if there is no communication, and it has less control overhead. So, the reactive routing protocols are chosen for this work. Major routing protocols used in this group include ad hoc on-demand distance vector (AODV), dynamic source routing (DSR), and dynamic MANET on-demand (DYMO) protocol [4]. Among the on-demand routing protocols, the DYMO routing protocol was chosen as it is extensible for larger networks and consumes minimal routing overhead. It has the disadvantage of not supporting smaller networks with low mobility speeds. As a result, RRWT has been recognized as the most important factor in improving the QoS performance of the protocol [5]. To enhance the QoS performance of the DYMO routing protocol, the default configurable value of RRWT has to be changed according to the dynamic environment. So, to implement this, a soft computing technique is used to capture the dataset of configurable values of RRWT, and to further enhance machine learning technique, multiple linear regression is used to predict the dynamically configurable value. The results show that the performance of ML-DYMO surpasses the existing DYMO. The paper’s remainder is with the following sections: Literature review of the related work, DYMO routing protocol discussion, Machine Learning Technique, Multiple Linear Regression, Research Methodology employed, Machine learning based DYMO Routing protocol, Simulation process used, and Result and analysis discussion. To end, the Conclusions and future scope of work are discussed.
42.2 Literature Review of Related Work Most researchers used machine learning techniques in MANETs for intrusion detection and identifying different security attacks in MANETs. But, machine learning techniques can also be used for dynamic routing. Sebopelo et al. [6] present an effective security method based on machine learning to detect and identify harmful attacks in practical based on logistic regression and a support vector machine using the Iris dataset. The results demonstrate that logistic regression outperformed related to the accuracy, with a detection rate of 100%. By incorporating machine learning into the MANET scenario, Nishani et al. [7] present the most notable models for developing intrusion detection systems. Laqtib et al. [8] conducted a comprehensive comparison of three models: inception architecture convolutional neural network (inception-CNN), bidirectional long shortterm memory (BLSTM), and deep belief network (DBN). The study of Suma et al. [9] focuses on enhancing the performance of locationbased routing in MANET. A machine learning-based attacker detection (MLAD) technique that leverages multi-path routing is offered to provide optimal routing even in the existence of attackers.
42 Performance Enhancement of DYMO Routing Protocol in MANETs …
465
42.3 DYMO Routing Protocol DYMO is an on-demand, reactive, multi-hop unicast routing protocol. DYMO involves two functions: route discovery and route maintenance. When a node needs to deliver a packet to a target that is not currently in its routing table, on-demand routes are discovered. The network is flooded with route request (RREQ) messages, and when the packet arrives at its destination, a reply message carrying the discovered accumulated path is sent back. The packet’s originating node is notified when the destination’s path is uncertain or the link is broken. The originator receives a route error (RERR) packet indicating that the existing route to a certain destination is incorrect or unreachable. The originator node removes the route and initiates the route discovery procedure for that destination when it receives the RERR [10].
42.4 Machine Learning Machine learning is a rapidly evolving technology that allows computers to learn on their own using historical data. Machine learning employs a variety of algorithms to create mathematical models and make predictions based on past data or knowledge. It is being utilized for a variety of activities, including image identification, speech recognition, email filtering, recommender systems, and so on. The study employs the multiple linear regression technique of supervised machine learning.
42.4.1 Supervised Learning Users train the machine learning system by submitting sample labeled data, and it predicts the output based on that data. To analyze and learn about the datasets, the system creates a model using labeled data. We evaluate the model after it has been trained and processed by providing a sample dataset to check if it appropriately predicts the output. Spam filtering is one example of supervised learning.
42.4.2 Multiple Linear Regression The extension of simple linear regression is multiple linear regression, in which it predicts the response variable using more than one independent variable. y = A + B1 x1 + B2 x2 + B3 x3 + B4 x4 . . .
(42.1)
466
P. M. Manohar and D. V. Divakara Rao
where A is an intercept to be computed, and B1, B2, B3, B4 are the slopes or coefficients concerning these independent features.
42.5 Research Methodology Three assessment methodologies were defined to test the proposed system in this paper. 1. 2. 3.
Simulation, Experimental and Mathematical.
The mathematical methodology is highly restrictive which contains assumptions and hypotheses that cannot suit realistic environments. Experimental methodology is not practicable due to high cost and their lack of flexibility and setup for such networks [11]. Simulation is an economical and easy approach to carry out experiments. Among the different simulators, the NS2 simulator is chosen for its features, advantages, and is highly preferred in networking research community.
42.6 Proposed Machine Learning-Based DYMO Routing Protocol DYMO routing protocol was chosen as it is extensible for larger networks and consumes minimal routing overhead. It has the disadvantage of not supporting smaller networks with low mobility speeds. As a result, RRWT has been recognized as the most important parameter in enhancing the QoS performance of the protocol [5]. The default configurable value of RRWT of the DYMO protocol has to be changed according to the dynamic environment for performance enhancement. Algorithm 1. 2.
3.
Using MATLAB, the soft computing technique, Fuzzy logic toolbox, is used to capture the dataset of configurable values. To further fine-tune and predict the dynamically configurable value of RRWT, machine learning technique, multiple linear regression using Python is used, with network size and mobility speed as the independent variable and RRWT as the predictor variable. The dynamically configurable parameter values of the proposed model are incorporated in NS2, and the experiments are run to enhance the performance of the DYMO protocol for different network sizes.
42 Performance Enhancement of DYMO Routing Protocol in MANETs … Table 42.1 Parameter setting for simulation process
Parameter
Value
Routing protocol
DYMO
Packet size
512 Kb
Nodes
20, 45
Mobility
5, 15, 25 m/sec
Pause time
0s
Data rate
11 Mbps
Node placement
Random
Traffic agent
UDP
Application traffic
CBR
Simulation time
300 s
Terrain size
1000 × 1000 m2
Mobility model
Random waypoint
467
42.7 Experimental Process MATLAB’s fuzzy logic toolbox is used to collect the data of DYMO configurable values. Machine learning with Python is used for further fine-tuning of dynamically configurable values. The experiments are conducted on UBUNTU OS 12.04 LTS with NS2 (Network Simulator 2) for the performance evaluation. The performance analysis is carried out using the metrics average throughput, packet delivery ratio, average end-to-end delay, and routing overhead. The simulator parameters and their values are shown in Table 42.1.
42.8 Results and Analysis 42.8.1 Average Throughput (Kbps) According to the graph in Fig. 42.1, the proposed ML-DYMO delivers the maximum throughput when compared to standard DYMO for various network sizes and mobility speeds.
42.8.2 Packet Delivery Ratio According to the graph in Fig. 42.2, the proposed ML-DYMO achieves the highest packet delivery ratio when compared to standard DYMO for practically all network sizes and mobility speeds.
468
P. M. Manohar and D. V. Divakara Rao
Avg. Throughput(kbps)
Average Throughput (kbps) 600 500 400 300
defacto DYMO
200
ML-DYMO
100 0
Fig. 42.1 Average throughput
Packet Delivery RaƟo(%)
Packet Delivery Ratio 60 50 40 30 20 10 0
defacto DYMO ML-DYMO
Fig. 42.2 Packet delivery ratio
42.8.3 Average End-to-End Delay For almost all network sizes and mobility speeds, the proposed ML-DYMO provides a higher average end-to-end delay than traditional DYMO because of higher throughput, as illustrated in Fig. 42.3.
42.8.4 Routing Overhead The proposed ML-DYMO produces identical routing overhead compared to standard DYMO for all network sizes and mobility speeds, as illustrated in Fig. 42.4.
42 Performance Enhancement of DYMO Routing Protocol in MANETs …
469
Avg. End-to-End Delay (ms)
Average End-to-End Delay (ms) 1.2 1 0.8 0.6
defacto DYMO
0.4
ML-DYMO
0.2 0
Fig. 42.3 Average end-to-end delay
RouƟng Overhead(packets)
Routing Overhead 14000 12000 10000 8000 6000 4000 2000 0
defacto DYMO ML-DYMO
Fig. 42.4 Routing overhead
42.9 Conclusions Machine learning techniques are mainly used in the area of MANETs for intrusion detection and identifying different security attacks. Also, machine learning techniques can be used for dynamic routing. To adapt to the dynamic environment by changing the default configuration parameters and to enhance the performance of the DYMO routing protocol, machine learning technique multiple linear regression are used. The multiple linear regression technique is used to predict the most significant parameter RRWT of the DYMO protocol. The NS2 simulator is used to carry out this research work. The results indicate that ML-DYMO enhances the QoS performance when compared to the existing DYMO routing protocol for different network sizes and mobility speeds. The work can be further extended using other machine learning techniques and soft computing techniques.
470
P. M. Manohar and D. V. Divakara Rao
References 1. Ismai, Z., Hassan, R.: Effects of packet size on AODV routing protocol implementation in homogeneous and heterogeneous MANET. In: 2011 Third International Conference on Computational Intelligence, Modelling & Simulation, pp. 351–356 (2011) 2. Divakara Rao, D.V., Pallam Shetty, S.: Integrating power saving mode in LAR protocol to minimize energy consumption in MANETs. J. Wirel. Commun. Netw. Mobile Eng. 3(3), 1–16 (2018) 3. Khan, A., Suzuki, T., Kobayashi, M., Morita, M.: Packet size based routing for route stability in mobile Ad-hoc networks. In: The 18th Annual IEEE International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC’07) (2007) 4. Attada, V., Pallam Setty, S.P.: Cross layer design approach to enhance the quality of service in mobile Ad Hoc networks. Wirel. Pers. Commun. 84, 305–319 (2015) 5. Manohar, P.M., Setty, S.P.: Taguchi design of experiments approach to find the most significant factor of DYMO routing protocol in mobile Ad Hoc networks. i-manager’s J. Wirel. Commun. Netw. 7(1), 1–11 (2018) 6. Sebopelo, R., Isong, B., Gasela, N.: Identification of compromised nodes in MANETs using machine learning technique. Int. J. Comput. Netw. Inf. Secur. (IJCNIS) 11(1), 1–10 (2019) 7. Nishani, L., Biba, M.: Machine learning for intrusion detection in MANET: a state-of-the-art survey. J Intell Inf Syst 46, 391–407 (2016) 8. Laqtib, S., El Yassini, K., Hasnaoui, M.L.: A technical review and comparative analysis of machine learning techniques for intrusion detection systems in MANET. Int. J. Electr. Comput. Eng. (IJECE) 10(3), 2701–2709 (2020) 9. Suma, R., Premasudha, B.G., Ravi Ram, V.: A novel machine learning-based attacker detection system to secure location aided routing in MANETs. Int. J. Netw. Virtual Organ. 22(1), 17–41 (2020) 10. Manohar, M.P.: Performance analysis of reactive routing protocols AODV, DYMO, DSR, LAR in MANETs. Int. J. Future Revolution Comput. Sci. Commun. Eng. 4(3), 1–7 (2018) 11. Hogie, L., Bouvry, P.: An overview of MANETs simulation. Electron. Notes Theor. Comput. Sci. 150, 81–101 (2006)
Chapter 43
Context Dependency with User Action in Recommendation System Arati R. Deshpande and M. Emmanuel
Abstract Recommendation systems recommend the users with the items which they may prefer in future. Context-based recommendation uses the user, item, or interaction context in recommendation process. Tagging action of user as interaction can be used as implicit action for recommendation generation, whereas the rating action can be used as explicit action. The context which influences the tagging action or the rating action can also be used to generate the recommendation in addition to user profile and item profile. The prefiltering method with context is used in recommendation systems to reduce the data for processing by recommendation system and to increase the quality of recommendation. A method to determine the context dependency for incorporating context to increase relevance of recommendation is proposed with prefiltering approach with implicit user tagging action with class association rule mining. The evaluation of the method on MovieLens dataset with tagging action shows that the context variable selected will not improve the relevance which can be used to make the decision of dependency of context with recommendation.
43.1 Introduction The recommendation systems are part of many Web or mobile applications. These predict the items that may be preferred by the user in future. The prediction is based on the user’s past preferences or other similar users’ preferences. The items can be e-commerce products, music, movies, or any item that is part of Web or mobile applications. The personalized recommendations take into account the preferences A. R. Deshpande (B) Department of Computer Engineering, SCTR’s Pune Institute of Computer Technology, Pune 411043, India e-mail: [email protected] M. Emmanuel Department of Information Technology, SCTR’s Pune Institute of Computer Technology, Pune 411043, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_43
471
472
A. R. Deshpande and M. Emmanuel
of user [1]. Currently, the context-based recommendation systems are developed to improve the quality of recommendation [2, 3]. As the user interacts with the application, the attributes of interaction can be considered as the context [4]. The context of environment can be time, location, user, or item attributes. The context variable is used as the factor for influencing the user action which can be used to increase the quality of recommendation. It can be like time or location. This factor is used to prefilter the user profile in prefiltering method to reduce the data for recommendation [5]. To determine which context variable influences the recommendation, a method is proposed through association rules and collaborative filtering.
43.2 Related Work The recommendation systems are currently incorporated in many e-commerce, web, and mobile applications. The traditional methods of recommendations are collaborative, content, and hybrid filtering. These use the profile of users and/or, items, implicit actions, and explicit ratings to predict the top-N items preferred by the user. The prefiltering, post-filtering, and context modeling methods have been proposed recently. Prefiltering methods reduce the information needed for recommendation by context before applying recommendation algorithm, post-filtering methods filter recommended items after the recommendation, and context modeling methods use the context in recommendation algorithm.
43.2.1 Prefiltering Methods for Recommendation The two dimensions of user and item of user item matrix is added with context as third dimension [6]. In this multidimensional view, each rating is associated with the user, item, and context. The reduction-based method uses the context to prefilter the user item matrix. This reduces the data to be processed for recommendation generation. If the context is time, then the ratings given only in that specific time are selected to generate the recommendation. The user item matrix for a specific time is called as contextual segment. The reduction-based method improves over the user-based collaborative filtering and also depends on the application. The item is split into two separate items in item splitting method [7], based on the context and collaborative filtering method is applied. The item ratings are given for one with context as c = cj and another as c = cj. The item splitting method improves the accuracy with context-based method. The context is considered as time with three ‘x’ months duration with prefiltering method to determine the contextual segment [8]. Three lists of recommendations are generated with collaborative filtering and are combined with the weights given
43 Context Dependency with User Action in Recommendation System
473
by fuzzy inference system (FIS). The recommendation method is evaluated for MovieLens dataset, and it increases the relevancy of recommendation. A context-based recommendation system uses clustering in [9]. The clusters of users with hierarchies are determined according to demographic values as context. The run time performance of collaborative filtering increases by a factor of k if k equal partitions of users are created.
43.3 Recommendation with Tagging Action The user of a recommendation system interacts with the items using different actions. The actions can be explicit like rating or giving ‘like’ for the item. But the actions can be implicit like purchasing, downloading, viewing, clicking, or tagging. These implicit actions can give the opinion about the items by the user. This implicit action can be used to find the recommendation for the active user. The history of information of tagging is the name of the tag given by users to different items. If tagging is taken as the implicit feedback action of the user about an item, the recommendation can be generated by computing the preference of an item for the current (active) user using all the tagged items by current user and other users [10, 11]. For a given current user u and item i, the preference is computed using the preference (importance) of a tag for the user and importance of that tag for the item. This importance is taken as weight of tag for a user defined as ut (u, t) and weight of tag for the item defined as mt (i, t). The values of these two functions can be calculated in many ways, but here, we take it as the number of times the user or item is tagged. ut(u, t) = number of times user u has used tag t
(43.1)
mt(i, t) = number of times item i has been tagged with t
(43.2)
The preference of an item i for user u, is defined as pref(u,i). pref(u, i) =
ut(u, t) × mt(i, t)
(43.3)
t∈T
where T = set of all tags assigned to item i. The tagging action data can use the User X Item matrix with elements as tag ids. The algorithm for generating recommendation using tagging is given as follows. Algorithm: Recommendation with tagging. Input: User X Item matrix with tagging values, Active user u Output: Recommendation List with top-N items 1.
For each item i which is not tagged by user u, do a.
pref(u, i) = 0
474
A. R. Deshpande and M. Emmanuel
b. c.
2. 3. 4.
Find the set T of all tags assigned to item i For each tag t in set T, do (1) Find ut(u, t) as given in Eq. 43.1. (2) Find mt(i, t) as given in Eq. 43.2. (3) Find pref(u, i) = pref(u, i) + ut(u, t) × mt(i, t).
Sort the items in decreasing order of their preference. Select the top-N items as the recommended list. End.
The matrix can be represented as sparse matrix in memory. The computation of preference of an item for the current user requires the scan of matrix once. This is equal to the number of non-zero elements in the matrix which are the number of tagging. If the number of users is m, number of items is n and number of tagging is t, then the time complexity to find the preference is O(t). The time complexity to find the recommending items is O(nt) for the current user for n items.
43.4 Prefiltering with Tagging Action The contextual segments are generated with time as the context for the tagging action of user to item. The rules are generated for prefiltering and are stored in the data tables. The context rules relate the time and user’s tagging action. These rules are represented as class association rules [12]. The class association rules have the rules with only one consequent which is the class attribute [13]. The context information is stored along with the user, item profiles. The context elements which have to be stored are designed at analysis and design time [14]. The context we have considered is time, and user action is the tagging action. In movie dataset, the time is considered as context. The contextual elements of time dimension considered are season, weekday, and time of the day. The class attribute ‘tagging’ is added to generate rules. For tagging action, the table for rule generation has season, weekday, time, and tagging with tagging = {tagged, not tagged}. The rules are stored in rule table. The action is the class attribute which is tagging. The context element attribute which is not in the rule will have the value null or ‘xx.’ Each rule is stored with a rule id. The tagging action table consists of userid, movieid, tagid, season, weekday, and time as attributes. The context elements are season, weekday, and time for the context dimension time. The values of context elements are generalized from the specific date and time values. The context in specific date and time values are generalized into season, weekday, and time. The rules generated are matched with context of user who needs recommendation. Only the context segments of the matched rule is used to generate recommendation for the user with algorithm. The recommendation with tagging computes the preference of the movie for a user and not the prediction of rating. So the evaluation of recommendation with tagging is carried out for precision, recall, and F1 measure.
43 Context Dependency with User Action in Recommendation System
475
Start
User interaction with system
Extract the current context of user (season, day and time)
Match current context with the rules in Rule database to get the matching rule
Extract the tagging data matching the rule
Apply the recommendation with tagging algorithm
Select the top 10 recommendations and display
End Fig. 43.1 Recommendation generation with tagging
The steps for recommendation generation are given in Fig. 43.1. The rules are generated and stored in the database prior to recommendation. When the user interacts with the recommendation system, the context of the user which is season, week day, and time are compared with the rules, and matching rule is selected from the stored rules. The contextual segment is selected for that rule from the user data of tagging. The recommendation is generated with data from that contextual segment from the user’s data.
43.5 Experimental Results The dataset of hetrec2011-movielens-2k [15] is used for generating the recommendations. The dataset contains 2113 users, 10,197 movies, 47,957 tag assignments,
476
A. R. Deshpande and M. Emmanuel
and 855,598 rating assignments. The recommendations for movie are generated with the implicit action of tagging. The dataset is sampled for 150 users and 200 items such that the users who have both rated movies and tagged are selected. The context used is time with contextual elements as season, weekday, and time of the day. The sparsity can be reduced by generalizing the context. Weka software [16] is used to generate the rules with class association rule mining and tagging action as consequent with minimum support and confidence. The tagging action is used to generate the recommendation of movies for the sampled dataset. The recommendation with tagging action is implemented for without context and with context methods. The recommendation with tagging algorithm given in Sect. 4.6 is used for recommendation generation for the active user. The dataset for tagging is divided into train and test data in the ratio of 80:20. The rules are generated for train set of tagging. The rules generated are shown in Table 43.1. The preference of the movie using tagging is calculated for recommendation generation. The evaluation measures used are precision, recall, and F1 measure. The values for recommendation without context for tagging and the values for recommendation with context for tagging are shown in Table 43.2. The graph is shown in Fig. 43.2. The top-N values are varied from 5 to 20 in interval of 5. The tagging action with context and without context for recommendation generation are implemented. The precision (Tag_Precision), recall (Tag_Recall) and F1 (Tag_F1) for without context is more than the precision (Tag_PrecisionWC), recall (Tag_RecallWC), and F1 (Tag_F1WC) values of with context method. The difference in F1 measure for without and with context is 4.4, 1.8, 1.4, and 6.9% for top5, top10, top15 and top20, respectively, with average as 3.6%. The recommendation with tagging with context has not improved the relevance of recommendation. This shows that the contextual elements chosen are not influencing the tagging action of the users. Though the rules generated for tagging show Table 43.1 Rules generated for train set with tagging Rule id
Season
Weekday
Time
Outclass
Conf
Num taggings
1
Xx
xx
E
One
1
231
2
Xx
wd
xx
One
1
220
3
Winter
xx
xx
One
1
162
4
Xx
we
xx
One
1
149
5
Xx
wd
E
One
1
127
6
Xx
we
E
One
1
104
7
Winter
wd
xx
One
1
102
8
Xx
xx
M
One
1
83
9
Winter
xx
E
One
1
82
10
Summer
xx
xx
One
1
81
11
Spring
xx
xx
One
1
81
43 Context Dependency with User Action in Recommendation System
477
Table 43.2 Relevance measures for recommendation with tagging without context and with context (WC) Top-N Relevance Measures at KNN = 20 Tag_Precision Tag_PrecisionWC Tag_Recall Tag_RecallWC Tag_F1 Tag_F1WC 5
0.412
0.343
0.235
0.203
0.299
0.255
10
0.332
0.345
0.331
0.354
0.331
0.349
0.300
0.290
0.451
0.429
0.360
0.346
0.295
0.242
0.552
0.454
0.385
0.316
Relevamce at KNN=20
15 20
0.560 0.550 0.540 0.530 0.520 0.510 0.500 0.490 0.480 0.470 0.460 0.450 0.440 0.430 0.420 0.410 0.400 0.390 0.380 0.370 0.360 0.350 0.340 0.330 0.320 0.310 0.300 0.290 0.280 0.270 0.260 0.250 0.240 0.230 0.220 0.210 0.200
Tag_Precision Tag_PrecisionWC Tag_Recall Tag_RecallWC Tag_F1 Tag_F1WC
5
10
15
20
Top-N Fig. 43.2 Relevance measures for recommendation with tagging without context and with context (WC)
the context elements with tagging action, it is not reflected in recommendation generation. This shows that the rules can be used to extract the influence of context on action, but the evaluation of recommendation is improved only when the actions have more contextualized values. So, this can be used as the guideline to select the contextual elements for the recommendation system.
478
A. R. Deshpande and M. Emmanuel
43.6 Conclusion Context in recommendation plays an important factor to increase the quality of recommendation. The context can be modeled as a hierarchy with abstract to refinement. The context-based recommendation with prefiltering and class association rule mining with user action as tagging is proposed. The reduction-based method of prefiltering is modified with user action as tagging. The data used for the recommendation generation is reduced by applying prefiltering with rules consisting of context and user action. However, the context can be used only when it influences the recommendation. The context dependency can be checked before model creation with prefiltering and rule-based approach. This is determined by using tagging action with context in our experiments. The recommendation will be carried out without context if the context will not influence the user action.
References 1. Adomavicius, G., Tuzhilin, A.: Towards next generation of recommender systems: a survey of state of the art and possible extensions. IEEE Trans. Knowl. Data Eng. 17(6), 34–739 (2005) 2. Dourish, P.: What we talk about when we talk about context. Pers. Ubiquit. Comput. 8(1), 19–30 (2004) 3. Abowd, G., Dey, A., Brown, P., Davies, N., Smith, M. Steggles, P.: Towards a better understanding of context and context-awareness. In: Handheld and Ubiquitous Computing, pp. 304–307. Springer, Berlin Heidelberg (1999) 4. Ricci, F., Rokach, L., Shapira, B.: Introduction to recommender systems. In: Recommender Systems Handbook, pp.1–35. Springer US (2011) 5. Adomavicius, G., Tuzhilin, A.: Context-aware recommender systems. In: Recommender Systems Handbook, pp. 217–253. Springer US (2011) 6. Adomavicius, G., Sankaranarayanan, R., Sen, S., Tuzhilin, A.: Incorporating contextual information in recommender systems using a multidimensional approach. ACM Trans. Inf. Syst. (TOIS) 23(1), 103–145 (2005) 7. Baltrunas, L., Ricci, F.: Experimental evaluation of context-dependent collaborative filtering using item splitting. User Model. User-Adap. Inter. 24(1–2), 7–34 (2014) 8. Ramirez-Garcia, X., Garcia-Valdez, M.: A pre-filtering based context-aware recommender system using fuzzy rules. In: Design of Intelligent Systems Based on Fuzzy Logic, Neural Networks and Nature-Inspired Optimization, pp. 497–505. Springer International Publishing (2015) 9. Datta, S., Das, J., Gupta, P., Majumder, S.: SCARS: a scalable context-aware recommendation system. In: 3rd International Conference on Computer, Communication, Control and Information Technology (C3IT), pp. 1–6. IEEE (2015) 10. Sen, S., Vig, J., Riedl, J.: Learning to recognize valuable tags. In: Proceedings of the 14th ACM International Conference on Intelligent User Interfaces, pp. 87–96 (2009) 11. Sen, S., Vig, J., Riedl, J.: Tagommenders: connecting users to items through tags. In: Proceedings of the 18th ACM International conference on World Wide Web, pp. 671–680 (2009) 12. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: 20th International Conference on Very Large Data Bases, pp. 478–499 (1994) 13. Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: 4th International Conference on Knowledge Discovery and Data Mining, pp. 80–86 (1998)
43 Context Dependency with User Action in Recommendation System
479
14. Deshpande, A.R., Emmanuel, M.: Conceptual modeling of context based recommendation system. Int. J. Comput. Appl. 180(12), 42–47 (2018) 15. Cantador, B., Peter, Kuflik: 2nd Workshop on Information Heterogeneity and Fusion in Recommender Systems (HetRec 2011), Proceedings of the 5th ACM Conference on Recommender Systems (2011) 16. Frank, E., Hall, M.A., & Witten, I.H.: The WEKA workbench. Online Appendix for “Data Mining: Practical Machine Learning Tools and Techniques”, 4th edn. Morgan Kaufmann (2016)
Chapter 44
Industrial Automation by Development of Novel Scheduling Algorithm for Industrial IoT: IIoT Re-birth Out of Covid-19 Scenario Sujit N. Deshpande
and Rashmi M. Jogdand
Abstract Internet of things is a progressive terminology as a future era of computing. The covid-19 outbreak that is drawn once in a hundred years’ experience to all the mankind and so as per the saying “necessity is the mother of invention”, nearly all scientific as well as technological fields are battled to develop various verticals like drugs discovery, health care, distant education and so the major challenge of managing industry with modest human resources. To maintain countries’ economic status high, as Indian industrial domain majorly contributes to GDP growth that further needs to be kept running in any predicament. The prolonged lockdown hampered the global economy and hence, this paper focuses on the Industrial IoT, where without human assistance, many key processes can be monitored and executed. As information technology (IT) employees can work from home, industrial processes can also execute various automation tasks using process scheduler. This paper presents the new CatchPro protocol request scheduling algorithm for IIoT which can be a great solution for unmanned process execution with the performance improvement of 2.56%. Paper also presents the methodology and framework for industrial automation using MQTT protocol, where manufacturing units can efficiently work with remote monitoring and automation of processes. This paper targets the three important roles as process execution without in-person process handling, algorithmic execution for scheduling of the industrial tasks, and more efficient execution of IoT protocols for machine-to-machine communication via sensor network.
44.1 Introduction Because of the shortage of any kind of tangible solution methodology, sociable distancing continues to be recognized as the ideal available safeguard approach S. N. Deshpande (B) · R. M. Jogdand KLS, Gogte Institute of Technology, Belagavi, Karnataka, India R. M. Jogdand e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_44
481
482
S. N. Deshpande and R. M. Jogdand
against the COVID-19 outbreak [1, 2]. Nevertheless, the dependence on social distancing has encouraged authorities around the globe to enforce lockdowns, which has proclaimed a considerable reduction in the worldwide economy. Almost all nonessential services were compelled to shut down, triggering practically all the industrial areas to confront vital interruptions [3] in the supply chain and so subsequently, positioning millions of citizens at chances of shedding their careers. The automotive sector has seen significant interferences in production as a result of strict lockdown [4] actions forced in many countries globally as an attempt to hold the outbreak. IoT has the potential to work for wellness programs, home as well as industrial automation [5], smart transport, smart grids, etc. IIoT encroachment [6, 7] is premised on the idea that superior quality data can be assessed to enable real-time regulation of industrial procedures and so permit digital simulations to strengthen functional performance and in performing so it can boost productivity by means of strengthening trustworthiness, predictability as well as security. Looking at all scenarios of covid-19 pandemic and lockdown, this paper presents the new framework for industrial automation (machine to machine communication via MQTT protocol) and scheduling of industrial tasks which can cut down the production downtime in any emergency situations like lockdown. The 2019– 2020 lockdown hampered the industry as lockdown lead to re-hauling of machinery, processing plants, etc. which increased maintenance cost as well as lead to huge financial loss.
44.2 Related Work The purpose of intelligent sensors [8, 9] as well as IoT-enabled industrial facilities are crucial for establishing contemporary solutions founded on a combination of intelligent sensors. Edge computing is a desirable alternative intended for smart industries [10] since it may resolve the difficulties of data overburden as well as latency. Edge computing in most cases is the allotted processing paradigm and so entails the off-loading of computation, backup of manufacture data, as well as interaction to physical equipment on or around the shop floor. Edge computing contrasts with cloud computing, just where production data are recorded and so examined on central nodes. Many latest analyses [11–13] have resolved and identified the reliability issues and criteria for IIoT, by distinct techniques. As an illustration, categorizes reliability issues depending on whether they incorporate to IoT in normal, or are particular to IIoT. In [14], deep neural network structured clustering contemplated for IIoT reliability. A thorough literature analysis of IIoT requisites is offered in [15]. In [16], IIoT protocol concerns are talked about by way of emphasis on Industry 4.0. In [17], research presented a specific IIoT Cloud system to resolve interoperability challenges, by means of employing Service-Oriented Architecture (SOA) through a cloud system and the suggested method is executed as a configurable edge plugging
Title Suppressed Due to Excessive Length
483
to load telemetry data intended to be utilized for new IIoT initiatives. In [18], the author proclaimed review of MQTT protocol and then compared the functionality, merits, as well as constraints of the MQTT protocol. Author also outlined brokers as well as libraries presently obtainable in the public domain to enable researchers as well as end-users to select a broker or client library depending on their preferences. Even though employing MQTT protocol for Industry 4.0 [19]applications, it is essential to evaluate the Quality of Services (QoS). In [20], author examined MQTT effectiveness relies on the essential ranking of such possibilities in the access network, edge as well as remote cloud, and on the attributes of the connection that delivers services for numerous qualities of service levels. But it is necessary to identify QoS for machine-to-machine communication in unmanned environments.
44.2.1 Industrial IoT As a growing concept, IIoT [21] has distinctive elements as well as criteria that recognize it from client IoT, integrating the exclusive categories of smart devices integrated, network system features and quality-of-service criteria, and rigorous requirements of command and control. In IIoT, the quality of services (QoS) plays a vital role, and hence proposed research targets the protocol request scheduling. The service request scheduling can lower the redundant request processing overhead and can contribute to make system more energy-efficient, reliable, and scalable.
44.2.2 IoT Protocol QoS An alternate scheduling algorithm for standard sensor nodes and by tasks deadlines is a lazy scheduling algorithm (LSA) [22]. LSA features the strategy of energy variability characterization curve (EVCC) that reflects the characteristics of the energy source and an off-line schedulability evaluation. The research in [23] presents effective scheduling algorithms for IoT-based systems and so is the need of the hour. Nodes with self-centered patterns weaken the efficiency of the network system. Consequently, a scheduling algorithm which schedules request packages depending on the emergencies assures enhanced outcomes. In [24] author recommended the architecture that incorporates hierarchical sensors networks to get the scheduling MQTT protocol and so that can communicate by MODBUS manufacturing networks. As there are still several issues in variable CoAP mode decisions, incorporating imperfect data as well as differentiated quality of service (QoS) criteria of distributed IoT service. In [25] author recommended upper confidence bound (UCB)-based potent CoAP form selection algorithm for data diffusion to resolve these kinds of issues. As per [26] the effectiveness of M2M interaction predominantly relies on the initial unique Messaging protocols tailored for M2M interaction through IoT
484
S. N. Deshpande and R. M. Jogdand
uses. MQTT presents three QoS levels intended for providing communications to an MQTT Broker as well as virtually any client (varying from 0 to 2).
44.3 Proposed Architecture In this paper, we design a machine to machine and/or machine to sensor communication model based on the MQTT protocol request/response for Industrial IoT. The key focus is on efficient MQTT request processing by means of publish/subscribe mode. The proposed design is targeted as process executions without human presence. The case study is conducted for Agro processing plant where milk and milk products are under essential services. The proposed model can be used for any industrial IoT (IIoT) systems where Sensor Network (SN) is implemented. MQTT publish and subscribe will be executed as per gateways input requests from any sensor and/or machine to other sensor and/or machine, i.e., machine-to-machine (M2M) or sensor to sensor (S2S) communication pattern. Figure 44.1 depicts the real-time milk processing plant sensor network application for process scheduling. As presented in Fig. 44.1, the milk processing plant automation is targeted to execute without in-person monitoring of any processes. In this scenario, the process scheduling for MQTT protocol is necessary to avoid any kind of delay in milk processing. The schematic of publish/subscribe action with the broker for MQTT protocol is shown in Fig. 44.2. Further, as shown in Fig. 44.3, for multiple IoT protocols proposed CatchPro scheduler can identify the priority of incoming request, and according to priorities further process execution can be done. For example, if there are two incoming requests of MQTT and CoAP protocols with high and low priority respectively then MQTT request will be processed by broker based on the level of priority and irrespective of the type of incoming protocol. To execute the line of action for proposed M2M communication and scheduler model, we developed the CatchPro scheduling algorithm for IIoT process automation. The proposed algorithm is further executed to identify the delay Quality of Services (QoS) parameter.
Title Suppressed Due to Excessive Length
485
Fig. 44.1 Representation of milk processing plant with sensor model
“CatchPro” Request Scheduler Algorithm: Input: SensorIDs, sequence of operations, pre-defined protocol request priorities. Output: Execution of machine to machine/sensor to sensor communication. Assumption: Only MQTT and CoAP protocol are considered here. 1: ParameterInitialization Array SensorID [];
486
S. N. Deshpande and R. M. Jogdand
Fig. 44.2 MQTT publish/subscribe multi-channel model
Array ProtocolType []; Array RequestPriority []; Array HQueue []; //High priority queue Array MQueue [];//Medium priority queue Array LQueue [];//Low priority queue addRequest (); 2: for each ProtocolRequesti do. if RequestPriority == ‘High’ then addRequest (‘High’) to HQueue []; else if RequestPriority == ‘Medium’ then addRequest (‘Medium’) to MQueue []; else addRequest (‘Low’) to LQueue []; SensorIDi ← PredefinedRequestPriorityi 3: If ProtocolType == MQTT then redirect request to MQTT broker else redirect request to CoAP broker 4: for each SensorIDi in RequestPriortity[] do. 5: SensorIDi + PredefinedRequestPriorityi 6: end for 7: [SensorIDi if RequestPriortity[] ==0] 8: Go to step - 2
The proposed CatchPro algorithm is executed using python library and the simulated machine to machine communication is demonstrated for PUB/SUB module for paho broker. Initially, we fired messages from a simulated machine to MQTT server
Title Suppressed Due to Excessive Length
487
Fig. 44.3 CatchPro request scheduler architecture
for QoS-0 and QoS-1 as shown in Fig. 44.2. The machine-to-machine communication is pipelined using Ethernet connection to maintain the same IP addresses. Machine ‘M1’ activates temperature rise sensor ‘S1’ and further the request goes to machine ‘M2’ which fires cooling request via sensor ‘S2’. Thus, the proposed algorithm is tested for MQTT protocol only. As per CatchPro scheduler architecture shown in Fig. 44.3, CoAP can be tested for QoS with modifications in protocol identifiers.
44.4 Result and Analysis This section discusses the performance parameters for proposed CatchPro scheduler algorithm. As discussed earlier, the experimental scenario is executed for sensors S1 (thermal sensor: medium priority), S2 (smoke sensor: high priority), and S3 (motion sensor: low priority). We executed benchmarking for two cases: first without use of CatchPro algorithmic scheduler which means that all MQTT incoming requests will
488
S. N. Deshpande and R. M. Jogdand
be processed as First In First Out (FIFO). In a second case, CatchPro scheduler will identify the predefined priorities of each sensor (here, S1,S2, and S3) and will redirect the incoming request to paho broker based on level of priority i.e. highest priority first and lowest priority requests will be handled after handling high and medium priority requests. The benchmarking is done by means of various test scenarios as discussed in the next section of this paper.
44.4.1 Benchmarking Results For execution of benchmarking for CatchPro algorithm following format is used as shown in Table 44.1. Table 44.1 Proposed algorithm benchmarking for sensor-to-sensor communication for MQTT protocol Test type
Test ID #1
Test category
Performance
Test type
Load testing
Aim
To identify performance of CatchPro algorithm scheduler for MQTT paho broker
Test description
Test scenario 1
Execution Steps Script
ensure that { when { (.) multiple sensor request PUBLISH messages at same time (!) message payload corresponding to PRIORITY_SETTING; } then { (!) without CatchPpo_Activation: the entity sends the SUBSCRIBE messages and note the delay in ms; (!) with CatchPpo_Activation: the entity sends the SUBSCRIBE messages and note the delay in ms; }
• Fire requests R1, R2, R3 from sensor S1, S2, and S3 via MQTT Paho broker without CatchPro scheduler • Fire requests R1, R2, and R3 from sensor S1, S2, and S3 via MQTT Paho broker with CatchPro scheduler
Output
Average MQTT PUBLISH/SUBSCRIBE delay in (ms)
Measurements
Overall Delay without CatchPro scheduler
Overall Delay with CatchPro scheduler
Remark
Recorded request time delay
0.921 ms
0.898 ms
Delay with CatchPro scheduler performs better
Title Suppressed Due to Excessive Length
489
Fig. 44.4 Performance analysis for MQTT sensor request processing with and without execution of CatchPro scheduler
44.4.2 Benchmarking Analysis As per the template shown in Table 44.1 above, many testIDs executed to identify the performance of proposed CatchPro scheduler algorithm. The following graph represents the evaluation of delay for multiple MQTT requests (from respective sensors: S1, S2, and S3) with and without the proposed scheduler algorithm. From Fig. 44.4 it is depicted that, without execution of CatchPro scheduler the overall delay in request processing is 0.921 ms whereas with incorporation of CatchPro scheduler the overall delay is 0.898 ms. This shows overall improvement of 2.56%. With incorporation of CatchPro scheduler high priority request gets executed first and then medium and low priority request processing is executed which cut down the unnecessary hold-off time of MQTT broker. In case the request hold-on occurs, other resources get halted which may cost huge in case of IIoT processes. Hence, proposed algorithm performs better to cut down the delay and subsequently improves the QoS.
44.5 Conclusion In this paper, we have considered the IIoT protocol request scheduling problem in the multi-protocol environment where MQTT and CoAP protocols can communicate via respective brokers. We discussed the milk processing plant application where automation can be executed without human presence. We propose a new CatchPro protocol request scheduler algorithm for industrial framework. For efficient resource utilization as well as to cut down the process downtime, proposed algorithm provides
490
S. N. Deshpande and R. M. Jogdand
priority-wise request execution through sensor networks. Also, algorithm redirects specific IoT protocol to specific protocol broker based on incoming protocol request type.
44.6 Future Scope Further, the QoS for MQTT and CoAP can be improved with use of proposed CatchPro algorithmic request scheduler. In the future, with lessons learnt from Covid19 pandemic, lockdown, and need of social distancing as well as need of work from home facility for non-IT employees; the proposed system can be further developed for remote monitoring and administration of smart factories.
References 1. Li, X., et al.: Intelligent manufacturing systems in COVID-19 pandemic and beyond: framework and impact assessment. Chi. J. Mech. Eng. 33(1), 1–5 (2020) 2. Ndiaye, M., et al.: IoT in the wake of COVID-19: a survey on contributions, challenges and evolution. IEEE Access 8, 186821–186839 (2020) 3. Chamola, V, et al.: A comprehensive review of the COVID-19 pandemic and the role of IoT, drones, AI, blockchain, and 5G in managing its impact. IEEE Access 8, 90225–90265 (2020) 4. Kumar, Mr S., et al.: Applications of industry 4.0 to overcome the COVID-19 operational challenges. Diabetes Metabolic Syndrome: Clin. Res. Rev. 14(5), 1283–1289 (2020) 5. Kamal, M., Aljohani, A., Alanazi, E.: IoT meets COVID-19: status, challenges, and opportunities. arXiv:2007.12268 (2020) 6. Potgieter, P.: IIoT sensors: making the physical world digital (2018) 7. Bansal, M., Goyal, A., Choudhary, A.: Industrial Internet of Things (IIoT): a vivid perspective. In: Inventive Systems and Control, pp. 939–949. Springer, Singapore (2021) 8. Sethi, R., et al.: Applicability of industrial IoT in diversified sectors: evolution, applications and challenges. In: Multimedia Technologies in the Internet of Things Environment, pp. 45–67. Springer, Singapore (2021) 9. Varshney, A., et al.: Challenges in sensors technology for industry 4.0 for futuristic metrological applications. MAPAN 1–12 (2021) 10. Mantravadi, S., et al.: Securing IT/OT links for low power IIoT devices: design considerations for industry 4.0. IEEE Access 8, 200305–200321 (2020) 11. Sengupta, J., Ruj, S., Bit. S.D.: A comprehensive survey on attacks, security issues and blockchain solutions for IoT and IIoT. J. Netw. Comput. Appl. 149, 102481 (2020) 12. Boyes, H., et al.: The industrial internet of things (IIoT): an analysis framework. Comput. Ind. 101, 1–12 (2018) 13. Hussain, S., et al.: A lightweight and provable secure identity-based generalized proxy signcryption (IBGPS) scheme for Industrial Internet of Things (IIoT). J. Inf. Secur. Appl. 58, 102625 (2021) 14. Mukherjee, A., et al.: Deep neural network-based clustering technique for secure IIoT. Neural Comput. Appl. 32(20), 16109–16117 (2020) 15. Ren, Y., et al.: Potential identity resolution systems for the Industrial Internet of Things: a survey. IEEE Commun. Surveys Tutor. (2020) 16. Municio, E., Latre, S., Marquez-Barja, J.M.: Extending network programmability to the things overlay using distributed industrial IoT protocols. IEEE Trans. Ind. Inf. 17(1), 251–259 (2020)
Title Suppressed Due to Excessive Length
491
17. Tan, S.Z., Labastida, M.E.: Unified IIoT cloud platform for smart factory. Implementing Ind 4.0 55 18. Mishra, B., Kertesz, A.: The use of MQTT in M2M and IoT systems: a survey. IEEE Access 8, 201071–201086 (2020) 19. Zhao, Q.: Presents the technology, protocols, and new innovations in Industrial Internet of Things (IIoT). In: Internet of Things for Industry 4.0, pp. 39–56. Springer, Cham (2020) 20. Borsatti, D., et al.: From IoT to cloud: applications and performance of the MQTT protocol. In: 2020 22nd International Conference on Transparent Optical Networks (ICTON). IEEE (2020) 21. Xu, H., et al.: A survey on industrial Internet of Things: a cyber-physical systems perspective. IEEE Access 6, 78238–78259 (2018) 22. Dong, Y., et al.: Lazy scheduling based disk energy optimization method. Tsinghua Sci. Technol. 25(2), 203–216 (2019) 23. Deva, P.M., et al.: An efficient scheduling algorithm for sensor-based IoT networks. In: Inventive Communication and Computational Technologies, pp. 1323–1331. Springer, Singapore (2020). 24. Cabrera, E.J.S., et al.: Industrial communication based on MQTT and modbus communication applied in a meteorological network. In: The International Conference on Advances in Emerging Trends and Technologies. Springer, Cham (2020) 25. Zhang, S., et al.: A UCB-based dynamic CoAP mode selection algorithm in distribution IoT. Alexandria Eng. J. (2021) 26. Mishra, B., Mishra, B.: Evaluating and analyzing MQTT brokers with stress-testing. (2020)
Chapter 45
Speech@SCIS: Annotated Indian Video Dataset for Speech-Face Cross Modal Research Shankhanil Ghosh, Naagamani Molakathaala, Chhanda Saha, Rittam Das, and Souvik Ghosh
Abstract Researchers who are working on various speech-based research applications for Indian ethnicity need quality speech data with various speech parameters. There are limited open source dataset of speech parameters for Indian context which maps voice and facial information. Manual acquisition of such data is time and energy consuming and expensive. There is also a lack of data acquisition tool to facilitate new research domain. This paperwork presents a proposal of a new online video recording system, with easy user interface, that facilitates collection of speech and facial data, with proper annotation. The paperwork also proposes Speech@SCIS, which is a gender and age annotated dataset for people of Indian ethnicity. The dataset is cleaned and segmented into ≈5 s quasi-static video information and converted into separate speech and facial datasets. Then, extensive analysis is performed on the audio segments, and the same is tested for quality on a gender-specific neural network classifier. The performance of the proposed audio dataset is compared with benchmark datasets (ELSDSR and TIMIT datasets) and found to have improved performance. There are further enhancement plans to include language annotations and using deep learning-based data augmentation methods to reduce skews within the dataset. S. Ghosh · N. Molakathaala (B) · C. Saha School of Computer and Information Sciences, University of Hyderabad, Hyderabad, India e-mail: [email protected] S. Ghosh e-mail: [email protected] C. Saha e-mail: [email protected] R. Das Department of Computer Science and Engineering, University of Calcutta, Kolkata, India S. Ghosh Heritage Institute of Technology, Kolkata, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_45
493
494
S. Ghosh et al.
45.1 Introduction India is home to over 1.5 billion people. The sheer size of one single ethnic group creates huge opportunities for research on biometric studies, where speech-based research is a major focus. Speech research, not just in Indian context, has been a major target of the research fraternity for quite some time. We have used human voices to create amazing technology, understand human body and have used them for so many use-cases. In recent years, we and other researchers across the globe have been focusing on research on speech-face cross-modal biometric matching, where the focus of study is the relationship between the human face and its voice. Such research has huge applications in security, forensics, biometric authentication and other areas. Interesting relationships have been established between the human face, the articulatory structure and his voice [1]. Authors Wen et al. [2] and Agarwal et al. [3] have proposed a GAN architecture to generate a person’s face from their voices. Authors Hoover et al. [4] talk specifically about identifying a speaker’s face from their voices. There are other researches which focus on “talking faces” [5], where, given a static facial image and a voice, a short video is generated which mimics the face speaking out the speech in its own voice. However, a noticeable fact in these research is that all these researches do not take into consideration specific details, such as ethnicity. Due to the huge volume of people of Indian ethnicity, this domain of research has various use-cases in this region. For example, a very recent research work by Banothu et al. [6] talks about a speech-based access of Kisan information system, in Telugu language. One of the most useful use-cases is in the domain of forensics. It is worth while to mention that a lot of banking fraud happens in India over the phone, and one very useful evidence in these cases is the voice of the perpetrator. Proper analysis of the voice and a face-reconstruction algorithm can drastically ease the investigation process. Also, the Government of India is interested in harnessing the capacity of face and speech in cross-modal identification for Indian citizens. Inspite of huge research and developmental opportunities, gathering a well-annotated speech and face database for ethnic Indians is still a big issue at large among the research fraternity. Collecting speech and face datasets is a huge problem in itself. Since speech has huge number of use-cases which demands different kind of speech datasets of its own, collecting appropriate dataset is a huge problem in itself. Gathering face and voice data together is also a very time-consuming and expensive task. Also, there does not exist a well-annotated video dataset for Indian subjects. Hence, in this paper, we propose a gender and age annotated Indian video dataset that we have collected using an online Web application build by the research team. This Speech@SCIS dataset is collected and stored on cloud storage. These Speech@SCIS datasets have been manually reviewed by members of the team, and those video, where the face is clearly visible, have been considered, rejecting the rest. To check the quality of the speech input in our video dataset, we have taken two well-established speech datasets, namely the ELSDSR dataset, proposed by Feng and Hansen [7], and a
45 Speech@SCIS: Annotated Indian Video Dataset …
495
small subset of TIMIT Acoustic-Phonetic Continuous Speech Corpus provided by Garofolo et al. [8] as our baseline dataset. We have tested our dataset against these baseline datasets on NeuraGen [9], which is a gender-classifier neural network. For this work, we have compared the training accuracy, validation accuracy, training loss and validation loss of Speech@SCIS against the baseline datasets, to establish an idea about the quality of our proposed dataset.
45.2 Related Work To understand the background of this research, we studied extensively past research works on topics like speech and facial datasets, speech data acquisition, covering both language-dependent and language-independent datasets, speech and GAN-based data augmentation methods on speech datasets.
45.2.1 Speech and Facial Dataset Speech data acquisition is a very important of contribution for the research fraternity. There are already existing speech datasets which contain samples obtained under quite constrained conditions and are sometimes hand annotated, hence limited in size. There are also other large-scale speech datasets which lacks proper annotations, but are extremely useful in some areas of research. One such example is by Nagraniy el al. [10], who have proposed a large-scale face–speech dataset. This dataset was collected by the researchers from publicly available YouTube videos, “in the wild”. The dataset, called VoxCeleb, contains face and speech utterance of over 1000 celebrities. The dataset is gender balanced, and the speakers are from various demographic backgrounds. However, these data are not annotated. Srivastava et al. [11] worked on low resources automatic speech recognition challenge for Indian languages. Given that India has more than 1500 languages but data about most of them are not widely available. Hence, we tried to provide speech recognition support for these low-resource languages.
45.2.2 GAN-Based Data Augmentation on Speech Datasets There are several relevant work in the domain of data augmentation. Specifically in terms of data augmentation in speech processing domain, authors Wu et al. [12] proposed a model to detect face mask and monitor subject’s breath from his speech. They have described the data augmentation, feature representation and modelling. They have mentioned the use of SpecAugment, proposed by Park et al. [13], which is described as a simple data augmentation method for automatic speech recognition.
496
S. Ghosh et al.
Fig. 45.1 Workflow of the proposed data collection tool
Authors Emre Eskimez et al. [14] also proposed a GAN-based data augmentation model for speech emotion recognition. This is a CNN-based GAN with spectral normalization on both the generator and discriminator networks, and they are pretrained on large-scaled un-annotated corpus. There are also other data augmentation models proposed for specific speech use-cases. For example, authors Chatziagapi et al. [15] have proposed a data augmentation model for speech emotion dataset. In this research, the authors targetted to generate enough data for under-represented emotions. Their models are tested on two large-scale datasets IEMOCAP and FEEL25k. The model achieved an improvement of 10% and 5%, respectively, on both datasets (Fig. 45.1).
45.3 Our Proposed Work 45.3.1 Data Collection Tool The video recordings were collected using an Web application that was developed by us for the purpose of data collection. The online tool is publicly available, and anyone having Indian ethnicity can visit the live link for the Web application and contribute their video recordings for our research purpose. The online tool was build using ReactJS v17.0.1. The application uses a third-party video recorder tool
45 Speech@SCIS: Annotated Indian Video Dataset …
(a) Self annotation form for Speech@SCIS online video collection tool
497
(b) Recording video for Speech@SCIS online video collection tool
Fig. 45.2 Screenshots of online tool to collect video data
react-video-recorder v3.17.2.1 The tool works the modern versions of Mozilla Firefox and Google Chrome. It is compatible with mobile, PCs and tablets, and the volunteers have used all kinds of devices to record their self-video for our database. The Web application works on almost any browser, and the user must manually provide permissions to access the video camera and microphone. The volunteers were also requested to provide their own gender and age, which will be used as annotation data. Currently, the Speech@SCIS dataset is a language-independent dataset; however, provisions will be made in the future to collect language-specific video data for other research use-cases. The Web application explicitly mentions the privacy policy that the research team adheres to. The video data that are recorded by the volunteers are securely stored in a Firebase cloud, which can be accessed only by authorized members of the research team. The Web application was distributed among willing volunteers who provided their self-video data, often multiple times. The first view of the application shows a list of usage-instruction and the welcome page. When the volunteer begins his/her contribution, the application gives a small form, where it asks for the volunteer’s age and gender (Fig. 45.2a), and then the application gives them a view where the volunteer can turn on the camera and record the self-video. The volunteers can read out any text as per their liking or can read the sample text given below the video camera component. The sample text is inspired from the United Nations Declaration of Human Rights2 (Fig. 45.2b). 1 2
https://github.com/fbaiodias/react-video-recorder. https://www.un.org/en/about-us/universal-declaration-of-human-rights.
498
S. Ghosh et al.
Table 45.1 Exploratory data analysis of Speech@SCIS dataset Male Female Utterance duration Mean age Outlier count Total utterances count Male–female ratio
– 26.44 5 1747 –
– 25.18 4 617 –
All 3–5 s 28.04 9 2364 2.83:1
45.3.2 Proposed Dataset As mentioned earlier, collected speech data are time-consuming and often an expensive affair. Also, different domain in speech and cross-modal matching research demands different annotations with the dataset. In this research, we are proposing a video-based speech and face dataset which we are calling the Speech@SCIS dataset. We have manually collected the Speech@SCIS dataset over a certain period of time. The Speech@SCIS dataset is still in the process of being build and cleaned and is not publicly available as of now. However, we are publishing the details of the database for the purpose of this research. The Speech@SCIS database is a collection of videos recordings submitted by 30+ volunteers. The video recordings consist of the speaker being present in front of the camera and speaking a few sentences. The speakers have spoken in a variety of languages, such as English, Bengali and Hindi. The language of the utterances has not been taken into consideration for this research. The volunteers have recorded the video in various environments, both indoors and outdoors, and the audio recordings have different levels of noise, loudness, clarity, etc. As mentioned before, we are still collecting the utterances, and the database is still growing. We have collected over 1.5 GB worth of video data, amounting to nearly 2364 utterances. We have performed exploratory data analysis on the available dataset and here are the analysis results given in details in Table 45.1. The dataset is somewhat skewed in terms of gender distribution, where we have received more male utterances than female. In terms of age distribution, the dataset is heavily skewed towards younger voices.
45.3.3 Exploratory Analysis of the Audio from the Dataset Since we are dealing with speech data, we performed feature extraction on the data to extract relevant features. Since for this research, we are targeting gender and age as our target annotations, with a special focus on gender, we have carefully chosen the features that affect these two biometrics. We have chosen total of four speech characteristics, and we have performed mathematical manipulations to generate the
45 Speech@SCIS: Annotated Indian Video Dataset …
499
final feature vector. We have used audio processing library Librosa to implement these feature extraction in our programs. The four speech characteristics that we have chosen for our feature extraction are Mel frequency cepstrum coefficient, root mean square, spectral coefficient and spectral bandwidth. The MFCC of an utterance would be generated as a 2-D matrix, of size M × 20, where M might vary due to the length of the utterances. For most of the utterances, the MFCC matrix dimensions were 216 × 20. We performed a mean across the 0th-axis of the MFCC matrix, to generate a feature vector of size 1 × 20. This ensures that we take the mean of the 20 features across the time-axis of MFCC, so we get a feature vector that represents the entire utterance. Since all the utterances are ≈5 seconds long, the meaned-MFCC is a reasonable. We collected MFCC feature vectors for both males and females and analysed them separately. Theoretically, MFCC is mainly used to identify the features of the articulatory structure, and we need additional features to properly identify gender.
(a) MFCC spectrogram of the subject’s (female, 23 years old) voice and snapshot of her face
(b) MFCC spectrogram of the subject’s (male, 23 years old) voice and snapshot of his face
Fig. 45.3 Snapshot of the dataset
500
S. Ghosh et al.
The comparison between the MFCCs of a sample subject from both gender is shown in Fig. 45.3. Both the subjects are of the same age (= 23 years old). Since MFCC alone can not be used for gender-based studies, hence we have introduced other features, as a part of the feature extraction process of our audio dataset. We have extracted the root mean square, the spectral bandwidth and the spectral centroid for analysis. For each utterance, the RMS, spectral bandwidth and spectral centroid are feature vectors of size 1 × M, where M might vary due between the utterances. As mentioned before, the value of M = 216 for most of the speech data we have. In a similar way as mentioned above, we averaged each of the features based on the gender of the speaker. It can be seen that there is a clear distinction between the RMS, the spectral bandwidth and the spectral centroid between male and female voices. Hence, for any gender-related study for audio modality, these four parameters, namely MFCC, RMS, spectral bandwidth and spectral centroid, can be used together as a distinguishing feature between the different voices. In Fig. 45.4, we have given the results of the analysis between the different extracted features with respect to the gender of the subjects.
(a) Comparing MFCC and RMS with respect to gender
(b) Spectral Bandwidth and spectral centroid over 216 frames, with respect to gender
Fig. 45.4 Comparison of MFCC, RMS, spectral bandwidth and spectral centroid with respect to gender
45 Speech@SCIS: Annotated Indian Video Dataset …
501
45.3.4 Comparing of Audio Subset of Speech@SCIS Against the Baseline We have tested the quality of the Speech@SCIS dataset against two standard speech datasets, namely a smaller subset of TIMIT dataset and the ELSDSR dataset, both of which are publicly available without any cost. To perform an unbiased comparison test, we have taken a randomly selected subset of the Speech@SCIS dataset, containing 360 utterances, 187 male utterances and 173 female utterances. This makes our Speech@SCIS dataset fairly even in terms of gender distribution. We have extracted
(a) Training curves for Speech@SCIS dataset
(b) Training curves for the hybrid sub-TIMIT and ELSDSR dataset
Fig. 45.5 Training curves of both datasets, compared side by side
502
S. Ghosh et al.
Table 45.2 Comparison between Speech@SCIS with ELSDSR+sub-TIMIT Accuracy Loss Training Validation Training Validation ELSDSR + TMIT 0.947 Speech@SCIS 0.962
0.943 0.955
0.133 0.109
0.125 0.084
the speech from the utterances, and we have not taken into consideration the video information. The ELSDSR dataset has a total 198 utterances, and the sub-TIMIT dataset has 160 utterances. Since ELSDSR and sub-TIMIT are individually very smaller in size, we have combined both of these dataset to produce a hybrid dataset of 358 utterances. This hybrid sub-TIMIT + ELSDSR dataset consists of 188 male utterances and 170 female utterances, and hence, it is fairly distributed in terms of gender distribution. We will use this hybrid dataset as it is provided. For comparing the datasets, we have used a low-resource neural network for gender classification, called NeuraGen [9]. We will publish a detailed paper about NeuraGen in a future research work; however in this video, we are only comparing the performance result of the two datasets. We have extracted previously mentioned features like MFCC, RMS, spectral bandwidth and spectral centroid. The dataset has been split into training and 20-fold cross validation validation sub-datasets. The exact configuration of NeuraGen has been used for running the experiments on Speech@SCIS and the baseline datasets. NeuraGen was trained both datasets separately, over 500 epochs. We calculated the training accuracy, validation accuracy, training loss and validation loss. The results were collected and are shown below in Table 45.2. Speech@SCIS has 1.60% better training accuracy, and 1.29% better validation accuracy than ELSDSR+TIMIT datasets. Also, the Speech@SCIS has a 17.85% lower training loss and a 33.02% lower validation loss than ELSDSR+subTIMIT dataset. Hence, it is evident that Speech@SCIS performed better in all four respects. The learning curves are shown in Fig. 45.5, which is also an evidence of the data shown in Table 45.2.
45.4 Conclusion In this paper, we have proposed Speech@SCIS, a video dataset for face and speech data for people of Indian ethnicity, and have described the online tool that we have built for this purpose. This video dataset has been gender and age annotated, cleaned and analysed. We have also discussed the analysis results of the various speech characteristics as obtained from the audio. We have shown here that speech features like RMS, spectral bandwidth and spectral centroid are significantly different for male and female, and hence, this is an indication that those features, along with
45 Speech@SCIS: Annotated Indian Video Dataset …
503
MFCC, which is a very useful feature for speech research, can be used to study different speech use-cases which is based on gender-based biometrics. We have also tested our Speech@SCIS dataset on NeuraGen, which is a low-resource genderclassifier neural network, against the two baseline datasets, ELSDSR and a small subset of TIMIT. We have shown that Speech@SCIS has a better performance in all four respects, namely training accuracy, validation accuracy, training loss and validation loss.
45.5 Future Scope of Work Speech@SCIS is continuously growing, and our generous volunteers are continuously contributing to the dataset. There are lots of areas where the dataset needs to improve. Currently, there is a heavy skew in terms of age, which will cause problems where age is a deciding factor. As of now, we are only collecting the gender and age annotations where improvements can be made. However, we plan to extend the horizon of annotations and include parameters like language and location information of the speaker. The online tool that is being used to collect the data has some bugs, as reported by our volunteers, which we are working to fix. We are also continually updating, optimizing the application and simplifying the UI so that inexperienced users can handle the tool with ease. We are planning to use the Speech@SCIS dataset for face–speech cross-modal research for forensic purposes. We are also planning to build a data augmentation model based on this dataset, so that we can generate data of under-represented annotations and build a fairly distributed dataset, and remove any existing skews. Acknowledgements This research work is an outcome of Advanced Operating System course (CS/AI/IT401) at the University of Hyderabad. The authors would like to thank Professor Chakravarthy Bhagvathy, Dean, SCIS, who is also the course facilitator for this course, for his support and feedback towards our research. The authors also thank the University of Hyderabad for facilitating the lab environments. Lastly, the authors would like to extend their heartfelt gratitude to all the volunteers who have contributed their voices to the Speech@SCIS database. Without their contribution, this entire research would have remained as drawings and scribbles in our notebooks.
References 1. Singh, R., Raj, B., Gencaga, D.: Forensic anthropometry from voice: an articulatory-phonetic approach. In: 2016 39th International Convention on Information and Communication Technology, Electronics and Microelectronics, MIPRO 2016—Proceedings, pp. 1375–1380 (2016) 2. Wen, Y., Singh, R., Raj, B.: Reconstructing faces from voices. 1–10 (2019) 3. Agarwal, P., Poddar, S., Hazarika, A., Rahaman, H.: Learning to synthesize faces using voice clips for Cross-Modal biometric matching. In: Proceedings of 2019 IEEE Region 10 Symposium, TENSYMP 2019, vol. 7, pp. 397–402 (2019)
504
S. Ghosh et al.
4. Hoover, K., Chaudhuri, S., Pantofaru, C., Slaney, M., Sturdy, I.: Putting a face to the voice: fusing audio and visual signals across a video to determine speakers, May 2017. https://arxiv. org/abs/1706.00079 5. Zhou, H., Liu, Y., Liu, Z., Luo, P., Wang, X.: Talking face generation by adversarially disentangled audio-visual representation. In: 33rd AAAI Conference on Artificial Intelligence, AAAI 2019, 31st Innovative Applications of Artificial Intelligence Conference, IAAI 2019 and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019 (2019) 6. Banothu, R., Basha, S.-S., Molakatala, N., Gautam, V.K., Gangashetty, S.V.:Speech Based Access of Kisan Information System in Telugu Language, pp. 287–298. Springer, Cham (2021). https://link.springer.com/chapter/10.1007/978-3-030-68449-5_29 7. Feng, L., Hansen, L.: A new database for speaker recognition. In: Test, pp. 1–4 (2005). http:// citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.126.1673&rep=rep1&type=pdf 8. Garofolo, J.S., Lamel, L.F., Fisher, W.M., Pallett, D.S., Dahlgren, N.L., Zue, V., Fiscus, J.G.: TIMIT acoustic-phonetic continuous speech corpus (1993). https://catalog.ldc.upenn. edu/LDC93S1 9. Ghosh, S., Saha, C., Molakaatala, N.: [DRAFT] NeuraGen: A low-resource neural network based approach to gender classification (2020) 10. Nagraniy, A., Chungy, J.S., Zisserman, A.: VoxCeleb: a large-scale speaker identification dataset. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 2017-August, pp. 2616–2620 (2017) 11. Srivastava, B.M.L., Sitaram, S., Kumar Mehta, r., Doss Mohan, K., Matani, P., Satpal, S., Bali, K., Srikanth, R., Nayak, N.: Interspeech 2018 Low Resource Automatic Speech Recognition Challenge for Indian Languages, no. August, pp. 11–14 (2018) 12. Wu, H., Zhang, L., Yang, L., Wang, X., Wang, J., Zhang, D., Li, M.: Mask detection and breath monitoring from speech: On data augmentation, feature representation and modeling (2020) 13. Park, D.S., Chan, W., Zhang, Y., Chiu, C.-C., Zoph, B., Cubuk, E.D., Le, Q.V.: SpecAugment: a simple data augmentation method for automatic speech recognition, Apr 2019. https://arxiv. org/abs/1904.08779 14. Emre Eskimez, S., Dimitriadis, D., Gmyr, R., Kumanati, K.: GAN-based data generation for speech emotion recognition. Technical Report 15. Chatziagapi, A., Paraskevopoulos, G., Sgouropoulos, D., Pantazopoulos, G., Nikandrou, M., Giannakopoulos, T., Katsamanis, A., Potamianos, A., Narayanan, S.: Data augmentation using GANs for speech emotion recognition. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 2019-Sept, pp. 171–175 (2019)
Chapter 46
Task Scheduling in Cloud Using Improved ANT Colony Algorithm Shyam Sunder Pabboju and T. Adilakshmi
Abstract In recent years, with the in-depth development of power informatization, more and more power applications and tasks are deployed in the cloud. Due to the dynamic heterogeneity of cloud resources and power applications, realizing resource division and task scheduling is a challenging problem in cloud computing systems. Power applications need to achieve rapid response and minimum completion time, and the scheduler must consider the load of each cloud computing node to ensure the reliability of cloud computing. A task scheduling algorithm based on improved ant colony algorithm is proposed to solve the task scheduling problem in virtual machines. Through the improvement of the standard ant colony algorithm, the task scheduling time reduction and load balancing are realized while minimizing the overall completion time. The research results show that the algorithm effectively shortens the task scheduling time and realizes the load balancing of cloud nodes, providing a technical basis for the optimization of power cloud computing.
46.1 Introduction Cloud computing technology can perform highly computationally intensive tasks as needed. Currently, most of the information technology industry has migrated to cloud computing-based frameworks to serve its customers [1]. The cloud computing framework provides services through three different levels, namely the infrastructure layer, the platform layer and the application layer to support the real-time needs of users (such as data storage, processing capacity and bandwidth.) [2]. Enterprises and individuals can use these services through the methods provided by virtualization technology. There are numerous cloud service providers available right now (such as S. S. Pabboju (B) MGIT Hyderabad, Hyderabad, Telangana, India e-mail: [email protected] T. Adilakshmi Vasavi College of Engineering, Hyderabad, Telangana, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_46
505
506
S. S. Pabboju and T. Adilakshmi
Amazon EC2, google, HP and IBM.) [3] whose resources are virtualized according to customer requirements and provided with “pay-as-you-use” service level agreements (SLA). Many purposes can be accomplished using cloud computing, including flexible resource storage, scalable and dynamic on-demand services and service pricing based on consumption and service quality (QoS) [4]. The application characteristics of the power industry are very consistent with the service mode and technical mode of cloud computing. Electric power business is divided into real-time business and non-real-time business [5, 6]. The two types of business have different requirements for resource allocation. Cloud computing is to gather the originally dispersed resources and provide them to the audience in the form of services to realize group operation, intensive development, lean management and standardized construction [7, 8]. The use of cloud computing can not only achieve data collection and sharing in the power industry, but ultimately realize data mining, provide business intelligence BI, assist decision-making analysis and promote the coordinated development of production and business [9]. It can also help grid companies convert data into services, enhance service value and achieve information neural network integration. Cloud computing in the power industry can automatically switch storage resources according to power business and demand, access computing and storage resources according to demand and transfer calculations running on grid nodes and individual computers to the cloud in the system, and the cloud will handle the calculations demand [10]. Using cloud computing can directly obtain computing power and resources from the cloud, greatly improving the computing power of the entire power system [11]. Cloud computing is not only a new way of sharing architecture, and it also integrates system resources to provide multi-style services [12]. In the cloud computing system, resource management plays an important role. The purpose is to optimize the configuration and assign tasks to the entire cloud resource pool [7, 13]. The task scheduler is the most important aspect of the cloud computing system’s resource management, which can optimize the pre-configuration of the “virtual machine (VM)” through optimization algorithms and allocate user tasks among logical resources to enhance the work scheduling process’ performance [14, 15]. In cloud computing, task scheduling is a fundamental challenge that requires cognitive search and decision-making to locate the optimum virtual machine for each user job [16]. Power management module needs to respond quickly, and the task proposes to allocate the best resources as quickly as possible [17]. Power source to improve the experience of power users. The allocation of cloud resources by the scheduler also needs to consider the reliability and reliability of the overall cloud computing system [18]. Sustainability, so cloud computing needs to optimize the node load, so that cloud computing can complete more power tasks [19].
46 Task Scheduling in Cloud Using Improved ANT Colony Algorithm
507
46.2 Related Works In cloud computing, scheduling tasks are based on the task scheduler, and scheduling decisions are taken according to various metrics to bind user tasks to connected resources. Algorithms for scheduling user tasks in cloud computing systems have been proposed in previous research. For effective scheduling of user work in a cloud context, Abdullahi et al. [20] suggested a discrete symbiosis search algorithm (DSOS). It achieves goal functionalities through symbiosis and parasitic interactions. Adaptive particle swarm optimization has a faster completion and reaction time. To overcome the local convergence problem, Jeyakrishnan et al. [21] offered a hybrid strategy-based bacterial swarm optimizer (BSO) resource allocation, as well as a technique to avoid local convergence while attaining geographical global search and global results. Optimization has improved significantly. Mondal et al. [22] suggested an approach for balancing cloud load distribution by utilizing a random hill climbing method to achieve balanced task allocation among virtual machines, which performed well in local search but poorly in global optimization. Guonin et al. [23] suggested the genetic simulated annealing algorithm as a new technique. After selection, crossover and mutation, they employed the simulated annealing process to improve local search speed, but when the size of the user task increased, it failed. Dengke et al. [24] suggested a cloud computing task scheduling algorithm based on particle swarm optimization and ant colony optimization, combining particle swarm optimization and ant colony optimization to improve the program’s implementation speed and optimization capabilities. Although Linjie [25] suggested an improved particle swarm optimization task scheduling strategy based on a biological symbiosis mechanism that can achieve rapid convergence, the aforementioned two algorithms do not take into account virtual machine load balancing. The above several task scheduling algorithms only consider the fast convergence of the algorithm or the load balancing of the virtual machine, but do not fully consider the integration of the two. In this paper, the classic ant colony algorithm is improved based on the fast convergence of the algorithm while satisfying the load balancing of the virtual machine. The simulation results show that the algorithm not only has better real-time performance, but also can achieve better load balancing.
46.3 Proposed Scheduling Model 46.3.1 System Model Power cloud computing provides on-demand high-performance computing services for the operation and management of power companies through the power private network. Its infrastructure resources include cloud servers, data storage and middleware. One or more hosts with one or more virtual machines are the primary resources
508
S. S. Pabboju and T. Adilakshmi
(VM). The hosting computer’s computing power (CPU), memory and storage location are shared by each VM. Each power task request is mapped to the appropriate virtual machine based on resource requirements (such as CPU, memory, storage and network bandwidth). The workflow of this model is: Each business interacts with the power cloud computing infrastructure and accesses infrastructure resources through cloud security interfaces to ensure identity verification and authorization. The resource scheduler is the major component that operates as a middleware between the client and the resource manager, processing the client’s request, obtaining further task request information from the client and allocating resources using the built-in scheduling algorithm. Finally, the resource manager serves as a resource information system and provides services to the resource scheduler in order to assign user tasks to the “best possible” virtual machine.
46.3.2 Scheduling Problem Tasks are dynamically provided by multiple power users in the power cloud computing ecosystem. The cloud resource scheduler is in charge of locating the most appropriate resource for the task at hand, which is the cloud virtual machine. It is also crucial to keep track of the load on each virtual machine, because scheduling jobs on overburdened resources might slow down the overall cloud computing system. The task scheduling problem is considered a critical problem in this research since it has an impact on the overall performance of power cloud computing in terms of service quality. Many factors affect service quality, including execution time, transmission time, delay, resource usage and completion time. In order to provide better performance for power users, this article focuses on task scheduling time and load balancing. In the task scheduling process, consider n user tasks submitted to the system, namely T = {T1 , T2 , T3 , ..., Tn }, which need to be allocated to m virtual machines, namely V M = {V M1 , V M2 , V M3 , ..., V Mm }, and each virtual machine is configured with different resource parameters, such as million instructions per second (MIPS) CPU capacity, storage and network bandwidth. Users create tasks with a variety of parameters, such as task duration, deadline and cost. In order to form the objective function of the proposed algorithm, taking Ci j as the completion time of task Ti on the virtual machine V M j , Formula (46.1) can be used to calculate the completion time of all tasks: Ci j = E i j + W j
(46.1)
Among them, E i j is the expected time to complete the i-th task on V M j , and W j is the task waiting time for V M j . The completion time of all submitted tasks is the virtual machine completion time with the longest total execution time of the completed tasks in the virtual machine.
46 Task Scheduling in Cloud Using Improved ANT Colony Algorithm
509
This can be represented as the sum of all task completion times Cmax across all virtual machines: k Cmax = max Ci j , i ∈ [1, k] (46.2) i=1
Among them, k is the amount of tasks allocated to the virtual machine V M j . Based on the total quantity of power cloud computing resources and the requirements of power users, the goal of the approach provided in this work is to enable the cloud scheduler to give the optimum resource allocation for all user jobs with the shortest completion time and lowest cost. Based on the above considerations, the objective function proposed in this paper is defined as fitness = min (Cmax x )
(46.3)
46.3.3 Improved ANT Colony Algorithm Standard ANT Colony Algorithm The standard ant colony algorithm must first initialize the pheromone function. The pheromone function τi j (t) represents the pheromone concentration of the virtual machine V M j for a certain task Ti . This article uses the computing power MIPS j of the virtual machine V M j and the communication bandwidth Bandwidth j . The pheromone function is initialized with the predicted execution time of the task. τi j (0) =
Bandwidth MIPS j + C D
j
(46.4)
Among them, C and D are constants. After completing the initialization of the heuristic function and the pheromone function, the probability that the task Ti is scheduled to the virtual machine V M j is obtained by Formula (46.5): ⎧ α β ⎨ [τi j (t)] [ηi j (t)] xi j ∈ / tabuk α β , k pi j (t) = x y∈ entulu [τi j (t)] [ηi j (t)] ⎩ 0, Others ηi j (t) =
1 Ci j
(46.5)
(46.6)
Among them, the heuristic function ηi j (t) represents the tendency of a task Ti to the virtual machine V M j , which is inversely proportional to Ci j . xi j represents the search path of the ant; that is, the task Ti is assigned to the virtual machine list of the kth ant. After the kth ant V M j . tabu k (k = 1, 2, ..., m) is the forbidden
m has selected the node xi j at time t, X i j i=1 is added to the forbidden list tabu k . α
510
S. S. Pabboju and T. Adilakshmi
and β as pheromone factor and heuristic factor, respectively, represent the relative degree of influence of pheromone and heuristic function. α and β as the pheromone factor and heuristic factor, respectively, represent the relative degree of influence of the pheromone and the heuristic function. After the ant selects the node according to pikj (t), that is, after task T j is assigned to a virtual machine V M j , the pheromone function completes the partial update through Eqs. (46.7) and (46.8): τi j (t + 1) = (1 − ρ) × τi j (t) +
m
Δτikj (t)
(46.7)
k=1
Δτikj (t)
=
, kthbr ant choosesxi j betweentandt + 1 0, other Q Ci j
(46.8)
In the above equation, ρ ∈ (0, 1) is the volatilization coefficient of pheromone. Δτikj (t) is the amount of information generated on the path (i, j) after ant k performs a cycle. Q is a constant. After all the ants complete a cycle, each task completes the selection of the virtual machine, and the pheromone is updated globally according to Formulas (46.9) and (46.10): m Δτikj (t) (46.9) τi j (t + n) = (1 − ρ) × τi j (t) + k=1
Δτikj (t) =
Q , When Cmax
thekthant selectsX i j in this cycle
0, other
(46.10)
In the formula, Cmax is the optimal solution obtained in the completed iteration. Improvement of heuristic factor β In the standard ant colony algorithm, the heuristic factor β is a constant. It can be seen from Formula (46.5) that the value of the pheromone function does not change in the initial stage of the algorithm, and the pheromone function has a small influence on the probability of ant selection. The pheromone function is continuously updated locally, and the degree of influence on the final solution is continuously increased, so the heuristic factor β is transformed into a decreasing function that changes with the number of iterations of the algorithm: ˆ = b1/i β(i)
(46.11)
In the formula, b is a constant, and i is the number of current iterations of the algorithm. In the initial stage of the algorithm, the value of i is smaller, and the value of β is larger. The ants first choose the path through the heuristic function. When the algorithm continues to iterate, the value of i increases, the value of β decreases, and
46 Task Scheduling in Cloud Using Improved ANT Colony Algorithm
511
the pheromone concentration on the path increases. The influence of the pheromone function on the path chosen by the ants is enhanced, which can improve the guiding effect of the pheromone function and help find the optimal solution. Load balancing implementation In order to reduce the overload of the virtual machine and realize the load balancing of the virtual machine in the task allocation process, this paper improves the pheromone function update process and adds the pheromone adjustment factor δ: ⎞ Cj⎠ δ = 1 − ⎝ C j − Cavg / ⎛
(46.12)
j∈VM
Among them, C j is the running time of the virtual machine V M j in the last iteration, and Cavg is the average running time of all virtual machines, that is, Cavg = jeVM C j /m. The pheromone function update method is improved as τi j (t + n) = (1 − ρ) × τi j (t) +
m
Δτikj (t) × δ
(46.13)
k=1
The pheromone function adds the influence of the load status of the virtual machine during the update process. When the load of the virtual machine is too large, the δ value decreases, and the pheromone function decreases. When the next iteration is performed, the virtual machine is assigned a task. The probability is also reduced, so as to achieve load balancing. Improve the ant colony algorithm process The process of improved ant colony algorithm is shown in Fig. 46.1.
46.4 Results and Discussions For implementation of this work, cloud computing simulation platform Cloudsim is used. In order to verify the scheduling efficiency of the algorithm in this paper and the realization of load balancing, the experiment was carried out under the same conditions and environment and compared with the task assignment results of the classic ant colony scheduling algorithm that comes with Cloudsim. Related parameters are shown in Table 46.1. In the experiment, the number of tasks is increased from 20 to 100, the number of computing nodes is 6, the number of virtual machines is 10, the number of ants is 100, and the number of iterations is 600. The comparison of the time required for the basic ant colony algorithm, and the algorithm in this paper to perform tasks is shown in Fig. 46.2, where ACO means basic ant colony algorithm, and IACO stands for the algorithm in this paper.
512
S. S. Pabboju and T. Adilakshmi
Fig. 46.1 Improved ant colony algorithm process Table 46.1 Related parameter values Parameter α b ρ
Value 1 3 0.65
46 Task Scheduling in Cloud Using Improved ANT Colony Algorithm
513
Fig. 46.2 Time required for task scheduling of two algorithms
Fig. 46.3 Relative deviation of task assignment results
It can be seen from Fig. 46.2 that when the number of tasks is small, there is no obvious difference between the two algorithms. However, when the virtual machine faces a more severe load, the algorithm in this paper can complete task scheduling in less time, especially when the number of tasks increases to At 100 h, the time difference between the two increases to about 2 s. Obviously, the algorithm in this paper has higher scheduling efficiency. In order to observe the task load, to obtain the number of tasks allocated on each computing node and to solve the relative standard deviation of the task allocation of the two algorithms refer Fig. 46.3. It can be seen from Fig. 46.3 that when the improved ant colony algorithm is used to perform tasks, the standard deviation of the
514
S. S. Pabboju and T. Adilakshmi
task assignment of the computing node is low, which indicates that the load balance of the node is high. Therefore, the algorithm in this paper has achieved a certain effect in load balancing.
46.5 Conclusion This paper proposes a task scheduling algorithm based on improved ant colony algorithm for cloud computing to solve the task scheduling problem in virtual machines. Through the improvement of the standard ant colony algorithm, the task scheduling time reduction and load balancing are realized while minimizing the overall completion time. The simulation results show that the task scheduling through this algorithm not only shortens the task scheduling time, but also realizes the load balancing of cloud nodes.
References 1. Abd Elaziz, M., Xiong, S., Jayasena, K.P.N., et al.: Task scheduling in cloud computing based on hybrid moth search algorithm and differential evolution. Knowl. Based Syst. 169(04), 39–52 (2019) 2. Boveiri, H.R., Khayami, R., Elhoseny, M., et al.: An efficient Swarm-intelligence approach for task scheduling in cloud- based internet of things applications. J. Amb. Intell. Humaniz. Comput. 10(9), 3469–3479 (2019) 3. Chen, W., Wang, D., Li, K.: Multi-user multi-task computation offloading in green mobile edge cloud computing. IEEE Trans. Serv. Comput. 12(5), 726–738 (2019) 4. Guo, S., Liu, J., Yang, Y., et al.: Energy-efficient dynamic computation offloading and cooperative task scheduling in mobile cloud computing. IEEE Trans. Mob. Comput. 18(2), 319–333 (2019) 5. Haidri, R.A., Katti, C.P., Saxena, P.C.: Cost-effective deadline-aware stochastic scheduling strategy for workflow applications on virtual machines in cloud computing. Concurr. CompPract. Exper. 31(7), 1–24 (2019) 6. Hung, P.P., Alam, G., Hai, N., et al.: A dynamic scheduling method for collaborated cloud with thick clients. Int. Arab. J. Inf. Technol. 16(4), 633–643 (2019) 7. Domanal, S.G., Guddeti, R.M.R., Buyya, R.: A hybrid bio-inspired algorithm for scheduling and resource management in cloud environment. IEEE Trans. Serv. Comput. 13(1), 3–15 (2020) 8. Garg, S., Chaurasia, P.K.: Application of genetic algorithms task scheduling in cloud computing. Int. J. Comput. Sci. Eng. 7(6), 782–787 (2019) 9. Karthikeyan, T., Vinothkumar, A., Ramasamy, P.: Priority based scheduling in cloud computing based on task–aware technique. J. Comput. Theor. Nanosci. 16(5), 1942–1946 (2019) 10. Kaur, A., Kaur, B., Singh, D.: Meta-heuristic based framework for workflow load balancing in cloud environment. Int. J. Inf. Technol. 11(1), 119–125 (2019) 11. Gong, X., Liu, Y., Lohse, N., et al.: Energy- and labor-aware production scheduling for industrial demand response using adaptive multiobjective memetic algorithm. IEEE Trans. Industr. Inf. 15(2), 942–953 (2019) 12. Yuan, H.: Application of cloud computing in power industry. J. Inf. Comput. (Theoret. Edn.) 9, 129–130 (2016)
46 Task Scheduling in Cloud Using Improved ANT Colony Algorithm
515
13. Jain, R.: EACO: an enhanced ant colony optimization algorithm for task scheduling in cloud computing. Int. J. Secur. Appl. 13(4), 91–100 (2020) 14. LMarahatta, A., Wang, Y., Zhang, F., et al.: Energy-aware fault-tolerant dynamic task scheduling scheme for virtualized cloud data centers. Mobile Netw. Appl. textbf24(3), 1063–1077 (2019) 15. Meshkati, J., Safi-Esfahani, F.: Energy-aware resource utilization based on particle swarm optimization and artificial bee colony algorithms in cloud computing. J. Supercomput. 75(5), 2455–2496 (2019) 16. CKaur, A., Sood, S.K.: Cloud-fog based framework for drought prediction and forecasting using artificial neural network and genetic algorithm. J. Exp. Theor. Artif. Intell. 32(2), 273– 289 (2020) 17. Vila, S., Guirado, F., Lerida, J.L., et al.: Energy-saving scheduling on laaS HPC cloud environments based on a multi-objective genetic algorithm. J. Supercomput. 75(3), 1483–1495 (2019) 18. Selvakumar, A., Gunasekaran, G.: A novel approach of load balancing and task scheduling using ant colony optimization algorithm. Int. J. Softw. Innov. 7(2), 9–20 (2019) 19. Nayak, S.C., Tripathy, C.: An improved task scheduling mechanism using multi-criteria decision making in cloud computing. Int. J. Inf. Technol. Web. Eng. 14(2), 92–117 (2019) 20. Abdullahi, M., Ngadi, M.A., Abdulhami, D., et al.: Symbiotic organism search optimization based task scheduling in cloud computing environment. J. Future Gener. Comput. Syst. https:// doi.org/10.1016/j.future.2015.08.006 21. Jeyakrishnan, V., Sengottuvelan, P.: A hybrid strategy for resource allocation and load balancing in virtualized data centers using BSO algorithms. J. Wirel. Person. Commun. 94(4), 2363–2375 (2017) 22. Mondal, B., Dasgupta, K., Dutta, P.: Load balancing in cloud computing using stochastic hill climbing-a soft computing approach. J. Procedia Technol. 4 (2012) 23. Guoning, G., Tingiei, H., Shuai, G.: Genetic simulated annealing algorithm for task scheduling based on cloud computing environment. In: IEEE International Conference on Intelligent Computing and Integrated Systems, Oct 22–24, Guilin, China. IEEE Press, Piscataway (2010) 24. Wang, D.K., Li, Z.: Cloud computing task scheduling algorithm based on particle swarm optimization and ant colony optimization. J. Comput. Appl. Softw. 30(1), 290–293 (2013) 25. Wang, L.J.: Task scheduling scheme based on bio-symbiosis mechanism to improve particle swarm optimization in cloud computing. Telecommun. Sci. 32(9), 113–119 (2016)
Chapter 47
Technological Breakthroughs in Dentistry: A Paradigm Shift Towards a Smart Future Anjana Raut, Swati Samantaray, and P. Arun Kumar
Abstract As a part of medical science, dentistry stands at the forefront in the use of newer technologies and advanced materials required to address the need for complex oral environments. Digitalization has become a part of dentistry with the transformation of routine dental practices into more predictable treatment results and patient care. In the new information age, dental professionals too have begun to adopt technologically supported methods to improvise their practice. There is enormous scope for the application of digitalization in dentistry, and the marketing avenues and trends are numerous. The emerging field of digital shift and workflow models, techniques and equipment have changed practice management towards a more promising future. This paper overviews digital developments in dentistry and identifies the gaps for future research. It also addresses the necessity of digitalization in practice, barriers encountered in digital dentistry, and primarily the ethical challenges in adaptation to digitalization.
47.1 Introduction to the Digital Transformation in Dentistry In the early 1990s, the digital health infrastructure of our country was in a rudimentary stage as far as the healthcare industry is concerned. Dentistry also needed an amelioration in standard-based digital systems. With the launch of digital radiography, intraoral scanning tools, and CAD/CAM system, contemporary dental practice shifted to digital dentistry. The Cone Beam Computed Tomography was a breakthrough in the early diagnosis and therapeutic management of craniofacial region. The early 2000s, marks the conglomeration of hardware, software, and materials to A. Raut · P. A. Kumar Kalinga Institute of Dental Sciences, Bhubaneswar, Odisha, India e-mail: [email protected] S. Samantaray (B) School of Humanities, Bhubaneswar, Odisha, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_47
517
518
A. Raut et al.
accomplish milestones in clinical dentistry. Conventional treatment methods have dominated practice for decades, but with changing technological trends as well as increasing demand for superior aesthetics, longevity, and perfection in the workflow, optoelectronics and robotics, digital dentistry has emerged as a speciality. Different philosophies and approaches as documented in recent past have worked together to integrate multiple engineering applications for oral rehabilitation. The presentday dentistry is called the golden era of dentistry owing to advances in computerbased technologies, improved workflow, precision, minimal treatment time and better quality in rendering service to patients. Broad-spectrum applications of digitalization in dentistry include 3D imaging Cone Beam Computed Tomography, Digital Radiography, Virtual planning and guided surgery, 3D printing and Rapid Prototyping, Intraoral scanning for patient counselling, Digital impression making and laboratory communication, Computer-aided design and computer-aided manufacturing, Digital Smile Designing, Temporomandibular joint (TMJ) analysis and diagnosis, Photography(extraoral and intraoral) and Material science. By virtue of hardware (display) and software (processing) the user experiences a virtual (non-real) environment and is able to interact with virtual world from diverse angles. VR has applications in Finite Element Analysis (FEA) of virtual prototypes, restoration designing and CAD systems. AR comprises of superimposition of virtual and real images for applications like guided maxillofacial surgery, implant placement and mandibular movement analysis (Fig. 47.1). Fig. 47.1 Components of dental digital technology
47 Technological Breakthroughs in Dentistry …
519
47.2 Review of Literature Many research works have been undertaken pertaining to digitalization and its applications in dentistry; however, some of the important scholarly works done in this field are mentioned chronologically (based on VR, AR and CAD): Pradies et al. [1] proposed a novel digital impression method for multiple implants involving the use of stereophotogrammetry (3D Camera technology) and concluded that the frameworks manufactured from this technique showed a correct clinical passive fit [1]. Kattadiyil et al. [2] in their study compared clinical treatment outcomes, patient satisfaction, and dental student preferences for digitally and conventionally processed complete dentures in a predoctoral setting [2]. Patients reported significantly higher overall more satisfaction scores with digital dentures. Vogtlin et al. [3] conducted a study to compare the accuracy of master models based on two intraoral digital scanners and silicone impressions and concluded that the accuracy of the master models which are obtained from digital scans is clinically sufficient to fabricate dental restorations [3]. Mostafa et al. [4] compared the marginal fit of all-ceramic crowns fabricated with digital impression and manufacturing, digital impression and traditional pressed manufacturing, and traditional impression and manufacturing concluded that digital impression and CAD/CAM technology is a suitable, better alternative to traditional impression and manufacturing [4]. Cervino et al. [5] prepared virtual models of osstem implant system and studied loading in different directions. Finite element analysis (FEA) was used for stress distribution analysis of implant-prosthetic components that were again expressed in different colour distribution. This software enabled testing of any material on virtual mode before actually being produced and used clinically. The whole method was based on mathematical logic and scope for error was zero [5]. Omar and Duarte [6] studied reliability of various software of digital smile designing (DSD). This study included Photoshop CS6, Keynote, Planmeca Romexis Smile Design, CEREC SW 4.2, Aesthetic Digital Smile Design, Smile Designer Pro, DSD App, and VisagiSMile [6]. No significant differences were found. Cervino et al. [7] carried out systematic analysis of published papers on digital smile designing (DSD) software usage. This study reflected that digital workflow is reliable and helpful tool in educating patients regarding predicted outcomes before starting the clinical procedure [7]. Kessler et al. [8] published an article explaining a practical and scientific overview of the nature, application, advantages, and disadvantages of the different additive procedures (3D Printing) in dentistry. Due to the elimination of production restrictions, it is now possible to produce dental work on an industrial level, economically and with increased complexity on-site [8].
520
A. Raut et al.
47.3 Objectives For an evidence-based approach, extensive research areas need to be identified and long-term studies must report how digital technology is bridging the existing gap between conventional and digital modes of treatment. Currently, literature available is scarce on digitalization in the field of dentistry due to lack of clinical trials and timely documentation. The objectives of this research paper are to review the necessity of digitalization in practice, barriers and ethical challenges encountered in digital dentistry and the recent applications.
47.4 Discussion Digitalisation has introduced newer technologies like ‘Dental Informatics’ and ‘Cloud Computing’ to facilitate communication, documentation, manufacture and timely delivery of dental treatment and care [9]. It has become easy to store the records of patients which include tabulating clinical data, photographs and radiographs and subsequently planning a comprehensive treatment protocol. Various researches and applications are ongoing to amalgamate these technologies collectively to create improved therapeutic treatment option.
47.4.1 Digitalized Patient Records and Enhanced Quality of Treatment Electronic patient records ensure clear exchange of information between dentists, patients and dental laboratory technicians. Photographic images of intraoral structures and digital radiographs are easy to store and transfer to patients and health care professionals seamlessly through cloud-based dental practice management software. The intraoral scanned images of prepared teeth when viewed in high contrast on a computer screen permits real-time modification to avoid fabrication error. Digital impression systems for crowns and bridges offer accurate fit and precision and save time. However, multiple tooth preparations and multiple implants are challenging to scan accurately [10]. Digital radiology offers information that can be stored for years and makes patient counselling hassle-free. Digital images are safe as they have lower exposure doses, minimal instrumentation, no processing errors, image contrast enhancement, reduced working time and simplified manipulation [11]. Digital file of photographs and models can also be prepared and stored for future reference. Digital models are equally reliable and clinically acceptable and can be used by CAD/CAM systems [12].
47 Technological Breakthroughs in Dentistry …
521
47.4.2 Systematic Storage of Individual Patient Data and Enhanced Patient Experience Innovative analytic and design software reduce human errors and cost of storage [13]. The process of digitalization increases the volume of data exponentially in healthcare records, insurance cards and the like; for that reason, analog format of data are easy to use, extract, transfer, or link. Cloud storage is commonly used dental data storage system maintained on monthly fee. Moreover, data superimposition, comparison and multiple treatment options and most predictable treatment outcomes can be determined by the clinician. Improved treatment experience is facilitated by the electronic modes of communication between dental office and patient pertaining to appointments, billing and review reducing the waiting period significantly. Digitalization brings clarity by elevating discussion platform for treatment planning, patient counselling and consent [14]. Single-visit indirect restoration has been made possible because of digital tools and recent software. Compared to conventional elastomeric impressions intraoral scanning offers greater comfort for majority of patients. Rapid prototyping and its application in dentistry require digital data and software. Moreover, it further enhances production by automation and high fidelity giving more precise and accurate fit [15].
47.5 Probable Barriers in Adopting Digital Technology For adopting new digital technologies, the dental experts need to take into account critical appraisal of the relative advantages as compared to conventional methods with regard to clinical outcome, financial constraints, and time spent. The barriers that are recognized while adopting digital technology are [16–18] high installation cost and timely technical support, absence of adequate computer knowledge, financial disadvantages (taxes), high equipment costs, understanding of risk and liability, workflow interruption, security issues, communication and collaboration between professionals, technical support, lack of basic research and regulatory concerns.
47.6 Ethical Challenges in Adaptation to Digital Dentistry Some of the key challenges associated with digitalization in dentistry are.
522
A. Raut et al.
47.6.1 Data Security and Digital Literacy Digital data can be manipulated easily compared to analog data. It is a challenge to ensure proper storage, sharing and use of the sensitive information and prevent any copying or tempering of original data without the patient’s knowledge. Therefore data security is an ethical issue in case of mismanagement and must comply with federal law. In the contemporary scenario, access to information is increasingly through dental informatics. The operator requires to have good cognitive and technical skills. Therefore, it is essential that the clinician is well trained to use latest digital platforms for patient counselling, diagnosis as well as treatment planning. Digital technology requires constant updates and changes to cater to changing software. Patient needs to give written consent prior to intervention which will be possible provided the technology is fully explained to the beneficiary.
47.6.2 Dental Practitioner-Patient Relationship and Overdiagnosis The latest information and communication technologies enable patients to gather information in plenty, thereby making patient education and counselling more effective. However, many predatory journals and magazines as well as commercial health apps offer invalidated and unverified data that can obscure the users [19]. To keep abreast of new developments dental clinicians must understand informatics so as to maintain competency. Treatment procedure can be explained efficiently to the patient in a virtual reality mode. Overdiagnosis and excessive treatment both are alarming for the patients. The paradigm shift to digitalization has made patient work more precise and comfortable, but it has incurred to added cost. In order to recoup the investment cost the clinician tends to use more frequently digital aid and technology than otherwise indicated.
47.6.3 Frequent Technology Update Protagonists of digitalization support digital technologies for future needs. However, the commercialization of newer devices brings forth frequent developments with additional features or regular updates to make the deal more lucrative. This indeed adds to new growth but escalates cost. Therefore, the sustainability aspect of digitalization is another determining factor [20].
47 Technological Breakthroughs in Dentistry …
523
47.7 Digitalization and Global Market The Asia–Pacific digital dentistry market is vast and its growth potential is yet to be explored in the coming years. A recent study has reported high levels of growth through 2024, reaching almost $1.1 billion. Market analysts project a tremendous scope of CAD/CAM systems and related dental materials (as shown in Fig. 47.2). China is constantly growing its digital market and expected to become the biggest hub of digital trends in dentistry. The CAD/CAM system is gradually bringing a paradigm shift in the Indian market and will reach a significant level by 2024 [21]. CAD system assisted smile designing includes intelligent simulation and previsualization of the therapeutic outcome. However, when the patients are not attuned or without inclination towards the technology they need to rely on the decision taken by the dentist [22]. AR-supported navigation systems provide better depth perception intraoperatively but high cost and procedure time limit its wide application [23]. Using digital designed files and subtractive or additive CAD/CAM technologies, the prosthodontics treatment of the elderly patients became shorter, more accurate and greatly predictable [2]. The procedure is beneficial for communicating with the patients, increasing their understanding and acceptance of the proposed treatment. However, the autonomy principle is mandatory to be followed, privacy concerns need to be taken care of, and obtaining informed consent is required for sharing patients’ data with the dental laboratory and colleagues from other specialties.
Fig. 47.2 Transition to digital dentistry in Asia pacific region
524
A. Raut et al.
47.8 3D Imaging and Digital Radiography Advance imaging methods like magnetic resonance imaging (MRI) and cone beam computed tomography (CBCT) assist to diagnose errors that would have been missed by human eye. This helps to identify organs at risk of head and neck cancer or cystic growth. Temporomandibular disorders (TMD) are also easy to diagnose and assess degree of impairment before appropriate intervention is initiated. From a Digital Imaging and Communication in Medicine (DICOM) file obtained from cone-beam computed tomography available bone density, bone quality, pathologic enlargements and important structures like blood vessels and nerve bundles are critically assessed and radiographic diagnosis obtained. Moreover, digital radiography can identify root canals, vertical root fracture and lesions around the root apex.
47.9 Use of Digital and Virtual Devices Digital tracers are more accurate and eliminate the error possibility while recording maxillomandibular relations. They can be repeated with ease since they are reliable and recordable. Moreover, the digital images can be shared for better communication. Intraoral tracing device is connected to a computer and patient is instructed and trained to make mandibular movements in centric and eccentric position. The sensor records the movement and digital display helps to assess jaw relation recording [24]. The virtual articulator (VR) overcomes the limitations of the conventional articulator by virtually articulating the models both in static and dynamic relationship. Moreover, it helps to appreciate the real maxillomandibular relationship and jaw movements on a virtual screen. This reduces instrument error drastically and prevents a faulty prosthesis. Instrumentation is drastically reduced and patient appointments are further minimized. This saves a lot of laboratory time and material.
47.10 Digital Shade Matching Shade matching is very challenging and has remained an inherently complex task due to inconsistencies in the clinician’s perception of tooth structure shade. Dental professionals experience difficulty in communicating the proper shade to the dental laboratory. The ceramist is tasked to interpret the correct shade information to create highly aesthetic restorations. Colour selection with shade guides is commonly used method but it incorporates error base on operator’s perceiving ability. Digitalization of the shade matching procedure makes the method more standardized and minimizes error. Computerized digital shade guides make the procedure simplified and easy to interpret. Effective digital photography along with adobe photoshop can significantly help in the shade determination process. It is recommended to capture image in
47 Technological Breakthroughs in Dentistry …
525
RAW file format with most colour information. Electronic shade-matching devices such as colorimeter and spectrophotometer have revolutionized shade selection and emerging widespread compared to conventional [25]. They are more reliable and provide a quantified shade data that can be reproduced multiple times. Its application in research studies has made optical property analysis more predictable [26].
47.11 Digital Planning and Guided Surgery The implant prosthesis follows the concept of reverse planning as the design of the prosthesis is planned prior to surgery and choice of implant used [27]. The CT scan images enable a virtual surgery by predicting the optimal position of implants, bone anchorage and the future prosthesis that these implants will receive based on VR (as shown in Figs. 47.3 and 47.4). A surgical guide is fundamentally a transfer tool. Its objective is to transfer the diagnostic and planning of both surgical and prosthetic aspects of treatment from the planning stage to the patient during surgery. Being a very meticulous transfer tool, CAD/CAM surgical guides requires a very precise treatment planning. The basis of its conception started with a well-framed prosthetic design taking into account of Fig. 47.3 Virtual implant placement
Fig. 47.4 Virtual implant prosthesis
526
A. Raut et al.
patients needs, desired functional and aesthetics outcomes. Then, one must plan the desired implant positions and distribution in accordance with prosthetic planning, bone density, biomechanical factors and availability. The surgical guide will then permit the clinician to perform their surgery with the utmost precision, transferring the design carefully conceived during the planning phase. The planning by the clinician will result in a third data set containing the implant position. The combination of radiological, implant positions and intraoral surface datasets via software will generate the necessary information for the constructing the surgical guide. Recently an in vitro study compared the accuracy of implant placement using a CAD/CAM surgical guide to conventional placement method. The results showed that the average differences between the planned and actual entry points in the different directions, lengths, and angles of the implants and the osteotomy showed a considerable reduction in the CAD/CAM group. Hence, it was concluded that accuracy of implant placement was improved using an innovative CAD/CAM surgical template [28].
47.12 Material Science and Digital Technology The three basic restorative materials—ceramics, metal and polymers have now moved into high technology phase of development and structural modifications. In the past decade, the demand for non-metallic biocompatible dental restorative material has increased. Digital systems like CAD/CAM technologies have been integrated to automate fabrication of teeth alike restorations. Dental ceramics have evolved from conventional firing methods to CAD-produced and 3D-printed substitutes. They can be electronically designed and precision milled or additively layered within few minutes and delivered on the same day. CAD/CAM system together with zirconia all-ceramic has enabled fabrication of void-free restorations without firing shrinkage and excellent adaptation. They are the most logical materials to be used with dental implants. Direct metal laser sintering and micromachining have increased demand for high-strength components for dental application. Titanium and its alloys are the most inert, strong and biocompatible metals desirable for intraoral use. 3D printed resins are the next-generation polymers that are being researched and explored (as shown in Fig. 47.5). Additive manufacturing (AM) is a revolutionary technology
Fig. 47.5 3D printed produced restoration
47 Technological Breakthroughs in Dentistry …
527
and facilitates production of complex structures. There is reduction of productionrelated material loss. It bridges the gap between conventional and digital procedures by overcoming limited programming of available systems and reducing laboratory fabrication time.
47.13 Forensic Dentistry and Medico-Legal Considerations Dental informatics is highly useful in retrieving data of deceased, missing or antisocial subjects for their identification and tracking. It aids access to important health information like medications, allergies, systemic illness and the like. Digital approach in delivering care has enabled the patient and the operator to view the treatment outcome before they start. This encourages better understanding and realistic expectations from the service imparted by the dental professionals. Therefore, medico-legal complexities can be avoided by both parties as communication is better established. Digital Smile Design is an excellent tool to predict final results from an aesthetic rehabilitation. Another medico-legal aspect revolves around development of newer materials pertaining to their safe testing on patients, quality assurance and minimal associated risk.
47.14 Conclusion and Future Scope of Digitalization in Dental Practice Future developments in dentistry must aim at optimizing surface quality and increasing process reliability and property gradients within the materials at lower costs and with shorter production times. There is a vast scope to explore. Robotics and Artificial Intelligence are the next generation tools as they lead us towards a smarter future. They can assist the dental professional in making decisions and offering predictable treatment and also help patient understand disease and its prognosis in greater depth and dimension. Digitalization is gradually overpowering conventional dental practice in a way to revolutionize the quotidian methods and practice. Technological advancements are a continuous process of evolution for sustainability. The best-performing technology is finally decided by adaptation and understanding of reality.
References 1. Pradíes, G., Ferreiroa, A., Özcan, M., Giménez, B., Martínez-Rus, F.: Using stereophotogrammetric technology for obtaining intraoral digital impressions of implants. J. Am. Dent. Assoc. 145(4), 338–344 (2014). https://doi.org/10.14219/jada.2013.45
528
A. Raut et al.
2. Kattadiyil, M.T., Jekki, R., Goodacre, C.J., Baba, N.Z.: Comparison of treatment outcomes in digital and conventional complete removable dental prosthesis fabrications in a predoctoral setting. J. Prosthet. Dent. 114(6), 818–825 (2015). https://doi.org/10.1016/j.prosdent.2015. 08.001 3. Vögtlin, C., Schulz, G., Jäger, K., Müller, B.: Comparing the accuracy of master models based on digital intra-oral scanners with conventional plaster casts. Phys. Med. 1(1), 20–26 (2016). https://doi.org/10.1016/j.phmed.2016.04.002 4. Mostafa, N.Z., Ruse, N.D., Ford, N.L., Carvalho, R.M., Wyatt, C.C.: Marginal fit of lithium disilicate crowns fabricated using conventional and digital methodology: a three dimensional analysis. J. Prosthodont. 27(2), 145–152 (2018).https://doi.org/10.1111/jopr.12656 5. Cervino, G., Romeo, U., Lauritano, F., Bramanti, E., Fiorillo, L., D’Amico, C., Milone, D., Laino, L., Campolongo, F., Rapisarda, S., et al.: Fem and von mises analysis of OSSTEM® dental implant structural components: Evaluation of different direction dynamic loads. Open Dent. J. 12, 219–229 (2018) 6. Omar, D., Duarte, C.: The application of parameters for comprehensive smile esthetics by digital smile design programs: a review of literature. Saudi Dent. J. 30(1), 7–12 (2018). https:// doi.org/10.1016/j.sdentj.2017.09.001 7. Cervino, G., Fiorillo, L., Arzukanyan, A.V., Spagnuolo, G., Cicciù, M.: Dental restorative digital workflow: digital smile design from aesthetic to function. Dent. J. (Basel). 7(2), 30 (2019). https://doi.org/10.3390/dj7020030 8. Kessler, A., Hickel, R., Reymus, M.: 3D Printing in dentistry—State of the art. Oper. Dent. 45(1), 30–40 (2020) 9. Jain, P., Gupta, M.: Digitization in Dentistry. Springer (2021). 10. Fasbinder, D.J.: Computerized technology for restorative dentistry. Am. J. Dent. 26(3), 115–120 (2013) 11. Ertas, E.T., Küçükyılmaz, E., Erta¸s, H., Sava¸s, S., Atıcı, M.Y.: A comparative study of different radiographic methods for detecting occlusal caries lesions. Caries Res. 48(6), 566–574 (2014). https://doi.org/10.1159/000357596 12. Czarnota, J., Hey, J., Fuhrmann, R.: Measurements using orthodontic analysis software on digital models obtained by 3D scans of plaster casts. J. Orofac. Orthop. 77(1), 22–30 (2016). https://doi.org/10.1007/s00056-015-0004-2 13. Dinkova, M., Yordanova, G., Dzhonev, I.: 3D archive in dental practice–A technology of new generation. IJSR. 3(11), 1574–1576 (2014) 14. Ahlholm, P., Sipilä, K., Vallittu, P., Jakonen, M., Kotiranta, U.: Digital versus conventional impressions in fixed prosthodontics: a review. J. Prosthodont. 27(1), 35–41 (2018). https://doi. org/10.1111/jopr.12527 15. Agbaje, J.O., Jacobs, R., Maes, F., Michiels, K., Van Steenberghe, D.: Volumetric analysis of extraction sockets using cone beam computed tomography: a pilot study on ex vivo jaw bone. J. Clin. Periodontol. 34(11), 985–990 (2007). https://doi.org/10.1111/j.1600-051X.2007.011 34.x 16. Joda, T., Katsoulis, J., Brägger, U.: Clinical fitting and adjustment time for implant-supported crowns comparing digital and conventional workflows. Clin. Implant. Dent. Relat. Res. 18(5), 946–954 (2016). https://doi.org/10.1111/cid.12377 17. Chochlidakis, K.M., Papaspyridakos, P., Geminiani, A., Chen, C.J., Feng, I.J., Ercoli, C.: Digital versus conventional impressions for fixed prosthodontics: a systematic review and meta-analysis. J. Prosthet. Dent. 116(2), 184–190 (2016). https://doi.org/10.1016/j.prosdent. 2015.12.017 18. Calberson, F.L., Hommez, G.M., De Moor, R.J.: Fraudulent use of digital radiography: methods to detect and protect digital radiographs. J. Endod. 34(5), 530–536 (2008). https://doi.org/10. 1016/j.joen.2008.01.019 19. Gani, F., Evans, W.G., Harryparsad, A., Sykes, L.M.: Social media and dentistry: Part 8: ethical, legal, and professional concerns with the use of internet sites by health care professionals. South Afr. Dent. J. 72(3):132–137 (2017). https://hdl.handle.net/10520/EJC-7e9d22af0
47 Technological Breakthroughs in Dentistry …
529
20. Rolston III H.: A New Environmental Ethics: The Next Millennium for Life on Earth. Routledge (2012). https://doi.org/10.4324/9780203804339 21. Asia pacific digital dentistry market report (2014–2024). Market size, share, price, trend and forecast. idataresearch.com 22. Gross, D., Gross, K., Wilhelmy, S.: Digitalization in dentistry: ethical challenges and implications. Quintessence Int. 50(10) (2019). https://doi.org/10.3290/j.qi.a43151 23. Al-Khaled, I., Al-Khaled, A., Abutayyem, H.: Augmented reality in dentistry: uses and applications in the digital era. Edelweiss Appli. Sci. Tech. 5, 25–32 (2021) 24. Gupta, M.: A comparative clinic-radiographic analysis of horizontal condylar guidance determined by height tracer, novel indigenous intraoral digi tracer and check bite in complete denture prosthesis. Int. J. Curr. Res. 9, 49940–49946 (2017) 25. Chu, S.J., Trushkowsky, R.D., Paravina, R.D.: Dental color matching instruments and systems. Rev. Clin. Res. Aspects. Dent. J. (Basel). 38:e2–e16 (2010). https://doi.org/10.1016/j.jdent. 2010.07.001 26. Paravina, R.D.: Performance assessment of dental shade guides. Dent. J. (Basel) 1(37), e15-20 (2009). https://doi.org/10.1016/j.jdent.2009.02.005 27. Ramasamy, M., Giri, R.R., Subramonian, K., Narendrakumar, R.: Implant surgical guides: from the past to the present. J. Pharm Bioallied Sci. 5(Suppl 1), S98. 10.4103%2F0975-7406.113306 28. Nokar, S., Moslehifard, E., Bahman, T., Bayanzadeh, M., Nasirpouri, F., Nokar, A.: Accuracy of implant placement using a CAD/CAM surgical guide: an in vitro study. Int. J. Oral Maxillofac Implants. 26(3) (2011). 10.5037%2Fjomr.2018.9101
Chapter 48
A Study on Smart Agriculture Using Various Sensors and Agrobot: A Case Study Shraban Kumar Apat, Jyotirmaya Mishra, K. Srujan Raju, and Neelamadhab Padhy Abstract The agricultural sector is rising as a high-tech industry, attracting young workers, new enterprises, and buyers. The technology is increasingly evolving, improving farmers’ processing capability, and progressing robotics and automation technology as we know it. The introduction of sensors and agrobot has sparked a new path in agricultural and farming science. Smart agriculture is an innovative theory about the technologies since different sensors provide information on agricultural fields. Most agriculture these days is fully automated with programmed autonomous robots. We addressed a seed sowing agrobot in our article. It is intended primarily to ease farmers’ work. In this article, we have used the decision table and cluster approach. The main aim is to limit the work of farmers. The rudimentary power of soil seeds and soil covering is illustrated by frameworks such as the temperature, moisture, and even movement of animals, giving results in agriculture. With the IoT application, AI sensors will track the crop field and provide farmers with preventive measures to alert them of any errors via SMS. This paper has developed an IoT system to use arms, performance, and analysis to control crop development agrobot seeds.
S. K. Apat (B) School of Engineering and Technology (Department of CSE), GIET University, Gunupur, Odisha, India e-mail: [email protected] J. Mishra · N. Padhy School of Engineering and Technology, Dept of CSE, GIET University, Gunupur, Odisha, India e-mail: [email protected] N. Padhy e-mail: [email protected] K. S. Raju CMR Technical Campus, Hyderabad, Telangana, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_48
531
532
S. K. Apat et al.
48.1 Introduction Substantive innovations have been made throughout human history to increase agricultural yield with less effort and resources. Nonetheless, during all these times, the high population rates never match demand and supply. The forecast figures suggest that the world’s population will reach around 9.8 billion people in 2050, roughly 25 percent higher than the current figure. Nearly the entire population growth in the developing countries listed is expected to occur. On the other hand, the urbanization trend is expected to continue to intensify, with about 70% of the world’s population anticipated to be urban by 2050 (now 49%). In addition, the levels of income are numerous, contributing to more food demand in the developed world in particular. Agricultural robotics automate long, routine, and boring operations, encouraging farmers to concentrate more on growing average productivity. Some of the most popular robotics in agriculture are used: cultivation, planting, pest and disease detection, weed control, autonomous mowing, pruning, seeding, spritzing and dilution, phénotyping, triage and packaging, cultivation, and choice. Owing to their accuracy, these are the essential robotic applications in agriculture. In our proposed paper, we have used an agrobot and have tracked crop growth on the agricultural side by the IoT unit.
48.2 Literature Survey It looked at the differences in WSNs and their potential to promote different improvements in agricultural applications. It optionally examines the suitability of WSNs for increased efficiency and profitability for most applications in agriculture and agriculture. It covers system design, node design, and agricultural communication technology standards. Nodes can be found in real wireless devices and various sensors such as soil, air, pH, and plants. It provided a comprehensive overview of sophisticated farm WSN applications. It contains details regarding the WSN system, node architectures, related components, and a unique application classification. It lists all of the wireless equipment nodes that are present, as well as the various methods for communicating with the sensors. Exactness agriculture uses WMSN to alter good irrigation. It is particularly in inexperienced house atmospheres in IoT and WMSN applications in agriculture. The power of feedback handling techniques in inexperienced household irrigation is often explained and unquestionably. A diode check to visualize these two methods was a semiconductor. The irrigation unit techniques are mainly irrigated on a timeline or with feedback. The planned irrigation shall provide the plant with water at a specific time. The primary basis for irrigation feedback should be irrigated when wet conditions or medium levels are predetermined. The check shows that the average saving per tree is one, 500 mL per day. During the WSN or WMSN greenhouse atmosphere, the test shows that the most effective contrast to scheduled irrigation
48 A Study on Smart Agriculture Using Various Sensors …
533
lies between an in-depth bowl system or associated automatic water irrigation. Automated irrigation will improve the use of water and fertilizer and further preserve the condition or humidity of the field in the same way that the expert indicated. The wireless sensors network (WSN) mainly uses call support systems. This system solves several problems. Exactness in agriculture is the main attention-grabbing field, along with a growing need for call assistance systems. Agriculture is often connected to IoT by device networks, which helps US agriculturists, farmers, and crops to establish relations between farmers as an alternative to geographical variations. The benefit of this scheme gives a period for farmers to make reasonable choices on lands and crops. Agriculture (PA) might minimize water fertilizer use, although increasing crop production and increased output may improve the industry’s weather inquiry. Various benefits include applying the WSN in precision. It isolates the administration of the entire structure. This can make it easier for naive users to reach and understand the results. Even the farmer keeps getting the notifications for almost every connected event occurring in the field. The computer network structure that allows agriculture to be linked to the IoT is focused in particular. The relationship establishes connexions between agronomists, and farms and enhances the assembly of agricultural products. It is an extensive system to achieve agricultural precision. WSN is another economic approach to improve and depend on agricultural resources. Precision farming systems that are hit by net of things (IoT) technologies are especially well-illustrated on the hardware and network architecture of the irrigation system’s software package management method. The machine gathers data from sensors and tracks data on a circuit that activates predetermined threshold values assisted by management gadgets. Weeding, watering, signing rain, treating birds and cattle, tracking, and following up are examples of adaptive and responsive GPSbased remote operating systems. It also includes sensitive irrigation, with receptive management and smart decision-making in real-time data. Finally, the warehouse organization provides temperature regulation, moisture conservation, and theft detection in the warehouse. These operations govern every remote responsive gizmo or laptop connected to the Internet and the function of connecting sensors, Wi-Fi or angularity units, camera, microcontroller, and Raspberry Pi through the exploitation. Agricultural measurement temperature and wetness by using single-chipped sensors CC3200. To collect images and send images via the mms to mobile phone wireless farmers, the CC3200 interfaces with the camera.CC3200 victimization is mainly based on microcontrollers, Network processors, and Wi-Fi systems. This is a compact, low-cost, safe, and quick association for battery-operated systems. Baz-Lomba et al. [1] proposed a model for selecting a particular crop that can be grown in a particular soil. The dataset includes the soil parameters like nitrogen, phosphorus, and potassium (N, P, K) and pH value of the soil. It is implemented using machine learning algorithm, i.e., K-nearest neighbor with cross-validation having obtained an accuracy of 88% to predict the type of crop that can be grown. In the future, this prediction can be made by using IoT to obtain real-time soil values. Baz-Lomba et al. [2] suggested a model of the cultivable in a particular soil using a Naive-Bayes algorithm, which contains two site suitability datasets: soil and crop
534
S. K. Apat et al.
requirement datasets specifying the conditions required to grow the soil and crop requirement datasets. The machine learning algorithm was found to give an accuracy of 60%, which gave a successful result. In the future, we can add more data of the different locations of the country so that it helps the farmer of any location predict the type of crop that can grow. Shahhosseini et al. [3] suggested a model that uses a hybrid approach (crop modeling + ML) to predict crop yield by considering five ML models, i.e., linear regression, LASSO, LightGBM, random forest, XGBoost, and six ensemble models. Here, agricultural production systems simulator (APSIM) is used as input to ML models. The dataset was from USDA National Agricultural Statistics Service from the year 1984- 2018. This model increased the crop yield prediction by 8%-9%. In the future, we can add more data to the dataset, further improving crop yield prediction. Paudel et al. [4] proposed a workflow on how machine learning algorithms help forecast crop yield in multiple locations. Four machine learning algorithms, i.e., Ridge regression, K-NN support vector regression, and gradient boosted, decision tree are used to assess and compare among the state of the art of the other techniques. It with the prediction of the “null” technique. This model uses MARS Crop Yield Forecasting System (MCYFS) data from the year 1993–2015. The workflow showed a better accuracy of prediction than the null method. New data sources can be added in the future, and more algorithms can be applied for more accurate predictions. Cao et al. [5] integrated data from multiple sources and used machine learning methods, i.e., least absolute shrinkage and selection operator (LASSO), random forest (RF), and deep learning method, i.e., long short-term memory networks (LSTM), to predict rice yield across China. The dataset for this study includes the satellite data, climate data, and soil properties data from the year 2001–2015. The LSTM model performed better than the other two, with the highest R2 value of 0.85/0.86/0.82 and the lowest RMS E values of 357.59/347.80/614.92 kg/ha early rice, late rice, and single rice, respectively. The LSTM model also showed the highest RMSE and lowest R2 at the “early” and “peak” stages. The study concluded that ML and DL methods showed better yield prediction than traditional yield prediction methods. Further improvements can be made by using different crop prediction models and including more farming data for more accurate predictions. Abbas et al. [6] proposed a study where different machine learning algorithms, i.e., linear regression (LR), elastic net (EN), k-nearest neighbor (k-NN), and support vector regression (SVR), were implemented to predict potato tuber yield from six fields in Atlantic Canada. The data of soil and crop properties were collected through proximal sensing for 2017 and 2018. Four datasets were formed, i.e., PE-2017, PE-2018, NB-2017, NB-2018, by combining data points from the studied fields. The study showed that SVR performed better with RMSE of 5.97, 4.62, 6.60, and 6.17 t/ha for the four datasets. In the future, large datasets can be used to make more accurate predictions using different machine learning models.
48 A Study on Smart Agriculture Using Various Sensors …
535
Fig. 48.1 Seed showing agrobot
48.3 Proposed System The AI sensors play an important part in data sensing and provide data to ARM controllers with overall agricultural performance. Wireless sensors with high accuracy are used for the early checking and detection of unwanted seeds. Our paper’s AI sensor in smart agriculture better understands agricultural improvements, increased productivity, and more precision and general efficiency. AI is the most serious difficulty as the model has to reproduce the parameters found in the data. To solve problems in agriculture, parametric calculations in AI can be very useful. In our work, different IoT sensors have been used to detect and respond to certain physical entertainment by using this type of optical, electrochemical, and moisture sensor Fig. 48.1. It is also found in existing packages of time.
48.4 Design Process of Seed Sowing Agrobot With the assistance of the ATMEGA8-16PU controller, the agricultural robot can be built. The 600Mah 12 V DC power battery is linked to the L293D power supply unit, similar to the 12 V 30 RPM power supply, connected to the ATMEGA2560 versions LP3 and LP4. Figure 48.1 indicates that the chambers 1 and 2 middle elements may select the location and shift it. Therefore, the combined rotor body of each chamber is usable. Mechanism: Both the body component’s configuration and characteristics are considered. Part 2 is the seeds rotor body. The rotor body, the third section of the mid-van, includes four blades, which the blade can raise from the tank and push at
536
S. K. Apat et al.
the outlet, in compliance with the time and place required. In CAD programming, it has developed base features, robot body, and mid-car. The base component has a void over the seed component, a distance long absent from the case. The rotor body has a stretch cylinder, which can fall as the outlet for the frame shown in Fig. 48.1. The same interface is connected on either side of the seeds to be reduced accordingly. The V ATMEGA microcontroller fireboard drives the device. The composition of the natural clay and the moisture content of the soil is calculated.
48.5 Proposed Model Used A. Optical Sensors Optical sensors essentially measure and record crop and soil data in real time with the light reflection shined on growing crops. As a result, applicators will advise them to use less nitrogen in healthy plants and more nitrogen in healthier, harmful plants. The soil was measured with natural clay and moisture content. B. Electrochemical Sensors It is used in the production of processes, diagrams, and soil chemical records. The PH is obtained utilizing the ions used for these electrodes, which gives the sensation that these ions’ behavior tinges nitrate, potassium, or hydrogen tinges. This sensor provides the status required for precisión agriculture, soil nutrient, and pH. Soil testing for the formulated fertilization is to decide the number of soil nutrients observed using a specific recommendation on nutrient requirements and location of fertilization. C. HTE MIX Sensors It is used to determine both soil and environmental moisture content and temperature. It is a common environment parameter that regularly occurs and is very critical in several areas for its manipulation. It is an electric capacity-type sensor that meets the cell of a smart soil moisture sensor. D. Motion Detector Sensors Motion sensors are utilized all over the field. When the records take place around the camp, those sensors can perform a server-to-server operation and then transmit a message for each other tool after data processing, which is furthermore operating within the limits of the farm. This gadget can also be used to produce noise to remove animals from harmful crops or plants. The below mentioned Fig. 48.1 shows seed showing agrobot.
48 A Study on Smart Agriculture Using Various Sensors …
537
48.6 ARM Cortex-M Processor The sensors provide plant color information and soil reflections for every minute connected by digital pin port 1 with the arm Cortex-M processor in AI Optical Sensor Degree Residencies and varied light wavelengths reflecting near-infrared, midinfrared, and polarized light spectrum. The START read sensor data is distributed, and the electric chemical sensor drives a motor via a circuit with a specific number of transistors. The microcontroller calculates these devastating factors by stretching past 10 RH percent. By using rims (1RH percent to 100RH percent), process map and ground chemical information associated with the use of digital pinport4 can generally be obtained. This sensor contains information needed for agriculture with precision, soil nutrients, and pH. High-end linearity, low use of resources, large estimation extended quick responses to pollution, extreme sensitivity, elite performance percentage highlight AI sensors. The HTE MIX sensor serves to determine both the soil moisture content and the ambient temperature. Motion sensors can be connected to pin 5 to detect an unexpected moment that reaches the sphere and produces an umbrella noise. The results are ship to the cloud, in which, after data analysis, an action may be conducted. It gives a message to a separate instrument which is also exercised from the farm limits. It is used when it listens to sounds, makes noise, and expels animals. The IoT is the interconnected networking system that allows all objects to communicate to collect and share information. The module can also be adapted to operate as an independent Wi-Fi Internet access. A solar power panel 3 V supplying electricity to the ARM processor is provided in the module.
48.7 Analysis and Implementation Data is collected in the cloud-based IoT, and the crop analysis output is used to increase farmers’ yield. It replaces the conventional IOT-based AI sensors with cloud connectivity features such as field visualization, access to data stock from anywhere, live tracking, and end-to-end communication. This cloud-based IoT increases the productivity of soil and water, nutrient quality, and chemical products. The processing of environmental information that eliminates crop injury can be done with high precision. IoT-based farming that produces on time lowers labor costs to crop quality and increases farmers’ output.
48.8 Message Sending to Farmers The AI sensor can continuously monitor crop growth, transmit it through cloud-based IoTs, and collect data to process and analyze crops. This sensed data is saved from all sensors, and performance analysis is performed in the cloud. The data is then sent to the farmers by SMS by using the GSM module.
538
S. K. Apat et al.
48.9 Result Agrobot helps to demonstrate the temperature changes and track plant growth in sensor-based smart agriculture. Sensed data is stored in the cloud, their output analysis is performed, and mobile data is delivered to farmers—the change in temperature influences plant and manufacturing productivity. We have collected the dataset from the different sensors and applied classifier rules to the dataset. There are 295 instances used and having eight attributes.
48.9.1 Decision Table In this section, we have done the decision table model. We have applied the dataset having 295 records. The number of training instances is 295 and made 41 rules— nonmatches covered by majority class.
48.9.2 Evaluation on Training Set The below mentioned Table 48.1 for analyzing the decision tree where the different performance parameters were discussed. In Fig. 48.2, training set is evaluated with respect to their root relative squared error,RMS,mean absolute error, and correlation coefficient. Here, we have taken 100 instances and 11 attributes. All these 11 attributes are divided into ten classes, i.e., a0 to a9. Here, several clusters selected by cross-validation, but we have analyzed only two clusters a0 and a1. Below-mentioned Table 2a shows the clusters ratio 66:34 and hence describes the cluster analysis. However, Table 2b, c represents true/false for cluster 0 and cluster 1, respectively. Table 2d shows cluster instances for cluster 0 and 1 with 68% and 32% accuracy, respectively. Like this, there are ten classes formed, and the time taken to build the model (full training data): 0.34 s. Finally, the model is built and evaluated on the training set. Table 48.1 For decision table analysis
Correlation coefficient
0.6864
Mean absolute error
0.2768
Root mean squared error
0.3841
Root relative squared Error
72.7195%
Total no of instances
295
48 A Study on Smart Agriculture Using Various Sensors …
539
Fig. 48.2 Evaluation of the training set
Table 48.2 For cluster analysis
Cluster (a) Attribute
0
1
(0.66)
(0.34)
False
30.2327
17.7673
True
37.8843
18.1157
[Total]
68.117
35.883
False
32.2914
20.7086
True
35.8256
18.1157
[Total]
68.117
35.883
Cluster: a0 (b)
Cluster: a1 (c)
Clustered instances (d) 0
68 (68%)
1
32 (32%)
48.10 Conclusion and Future Scope So, this study confirmed the potential effectiveness of integrating AI algorithms into a decision-making system that implements precision farming while improving yields of crops, agriculture in tomorrow’s future. This must be developed into complete agro
540
S. K. Apat et al.
technologies with artificial intelligence, deep knowledge, and massive data systems, integrating the end system into one unit for seeding to be handled in the production forecast utilizing current technology like robotics to usher in a new era. Agrobot can be introduced by planting the soil seeds to increase average crop production considering the related criteria such as atmospheric conditions, humidity, and temperature. Based on the environment of this specific area, humidity may be regulated. This smart agricultural IoT deployment can increase crop quality. This can be humiliated by expanding the frame into the end definition via SMS, which explicitly encourages farmers to use their flexible GSM package instead of the transportable device. This process can be modified over time and manual power reduction.
References 1. Baz-, J.A., Salvatore, S., Gracia-Lor, E., Bade, R., Castiglioni, S., Castrignanò, E., Thomas, K., Causanilles, K., Hernandez, F., Kasprz-Hordern, B., Kinyua, J., McCall, A.K., Thomas, K.: Comparison of pharmaceutical, illicit drug, alcohol, nicotine, and caffeine levels in wastewater with sale, seizure, and consumption data for 8 European cities. BMC Public Health 16(1), 1–11 (2016) 2. Reddy, D.J., Kumar, M.R.: Crop yield prediction using machine learning algorithm. In: 2021 5th International Conference on Intelligent Computing and Control Systems (ICICCS), pp. 1466– 1470. IEEE (2021) 3. Shahhosseini, M., Hu, G., Huber, I., Archontoulis, S.V.: Coupling machine learning and crop modeling improve crop yield prediction in the US Corn Belt. Sci. Rep. 11(1), 1–15 (2021) 4. Paudel, D., Boogaard, H., de Wit, A., Janssen, S., Osinga, S., Pylianidis, C., Athanasiadis, I.N.: Machine learning for large-scale crop yield forecasting. Agric. Syst. 187, 103016 (2021) 5. Cao, J., Zhang, Z., Tao, F., Zhang, L., Luo, Y., Zhang, J., Xie, J.: Integrating multi-source data for rice yield prediction across china using machine learning and deep learning approaches. Agric. Forest Meteorol. 297, 108275 (2021) 6. Abbas, F., Afzaal, H., Farooque, A.A., Tang, S.: Crop yield prediction through proximal sensing and machine learning algorithms. Agronomy 10(7), 1046 (2020)
Chapter 49
Task Scheduling in Cloud Computing Using PSO Algorithm Sriperambuduri Vinay Kumar, M. Nagaratna, and Lakshmi Harika Marrivada
Abstract The performance of the cloud is largely determined by resource allocation strategies. In cloud computing, service providers have to supply their services to a huge number of customers. As a result, one of the most difficult aspects of cloud computing is assigning cloudlets/tasks to appropriate virtual machines. Different approaches can be used to allocate tasks onto virtual machines. In this paper, we offer a method based on the particle swarm optimization technique for assigning tasks to virtual machines. The primary goal is to reduce the total time of tasks. The proposed method is tested using the cloud simulator Cloudsim-3.03. The existing scheduling methods, First Come First Serve (FCFS) scheduling, and the Round Robin, Min-Min scheduling policy were used for comparison of simulated results from the proposed algorithm on scientific workflows. Overall execution time and makespan time (the time it takes for the last cloudlet to finish) of the proposed approach outperforms the other algorithms.
49.1 Introduction Cloud computing is a type of web computing where On-demand services are provided based on the needs of the customer. The services through which cloud computing works, are as follows: PaaS, IaaS, and SaaS [1]. This cloud computing framework is designed on a pay-per-use basis. In terms of customer service, cloud computing provides mobility, scalability, and flexibility. Cloudlets are the request for resources made by users in the cloud. With the resources available, these cloudlets must be managed properly. As a result, the primary objective in cloud computing is always to use systematic techniques to assign virtual machines to tasks [2]. The cloud S. V. Kumar (B) · L. H. Marrivada Vasavi College of Engineering, Hyderabad, India e-mail: [email protected] M. Nagaratna JNTUH College of Engineering, Hyderabad, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_49
541
542
S. V. Kumar et al.
Fig. 49.1 Depicts the Cloudlet allocation as a graphic depiction in the cloud
computing environment serves as a primary source for implementation of scheduled algorithms. Some fundamental cloud entities are Datacenter, Host, Tasks or Cloudlets, Virtual Machines (VMs), Datacenter Broker, & Information Service in the Cloud (CIS) [3]. A datacenter is a collection of various hosts. Each host may have a variety of virtual machines with varying specs. The metadata of many cloud entities is stored in CIS. In cloud computing, CIS server plays a key role in Data storage and management. A datacenter acts as a link between clients and the companies which provide and maintain public clouds also called as providers of cloud services (e.g., Amazon web services, Microsoft Azure, IBM Cloud, Google Cloud platform) (Fig. 49.1). It compiles all the necessary data on resource availability, usage, and communication costs. According to the scheduling algorithm used, Cloudlets are assigned to virtual machines by the broker. In our study, we used one of the task assignment problem’s solutions, namely Allocation of tasks using a particle swarm optimization technique. The overall execution time of our suggested approach is minimized.
49.2 Related Work Many studies have been undertaken to arrive at results, all of which show an increase in the use of computing facilities over the previous decade. Cloud computing services are regarded as critical. In [4] researchers did a rigorous examination of a range of job scheduling methods in 2018. Priority-based performance algorithm, templatebased genetic algorithm, hybrid multi-objective PSO algorithm, intelligent water drop algorithm, enhanced genetic algorithm, and others were tested, and none of them were found to be satisfactory on all parameters. Cloud computing is a computational talent that involves the allocation of virtual servers as well as associated services that are based on a pay-per-use basis. Scheduling is believed to be the most significant duty to access machines that are situated remotely. The task scheduling is regarded as an NP-complete problem. We require some effective and efficient scheduling approaches to obtain optimal and improved performance for cloud resources. A scheduling technique called "Priority-based
49 Task Scheduling in Cloud Computing Using PSO Algorithm
543
Performance Improved Algorithm" is discussed in the relevant work. The priority of users’ meta-tasks is taken into account by this algorithm. The Min-Min method is used to schedule the high priority meta-task, whereas the Max–Min method is used to schedule the low priority meta-task. It was determined that the proposed approach gives the shortest make span while using the minimum resources [5]. Another paper by Gajera, Vatsal, Rishabh Gupta, and Prasanta K Jana uses the Min–Max technique, which is extensively used in the field of data mining for data normalization, as the foundation for customizing the functionality to fit with cloud computing. Normalized Multi-Objective Min-Min Max–Min Scheduling emerged as the term for this method. The basic Min–Min and Max–Min techniques were outperformed by this method [6]. In 2015, Toktam Ghafarian and Bahman Javadi suggested a data-intensive process scheduling system that was cloud aware. With the goal of solving the problem by combining volunteer computing and cloud resources, the chance of increasing the efficiency of these systems’ usage increased [7]. This analytic method got quite popular because of CloudSim. Many algorithms were suggested and developed, including Weighted Round Robin, Start Time Fair Queuing, and Borrowed Virtual Time. Jambigi, Murgesh V, Vinod Desai, and Shrikanth Athanikar compared the three and found that BVT outperformed the other two [8].
49.3 Existing System 1.
First-Come-First-Serve scheduling Algorithm
Tasks that came earliest are served first in this method. Jobs that enter the queue are placed at the end of the queue [9]. Each process is taken from the front of the line one at a time. This algorithm is easy to implement. • There is no prioritizing, and as a result, every process must finally complete before any other process can be introduced. • This sort of algorithm does not perform well with time-sensitive traffic since the waiting time and latency are quite short on the top part. 2.
Round Robin scheduling Algorithm
In this algorithm, processes are implemented like FIFO, but they are limited to processor time, which is known as time-slice [10]. The processer continues to work on the next process in the waiting queue, if it was not done before the closing time specified in the process. The queue is in a state of wait. The pre-empted or new process is then moved to the back of the ready list, and new processes are put on the tail of the queue. The characteristics of the algorithm are.
544
S. V. Kumar et al.
• If we use a shorter time-slice or a quantum, we will have lesser CPU efficiency. • If we use a lengthy time-slice or quantum, we will have a slow reaction time. • Due to the length of the wait, there is a very minimal probability that deadlines will be reached. 3.
Min–Min Algorithm
For each task, the earliest or shortest completion time is calculated for all machines. The task with the shortest overall completion time is chosen and assigned to the machine with the shortest turnaround time. The drawbacks of the algorithm are. • The Min-Min algorithm’s drawback is that it prioritizes smaller tasks, wasting resources with greater processing capacity. • As a result, when the number of smaller jobs exceeds the number of large activities, the schedule generated by Min-Min is not ideal.
49.4 Proposed System Particle Swarm Optimization (PSO) Algorithm. PSO is a bird swarming algorithm inspired by nature that may be applied to identify the best solution in a search space. Each solution is supposed to be a particle in this algorithm [11], and these particles are seeking the best solution, which is considered to be food in this scenario. Each particle has its own fitness values, which must be computed in order to obtain the best solution. In each iteration of the algorithm, the location and velocity are updated. It starts with random particles and then updates the two best values, local best and global best, to get the optimum solution. The particle swarm optimizer looks for the best solution that any particle has obtained so far, which is referred to as global best (gbest). Particles that obtain the best solution so far are marked as pbest. After discovering the values in each iteration, the particle adjusts its velocity and location using the equations below [12]. vik+1 = wvik + c1 rand1 × pbesti − xik + c2 rand2 × gbest − xik xik+1 = xik + vik+1 where vik vik+1 ω cj
Velocity of particle i, at iteration k. Velocity of particle i, at iteration k + 1. Inertia weight. Acceleration coefficients; j = 1, 2
(49.1) (49.2)
49 Task Scheduling in Cloud Computing Using PSO Algorithm
545
Fig. 49.2 Represents a particle in a 2D searching space updating its velocity and location
randi xik pbesti gbest xik+1
Random number between 0 and 1; i = 1, 2 Current position of particle i at Iteration k. best position of particle i. position of the best particle in a population. Position of the particle i at iteration k.
To determine the fitness of the particle we have taken makespan as the parameter, the fitness function F is represented as F = Cmax = max{Ci } where C i represented the cost of execution of the ith particle. We determine the values of all rows in each column to update the position vector of elements based on velocity, and the element with the greatest value has its corresponding element in the position vector set to 1, while the remaining elements in that column are changed to 0. The fitness function is specified in relation to our goal. In this study, we want to minimize the time it takes to finish the execution of the final job. So, for each machine, we add the job run time of all tasks allocated to that machine, and the highest number is stated to represent the particle’s fitness. We repeat this procedure until the method’s stop requirement is fulfilled, which in our method is equal to the number of iterations. Figure 49.2 depicts the flowchart of the PSO algorithm (Fig. 49.3).
49.5 Implementation Cloudsim architecture is made up of four fundamental components that are extremely useful in setting up a basic cloud computing environment. Datacenter Broker, Virtual Machine, and Cloudlet are the elements that define this system. For our experiments, we utilized CloudSim-3.0.3, an open-source simulator, and the Eclipse Java IDE. The Cloudsim toolbox allows users to create their own resource provisioning algorithms [13, 14]. Virtual machines, data centers, applications, users, and scheduling mechanisms are all represented by Cloudsim classes. The life cycle begins with the setup of the Cloudsim environment and concludes with the simulation results [15, 16]. Following the formation of cloudlets, the scheduling method is used. We used Cloudsim to test our suggested method and compared the results to a traditional FCFS policy, Min-Min, and a Round Robin scheduling approach (Fig. 49.4).
546
S. V. Kumar et al.
Fig. 49.3 PSO algorithm flowchart
49.6 Experimental Results and Analysis In this section, we have run the PSO algorithm on the scientific workflow dataset to analyze a scenario with different parameters in a cloud computing system. We implemented proposed method in Eclipse IDE using Java. Initially, the virtual machine, task, and host configurations are as follows (Tables 49.1, 49.2 and 49.3): In the above-shown figures, various algorithms have been discussed such as PSO (Particle Swarm Optimization), FCFS (First Come First Serve), Round Robin, etc. The comparative result of execution time of algorithms shown in Fig. 49.5 where PSO performs better than FCFS (First-Come-First-Serve) and RR (Round Robin), Min–Min algorithms. In Fig. 49.6 makespan of different algorithms is shown, the graph clearly indicates that the makespan of PSO is better than the existing methods. The further study can be on other multi-objective task scheduling problems.
49.7 Conclusion and Future Work Scheduling of tasks in a cloud scenario is a difficult problem in cloud computing. It is a challenge for task schedulers nowadays to meet thousands of user demands while making the best possible use of available resources and fulfilling both user and
49 Task Scheduling in Cloud Computing Using PSO Algorithm
547
Fig. 49.4 Depicts the many stages of Cloudsim’s implementation
Table 49.1 VM parameters
Image size (MB) Int. RAM
Table 49.2 Cloudlet parameters
10,000 512
VM memory (MB)
1000
Long BW
1000
Long length
1000
Long file size
300
Long output file size
300
Int PES number
1
548
S. V. Kumar et al.
Table 49.3 Parameter setting for PSO algorithm Data centers
5
Number of Tasks/jobs
40–100
Population size
30
Iterations
500
Inertia (ω)
0.1
Learning factors c1 , c2
2.0
Fig. 49.5 Comparison of total execution time
Fig. 49.6 Comparison of algorithms based on Makespan
49 Task Scheduling in Cloud Computing Using PSO Algorithm
549
service provider requests. This paper provides an overview of several task scheduling methods that can be utilized to solve challenges while scheduling tasks. PSO outperforms the existing standard FCFS policy, Round Robin, and Min-Min scheduling methods in terms of both execution time and makespan. In the future, work will be carried out utilizing a new algorithm for handling multi-objective task scheduling problems, such as considering fault tolerance constraints while minimizing energy, cost, and load balancing.
References 1. Buyya, R.K., Garg, S.K., Versteeg, S.: A framework for ranking of cloud computing services. Future Gener. Comput. Syst. 29, 1012–1023 (2013) (S.K. Garg et al.) 2. Manan, M.R., Shah, D., Dipak, M.R., Agrawal, L., Amit, M.R., Kariyani, A.: Using a Load balancing algorithm to allocate virtual machines in cloud computing. Int. J. Comput. Sci. Inf. Technol. Secur. (IJCSITS) 3(1) (2013). ISSN 2249-9555 3. Goyal, A., Goyal, T., Singh, A.: Cloudsim: simulator for cloud computing infrastructure and modelling. In: International Conference on Modelling, optimization and computing (ICMOC) (2012) 4. Anushree, B., Xavier, A.: Comparative analysis of latest task scheduling techniques in cloud computing environment. In: Second International Conference on Computing Methodologies and Communication (ICCMC). IEEE (2018) 5. Kavitha, S., Amalarethinam, George, D.I.: Priority-based performance improved algorithm for meta-task scheduling in a cloud environment. In: 2nd International Conference on Computing and Communications Technologies (ICCCT), IEEE (2017) 6. Gajera, V., Gupta, R, Jana, P.K.: An effective multi-objective task scheduling algorithm using min-max normalization in cloud computing. In: 2nd International Conference on Applied and Theoretical Computing and Communication Technology (iCATccT). IEEE (2016) 7. Ghafarian, T., Javadi, B.: Cloud-aware data-intensive workflow scheduling on volunteer computing systems. Futur. Gener. Comput. Syst. 51, 87–97 (2015) 8. Jambigi, M.V., Desai, V., Athanikar, S.: Comparative analysis of different algorithms for scheduling of tasks in cloud environments. In: International Conference on Computational Techniques, Electronics and Mechanical Systems (CTEMS). IEEE (2018) 9. Kaur1, A., Dr. Maini2, R.: Different task scheduling algorithms in cloud computing, Int. J. Latest Trends Eng. Technol. 10. Nusrat P., Dr. Agarwal, A., Dr. Ravi R.: Round Robin method for VM load balance algorithm in cloud computing. Int. J. Adv. Res. Comput. Sci. Softw. Eng. (IJRISE) 4(5) (2014). ISSN 2277-128X 11. Bratton, D., Kennedy J.: Defining a standard for particle swarm optimization. In: Proceedings of the IEEE Swarm Intelligence Symposium, Honolulu, HI, pp. 120–127 (2007) 12. Suganthan, P.N.: Optimization of particle swarms using the neighbourhood operator. Proceedings of IEEE International Conference on Evolutionary Computation, vol. 3, pp. 1958–1962 (1999) 13. Hicham, G.T., El Chaker, A.: Cloud computing CPU allocation and scheduling algorithms using CloudSim simulator. Int. J. Electr. Comput. Eng. IJECE) 6(4), 1866–1879 (2016). ISSN: 2088
550
S. V. Kumar et al.
14. Rawat, P.S., Saroha, G.P., Barthwal, V.: QoS evaluation of SaaS Modeller (task) running on virtual cloud computing location using CloudSim. Int. J. Comput. Appl. 53(13) (2012) (09758887) 15. Khatibi, A., Khatibi, O.: Criteria for the CloudSim environment. arXiv: 1807.03103v1 [cs.DC] (2018) 16. Mishra, S., Sahoo, M.N.: On using CloudSim as a cloud simulator: the un-official manual, research (2017). https://doi.org/10.13140/RG.2.2.30215.91041
Chapter 50
A Peer-to-Peer Approach for Extending Wireless Network Base for Managing IoT Edge Devices Off-Gateway Range Ramadevi Yellasiri, Sujanavan Tiruvayipati, Sridevi Tumula, and Khooturu Koutilya Reddy Abstract Internet of Things (IoT) is a feature of the future Internet that has been portrayed as a worldview which principally coordinates and empowers advancements and correspondence arrangements with an outstanding interest to characterize how current standard conventions could uphold the acknowledgment of the far technological vision. Inside this specific situation, remote sensor organizations close to handle radio correspondences directing conventions as an intent to their appropriateness toward IoT. Apart from the standard infrastructure components especially power and network which are absolutely necessary for IoT gadgets, there is also a need to eliminate such dependencies to make the IoT future ready. One such move in building wireless-fidelity (Wi-Fi) peer-to-peer (P2P) communication strategy for IoT gadgets is portrayed in the research work. An uncomplicated approach is proposed through this research in order to establish a network backbone among IoT gadgets and also make them self-powered without the need to rely on an external infrastructure. The proposed methodology would minimize the overall capital investment in IoT network and power infrastructure including their maintenance but with the trade-off for lower data rates on further expansion.
The original version of this chapter was revised: The authors “R. Yellasiri, S. Tumula, and K. K. Reddy” affiliation has been updated. The correction to this chapter is available at https://doi.org/10.1007/978-981-16-9669-5_59 R. Yellasiri (B) · S. Tumula · K. K. Reddy Department of Computer Science and Engineering, Chaitanya Bharathi Institute of Technology, Hyderabad, India e-mail: [email protected] S. Tumula e-mail: [email protected] S. Tiruvayipati Department of Computer Science and Engineering, Osmania University, Hyderabad, India e-mail: [email protected] Maturi Venkata Subba Rao Engineering College, Hyderabad, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022, corrected publication 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_50
551
552
R. Yellasiri et al.
50.1 Introduction The Internet of Things (IoT) is an arising region, and it is anything but between associated worlds brimming with physical just as virtual items, gadgets, cycles and administrations fit for giving an alternate focal point on the best way to interface them by means of the Internet. While IoT as a feature of the future Internet has been portrayed as a worldview that principally coordinates and empowers a few advancements and correspondence arrangements portray a future savvy conceivable model design [1]. Researchers have proposed and carried out a clinical emotionally supportive network thinking about P2P and IoT advancements utilizing a Smart Box [2]. It has numerous capacities and sensors, for example, body sensor, infrared sensor, seat or bed vibration control, light control, smell control and sound control. An earlier work [3] presented a performance evaluation study of LOADng routing protocol in IoT applications. LOADng is a simplified light version of AODV developed taking into account the limited resources of IoT devices. The use of LOADng is justified once several studies have exposed the drawbacks of RPL in applications with multipoint-to-point traffic, which are common in IoT environments. Without the presence of access points (APs), Wi-Fi direct, otherwise called Wi-Fi peer-to-peer (P2P), turns into an inherent facilitator for IoT applications because of its prevalence. Specifically, the P2P group-of-operators (GO) assumes the part of the AP and is answerable for making associations with P2P group-of-customers (GC). Investigations show that the systems can essentially decrease the force utilization and accordingly makes Wi-Fi P2P association more productive for IoT applications [4]. Lately various steering conventions were proposed that gives some data about the directing conventions utilized in IOT and in MANETS [5]. Different practices [6] received by associations in the investigation of IoT concentrate more about the functioning standards as opposed to improving their correspondence. P2P application in IoT will help in appropriate self-setup of gadgets for setting up powerful correspondence.
50.2 Related Work An energy proficient IoT virtualization structure with shared P2P systems administration and edge preparation was proposed by researchers [7] in which the IoT task preparing demands are served by peers. A model that incorporates an AdHoc IoT organization [8] presents the arranging ventures, as far as steering advancement and repetition, thinking about a crisis situation. Consequently, blockchain-based IoT gadget was proposed to get a safer verification [9]. Investigations have considered the prerequisites for IoT that thinks about various existing rise of fog, it is a holistic peer-to-peer (HPP) design, and application layer convention meets the necessities set for IoT [10]. To handle security, blockchain has tackled numerous downsides of IoT models. Subsequently, a proposed disseminated trust model for IoT was made to
50 A Peer-to-Peer Approach for Extending Wireless Network Base …
553
depend on another methodology called Holochain considering some security issues, like identifying mischievous activities, information respectability and accessibility [11]. The ensuing age of IoT gadgets should chip away at a multi-convention design to work with M2M correspondence alongside endpoint client interfacing to tackle the organization foundation conditions joined by excess information stream overhead. A philosophical arrangement [12] is proposed to work with change while reducing down expense and improving the arrangements through appropriate execution of edge. Numerous articles utilized in human existence send and get information continuously through a variety of organization advances including RFID, NFC, WIFI, Bluetooth, ZigBee, GPS and 4G. For productive preparation, a P2P network should be investigated. Through collaboration among innovative work groups, P2P advancements are applied to set up for taking care of issues during the utilization [13]. To facilitate the sending and synchronization of decentralized edge registering hubs, a work performed [14] which depicts a M2M circulated convention dependent on peer-to-peer (P2P) interchanges that can be executed on low-power ARM gadgets and utilized brokerless interchanges by utilizing a disseminated distribution/membership convention. Investigations have centered around utilizing IoT and AI via a P2P IoT design by running an AI model in each companion. AI models [15] predict necessity in each companion with less mistakes. Meanwhile researchers also proved that NG-RPL diminishes directing stir, which improves parcel gathering proportion, full circle time and energy utilization of P2P correspondence contrasted with standard RPL [16]. Few works have also proposed a DCNI strategy [17] where an exhaustive measurement gauges the hub significance in the geography preview. The investigations sought come to a final point that though there is a lot of research ongoing in the field of security of devices and communication protocols to enhance IoT there is still a demand in the research to fulfill the gap in implementation of cost-effective IoT solutions where network infrastructure is partially missing or completely absent, i.e., off-gateway range.
50.3 Proposed System Architecture Our organization of arrangement of the IoT components for communication has been decentralized in order to deal with the missing support of power and network infrastructure by incorporating self-sustainable nodes bringing in only simple design prerequisites in construction (see Fig. 50.1). Each IoT node is also made independent of power infrastructure by incorporating a solar energy harvesting system which is composed of a polycrystalline solar panel whose energy produced is transferred via a variable to constant DC-DC buck converter to a battery management system attached to a lithium-polymer battery pack which then drives the IoT nodes. It is to be observed that even though the nodes are
554
R. Yellasiri et al.
Fig. 50.1 Overview architecture of the proposed system where each IoT Wi-Fi repeater level contains a set of nodes that help to further the communication range themselves instead of depending on additionally laid network infrastructure
self-sustainable in terms of power, for the network they depend on the neighbors as traffic is to be handled layer by layer (see Fig. 50.2).
Fig. 50.2 Hierarchical representation of the nodes in the proposed system as branches of a tree structure for the formation of the P2P network for communication where each level acts as a layer responsible for extending the range of communication
50 A Peer-to-Peer Approach for Extending Wireless Network Base …
555
Fig. 50.3 Connections among various components of packet tracer simulator up to a depth of level-2 of the proposed system architecture
50.4 Experimental Setup In order to prove the working principles of the proposed system architecture, a similar packet tracer environment is constructed with various components connected (see Fig. 50.3). For implementing the proposed system architecture, a WRT300N wireless router was chosen as the main point of network access, and single board computer (SBC) devices are used to mimic the behavior of a general Raspberry Pi’s equivalent. They are configured and coded to closely simulate the proposed system architecture. The instructions in the IoT nodes create two parallel processes: (1) Executing the proposed system architecture where two Wi-Fi controllers are initialized in two different modes station mode to access the nodes in the upper level of the tree and access point mode to repeat the network to the lower level tree nodes in order to extend the range of the network, (2) Executes the regular IoT-related functions for fetching sensor values to update its values to the server and fetch updates from the server to issue actuator commands.
50.5 Results and Analysis The major parameters in view of notable comparison between the proposed model (Pm) versus the existing model (Em) are capital investment (Ci), maintenance cost (Mc) and performance of the network (Pn). In existing systems, the components include IoT nodes (In), networking devices (Nd) and power supply equipment (Pe). Whereas the proposed model would only require capital investment in IoT nodes (In) only.
556
R. Yellasiri et al.
Table 50.1 Assignment of cost as a value of units to various parameters
Table 50.2 Minimum capital investment required for various nodes
No. of IoT nodes
Ci(In)
Ci(Nd)
Ci(Pe)
1
1
1
1
10
10
1
10
20
20
2
20
30
30
3
30
100
100
10
100
No. of IoT nodes
Ci(Em)
Ci(Pm)
1
$150
$50
10
$1050
$500
20
$2100
$1000
30
$3150
$1500
100
$10,500
$5000
Ci(Em) = Ci
In + Ci Nd + Ci Pe Ci(Pm) = Ci
(50.1)
In
(50.2)
As Eqs. 50.1 and 50.2 are expanded, assigning cost as a value of units (1 unit = $100) to various parameters is shown in Table 50.1, the Ci(Em), i.e., capital investment for the existing model is higher compared to Ci(Pm), i.e., capital investment for the proposed model is shown in Table 50.2.1 Maintenance cost (Mc) is an yearly budget that needs to be spent on servicing as well as fixing components deployed for an IoT application. In existing systems, the components include IoT nodes (In), networking devices (Nd) and power supply equipment (Pe). Whereas in the proposed model the system would only require maintenance cost for IoT nodes (In) only. Mc(Em) = Mc
In + Mc Nd + Mc Pe
Mc(Pm) = Mc In
(50.3) (50.4)
As Eqs. 50.3 and 50.4 are expanded, assigning cost as a value of units(1 unit = $5) to various parameters is shown in Table 50.1, the Mc(Em), i.e., maintenance cost for the existing model is higher compared to Mc(Pm), i.e., maintenance cost for the proposed model is shown in Table 50.3.2 1 2
Provided that there exists one network device for providing Internet. Provided that there is negligible maintenance cost for energy harvesting modules.
50 A Peer-to-Peer Approach for Extending Wireless Network Base …
557
Table 50.3 Approximate yearly maintenance required for various nodes No. of IoT nodes
Ci(Em)
Ci(Pm)
1
$15
$5
10
$105
$50
20
$210
$100
30
$315
$150
100
$1050
$500
Table 50.4 Data rates of existing and proposed models over communication of various nodes No. of IoT nodes
Depth of the repeater network tree (Dt)
Pn(Em) (Mbps)
Pn(Pm)
1
1
1000
1000
10
1
100
100
20
2
50
25
30
3
33
11
100
10
10
1
Performance of network (Pn) in the existing system is not affected by the network expansion as the networking devices take the additional overhead but, with a division of total network bandwidth (TNB) among the number of IoT nodes. Whereas the proposed model is binded by a repeater network based on the depth of the tree (Dt) constructed out of how the IoT nodes are laid out physically and taking into the constraint of managing the routing, network address translations themselves there would be a major data rate issue. Pn(Em) = TNB/
Pn(Pm) = TNB/Dt/
In
In
(50.5) (50.6)
Assuming that a TNB of 1 Gbps (1000 Mbps) is provided to both models there would be a drastic difference in the existing and proposed as shown in Table 50.4. It is to be observed that the huge number of hops involved in communication of nodes for the proposed system is the sole cause of the drastic decrease in data rates leading to less feasibility for large-scale deployments.
50.6 Conclusion To offer understanding to specialists, we presented a novel design for the IoT deployment meant to be cost-effective considering few strategies that are unsurprising and
558
R. Yellasiri et al.
simpler design for IoT association. This proposed system is not intended where the existing network architecture is laid for multiple utility purposes but in the scenario where infrastructure is to be laid only for the purpose of IoT. It is to note that the energy harvesting modules utilized are not tested for real-time working conditions and therefore a further study on this context has to be made. Another aspect is in terms of automated or remote systems administration which still stands as a challenge as the nodes are treated to be independent from each other. Acknowledgements This work was supported by the Research Promotion Scheme (RPS) by All India Council for Technical Education (AICTE): Quality Improvement Schemes(AQIS) [Sanction Letter—File No. 8-85/FDC/RPS(POLICY-1)/2019-20] under the Ministry of Human Resource Development(HRD), Government of India (GoI).
References 1. Reina, D.G., Toral, S.L., Barrero, F., Bessis N., Asimakopoulou, E.: The Role of ad hoc networks in the internet of things: a case scenario for smart environments. In: Bessis, N., Xhafa, F., Varvarigou, D., Hill, R., Li, M. (eds.) Internet of Things and Inter-cooperative Computational Technologies for Collective Intelligence. Studies in Computational Intelligence, vol. 460. Springer, Berlin, Heidelberg (2013). https://doi.org/10.1007/978-3-642-34952-2_4 2. Kolici, V., Spaho, E., Matsuo, K., Caballe, S., Barolli, L., Xhafa, F.: Implementation of a medical support system considering P2P and IoT technologies. In: 2014 Eighth International Conference on Complex, Intelligent and Software Intensive Systems, pp. 101–106 (2014). https://doi.org/10.1109/CISIS.2014.15 3. Sobral, J.V.V., Rodrigues, J.J.P.C., Saleem, K., Al-Muhtadi, J.: Performance evaluation of LOADng routing protocol in IoT P2P and MP2P applications. In: 2016 International Multidisciplinary Conference on Computer and Energy Science (SpliTech), pp. 1–6 2016. https://doi. org/10.1109/SpliTech.2016.7555943. 4. Liao, C., Cheng, S., Domb, M.: On designing energy efficient Wi-Fi P2P connections for Internet of Things. In: 2017 IEEE 85th Vehicular Technology Conference (VTC Spring), pp. 1– 5 (2017). https://doi.org/10.1109/VTCSpring.2017.8108292 5. Noorul, T.M., Pramila, R.S., Islam, N.: An analysis of routing protocols in manets and Internet of Things. In: 2017 International Conference on IoT and Application (ICIOT). pp. 1–8 (2017). https://doi.org/10.1109/ICIOTA.2017.8073604 6. Narayandas, V., Dugyala, R., Tiruvayipati, S., Yellasiri, R.: Necessity of MANET implementation over Internet of Things: the future of dynamic communication among end devices. In: 2019 Fifth international conference on image information processing (ICIIP), pp. 359–362 (2019). https://doi.org/10.1109/ICIIP47207.2019.8985804 7. Al-Azez, Z.T., Lawey, A.Q., El-Gorashi, T.E.H., Elmirghani, J.M.H.: Energy efficient IoT virtualization framework with Peer to Peer networking and processing. IEEE Access 7, 50697– 50709. (2019) https://doi.org/10.1109/ACCESS.2019.2911117 8. Leite, J.R.E., Martins, P.S., Ursini, E.L.: Planning of AdHoc and IoT networks under emergency mode of operation. In: IEEE 10th Annual Information Technology Electronics and Mobile Communication Conference (IEMCON), pp. 1071–1080 (2019). https://doi.org/10.1109/IEM CON.2019.8936196 9. Hong, S.: P2P networking based internet of things (IoT) sensor node authentication by blockchain. Peer-to-Peer Netw. Appl. 13, 579–589 (2020). https://doi.org/10.1007/s12083019-00739-x
50 A Peer-to-Peer Approach for Extending Wireless Network Base …
559
10. Tracey, D., Sreenan, C.: How to see through the Fog? Using Peer to Peer (P2P) for the Internet of Things. In: 2019 IEEE 5th World Forum on Internet of Things (WF-IoT), pp. 47–52 (2019). https://doi.org/10.1109/WF-IoT.2019.8767275 11. Frahat, R.T., Monowar, M.M., Buhari, S.M.: Secure and scalable trust management model for IoT P2P network. In: 2019 2nd International Conference on Computer Applications & Information Security (ICCAIS), pp. 1–6 (2019). https://doi.org/10.1109/CAIS.2019.8769467 12. Tiruvayipati, S., Yellasiri, R.: Feasibility of soft real-time operations over WLAN infrastructureindependent IoT implementation by enhancing edge computing. In: Raju, K., Senkerik, R., Lanka, S., Rajagopal, V. (eds.) Data Engineering and Communication Technology. Advances in Intelligent Systems and Computing, vol. 1079. Springer, Singapore (2020). https://doi.org/ 10.1007/978-981-15-1097-7_19 13. Jo, S., Lee, J., Han, J., et al.: P2P computing for intelligence of things. Peer-to-Peer Netw. Appl. 13, 575–578 (2020). https://doi.org/10.1007/s12083-020-00887-5 14. Froiz-Míguez, I., Fraga-Lamas, P., Fernández-Caramés, T.M.: Decentralized P2P broker for M2M and IoT applications. Proceedings 54, 1–24 (2020). https://doi.org/10.3390/proceedin gs2020054024 15. Lekshmy, H.O., Krishnaprasad, T.R., Aishwaryaa, R., Vinod, R.: An IoT based P2P model for water sharing using machine learning. In: International Conference on communication and signal processing (ICCSP), pp. 0790–0794 (2020). https://doi.org/10.1109/ICCSP48568.2020. 9182081 16. Kim, Y., Paek, J.: NG-RPL for Efficient P2P routing in low-power multihop wireless networks. IEEE Access 8, 182591–182599 (2020). https://doi.org/10.1109/ACCESS.2020.3028771 17. Niu, Z., Li, Q., Ma, C., Li, H., Shan, H., Yang, F.: Identification of critical nodes for enhanced network defense in MANET-IoT networks. IEEE Access 8, 183571–183582 (2020). https:// doi.org/10.1109/ACCESS.2020.3029736
Chapter 51
Analysis of Different Methodologies for Sentiment in Hindi Language Rohith Reddy Byreddy, Saketh Malladi, B. V. S. S. Srikanth, and Venkataramana Battula
Abstract Sentiment analysis is an important task for companies, data analysis, and many other areas. Until now, a lot of research has been done in English and other predominant languages. This paper presents a comparative analysis of different approaches for sentiment analysis in the Hindi language. In the first method, Hindi language sentences are translated into the English language, and sentiment analysis is performed on the translated data. Contrary to this approach, in the second method, the sentiment analysis is done without translation of Hindi sentences. In the next method, linear discriminant analysis (LDA) is applied to the features to reduce the dimensions which makes a probabilistic linear classification model. With recent developments of neural networks in NLP, a convolution neural network-based approach as a classification method is implemented to perform sentiment analysis. A complete comparative analysis is done in this method to find the best sentiment classification system. Various evaluation metrics with different classifiers are also discussed in this paper.
51.1 Introduction Sentiment analysis is the process of extracting the opinions of people and use it to understand the people’s attitude, reactions expressed on the Web regarding the various issues in the world and is also known as opinion mining. Nowadays, with the increasing usage of Internet, a lot of information is available on the Web which consists of reviews on different products, movies, books, technologies etc. People express their views, opinions etc., on the different products, services, books etc., on the Web. Classifying the statements into positive (1) and negative (0) is called sentiment analysis. R. R. Byreddy (B) · S. Malladi · B. V. S. S. Srikanth · V. Battula Department of Computer Science Engineering, Maturi Venkata Subba Rao (MVSR) Engineering College, Hyderabad, India V. Battula e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_51
561
562
R. R. Byreddy et al.
As four methods are discussed in this paper, in translation-based method, we have used classified English movie reviews as input data which has 5331 positive and 5331 negative sentences. For all other methods, Hindi movie reviews were taken which has 464 positive and 462 negative. Using the tweepy library, a dataset for sentiment analysis was retrieved from Twitter. All of the sentences are preprocessed and cleaned using NLP techniques such as noise reduction, tokenization, and lemmatization before being fed into the training system. Following data preparation, a unigram model is used to extract features from training data. The Unigram model considers each word individually. It does not take word order into account, so the order does not make a difference in how words are tagged or split up. We generate a corpus in this model that contains all of the words that appear in any review of our dataset. The corpus is a collection of all good and negative evaluations integrated into one word. To exclude terms that do not contribute much to sentiment categorization, we only consider those words in the corpus that have a frequency count in a certain range. We generate a feature matrix of size m * n (where m = number of reviews in our dataset and n = number of words in the corpus). For each element of the matrix, if that corpus word occurs in the review, the element is assigned the frequency count of that word in the review. As each feature can be class and output class contains 1 or 0 which depicts positive or negative different classifiers were taken like logistic regression (LR), support vector classifier (SVC) and Gaussian naïve Bayes (GNB) to train the model. As four methods are mentioned to do sentiment analysis, each approach has its own unique method of training the data. In the translation approach, the classifier is trained using an English dataset, and the data instances are translated into English from the Hindi language for testing, but in the Hindi-specific classification technique, the classifier is taught and tested in Hindi. When it comes to the linear discriminant analysis method, training and testing are done in Hindi instances, but the feature matrix is reduced to form a single input class and trained accordingly. In convolution neural network (CNN) method, a model is defined with hidden layers, and Hindi instances were trained. By having test data, the models can be evaluated against the predicted sentiment which measures how accurate our model is. The rest of the paper is catalogued as follows: Sect. 51.2 presents related work on multi-linguistic sentiment analysis. In Sect. 51.3, we introduce the proposed sentimental analysis models. Sect. 51.4 presents the experimental results of the propound approach, and Sect. 51.5 concludes the work.
51.2 Literature Survey Sonali Rajesh Shah, Abhishek Kaushik developed a count vector-based sentiment classification model for Indian indigenous languages. They analyzed different classifiers and approaches to find the sentiment. They have given insights regarding future sentiment analysis models [1].
51 Analysis of Different Methodologies for Sentiment …
563
Rohini V, Merin Thomas, Dr. Latha. C. A proposed a comparison between the direct method and the translation method in the Kannada language. Their results are based on movie and book reviews from different review sites. They used parts of the speech tagging method to find the sentiment analysis [2]. Santwana Sagnika, Anshuman Pattanaik, Bhabani Shankar Prasad Mishra, and Saroj K. Meher in their paper analyzed the multilingual and translation-based sentiment analysis using machine learning methods. They reviewed already done works in multilingual sentiment analysis. Their results suggest translation is a more feasible approach for multi-linguistic sentiment analysis [3]. Reddy Naidu, Santosh Kumar Bharti, Korra Sathya Babu, and Ramesh Kumar Mohapatra proposed a two-phase sentiment analysis for Telugu news sentences using Telugu SentiWordNet. It identifies the subjectivity classification of the sentences classified as subjective or objective [4]. Ethem F, Aysu Ezen, and Fazli in their paper discussed the model which can be reused for other languages for sentiment analysis that have large resources. They used an RNN algorithm with a translation approach to perform the sentiment analysis. They used Spanish movie reviews, and the RNN model outperformed with the given data [5]. Sannikumar Patel, Brian Nolan, Markus Hofmann, Philip Owende, and Kunjan Patel presented research and results which are important for understanding whether machine translation should be used for multilingual sentiment analysis or not (building language-specific systems) [6]. Carlos Argueta, Yi-Shin Chen proposed a system based on emotion-bearing patterns which can be used to perform more complex analysis in ambiguity and sarcasm identification. They have experimented with different patterns and values to find the optimal set for a given language and best fits the sentimental model [7]. Bidhan Sarkar, Nilanjan Sinhababu, Manob Roy, and Pijush Kanti Dutta Pramanik in their study about multilingual tweets and scripts unleashed how important mining of data from Twitter. It takes a lot of tasks to finally convert twitter data to the normal script [8]. Siaw Ling Lo, Erik Cambria, Raymond Chiong, and David Cornforth in their study examine the various methods and approaches currently used for multilingual sentiment analysis, identify the challenges including a framework which are particularly suitable for languages with limited resources [9]. Sreekavitha Parupalli, Vijjini Anvesh Rao, and Radhika Mamidi used word-level annotations for setting up baseline accuracy for sentiment analysis in Telugu language data. They generated annotated corpus that enhanced the sentiment task. This was even improvised by annotating bi-grams from the corpus [10].
564
R. R. Byreddy et al.
51.3 Implementation 51.3.1 Translation-Based Method To create a model, we need to create a feature set to give it to the classifier. After the data is prepared and preprocessed, these sentences are fed to the TextBlob library and segregated into positive and negative categories. This step is vital for training the model for finding its accuracy and also for creating feature sets. Before extracting, every word is taken and its count in the whole document of sentences is calculated. Only those words are considered whose count is less than 4000. This set of words is considered as Corpus (i.e., English data). To create the feature set, we used the Unigram Method. After that, for each list taken above, the polarity is appended, i.e., if the sentence is positive, “1” is appended, otherwise “0” is appended. All these lists (i.e., sentences) are accumulated to form the feature set. The created feature set is fed into the classifier. Support vector classifier and naïve Bayes’ classifier have been applied. Regression has also been implemented by applying logistic regression.
51.3.2 Hindi Corpus-Based Method (Without Translation) As we took data from Twitter using tweepy library, we manually classified the Hindi sentences into positive and negative. In this approach, only Hindi sentences are trained and evaluated. Before extracting features from the train data, preprocessing is done. From preprocessed data, lexicons (a Hindi word corpus) are created using the tokenization method in the NLTK library. Now, feature extraction is done using the unigram method. In this method, for each sentence, a list is taken where the count of those words is inserted if they exist in the Corpus taken. After that, for each list taken above, the polarity is appended, i.e., if the sentence is positive, “1” is appended, otherwise “0” is appended. All these lists (i.e., sentences) are accumulated to form the feature set. The created feature set is fed into the classifier. Support vector classifier and naïve Bayes’ classifier have been applied. Regression has also been implemented by applying logistic regression.
51.3.3 Linear Discriminant Analysis Method Linear discriminant analysis is done to reduce the dimensionality of the training data and give better accuracy. Here, we have taken the already created features from the Hindi corpus-based method; now, LDA is applied on this feature set. We have a feature set dimension of 926 × 3257; after applying linear discriminant analysis, the feature set dimension is reduced to 926 × 1. LDA uses space reduction techniques to remove dimensionality. After the application of LDA, the feature set is fed into
51 Analysis of Different Methodologies for Sentiment …
565
the classifier. Support vector classifier and naïve Bayes’ classifier have been applied. Regression has also been implemented by applying logistic regression.
51.3.4 CNN-Based Method Just like in LDA, here the feature set from the Hindi corpus-based method is taken. Later, a neural network model is created using one-dimensional convolution neural network with three hidden layers. Convolution neural networks are composed of multiple layers of artificial neurons. Artificial neurons, a rough imitation of their biological counterparts, are mathematical functions that calculate the weighted sum of multiple inputs and output, an activation value. Convolutional neural networks are commonly used to process images, but a onedimensional convolution can be used for language processing A CNN receives the features. The embedding layer receives the feature set matrix. The output filters are set to 64, with a kernel size of 4, determining the length of the convolution window. For the 1D convolution neural network, the ReLu activation function is used. Max pooling 1D layers are applied to each layer. All the outputs are then concatenated. A dropout layer dense, then dropout, and then, final dense layer is applied. The same ReLu activation function is used for the first dense layer, and the activation function for the second layer is sigmoid.
51.4 Evaluation and Discussion The proposed four methodologies are assessed dependent on the test reviews taken in Hindi language. In our experiment, the predicted sentiment is evaluated against actual sentiment.
51.4.1 Experimental Setup The sentences for test data are taken from Twitter and are movie reviews. The information about the dataset is given in Table 51.1. As we have implemented sentiment analysis in four methods, in first method, we have taken English movie reviews, whereas in other methods, Hindi movie reviews are taken. All training and testing data are polarized, i.e., the sentences were given whether their sentiment is positive (1) or negative (0).
566
R. R. Byreddy et al.
Table 51.1 Dataset description Language
Data
Total
Sentiment Positive
Negative
English
Training
10,662
5331
5331
Hindi
Training
926
464
462
Testing
100
50
50
Table 51.2 Accuracy and F1 scores of sentiment analysis task Method
Different classifiers LR
SVC
GNB
CNN
ACC
F1
ACC
F1
ACC
F1
ACC
F1
Translation approach
62.22
0.62
59.43
0.58
55.44
0.54
–
–
Hindi corpus approach
82.22
0.83
82.22
0.83
74.44
0.74
–
–
LDA approach
74.44
0.76
74.44
0.76
74.44
0.76
–
–
CNN approach
–
–
–
–
–
–
81.11
0.71
51.4.2 Evaluation Metrics Accuracy (ACC), F-measure (F1) are used as evaluation metrics. For each approach, both scores are determined.
51.4.3 Results The achieved performance of the sentiment classification task under four methods are given in Table 51.2, respectively; we have taken 100 reviews for testing. It is shown that the Hindi corpus-based method achieves an acceptable accuracy. In addition to that, CNN neural network-based method also produced remarkable results. As CNN uses neural networks for computation which is very accurate to produce good results. Translation-based method shows unexpected results; as translation happens, it may loose some words or meaning which results in reduction of accuracy. LDA method reduces the computation, but results can be acceptable. As we can see in overall, all our approaches achieved the better results.
51.5 Conclusion In English, there are many ways for extracting features; however, feature extraction in Hindi is a difficult undertaking. We extracted features using the unigram
51 Analysis of Different Methodologies for Sentiment …
567
term frequency technique, which has the potential to be expanded to other language data with little resources. We investigated many approaches to sentiment analysis, including translation-based approaches, Hindi corpus methods, LDA methods, and convolutional neural network methods. Accuracy and F1 Scores of different classifiers and all methods are recorded. Out of all methods, the Hindi corpus method gave better results. The performance of the translation-based strategy dropped because translation reduces the meaning of the text, which is the primary source of incorrect outcomes. The CNN technique yielded positive results in sentiment analysis, and it now plays a key role in sentence-level opinion mining. The CNN approach, as well as the Hindi corpus methods, outperforms the translation method in general.
References 1. Shah, S.R., Kaushik, A.: Sentiment analysis on Indian Indeginous languages: a review on multilingual opinion mining. https://doi.org/10.20944/preprints201911.0338.v1 2. Rohini, V., Thomas, M., Latha, C.A.: A paper domain based sentiment analysis in regional language—Kannada (2016). ISSN 2278-0181 3. Sagnika, S., Pattanaik, A., Mishra, B.S.P., Meher, S.K.: A review on multi-lingual sentiment analysis by machine learning methods (2020). https://doi.org/10.25103/jestr.132.19 4. Naidu, R., Bharti, S.K., Babu, K.S., Mohapatra, R.K.: Sentiment analysis using Telugu SentiWordNet (2017). 10.1109/ WiSPNET.2017.8299844 5. Ethem, F., Ezen, A., Fazli: Multilingual sentiment analysis: an RNN-based framework for limited data (2018). arXiv:1806.04511v1 [cs.CL] 6. Patel, S., Nolan, B., Hofmann, M., Owende, P., Patel, K.: Sentiment analysis: comparative analysis of multilingual sentiment and opinion classification techniques (2017). ISNI 0000000091950263 7. Argueta, C., Chen, Y.-S.: (ACL-2014). Multi-lingual sentiment analysis of social data based on emotion-bearing patterns. https://doi.org/10.3115/v1/W14-5906 8. Sarkar, B., Sinhababu, N., Roy, M., Pramanik, P.K.D.: Mining multilingual and multiscript Twitter data: unleashing the language and script barrier (2019).https://doi.org/10.1504/IJB IDM.2020.103847 9. Parupalli, S., Rao, V.A., Mamidi, R.: (ACL-2018). BCSAT: a benchmark corpus for sentiment analysis in Telugu using word-level annotations. Report No: IIIT/TR/2018/-1 56th Annual Meeting of the Association for Computational Linguistics 10. Lo, S.L., Cambria, E., Chiong, R., Cornforth, D.: Multilingual sentiment analysis: from formal to informal and scarce resource languages (2017). https://doi.org/10.1007/s10462-016-9508-4
Chapter 52
Analysis of Efficient Handover CAC Schemes for Handoff and New Calls in 3GPP LTE and LTEA Systems Pallavi Biradar, Mohammed Bakhar, and Shweta Patil
Abstract In this work, a novel CAC scheme is analyzed for handoff handling and new call blocking attempts. As traffic in mobile cellular network increases handoff will become an increasingly important issue and as cell size shrinks to meet the growing demand for services, better, more efficient handoff mechanisms must be implemented. In this paper, various handoff schemes are analyzed for multiple traffic system and simulate an ATM-based wireless personal communication network to implement the bandwidth level degradation along with dynamic guard channels scheme.
52.1 Introduction The most significant aspect of a wireless cellular communication system is mobility, and in most cases, continuous service is necessary, which is accomplished by the process of handoff from one cell to another. Because of technological advances, user mobility is increasing in wireless cellular network and also demand for multimedia and voice is increasing. Handoff is the process of changing the channel (frequency, timeslot, spreading code, or combination of them) associated with the current connection while a call is in progress. It is usually triggered by either crossing a cell boundary or a drop in signal quality on the current channel. Poorly constructed handoff schemes result in a significant increase in signaling traffic and as a result, a significant reduction in service quality (QoS). Handoffs will become increasingly critical as traffic in these mobile cellular networks grow. Newer, more efficient handoff techniques are required when cell sizes reduce to suit an increased demand for services. Handoff Management Operation: The handoff management operation allows a mobile. Terminal (MT) to effortlessly switch from one access point to another while maintaining the active connection’s quality of service (QoS). Handoff management is divided into three stages: 1. Initiation: This stage involves handoff decision-making, P. Biradar (B) · M. Bakhar · S. Patil E and CE Department, Gurunanak Dev Engineering College, Bidar 585401, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_52
569
570
P. Biradar et al.
in which the need for handoff is detected by either the mobile terminal or the network agent, and the various handoff ignition criteria describe the best handoff initiation algorithm and related parameters. 2. Generation of new connections: This entails identifying new resources and conducting additional routing procedures in order to create new connections. 3. Data flow control: It involves data delivery along new established path the agreed QoS guarantees. The literature survey shows that the current existing work addresses the concepts of different channel handling schemes as, FCA the fixed channel allocation scheme, there are no separate channels allocated for handoffs calls or requests in the fixed channel allocation mechanism. The entire number of accessible channels is shared among new originating and handoff calls on a first come, first served basis [11]. Here, both the new call request and handoff request calls are dealt with equality. When opposed to fresh call blocking, the termination of an ongoing call handoff failure is fundamentally less desirable. As a result, because the handoff blocking rate is identical to the new call blocking rate [12], the quality of service does not meet expectations. Several studies in the related literature [13] discuss the categorization of handoff methods based on the guard channel idea. The “guard channel” (GC) concept improves the probability of a successful handoff, where certain number of channels are reserved exclusively for handoff requests. The number of available remaining channels can be shared equally by handoff request calls and new call requests. The guard channel allocating scheme for handoff requesting calls improves the overall system throughput [14]. Because a specific number of channels are assigned solely for handoff requests even when traffic is low, when the reserved guard channel number is too large, the call blocking rate for new calls will be high. When the traffic load is excessive, the total available resources are lost by not fulfilling either the handoff or new call requests, and if the number is too small, the handoff call blocking rate for handoff calls cannot be guaranteed. So, while this approach improves quality of service by lowering the handoff blocking rate in a steady traffic load, it is not flexible enough to achieve high quality of service when the incoming traffic load changes dynamically owing to large events or during working hours [12]. The proposed system describes how guard channels are dynamically allocated for handoffs request calls using a unique and efficient data format. The proposed approach dynamically allocates dynamic guard channels for handoff requests calls depending on traffic load over a specified time period and enables call queuing in the event of heavy traffic. If guard channel is full, it uses bandwidth level degradation technique to handle handoff and the new calls. Finally, it uses a priority-based queue to handle least tolerant delay first. This approach improves QoS and maximizes the channel utilization, minimizes the call blocking rate, minimizes delay, avoids dropping of handover request.
52 Analysis of Efficient Handover CAC Schemes …
571
52.2 Literature Survey • Tho [1] proposed handover for a multimedia Wireless LAN’s. New design issue for a multimedia handover is specified and a fast, continuous hybrid handover protocol is proposed. A general access point-initiated, inter- switch handoff scheme is proposed by Yuan et al. in [2], where to contact the neighboring access points, the current access point uses wireless control protocol to provide service to handoff calls. • An overlay signaling and migratory signaling scheme are proposed by Akhyol in [3]. The overlay signaling follows an overlay network, where the migratory signaling upgrades the network in regions and maintains compatibility. Wong [4] and Salah [5] presented a two-phase handoff protocol. The path optimization phase is enabled when delay constraint is violated, and the path extension phase is activated for inter-switch handoff. • A low-latency handoff offered by handover protocol is proposed by Naylon in [6]. The two-phase backward handover is used by the proposed handover protocol, which decouples the re-routing of mobile terminal from the radio handoff, the interruption for data transmission is minimized. • The quality of service-based path optimization scheme is proposed in [7]. This scheme outperforms the delay and hop-based optimization scheme in terms of average delay, number of hops, handover-drop rates. The uniform pre-establishment algorithm (UPA) and Pre-establishment algorithm (PAP) are proposed in [8], to improve the quality of service namely dropping handovers and blocking calls. • Siddiqui and Zeadally [9] proposed a handoff management procedure which includes a discussion on handoff decision, implementation procedures and current handover techniques, aiming mobility over wide range of different access technologies and on the mobile terminal capabilities, which are necessary for seamless mobility over hybrid wireless cellular networks. One of the primary difficulties for the FGWN, according to [10], is mobility, which allows consumers to benefit from ongoing services while traveling across networks. • Rami Tawil in [11] proposed T-DVHD a trusted and distributed vertical handoff decision mechanism, to provide seamless and trusted handover in vertical handoff. A novel call admission control policy for wireless multimedia cellular network is proposed by Ojesanmi in [13], two kinds of traffics, the real-time and nonreal-time calls are assumed, to access the available limited frequency channels in cell. Calls of same cell and priority are served using ticket scheduling. The total throughput of the network can be increased by providing buffer to the handoff calls in presence of new calls. • Alagu and Meyyappan in [14] analyze the various traffic schemes for new call blocking attempts and for handoff handling. Using simulation on two types of networks, it is shown that an allocation of separate channel for handover requests (guard channel) shows improvement [15, 16]. Ni et al. [17] proposed slotted call admission control mechanism with the dynamic channel allocation scheme, to
572
•
•
•
•
•
P. Biradar et al.
introduce the performance degradation of secondary users due to a failed handover. In this approach, users are admitted at the start of each new slot. As a result, new secondary users that come between the two slots should join a queue until the next slot is available. A novel call admission control and adaptive bandwidth management scheme for heterogeneous wireless network is proposed in [18]. In this scheme, the incoming traffic for each class is separated, this scheme priorities handoff calls over the new class. A dynamic channel allocation scheme is proposed in [19], where the scheme provides the number of reserved channels for handoff calls over the new call requests to increase the successful accession rate. The method is used in TD-SCDMA wireless mobile communication to provide integrated voice/data services. Transferring an ongoing call of one channel to the released channel, when there are relatively a smaller number of available channels becoming unavailable due to channel constraints is overcome by the method of efficient channel utilization scheme on distribute dynamic channel assignment (DDCA) is proposed in [20]. Another call admission control policy is proposed in [22] for long term evolution (LTE) network. The number of incoming calls are classified into new calls and handoff calls. In this scheme, the handoff calls are given priority over new calls. The proposed call admission control scheme guaranties the quality of services and prevents network congestion. In [21], an adaptive call admission management technique with bandwidth reservation for downlink LTE networks is presented to reduce user traffic starvation and increase resource utilization in LTE networks. To avoid user traffic exhaustion, this scheme introduces call admission control criteria. When there is inadequate bandwidth, a bandwidth degradation method is employed to admit users; with the proposed scheme, the results show good performance with improvement of data throughput, reduces the call blocking probability CBP and the call dropping probability CDP and the degradation ratio then the other bandwidth degradation and reservation scheme.
52.3 Objective of the Work The objective of research work is to design a solution for efficient utilization of resources for handoff calls and for new calls in order to provide good quality of service and higher reliability for multimedia users. • To survey the current methods which are used for handover decisions in a wireless network area. • Developing new algorithms by using MATLAB and Python language and implement them for heavy traffic area in wireless cellular network area.
52 Analysis of Efficient Handover CAC Schemes …
573
• Designing a novel call admission control (CAC) scheme for computing the probability of call blocking rate (CBP) and probability of handoff dropping rate (CDP) to the number of call arrival which intern should results in minimizing call blocking rate, minimizing the dropping of handover request, computing the delay by dynamically allocating channels based on past history, and utilize the available resource efficiently. • Minimizing call blocking also by call queuing for handoff calls and for new calls for improving quality of service. • To run the code with standard examples and test the code. • To provide a good quality of service and main aim is utilization of available resource efficiently by the method of dynamic guard channel allocation with bandwidth level degradation and by providing priority call queuing with higher priority given to the lowest tolerant delay for both handoff and new calls. Avoiding handoff failure due to channel unavailability and lower tolerant delays.
52.4 System Model and Scheduling of Users Heterogeneous wireless network consists of different radio access technologies coexisting in the same area and can operate in 3GPP-LTE, Wi-MAX, and Wi-Fi technologies. In heterogeneous network, applications with delay can be used. To ensure good quality of service, proposed CAC limits the time out for each call. The heterogeneous wireless network traffic is categorized into three different traffic classes of calls. Where the first class is of type tolerant real-time, second class is intolerant real time, and the last class is nonreal time. VOIP, video calls, etc., are the types of nontolerant real-time (RT-INTR) traffic. Video and audio streaming, uploading files, etc., are types of tolerant real-time traffic (RT-TR). Sending mails and downloading files are types of nonreal-time class (NRT). All these calls can operate at minimum and maximum bandwidths. At maximum bandwidth, calls can provide good quality of services whereas at minimum bandwidth, calls will get just enough bandwidth to operate. RT-INTR call has the highest priority, followed RT-TR. NRT calls have the least priority. In our proposed admission control system, handoff calls are given higher priority over new calls. Traffic profile for each call is. Traffic profile = Bmn ; Brq ; Bmn ; Dmx ; P; K where: • • • •
Bmn : Minimum bandwidth required for the call i Bmx : Maximum bandwidth required for the call i. Brq : Required bandwidth for the call i. Dmx : Maximum delay for the call i.
574
P. Biradar et al.
• T C : Indicates the call type (e.g., handoff call (HC) or new call (NC)). • K: The type of service for call i. • Bgc : Reserved bandwidth for guard channel.
52.5 Proposed Call Admission Control Scheme The suggested CAC algorithm divides incoming calls into NC and HC categories, for both nonreal-time and real-time applications, all sorts of calls have a QoS request and a delay restriction to adhere. If an incoming call is handoff call and bandwidth is available in guard channel, then handoff call is accepted. Otherwise, if CAC determines that network resources used do not meet the predetermined threshold capacity associated with the kind of application requested, these calls will be permitted by allocating requested resources in the usual channel. If both guard channel and normal channels are full, CAC will use bandwidth degradation algorithm to get the required bandwidth for the new service (HC or NC). If sufficient resources are present, then call is accepted or call is moved to a priority-based queue which follows lowest delay first algorithm. Whenever any current call or service is completed, that channel is given to the call from queue which has least tolerant delay. The following are the steps in the call admission control algorithm: Step 1: Incoming calls with the required parameters like Bmn, Bmx, Brq, Dmx, K, P. Step 2: Call type (new call or handoff call) is determined. Step 3: If it is a handoff call and GC is available, then allocate bandwidth in GC else go to step 4. Step 4: Call class is (RT-TR or RT-INTR or NRT) determined. Step 5: Call is accepted if the resources are sufficient or else the condition Dr < Dmx is verified. If true, then proceed to next step, else the call is rejected. Step 6: If the condition is satisfied, then bandwidth level degradation algorithm (BLD) is applied on a call with the minimum tolerated delay. Step 7: Call is accepted if resources are made available or else the condition Dr < Dmx is checked again. Step 8: If no resources are made available till call maximum tolerable delay, then the call will be rejected.
52.5.1 Dynamic Guard Channel Allocation To provide a good quality of service and main aim is to handle all handoff calls with dynamic guard channel allocation. Dynamic guard channel allocation is computed based on history, real-time, and nonreal-time handoff call dropping probability and current load of the traffic.
52 Analysis of Efficient Handover CAC Schemes …
575
52.5.2 Bandwidth Level Degradation Algorithm The bandwidth level degradation algorithm decreases the bandwidth of selected ongoing calls. This will free the bandwidth. The freed bandwidth is used handle the new incoming calls. If freed bandwidth is greater than bandwidth required for new call, then call is accepted. The “bandwidth level degradation” is activated when the available bandwidth is insufficient to meet the new incoming call. In the BLD approach, all active calls are ordered in decreasing order of bandwidth. The BLD procedure decreases the resource of all calls belonging to a group of class that have the maximum resource. If freed bandwidth is higher than those required bandwidth of new call, then the call is accepted, and the freed bandwidth is allocated to the new call. In case, freed resource do not meet the bandwidth needed for the new incoming call next higher-level sets are used. The procedure continues for all level of calls. All calls in any set should not be degraded below Bmn or Brq. A call will be removed from a set when resource reaches Brq level while accepting a new call and will be removed from a set if resource reaches Bmn level while handling a handoff calls.
52.6 Simulation Parameters A 3GPP-LTE network and an IEEE 802.11a wireless LAN were considered in a heterogeneous wireless network environment with full radio coverage to test the proposed call admission control method. MATLAB is used to simulate and implement the proposed call admission control mechanism, and a heterogeneous wireless network environment is considered. In the proposed scheme, three different service types of traffic class are considered: VoIP(RT-NTR), video streaming (RT-TR), and data on demand(NRT). “Poisson process” is used to generate call requests randomly based on mean arrival rate of λ (calls/s). The channel holding times are exponentially distributed. Considering mobility model as the “random walk model.” Simulation parameters with values are mentioned in Table 52.1.
52.6.1 Parameter Values See Table 52.1.
576
P. Biradar et al.
Table 52.1 Simulation parameters for proposed CAC system Bandwidth
20 MHz (100RB’s, 180 kHz per RB)
WLAN data rate
100 Mbps
VOIP
1RB, 1RB, 1RB, 300 ms (Bmx, Brq, Bmin, Dmx)
Video streaming
2RB, 1RB, 1RB, 1500 ms (Bmx, Brq, Bmin, Dmx)
Data on demand
3RB, 2RB, 1RB, 10,000 ms (Bmx, Brq, Bmin, Dmx)
Average connection holding time
3, 5, 9 min (VoIP, video, DoD)
Data simulation time
30 min
52.7 Results The simulation results of the proposed CAC scheme are compared with resource block reservation algorithm [23] and compared with BLD schemes [18]. Figures 52.1 and 52.2 show that the handoff call dropping probability and new call blocking probability decrease when compared with the resource block reservation algorithm scheme [23] and BLD algorithm [18]. Simulation analysis results show that proposed CAC scheme has a better resource utilization during peak loads compared with other schemes. Analysis of the results shows that proposed system reduces the call dropping probability by 2.1% when compared with bandwidth degradation algorithm and decreases the call dropping probability by 13% when compared to resource block reservation algorithm for handoff calls in real-time nontolerant systems. It also shows that the proposed system reduces the call blocking probability by 2.5% when compared with bandwidth degradation algorithm and decreases the call blocking probability by 11% when compared to resource block reservation algorithm.
Fig. 52.1 Call dropping probability versus call arrival rate for handoff calls
52 Analysis of Efficient Handover CAC Schemes …
577
Fig. 52.2 Call blocking probability versus call arrival rate for new calls
References 1. Tho, C.: Evaluation and implementation of a mobile handover in Fairsle, January 1995 2. Yuan, R., Biswas, K., Raychaudhuri, D.: A signaling and control architecture for mobility support in wireless ATM networks. Mob. Netw. Appl. 1, 287–298 (1996) 3. Khyol, B.A., Cox, D.C.: Signalling alternatives in a wireless ATM network. IEEE J. Sel. Areas Commun. 35–49 (1997) 4. Wong, Chan, Leung: Performance evaluation of path optimization schemes for inters-witch handoff in wireless ATM networks. In: Proceedings of IEEE Mobicom-98, pp. 242–252. Dallas, TX (1998) 5. Salah, K., Drakoponlos, E.: A two-phase inter-switch handoff scheme for wireless ATM networks. In: IEEE Proceedings of 7th International Conference on Computer Communication and Networks, LA, USA (1998) 6. Naylon, J., Gilmurray, D., Porter, J., Hopper, A.: Low-latency handover in a wireless ATM LAN. IEEE J. Sel. Areas Commun. 909–921 (1998) 7. Hac, A., Peng, J.: A two-phase combined QoS-based handoff scheme in a wireless ATM network. Int. J. Netw. Manag. 11(5), 309–330 (2001) 8. Gueroui, A., Boumerdassi, S.: A handover optimization scheme in cellular networks. Rapport Scientifique CEDRIC No. 390 (2002) 9. Siddiqui, F., Zeadally, S.: Mobility management across hybrid wireless networks: trends and challenges. Comput. Commun. 1363–1385 (2006) 10. Ma, L.: A class of Geom/Geom/1 discrete-time queueing system with negative customers. Int. J. Nonlinear Sci. 275–280 (2008) 11. Tawil, R., Demergian, J., Pujolle, G.: A trusted handoff decision scheme for the next generation wireless networks. IJCSNS Int. J. Comput. Sci. Netw. Sec. 8(6) (2008) 12. Akki, C.B., Chadchan, S.M.: The survey of handoff issues in wireless ATM networks. Int. J. Nonlinear Sci. 7, 189–200 (2009) 13. Ojesanmi, O.A., Ojesanmi, A., Makinde, O.: Development of prioritized handoff scheme for congestion control in multimedia wireless network. In: Proceedings of the World Congress on Engineering, vol. I, London, UK (2009) 14. Alagu, S., Meyyappan, T.: Analysis of handoff schemes in wireless mobile network. IJCES Int. J. Comput. Eng. Sci. 1 (2011). ISSN 2250–3439
578
P. Biradar et al.
15. Alagu, S., Meyyappan: Efficient utilization of channels using dynamic guard channel allocation with channel borrowing strategy in handoffs (2012) 16. Alagu, S., Meyyappan: An efficient call admission control scheme for handling handoffs in wireless mobile networks. Int. J. ADHOC Netw. Syst. 29–46 (2012) 17. Ni, Z., Shan, H., Shen, W., Wang, J., Huang, A., Wang, X.: Dynamic channel allocationbased call admission control in cognitive radio networks. In: IEEE International Conference on Wireless Communications and Signal Processing, Hangzhou (2013) 18. Omheni, N., Gharsallah, A., Zarai, F., Obaidat, M.S.: Call admission control and adaptive bandwidth management approach for HWNs. In: IEEE Global Communications Conference, TX, USA (2014) 19. Sun, Y., Li, M., Tang, L.: A dynamic channel allocation scheme based on handoff reserving and new call queuing. In: 2011 International Conference on CASE, Singapore. IEEE 20. Ramesh, T.K., Giriraja, C.V.: Study of reassignment strategy in dynamic channel allocation scheme. In: 2016 3rd International Conference on Signal Processing and Integrated Networks. IEEE 21. Hanapi, Z.M., Abdullah, A., Muhammed, A.: An adaptive call admission control with bandwidth reservation for downlink LTE networks. IEEE Access 10986–10994 (2017) 22. Khdhir, R., Mnif, K., Belghith, A.: Kamoun, L.: An efficient call admission control scheme for LTE and LTE—a networks. In: 2016 International Symposium on Networks, Computers and Communications (ISNCC), Tunisia. IEEE 23. Zarai, F., Ali, K.B., Obaidat, M.S., Kamoun, L.: Adaptive call admission control in 3GPP LTE networks. Int. J. Commun. Syst. (2012)
Chapter 53
Artificial Intelligence Framework for Skin Cancer Detection K. Mohana Lakshmi
and Suneetha Rikhari
Abstract Skin cancer is a disease which has become the matter of concern and is responsible for millions of deaths of human beings. The objective of this work to identify new harmful skin cancer cases which are primitive to undertake a low death rate compared to the existing rate. Some of the conventional approaches are implemented on machine learning-based algorithm, but they declined to give maximum accuracy and specificity. An advanced deep learning-based probabilistic neural networks (PNN) classification mechanism is proposed in this work. A discrete wavelet transform (DWT)-based low-level features are extracted and grey level co-occurrence matrix (GLCM) is developed for texture feature extraction, and for segmentation, k-means clustering is proposed. PNN is developed using emulation of birth neural scheme for classification to achieve the maximum efficiency.
53.1 Introduction In recent days, skin cancer has become most common type of cancer in several countries and worldwide. There are various types of skin cancer and basically are divided into benign and malignant. In these two types, melanoma has become a most effective disease when compared with non-malignant. Skin cancer is a malignant tumour which grows in the skin cells, and skin cancer is rare in children. The skin cancer detection system will be used to recognize the cancer in accurate level. Most of the presently employed anti-cancer agents do not importantly differentiate between normal and cancerous cells [1]. The skin is the largest organ in the body, and it covers entire body; its thickness differs from the feet and more on palms of hand. The skin plays an important role to maintain the internal systems and protect the body. It also functions for producing vitamin D for body, sensation, temperature and isolation. The K. Mohana Lakshmi (B) CMR Technical Campus, Hyderabad, Telangana, India S. Rikhari Mody University of Science and Technology, Lakshmangarh, Rajasthan, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_53
579
580
K. Mohana Lakshmi and S. Rikhari
Fig. 53.1 Skin structure
skin is divided into three layers: the epidermis—the top layer, dermis—the middle layer and hypodermis—the last layer. There are two major sorts of skin cancer, namely non-melanoma and melanoma [2]. Melanoma is one of the most dangerous skin cancers and can be fatal if not treated. If it is detected in the early stages, it is curable highly, yet progressive melanoma is deadly, i.e. early treatment and finding of skin cancer can minimize the morbidity. The digital image processing methods are considered widely and accepted for the medical field [3]. An automatic image processing approach normally has different kinds of stages such as the initial image analysis of the given image, proper segmentation, after that, feature extraction and selection of the needed features and lesion recognition. The segmentation process is incredibly significant since it affects the subsequent step precision values [4]. Skin cancer detection prototype of a system pinpoints the melanoma system by this procedure, creates efficient classification and extraction of pigmented skin lesion such as that of melanoma. This approach is very useful to moderate treatment and diagnosis of skin cancer. This approach may also help in rural area and main motto of this approach is to reduce cost and time (Fig. 53.1).
53.2 Related Work Suspicious skin cancer is often biopsied which is unpleasant and slow for the patient to get diagnostic results. Additionally, the rate of unwanted biopsies is very high. Hence, the skin cancer image processing system can be used for recognizing the texture
53 Artificial Intelligence Framework for Skin Cancer Detection
581
of a skin or issue by symptoms, signs and the various kinds of diagnosis process results. The skin cancer images are analysed by pre-processing, segmentation and classification. For the pre-processing of image, CLAHE algorithm is used, in which it utilizes the advanced technique of bilateral filtering. By using recursive C-means algorithm, the conclusion of primary segmentation is obtained as an outcome. In Li et al. [5], a three-step algorithm was suggested in which non-linear transformation techniques are converted using the fuzzy logic techniques to classify and expand the data from current data standard postulate for some data set. Next, to remove best separation of characteristics, they had requests principal component analysis (PCA). It is used on basis of novel distorted data set. Finally, they used the distorted data with favourable characteristics as input data for a SVM by WBCDD, and the performance of the approach was 96.35%. In the evolution of numerous methods, techniques of image segmentation is helpful to analyse the lesions on skin from images which was suggested by Oliveira et al. in the year 2016 [6]. In some of the algorithms, threshold is used extensively due to their ease. So, thresholding techniques are used, and some of them are: Oster was introduced by Schaefer in the year 2013 [7], Type-2 fuzzy logic was introduced by Yuksel and Borlu in the year 2009 [8], and Renyi entropy techniques by Beuren et al. in the year 2012 [9] had created an approach that divides the input image into different religions. A method called substitute which is used for recognition through melanoma dermoscopy images on the basis of exterior appearance and shade of colour characteristics extraction was presented by Barata [10]. There is a capability to develop a real-time analysis application or develop any images, and operations are obtainable by captivating; this is the main disadvantage of the scheme. Sadeghi et al. [11] proposed a graph-based technique to categorize and visualize pigment system. They authenticate the technique by calculating its capability to organize and picture the actual dermoscopic images. Glaister et al. [12] projected a novel multistep clarification modelling technique to proper clarification difference in comprehensive images. This technique initially resolves a non-statistical technique of the elucidation by utilizing a Monte Carlo sampling technique. Then, a statistical polynomial plane model is utilized to resolve the concluding elucidation evaluation. Lastly, the brightness-rectified image is acquired by utilizing the observance factor calculated since the last predictable elucidation. ANN classifier is simulated in MATLAB software for tumour on skin recognition. The image comprises hairs and other noises. These noises create inaccuracy in classification. The noises are neglected by filtering. Filtering technique executed at this point is the median filtering [13]. This paper [14] proposes an easy so far efficient and incorporated computer vision algorithm utilized for identifying and analysing the initial step of melanoma.
582
K. Mohana Lakshmi and S. Rikhari
Fig. 53.2 Block diagram of skin cancer detection
53.3 Proposed Methodology Performing the training image like benign and malignant in which data set consisted of 15 benign and 15 malignant images. These images are trained by using PNN network model with GLCM feature, statistical and texture feature.
53.3.1 Pre-processing Firstly, pre-processing is done in which noise is removed and enhancement is also done, i.e. colour and contrast enhancement of the image. In pre-processing, we are using Gaussian filter. In skin image, we have two parts, i.e. Lenin part (effected area) and Normal part (unaffected area). This stage is used for removing the noise from image. Pixels in the image show different intensity instead of the pixels. This is present when noise is occurring in an image (Fig. 53.2).
53.3.2 Segmentation As per k-means clustering, the image is divided into multiple clusters or sub blocks. Each cluster consists of accurate edges. By k-means, segment the image into clusters and select the number of cluster. Centre cluster is a random block, i.e. it may be in any area and block. Now, set initial cluster centre randomly and put object to closest cluster centre in which its interior cluster is calculated. While calculating two clusters, if a similar cluster is obtained, then recalculate the new cluster centre. Then, create a cluster based on smallest distance from both effected and unaffected part so that object will move to cluster where the process will continue until its output is given. When we are considering any deep learning, feature extraction is important because mechanism consists of each property (Fig. 53.3).
53 Artificial Intelligence Framework for Skin Cancer Detection
583
Fig. 53.3 k-means clustering
53.3.3 Feature Extraction By this mechanism, we are using some feature extraction like GLCM features, DWT feature and statistical colour features. In GLCM feature, we calculate contrast, homogeneity, correlation, angular second moment. DWT feature is used to extract the low features. Mean and standard deviation are calculated based on statistical colour features. GLCM is a texture technique of scrutinizing textures, considering spatial
584
K. Mohana Lakshmi and S. Rikhari
Fig. 53.4 PNN layered architecture
connection of image pixels. The texture of image gets characterized by GLCM functions through computations of how often pairs of pixels with explicit values and in a particular spatial connection are present in image.
53.3.4 Classification In this, we are going to use PNN classification in which neural network is effectively used in some of the domains like finance, medicine, engineering, geology, physics and biology. PNN is one of the methods which develop by using emulation of birth neural scheme. PNN is divided into two stages in which it is used for classification such as training and testing (Fig. 53.4). Hidden layers are performed by trained image probability. It has four layers. Class nodes are of two levels which are developed by classification. These two levels have individually normality and abnormality for skin cancer. With the help of these two levels of class nodes, we get the output layer. And hidden layer also consists of abnormal types in which they are benign and malignant cancer. By this types, we are going to test the input layer to get the output layer. From above structure, we can observe that if feature is present with c1 label with maximum weight distribution, then the output is benign. If the feature is present with c2 label with minimum weight distribution, then the output is malignant cancer. So, if these types are not present, then the output layer is normal skin.
53.4 Experimental Results Experiments are done using MATLAB. ISIC is one of the biggest available collections of quality-controlled dermoscopic images. For the implementation of the proposed method, spatial domain and frequency domain of 30 dermoscopic skin lesion images (15-benign and 15-Malignant) have been obtained, respectively, by applying rotations
53 Artificial Intelligence Framework for Skin Cancer Detection
585
at different angles. Train images of each label have been used to train the PNN architecture with fifty epochs, whereas rest twenty percent is used for testing. The features extracted by GLCM, DWT future network are used to train PNN classifier to classify the images into its respective classes. The efficiency of the model can be computed using various performance metrics. The proposed method can be effectively detecting the regions of skin cancers; it indicates the segmentation done very effectively compared to the active contour approach shown in Fig. 53.5. Here, TEST-1 and TEST 2 images are considered as the benign and TEST-3 and TEST-4 images are considered malignant type images, respectively. For the malignant images, the segmentation accuracy is more. Input image
Active contour segmented output
Fig. 53.5 Segmented output images of various methods
K-means segmented output
586
K. Mohana Lakshmi and S. Rikhari
For calculating the performance measure, the proposed methodology is implemented by two types with segmentation methods; they are active contour (AC) and k-means clustering, respectively. Then, metrics are compared with conventional methods and shown in Tables 53.1 and 53.2; it is observed that the proposed clustering technique gives the high accuracy for both benign and malignant diseases compared to the various kernels of SVM [14] such as SVM linear kernel, RBF kernel, polynomial kernel and fivefold cross validation, respectively (Fig. 53.6). Table 53.1 Performance comparison Metric
Method
Test 1
Test 2
Test 3
Test 4
Accuracy
PNN-AC
0.9157
0.78099
0.85796
0.47765
PNN-k-means
0.99985
0.99715
0.99999
0.99999
PNN-AC
0.70588
0.90024
0.9166
0.83857
PNN-k-means
0.99931
0.99198
1
1
F measure
PNN-AC
0.82207
0.68494
0.79395
0.44602
PNN-k-means
0.99965
0.99381
0.99998
0.99998
Precision
PNN-AC
0.98404
0.55275
0.70023
0.30381
PNN-k-means
1
0.99852
0.99997
0.99997
MCC
PNN-AC
0.7869
0.56857
0.70305
0.1835
PNN-k-means
0.99956
0.99198
0.99998
0.99998
PNN-AC
0.82207
0.68494
0.79395
0.44602
PNN-k-means
0.99965
0.99381
0.99998
0.99998
Jaccard
PNN-AC
0.69789
0.52085
0.65831
0.28702
PNN-k-means
0.99931
0.9877
0.99997
0.99977
Specificity
PNN-AC
0.99564
0.73812
0.83298
0.35685
PNN-k-means
1
0.99956
0.99999
0.99998
Sensitivity
Dice
Table 53.2 Accuracy comparison Method
Test 1
Test 2
Test 3
Test 4
SVM-linear kernel [14]
0.4
0.40
0.7
0.7
SVM-RBF kernel [14]
0.4
0.45
0.55
0.6
SVM-Polynomial kernel [14]
0.4
0.3667
0.50
0.5667
SVM-fivefold cross validation [14]
0.6
0.55
0.60
0.45
Proposed PNN-AC
0.9157
0.78099
0.85796
0.47765
Proposed PNN-k-means
0.99985
0.99715
0.99999
0.99999
53 Artificial Intelligence Framework for Skin Cancer Detection
587
Fig. 53.6 Performance evaluation of conventional and proposed methods
53.5 Conclusion The proposed approach is used for detection of skin cancer and classification of MRI images with PNN deep learning-based approach. Here, Gaussian filters are utilized for pre-processing, which removes the unwanted noise elements or artefacts. The purpose of ROI is extraction of cancerous cells detection using segmentation, i.e. kmeans. Then, GLCM, DWT-based method was developed for extraction of statistical, colour and texture features from segmented image, respectively. Finally, PNN was used to classify and identify which type of cancer. Thus, upon comparing with stateof-the-art works, we conclude that PNN is better than conventional SVM method. In future, this method can be extended by implementing more network layers of PNN for benign and malignant cancer images.
References 1. Revathi, V.L., Chithra, A.S.: A review on segmentation techniques in lesion on skin images. Int. Res. J. Eng. Technol. (IRJET) 2(9) (2015) 2. Hemalatha, R.J., Babu, B., Dhivya, A.J.A., Thamizhvani, T.R., Chandrasekaran, J.E.J.R.: A comparison of filtering and enhancement methods in malignant melanoma images. In: IEEE International Conference on Power, Control, Signals and Instrumentation Engineering (ICPCSI) (2017) 3. Ma, L., Staunton, R.C.: Analysis of the contour structural irregularity of skin lesions using wavelet decomposition. Pattern Recogn. 46, 98–106 (2013) 4. Ray, P.J., Priya, S., Kumar, T.A.: Nuclear segmentation for skin cancer diagnosis from histopathological images. In: IEEE Proceedings of 2015 Global Conference on Communication Technologies (GCCT) (2015)
588
K. Mohana Lakshmi and S. Rikhari
5. Li, D.-C., Liu, C.-W., Hu, S.C.: A fuzzy-based data transformation for feature extraction to increase classification performance with small medical data sets. Artif. Intell. Med. 52, 45–52 (2011) 6. Oliveira, R.B., Filho, M.E., Ma, Z., Papa, J.P., Pereira, A.S., Tavares, J.M.R.S.: Computational methods for the image segmentation of pigmented skin lesions: a review. Comput. Methods Prog. Biomed. 131, 127–141 (2016) 7. Scharcanski, J., Celebi, M.E.: Computer vision techniques for the diagnosis of skin cancer. Springer, Berlin Heidelberg (2013) 8. Yuksel, M.E., Borlu, M.: Accurate segmentation of dermoscopic images by image thresholding based on type-2 fuzzy logic. IEEE Trans. Fuzzy Syst. 17, 976–982 (2009) 9. Beuren, A.T., Janasieivicz, R., Pinheiro, G., Grando, N., Facon, J.: Skin melanoma segmentation by morphological approach. In: International Conference on Advances in Computing, Communications and Informatics, pp. 972–978. ACM, Chennai (2012) 10. Barata, C., Ruela, M., Francisco, M., Mendonça, T., Marques, J.S.: Two systems for the detection of melanomas in dermoscopy images using texture and color features. Syst. J. IEEE 8, 965–979 (2014) 11. Sadeghi, M., Razmara, M., Ester, M., Lee, T.K., Atkins, M.S.: Graph-based pigment network detection in skin images. Proc. SPIE 7623–762312 (2010) 12. Glaister, J., Amelard, R., Wong, A., Clausi, D.: MSIM: multistage illumination modeling of dermatological photographs for illumination-corrected lesion on skinanalysis. IEEE Trans. Biomed. Eng. 60, 1873–1883 (2013) 13. Dumitrache, I., Sultana, A.E., Dogaru, R.: Automatic detection of skin melanoma from images using natural computing approaches. In: 2014 10th International Conference on Communications (COMM) (2014) 14. Vidya, M., Karki, M.V.: Skin cancer detection using machine learning techniques. In: 2020 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT), pp. 1–5, Bangalore, India (2020)
Chapter 54
Telugu Text Summarization Using HS and GA Particle Swarm Optimization Algorithms M. Varaprasad Rao, A. V. Krishna Prasasd, A. Anusha, and K. Srujan Raju
Abstract For the extraction of summaries of individual Telugu papers in this study, we propose using the Particle Swarm Optimization (PSO) algorithm. The PSO technique is analogous to the development of genetics and harmony search approaches (HS). The proposed technique will be examined using the Telugu NLP docs and the ROUGE tool. Experimental tests have demonstrated that the suggested approach achieves competitive and higher ROUGE values, compared with conventional HS and GA techniques.
54.1 Introduction A description of the original text that retains the whole context [1]. Automatically a machine summary is created by the automated description. Because of the time and cost usual to understand multiple data manually, several automated text summary solutions have been built over the past decade. Automatic Text Summarization (ATS) has attracted both the academic community and the business world as a solution for sinking surpluses and assisting consumers to analyze various documents to find documents of interest [2]. ATS is critical for both academic research and business people in training and analyzing. An example of searching a phrase in the Google Search engine to display the list of results in the corresponding source. Abstract or extractive methods can be divided into approaches to ATS. A synthesis of abstract ideas seeks to establish an understanding of the key concepts used by linguistic methods in a text, and then to communicate them in phrases and plain understandable text [3][4]. An extraction M. V. Rao (B) · K. S. Raju CMR Technical Campus, Hyderabad, India A. V. K. Prasasd MVSR Engineering College, Hyderabad, India A. Anusha JB Institute of Engineering and Technology, Hyderabad, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_54
589
590
M. V. Rao et al.
technique for describing all your key sentences lists them in a new text based on the data and linguistic features of your sentences [5]. The most important information in the paper should be selected in a good overview while holding low redundancies [6]. The key issue with the extractive method is that it creates an irregular synthesis because the semi-relations between the sentences are not considered; it is based on statistical characteristics alone [3]. An easy or multi-Document overview may be grouped into the text summary. The documents have to be linked to each other on multi-document systems and a description is then produced based on these documents. The overview is based on this single document only in single-document systems [7]. There are typically four major approaches of extractive syntheses, such as statistical [8], semantical [9], ML [10], and heuristic [11]. The Heuristic Search [14] and the Genetic Algorithm [15], two evolutionary technologies have been suggested for ATS. We suggest using the PSO algorithm in this research to summarize Telugu text with a single document. PSO is an intelligent searching algorithm and here two evolutionary algorithms have been used to generate a summary, use of GA is compared with the proposed method.
54.2 Literature Review This section examines the most important extraction techniques employed in a study into the various languages, including Hindi, Telugu, and Arabic. Al-Omour and AlTaani’s [12] hybrid Arabic ATS systems are based on both a charting and a statistical methodology. The results indicated that the n-gram results in the summary phase beat the n-gram outcomes. The most relevant portions are extracted in the ATS provided by Elghazaly & Gheith [13] as phrased text. For grading and extracting relevant paragraphs, a similarity of the Cos metric is employed in representing vectors. The accuracy is used to assess both summaries. However, when used to large-scale texts, their method is limited. To get close to an optimum summary of a text by utilizing the EASC corpus and ROUGE tools for evaluation of the method presented Jaradat [14] includes HS in a summary process. The results revealed that the technique presented outperformed a different stateof-the-art approach. A hybrid technique utilizing GA has been described by Jaradat and Al-Taani [15] in ATS. To determine the correctness of the suggested methodology, the proposed technique is tested using the EASC corpus and ROUGE assessment method. The results demonstrate that several cutting-edge methods were surpassed by the suggested strategy. In assessing the efficacy of the numerous characteristics used to summarize the Arabic text, Al-Zahrani et al. [27] utilized PSO. The PSO is trained using corpus data from Essex Arabic to identify the best portion of the ESS that is the best simple combination of eight information characteristics that Arab summarizers commonly utilize.
54 Telugu Text Summarization Using HS and GA Particle …
591
Text input penalties are collected and evaluated to eliminate heavy-duty penalties in the form of an outcome summary depending on the characteristics selected in each PSO iteration and their suitable weights. The summary of the output is then compared by utilizing the cosine fitness function as the reference synthesis feature. The findings of the trial show summarized texts, focusing on every paragraph’s opening phrase. For the automated extraction of Single Arabic Documents, Mahjoub has suggested [28] the use of the PSO method. Experimental findings have shown that the proposed technique to extract a summary of documents is successful. It is determined from a literature study on ATS that graphical approaches [12] ensure semantics connections between the phrases, thereby improving the flow of the extracted summaries. On the other hand, due to the usage of standard graph search techniques, these approaches experience the optimal local issue. It is shown that PSO has not yet been utilized for extracting Telugu summaries. Moreover, as the existing methodology is incompatible with the important information dispersed in phrases, the output summaries are inconsistent, whereas the semanticized/similarity relation is considered in the suggested approach and the output summaries are consequently cohesive. The benefits of statistical techniques and semantic strategies, together with the merits of the PSO algorithm in the extraction for the Telugu-single document of the more significant phrases, are combined in this Research Paper. Eberhart & Kennedy’s 1995 social behaviour of particles proposed and each particle will modify its flying experience to fly and its neighbours to fly [16]. The search zone particles are analyzed and the goal function is evaluated at each location. Each particle then calculates its movement by integrating the value of its present positions with the values of the surrounding swarm members. Finally, the whole swarm will probably move towards an optimal solution [17]. PSO based classification of the text proposed by Binwahlan et al. [18]. These findings have not been utilized by many of the researchers in the Arabic documents for extracting summaries. M. Varaprasad Rao et al. [29] [30] proposed a statistical approach and a hybrid approach for the gist generation of Hindi news articles. B. Kavitha Rani et al. [31] proposed a Telugu text summarization using deep learning methodology.
54.3 Proposed System—HS and GA PSO Technique 54.3.1 Text Pre-preparation This step contains segmentation, tokenization, stop word removal, stemming and document construction word stem list. The original text is separated into phrases in segmentation. To split the text into sentences, special symbols and punctuation symbols such as commas, question marks, etc. stand removed. A sequential number is given for each sentence (ID). The phrases are split in words based on blanks and symbols in tokenization. The stop-word list is a list of the omitted terms, that are
592
M. V. Rao et al.
usable verses for precise in advance of automated language processing of texts. In this study, the Arabic stop word list was utilized by El-Khair [19]. We also utilized similar terminology in the paper from the Arabic Stemmer Information Science Research Institutes (ISRI) [20]. We have extracted separate words to generate an ordered list of different documents for a particular text. A vector is constructed by all the phrases of doc and in turn to maintain the score of the semantic phrase. More exactly, we utilize the Document List to generate vectors like an IDF (Inverse Document Frequency) and the Term Frequency (TF).
54.3.2 Calculating Informative Scores To compute the information score for each sentence, the structural elements of a sentence are utilized to rank the sentences of the source material. Each phrase has the heuristic values of that phrase and a score is assigned for each statement by the statistical metrics [9] [14]. The similarity of the Title(T): If a word is included in the text then the sentence has great relevance to the doc, and is calculated as follows: T =
No.of Title words in S No.of words in Title
(54.1)
Length of the sentence (L): The length of the statement is determined by the number of words included in that phrase. The longer phrases are considered as important than that of shorter sentences. As the shorter phrases may contain very poor information about the knowledge. The following formula is used to determine the sentence length score: L=
No.of words in S No.of words in longest sentence
(54.2)
Location of Sentence (Loc): The place shows the value of sentences, which at the beginning of the article are frequently the writers of key information included in the text. The first phrases are therefore supposed to convey the most significant material. The location of sentence score is computed by the statement of the location in the corresponding document over the number of statements in that docu is shown below. Loc = 1 −
Loc S N
(54.3)
TF and TF-IDF: TF refers to the incidence of a term is occurred in a stretch and IDF is the number of times the term occurred in the doc. The term TF-IDF function balances the occasions in the phrases from local to global word [21]. The TF-IDF can be calculated as below with formulae (54.4) to (54.7).
54 Telugu Text Summarization Using HS and GA Particle …
TF − IDF(tSi ) = TF(tSi )X IDF(t) IDF(t) = TF − IDF(Si ) =
Log N Nt
n
TF − IDF (tSi )
593
(54.4) (54.5)
(54.6)
t=1
iSCORE(Si ) = T + L + Loc + TF − IDF(Si )
(54.7)
54.3.3 Calculating Semantic Scores Semantic values are numeric values that represent the amount of semantic familiarity of the sentences. The cosine metric correspondence is used to quantify the relationship among the sentences, which gives the TF and IDF scores. Thereby calculate the TF-IDF metric to find the similarity between the sentences [22, 23] and are calculated by the formulae (54.8) to (54.11). −−→ TF Si = (TF(t1Si ), TF(t2Si ), . . . , TF(tnSi ))
(54.8)
−−−→ I D FSi = (IDF(t1Si ), IDF(t2Si ), . . . , IDF(tnSi ))
(54.9)
−−−−−−−→ TF − IDF Si = (TF − IDF(t1Si ), TF − IDF(t2Si ), TF − IDF(t3Si ), . . . , TF − IDF(tnSi ))
(54.10)
n
TF − IDF(Si)X TF − IDF(Sj) n 2 2 t=1 (TF − IDF(Si)) X t=1 (TF − IDF(Sj)) (54.11)
Sim − Score Si , S j = n
t=1
54.3.4 Design the Solution Graph The DAG represents the semantic relationships between statements and documents [24]. The diagram consists of two sets (V, E) where V is the vertical set and E is the edge set. The vertex of the graph is assumed to be each sentence in the document. There is a resemblance between the two phrases in the weight of each rim. We must add each phrase to the vertex list in the document. If the Si sentence happened before
594
M. V. Rao et al.
Sj in the text and the semantic value was not negative, then add the Si-to-Sj sentence edge to the edge list for every two sentences. [25].
54.3.5 Experimentation Proposed Algorithm The PSO is a population-based optimizer that is made up of a particulate swarm. The method is prepared with a collection of random keys in addition searches repeatedly aimed at the ideal. By swarming after the best particle it discovers the optimal solution. Each solution in PSO is termed the search space particle. All elements take appropriate values to be assessed through optimizing the suitability function through the particles. PSO consists of the following steps [17] 1. 2. 3.
Set Population: Identify each particle’s fitness in the inhabitants using an appropriate function. Discover the best solution: find the best fitness-value current result. Compute the frequency and position: Using the best position, current speed and neighbour information, each particle update their speed and position. For the update of speed and location correspondingly, Formulae (54.12) and (54.13) are used: → − → − → − → pg − → v1 + ϕ1 ⊗ − p1 − → x1 + ϕ12 ⊗ − x1 v1 = − − → → → x1 = − x1 + − v1
(54.12) (54.13)
where − → x1 : the ith particle position in the swarm − → v1 : the ith particle velocity − → p1 : the ith particle local best position − → : global best position p g
⊗
: multiplication of the vector
ϕ1 = c1r1 and ϕ2 = c2r2; Where r1, r2 are two uniformly selected vectors of a random integer [0, 1] and coefficients of personal and societal acceleration. 4.
Stop criteria check: The particles will return the best result of step 2 when the stop criteria are satisfied. It is calculated by repeating likeness for the following sentences to the informative outcomes [25]. Fα =
n n−1 i=1 j=1+1
Sim − Score(Si, Sj)X iSCORE(Si)
(54.14)
54 Telugu Text Summarization Using HS and GA Particle …
595
where F α : represents the fitness value of the αth candidate summary, and n represents the size of the candidate summary. The final decision is predicated on the POS, which selects the most readable sentences. Each particle improves its velocity and position in the PSO algorithm, in its global optimum, speed and certain neighbouring knowledge. The flight of particles is directed by rapidity. The speed and location of particles are correspondingly adjusted to Eqs. 54.12 and 54.13. The population is homogenous after certain repetitions and, for some stages, the best score is not different. The summary is produced with the best fitness function.
54.4 Results and Discussion To build a precise procedure, Telugu NLP Corpus[26] and ROUGE Toolkit [7] is used. Two major techniques, HS [14] and GA [15] are given and compared to the outcomes. Table 54.1 shows PSO experiment parameter parameters. The same parameter values are utilized for both ROUGE-1 and ROUGE-2 measurements. The comparative results for the strategy proposed are shown in Table 54.2 on ROUGE-1 and ROUGE-2, together with two key evolutionary approaches. The preceding table shows that the proposed PSO-based method is accurate and the F-measure goes beyond this (Fig. 54.1). Table 54.1 Setting PSO values Parameter
Value
Iterations
150
Size of particles
20
Personal constant (c1)
2
Social constant (c2)
2
Table 54.2 Results summary Method
ROUGE
Recall
Precision
F-measure
HS
ROUGE-1
0.5668
0.5698
0.5523
PSO
ROUGE-1
0.4544
0.5894
0.5432
GA
ROUGE-1
0.5813
0.5678
0.5436
HS
ROUGE-2
0.4890
0.4693
0.4569
PSO
ROUGE-2
0.4583
0.4843
0.4558
GA
ROUGE-2
0.4770
0.4590
0.4422
596 Fig. 54.1 The summary of the results for the PSO method
M. V. Rao et al.
Summary of Results 0.8 0.6 0.4 0.2 0
0
2 Recall
4 Precision
6 F-measure
8
54.5 Conclusion In this paper, we proposed a single-document PSO method for Telugu script synthesis. The solution suggested combining insightful scoring with semantical scoring for the optimum document summary. The proposed methodology shows greater Fmeasurement results than the evolutionary GA and HS approaches because the new heuristic function of TF-IDF weight is the insightful calculation. This function enhances the summary’s coherence and continuity. PSO approach demonstrated higher efficiency than evolutionary approaches; PSO only requires 100 iterations, while HS requires a total of 100000 iterations. The PSO method is based on the location of the particle and converges quickly with the optimum solution, although neither HS nor GA is positioned. For the future, we suggest improving insightful scoring by incorporating structural features of other terms, such as reference words, which could improve the summary generated.
References 1. Nagwani, N.K., Verma, S.: A frequent term and semantic similarity based single document text summarization algorithm. Int. J. Comput. Appl. 17(2), 36–40 (2011) 2. Mani, I., Maybury, M.T. (eds.): Advances in Automatic Text Summarization, vol. 293. MIT Press, Cambridge, MA (1999) 3. J.C. Cheung, Comparing Abstractive and Extractive Summarization of Evaluative Text: Controversiality and Content Selection, Doctoral dissertation, University of British Columbia (2008) 4. E. Lloret, M. Palomar, A gradual combination of features for building automatic summarisation systems. Proceedings of the International Conference on Text, Speech and Dialogue (Springer, Berlin Heidelberg, 2009), pp. 16–23 5. Gupta, V., Lehal, G.S.: A survey of text summarization extractive techniques. J. Emerg. Technol. Web Intell. 2(3), 258–268 (2010) 6. K. Ganapathiraju, J. Carbonell, Y. Yang, The relevance of cluster size in MMR based summarizer: a report 11–742: Self-paced lab in Information Retrieval (2002) 7. C.Y. Lin, E. Hovy, Manual and automatic evaluation of summaries. In Proceedings of the ACL-02 Workshop on Automatic Summarization-Volume 4. (Association for Computational Linguistics, 2002), pp. 45–51
54 Telugu Text Summarization Using HS and GA Particle …
597
8. Gaikwad, D.K., Mahender, C.N.: A Review Paper on Text Summarization. Int. J. Adv. Res. Comput. Commun. Eng. 5(3), 154–160 (2016) 9. Kumar, Y.J., Salim, N.: Automatic multi-document summarization approaches. J. Comput. Sci. 8(1), 133–140 (2012) 10. Brin, S., Motwani, R., Page, L., Winograd, T.: What can you do with a web in your pocket? IEEE Data Eng. Bull. 21(2), 37–47 (1998) 11. Alguliev, R., Aliguliyev, R.: An evolutionary algorithm for extractive text summarization. Intell. Inf. Manag. 1, 128–138 (2009) 12. M.M. Al-Omour, Extractive-based Arabic Text Summarization Approach, Master thesis, (Yarmouk University, Irbid, Jordan, 2012) 13. Ibrahim, A., Elghazaly, T., Gheith, M.: A novel Arabic text summarization model based on rhetorical structure theory and vector space model. Int. J. Comput. Linguistics Natl. Lang. Process. 2(8), 480–485 (2013) 14. Y.A. Jaradat, Arabic Single-Document Text Summarization Based on Harmony Search. Master Thesis, (Yarmouk University, Irbid, Jordan, 2015) 15. Y.A. Jaradat, A.T. Al-Taani, Hybrid-based Arabic single-document text summarization approach using a genetic algorithm. 7th International Conference on in Information and Communication Systems (ICICS2016), 5–7 April, Irbid, Jordan (2016) 16. R.C. Eberhart, J. Kennedy, A new optimizer using particle swarm theory. Proc. Sixth International Symposium on Micro Machine and Human Science (Nagoya, Japan). IEEE Serv. Center Piscataway, 1, 39–43, (1995) 17. Poli, R., Kennedy, J., Blackwell, T.: Particle swarm optimization. Swarm Intell. 1(1), 33–57 (2007) 18. M.S. Binwahlan, N. Salim, L. Suanmali, Swarm based text summarization. International Association of Computer Science and Information Technology—Spring Conference, IACSIT-SC (Singapore, 2009), pp. 145–150 19. El-Khair, I.A.: Effects of stop words elimination for Arabic information retrieval: a comparative study. Int. J. Comput. Inf. Sci. 4(3), 119–133 (2006) 20. L.S. Larkey, L. Ballesteros, M.E. Connell, Improving stemming for Arabic information retrieval: light stemming and co-occurrence analysis. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (2002), pp. 275–282 21. Alguliev, R.M., Alyguliev, R.M.: Summarization of text-based documents with a determination of latent topical sections and information-rich sentences. Autom. Control. Comput. Sci. 41(3), 132–140 (2007) 22. Qazvinian, V., Hassanabadi, L.S., Halavati, R.: Summarising text with a genetic algorithmbased sentence extraction. Int. J. Knowl. Manage. Stud. 2(4), 426–444 (2008) 23. C.D. Manning, P. Raghavan, H. Schtze, Introduction to Information Retrieval: Cambridge University Press (2008) 24. M. Mitra, A. Singhal, C. Buckley, ‘Automatic text summarization by paragraph extraction’. Proceedings of the ACL’97/EACL’97 Workshop on Intelligent Scalable Text Summarization (1997), pp. 39–46. 25. Nandhini, K., Balasundaram, S.R.: Use of genetic algorithm for cohesive summary extraction to assist in reading difficulties. Appl. Comput. Intell. Soft Comput. 2013(3), 1–11 (2013) 26. Anusha, Telugu NLP Indic script dataset; https://www.kaggle.com/sudalairajkumar/telugunlp?select=telugu_news. Accessed on 20 Dec 2020 27. Al-Zahrani, A., Mathkour, H., Abdalla, H.: PSO-Based feature selection for arabic text summarization. J. Univ. Comput. Sci. 21(11), 1454–1469 (2015) 28. A.Y. Mahjoub, Text Summarization Using Particle Swarm Optimization Algorithm. Master Thesis, College of Graduate Studies, Sudan University of Science and Technology, Sudan (2015) 29. Rao, M.V., Vardhan, B.V.: A mixed approach for Hindi news article gist generation. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 3(8), 900–906 (2013)
598
M. V. Rao et al.
30. Varaprasad Rao, M., et al.: A statistical model for gist generation a case study on Hindi news article. Int. J. Data Min. Knowl. Manage. Process (IJDKP) 3(5), 15–23 (2013) 31. Rani, B.K., et al.: Telugu text summarization using LSTM deep learning. Pensee J. 51(1), 355–363 (2021)
Chapter 55
A Dynamic Model and Algorithm for Real-Time Traffic Management M. N. V. M. Sai Teja, N. Lasya Sree, L. Harshitha, P. Venkata Bhargav, Nuthanakanti Bhaskar, and V. Dinesh Reddy
Abstract The work portrays a summary of traffic congestion which has been a persistent problem in many cities in India. The major problems that lead to traffic congestion in India are primarily associated with one or combination of the factors such as signal failures, inadequate law enforcement and relatively poor traffic management practices. The traffic congestion shall be treated as a grave issue as it significantly reduces the freight vehicles speed, increases wait time at the checkpoints and toll plazas, uncountable loss of productive man-hours spending unnecessary journey time, physical and mental fatigue of humans. In addition, the cars that are waiting in traffic jams contribute to 40% increased pollution than those which are normally moving on the roads by way of increased fuel wastage and therefore causing excessive carbon dioxide emissions which would result in frequent repairs and replacements. To avoid such unwarranted and multi-dimensional losses to mankind, we developed a technological solution and our experiments on real-time data show that proposed approach is able to reduce the waiting time and travelling time for the users.
M. N. V. M. S. Teja · N. L. Sree · L. Harshitha · P. V. Bhargav · V. D. Reddy (B) Department of CSE, SRM University, Amravati, Andhrapradesh, India M. N. V. M. S. Teja e-mail: [email protected] N. L. Sree e-mail: [email protected] L. Harshitha e-mail: [email protected] P. V. Bhargav e-mail: [email protected] N. Bhaskar Department of CSE, CMR Technical Campus, Hyderabad, Telangana, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_55
599
600
M. N. V. M. S. Teja et al.
55.1 Introduction How good does it feel when a doctor gives the verdict that you brought the patient on time! We are living in a generation where reaching your destination on time is precious but the journey is hectic due to several traffic jams and road blockages. For instance, if a patient is being transported to hospital, the ambulance has to cross several traffic sectors, sometimes it even takes more than 20 minutes to shift roads and for ambulances containing emergency patients, that 20 minutes is the golden time to save that patient life and Delays, which may result in late arrival for employment, meetings, and education, resulting in loss of business and other personal losses. The aim of our project is to reduce the traffic congestion as soon as possible and also to serve the emergency vehicles quickly to reach the destination in time. In the current scenario the green signal is turned on for an equivalent time on two lines without considering traffic density, thus increasing the waiting time of people on other lanes [6]. So, to overcome a lot of such problems due to traffic congestion we come up with an algorithm which can reduce waiting time by predicting the number of vehicles and can change the signal accordingly [7].
55.2 Related Work Researchers have been working on traffic congestion problems for decades. As part of our research, we found some interesting papers which dealt with the same problem and found some manageable solutions. Ninad Lanke and Sheetal Koul came up with a new technology called Radio Frequency Identification (RFID) [1]. They explained that this technology can be coupled with the existing technology which can lead to smart traffic management. This technology is cost-effective and the installation process is not time taking. This technology consists of RFID Controller and RFID Tag. RFID controller is classified into Controller Core and RFID interrogators. This interrogator communicates with the tag to get signals/data. Then the controller core receives the signal from the interrogator. Controller core performs read or write operations depending on the configuration. RFID tags are wireless devices that use the radiofrequency electromagnetic fields to transfer the data to the interrogator. These tags are classified as active and passive. Active RFID works on batteries whereas Passive RFID depends on external sources. This tag is installed within the vehicle and it stores the information about the vehicle. The controller is coupled with the existing system. By this, the signal can count the number of vehicles passed through the signal. Each traffic signal is stored with a threshold value for red and green. Based on the predefined minimum frequency, controller identifies the vehicles passing by the signal. Once vehicle is identified, traffic signal is changed to show red dynamically [8]. Edward et al. solved the traffic congestion problem with the help of deep learning and computer vision [2]. Using deep learning this problem is solved by Zhao et al.
55 A Dynamic Model and Algorithm for Real-Time Traffic …
601
which is a complex algorithm that deals with real traffic scenarios. Edward’s research was able to create a smart light model that can dynamically adjust the signal duration based on the current and predicted number of vehicles. They built a model using Arduino as a traffic light and includes a calculation function to adjust the traffic lights duration and a deep learning algorithm to predict the count of vehicles in 30 minutes. They used 3 different algorithms to build this model. Algorithms are shown below. Algorithm 1: Pseudocode to Control a Traffic Light (Arduino) Begin Put on the red signal Set the yellow signal duration Set the green signal duration For each of the signal light do Put on the yellow signal Delay for yellow duration Turn out the red signal Turn out the yellow signal Activate the green signal Delay for green duration Turn out the green signal Activate the yellow signal Delay for yellow signal Turn out the yellow signal Turn on the red signal End. Algorithm 2: Computer Vision Pseudocode for Vehicle Detection Begin Read dataset for training the model Train the model using the dataset Capture image from a traffic camera Change captured image colour space Count detected vehicle End. Algorithm 3. Deep Learning Pseudocode for the Number of Vehicle prediction Begin Initialize model for deep learning Read train data For every 30 minutes
602
M. N. V. M. S. Teja et al.
Read live traffic data Train the model with live traffic data Get number of car prediction End The results of the above work indicated that by using computer vision, the traffic congestion rate can be decreased by 28.57%, and by using deep learning prediction it can be reduced by 17.86%. Attila et al. had done intensive work on creating a Congestion Propagation Modelling Algorithm (CPMA) model [3]. They developed their own model by considering the pros and cons of previous methods like congestion propagation model with ConvLSTM (CPM-ConvLSTM), Propagation Probability Graph (PPG), Spatial Congestion Propagation Pattern (SCPP), etc. CPMA is designed by a new approach by identifying the frequent congestion propagation phenomena by which the model seeks the influences between the road segments. This model will be operated on the basis of the outcomes of SCPP algorithm, which contains the detected frequent congestion propagation patterns. CPMA describes the road network in the form of a directed graph. The input of this CPMA is taken in the form of a matrix. The output of this model is visualized by comparing it with the SCPP algorithm. With this kind of visualization of congestion propagation, it’ll be more useful for traffic management. Ata et al. came up with a solution using Artificial Neural Networks (ANN) [4]. This model controls the road blockage due to traffic and results in smooth traffic flow. There are many other factors like weather, accidents, construction, etc. Authors predict the traffic congestion using backpropagation Neural Network (BPNN). There are different steps involved in backpropagation like weights initialization, feedforward, Back Propagation error and updating the weights and bias. The accuracy of this model depends on the iteration done on the layer of neural network—an input layer, hidden layer, output layer (then a number of this layer’s changes from one model to another to form a network). They proposed two propagation models—BPANN using fitting modeling and time series modeling. In these two systems, 1 hidden consists of 10 hidden neurons, 6 input, 1 output. For fitting the model, they used Levenberg– Marquardt algorithm, whereas for the time series model they used a neural network model. They fit 997 sets of datasets using Non-Linear Autoregressive with External Input (NARX). They implemented these methods in MATLAB for validating their accuracies, miss rates, and RMSE. Other researchers also came up with other ideas like video proctoring, inductive loop detectors, infrared sensors, microwave sensors, microcontroller and personal computers.
55.2.1 Video Proctoring A smart camera consists of sensors, a dedicated processor in each unit, and communication interfaces. We monitor the traffic using these cameras. After the recording is
55 A Dynamic Model and Algorithm for Real-Time Traffic …
603
completed, some images are captured from the video which helps us to create statistical data based on the number of vehicles passed through the signal, the average speed of the vehicles, and lane occupancy. But there are some disadvantages like it is not cost-efficient, this idea may not work during some weather conditions (rains or fog, etc), besides, night time monitoring becomes difficult due to inadequate sources of light [5].
55.2.2 Inductive Loop Detectors These detectors use an electrically conducting loop installed in the pavement to detect the vehicle and send a signal to the system. Then the system changes its light and allows the traffic to pass. If we use inductive loop detectors triangular-, diamond-, or square-shaped outline is visible on the pavement [1].
55.2.3 Infrared Sensors These are the type of sensors that we often use at traffic signals. These sensors are installed overhead to recognize the presence of a vehicle. There are active and Passive IR sensors. Active IR sensors produce some low-level infrared energy to detect the presence of a vehicle. When there is a vehicle in that region, then the sensor sends the pulse to the traffic signal to change the light. Passive IR sensors, don’t actually emit the IR energy. So, they collect the energy received from the vehicles to detect them and send the pulse to the traffic signal to change the light. The disadvantage of these sensors is the maintenance is difficult [5].
55.2.4 Microwave Sensors This sensor is also a mounted overhead sensor and its work is similar to IR sensors but microwave sensors use electromagnetic energy to detect the traffic. These are less expensive compared to IR sensors. Both IR and Microwave sensors are easy to maintain compared to Inductive loops.
55.2.5 Microcontrollers To overcome the traffic congestion problem, we can also use microcontrollers. They handle the traffic based on the traffic flow/density at that particular time. Based on vehicle count, microcontrollers decide the time delay in the traffic lights which allows
604
M. N. V. M. S. Teja et al.
us to control the traffic. This delay is classified into three categories—Low, Medium, High range and this delay is predefined based on the vehicles count [5].
55.2.6 Through Personal Computers In this method, PCs are connected to the microcontroller. By this connection, we can save the data related to vehicle count according to the interval on the PC. This data transfer can be done through a telephone network. This data can be used to analyze the traffic control in particular intervals and predefined light delay values are set [5].
55.3 Proposed Work Our objective is to reduce the waiting time and find the best possible time to reach the destination. Our proposed algorithm first takes the arrival time, waiting time, cycle time and time taken to travel for the next junction and stores that information in the list. Now we have to reduce/increase the cycle time that we have done by considering the new arrival time(of vehicle which is calculated by removing waiting time) and the cycle time given we add the cycle time if the time[i]%(cycle[i]*t) is less than 100 (i.e, if it is less we are adding cycle time so that at that time of arrival the vehicle crosses the junction) where t is time[i]//cycle[i] and we reduce the cycle time if it is greater than 100. Pseudocode for the proposed approach is given in Algorithm 1. Algorithm Step 1: Import the dataset and needed libraries. Step 2: Consider the data of - time taken for next junction, cycle time and arrival time from the dataset then append the values to empty lists. Step 3: Slice the above-created lists according to the junction names. Step 4: Now, we have to calculate the new cycle time and by taking them as reference we have to adjust the signal1, signal2, signal3 and signal4 values respectively. First, we calculate the new cycle time by considering the arrival time without waiting time and time taken for the next junction. After calculating the new cycle time we adjust the signal time accordingly. Now we change the values in the dataset from the new signal values. Algorithm 4: Pseudocode for Calculating the New Cycle Time If time[i] is equal to 0 Append the values to the list If time[i] // cycle[i] is equal to 0
55 A Dynamic Model and Algorithm for Real-Time Traffic …
605
Append the difference of time[i] and cycle[i] to the list If time[i] // cycle[i] is greater than or equal to -1 t=time[i] / cycle[i] Check = (time[i]) % (cycle[i] * t) If (check is less than or equal to signal[i]) Append cycle value to the list Else If (check is less than 100) Increase cycle time by 10 sec and again check the condition again Else Decrease cycle time by 10 sec and again check the condition again
55.4 Results We have collected the data pertaining to the traffic flow within the two major cities namely Guntur and Vijayawada. In Guntur city, we have collected data from two different junctions namely Guntur market and Brundhavan Gardens while at Vijayawada the data has been collected at four different junctions namely Benz circle, Seetharamapuram, Guru Nanak Colony and Ramavarappadu Junction. At each junction, four different signal points named as signal1, signal2, signal3 and signal4 were considered and the time taken by a vehicle to reach the other junction is also considered. By taking the above-created dataset as input, we proposed an algorithm which reduces the waiting time. The first dataset is as follows. We can see that it has a waiting time of 250 seconds and a total travel time of 2130 seconds from Guntur market to Ramavarappadu (Table 55.1). By using our algorithm, we adjusted the signal timings and reduced the waiting time and given the best possible result to reach the destination faster as shown in the below table. The travel time has reduced from 2130 seconds to 1660 seconds and the waiting time is zero seconds (Table 55.2). Further, we have generated another dataset for the same route as shown in Table 55.3. Here we can observe a total travel time of 2470 seconds and a waiting time of 150 seconds. Our aim is to reduce the waiting time by using our algorithm. We have reduced the travel time to 2270 seconds and we can see the result in Table 55.4.
1880
2130
250
660
Ramavarappadu
Total travel time
130
500
Guru Nanak Colony
0
60
40
240
20
360
120
Brundavan Garden
0
Seetarampuram
0
Guntur market
Waiting time
Benz circle
Time
Junction name
50
25
25
40
30
20
Signal 1 green
5
5
5
5
5
5
Signal 1 yellow
40
35
20
50
30
20
Signal 2 green
Table 55.1 Traffic flow from Guntur market to Ramavarappadu
5
5
5
5
5
5
Signal 2 yellow
50
25
20
40
30
20
Signal 3 green
5
5
5
5
5
5
Signal 3 yellow
40
35
15
50
30
20
Signal 4 green
5
5
5
5
5
5
Signal 4 yellow
200
140
100
200
140
100
Cycle
100
100
120
100
120
Time taken for next junction
606 M. N. V. M. S. Teja et al.
1660
1660
0
540
Ramavarappadu
Total travel time
0
440
Guru Nanak Colony
0
0
0
220
0
340
120
Brundavan Garden
0
Seetarampuram
0
Guntur market
Waiting time
Benz circle
Time
Junction name
50
25
25
40
30
20
Signal 1 green
5
5
5
5
5
5
Signal 1 yellow
30
35
20
50
30
20
Signal 2 green
Table 55.2 Updated traffic flow from Guntur market to Ramavarappadu
5
5
5
5
5
5
Signal 2 yellow
50
25
20
40
20
20
Signal 3 green
5
5
5
5
5
5
Signal 3 yellow
30
35
25
50
20
20
Signal 4 green
5
5
5
5
5
5
Signal 4 yellow
180
140
110
200
120
100
Cycle
100
100
120
100
120
Time taken for next junction
55 A Dynamic Model and Algorithm for Real-Time Traffic … 607
2470
2470
150
790
Ramavarappadu
Total travel time
90
640
Guru Nanak Colony
40
0
0
320
20
540
180
Brundavan Garden
0
Seetarampuram
0
Guntur market
Waiting time
Benz circle
Time
Junction name
50
35
40
25
45
30
Signal 1 green
5
5
5
5
5
5
Signal 1 yellow
50
35
40
25
45
30
Signal 2 green
Table 55.3 Traffic flow from Guntur market to Ramavarappadu
5
5
5
5
5
5
Signal 2 yellow
50
35
40
25
45
30
Signal 3 green
5
5
5
5
5
5
Signal 3 yellow
50
35
40
25
45
30
Signal 4 green
5
5
5
5
5
5
Signal 4 yellow
220
160
180
120
200
140
Cycle
150
100
180
120
180
Time taken for next junction
608 M. N. V. M. S. Teja et al.
2270
2270
0
730
Ramavarappadu
Total travel time
0
580
Guru Nanak Colony
0
0
0
300
0
480
180
Brundavan Garden
0
Seetarampuram
0
Guntur market
Waiting time
Benz circle
Time
Junction name
50
35
40
25
50
30
Signal 1 green
5
5
5
5
5
5
Signal 1 yellow
55
35
40
35
60
30
Signal 2 green
Table 55.4 Updated traffic flow from Guntur market to Ramavarappadu
5
5
5
5
5
5
Signal 2 yellow
50
35
40
25
60
30
Signal 3 green
5
5
5
5
5
5
Signal 3 yellow
55
35
20
35
40
30
Signal 4 green
5
5
5
5
5
5
Signal 4 yellow
230
160
160
140
230
140
Cycle
150
100
180
120
180
Time taken for next junction
55 A Dynamic Model and Algorithm for Real-Time Traffic … 609
610
M. N. V. M. S. Teja et al.
55.5 Conclusion Our work presents a real-time traffic information collection and monitoring system to solve the problem of real-time monitoring and controlling road vehicles through traffic signals by predicting a number of vehicles. The lifestyle of people in metro cities where there is a large volume of population which is equally affected by various application and service systems. Consequently, most of the cities are in the process of transforming their cities into smart cities by adopting automated systems in all sectors. So, by considering the above we have developed an algorithm which reduces traffic rate and waiting time to reach the destination in the best possible way.
55.6 Future Scope The logic and methodology demonstrated in the present report can further be extended to 4 signals where it checks the traffic rate at the respective signal and reduce that particular signal time so that traffic rate reduces. This can be accomplished by extending to a four-signal checking time that can be pursued by the authors in the forthcoming semester and also improve the overall efficiency of the system.
References 1. Ninad, L., Sheetal, K.: Smart traffic management. Int. J. Comput. Appl. 75, 19–22 (2013) 2. D. Garg, M. Chli, G. Vogiatzis, Deep reinforcement learning for autonomous traffic light control. In 3rd IEEE International Conference on Intelligent Transportation Engineering (ICITE), (IEEE, 2018), pp. 214–218 3. M. Attila, V. Simon, Novel congestion propagation modeling algorithm for Smart Management System. Pervasive Mobile Comput. 73 (2021) 4. Ata, A., Khan, M.A., Abbas, S., Ahmad, G., Fatima, A.: Modelling smart road traffic congestion control system using machine learning techniques. Neural Network World 29(2), 99–110 (2019) 5. A. Basavaraju, S. Doddigarla, N. Naidu, S. Malgatti, Vehicle density sensor system to manage traffic. Int. J. Res. Eng. Technol. 2319–1163 (2014) 6. S. Jacob, A. Rekh, G. Manoj, J. Paul, Smart traffic management system with real time analysis. Int. J. Eng. Technol. (UAE), 7, 348–351 7. A.S. Baker et al., “smart traffic monitoring system”, https://www.slideshare.net/alivxlvie/smarttraffic-monitoring-system-report 8. A. Saikar, M. Parulekar, A. Badve, S. Thakkar, A. Deshmukh, TrafficIntel: smart traffic management for smart cities. In International Conference on Emerging Trends and Innovation in ICT, IEEE (2017), pp. 46–50
Chapter 56
A Selection-Based Framework for Building and Validating Regression Model for COVID-19 Information Management Pravinkumar B. Landge, Dhiraj V. Bhise, Kapil Kumar Nagwanshi, Raj Kumar Patra, and Santosh R. Durugkar Abstract The world is facing pandemic situation, i.e., COVID-19, all the researchers and scientist are working hard to overcome this situation. Being human it is everyone’s duty to take care of family and the society. In this case study, an attempt has been made to find the relation between various variables by dividing them into the independent and dependent variables. A dataset is selected for analysis purpose which consists of variables like location (countries across the globe, date, new cases, new deaths, total deaths, smoking habits washing habits, diabetic prevalence, etc. Approach is to identify the impact of independent variable on the dependent variable by applying the regression modeling. Hence, proposed case study is based on selection-based framework for validating the regression modeling for COVID-19 data analysis. Regression modeling is applied, and few representations are shown to understand the current pandemic situation across the world. In the end, using regression modeling interceptor and coefficient values for different approaches (using different variables) is computed.
P. B. Landge · D. V. Bhise · S. R. Durugkar Department of Information Technology, SVKM’s NMIMS—MPSTME Shirpur, Shirpur, India e-mail: [email protected] D. V. Bhise e-mail: [email protected] K. K. Nagwanshi (B) Department of Computer Science and Engineering, ASET, Amity University Rajasthan, Jaipur, India e-mail: [email protected] R. K. Patra Department of Computer Science and Engineering, CMR Technical Campus, Hyderabad, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_56
611
612
P. B. Landge et al.
56.1 Introduction Proposed case study presents an approach to analyze the impact of various factors affecting the current pandemic situation (COVID-19). A large dataset of the globe has been processed and analyzed to ensure preventive measures to be taken. In this case study, method like regression is discussed and applied to identify the impact of independents variables on dependent variables. Demographic analysis, environmental analysis and other living habits are the core parts of this study which suggests how people are facing this pandemic situation and study also reveals the country wise data analysis. Data mining is nothing but extraction of data, relationships among the variables using different tools. An attempt has been made to analyze the large dataset to find certain (accurate) data (conclusions) from uncertain relationships using regression modeling.
56.2 Review of Literature Using data mining methods, it is easy to explore the new relationships exists in dataset. To solve the research problems and other statistical issues, one can go for data mining with its different approaches in used machine learning. One can use this data mining methods for scholastic purposes such as educational, scientific, gaining meaningful data from repositories and recent computing such as ubiquitous computing and big-data computing. Data mining is greatly being used in decision-making systems to analyze records using algorithms like J48, K-means, KNN, etc. A detailed investigation can be easily made using data mining methods so that one can identify the relationship among different variables. Using data mining, a cluster of similar data can be prepared so that distinguishing will be made simpler to enhance the retrieval process. Required records can be efficiently retrieved by applying the preprocessing data mining techniques, applying the performance measures and feature selection [2, 3, 11]. Classification method can be used to classify the records available in the datasets. If a user has large dataset, then relevant facts can be identified and stored in one group known as classifying the records. A recent trend is machine learning (ML) where a user wishes to automate the systems. Systems being trained and regular innovations are made available in this field. Co-relational, statistical, Bayesian and regression are the popular approaches of the classification. If one wishes to corelate the variables (records), then the co-relational approach suggests identifying the relationship among those variables. In regression analysis, one needs to analyze the independent and dependent variables, and using regression method will explore how the independent variable impacts the dependent one. In addition to available approaches, there is a simple approach, i.e., regression, based on assumption theory [7].
56 A Selection-Based Framework for Building and Validating Regression Mode …
613
Many times one has to deal with big data where a large datasets can be processed to retrieve meaningful relations. Momentum is gained in this field as every day large datasets are being added and processed. In the future, while dealing with the socioeconomic systems and processing the data one has to maintain privacy so that information cannot be disclosed to unauthorized persons. Hence, to maintain the privacy, one has to deal with the accuracy of the data too. Maintaining the privacy in data processing requires application of optimization methods so that systems performance cannot be deviated and accuracy will be achieved [5]. To mine large volume of data, parallelizing of data mining algorithms is necessary to increase the throughput. This can be implemented like a system that can distribute data across various nodes (other systems) to reduce the delay, optimize the results and throughputs. This seems to be similar load balancing in distributed systems. One can also check the processing capabilities of processing nodes as data is being added incrementally [4]. In recent era, big data, heterogeneous data needs to be processed, and to do so, one has to identify the factors/variables affecting others. Data mining and ML methods can be used in geological data processing to increase the accurately predicting data. This prediction model is based on neural network like input and output neurons, and training function is implemented [10]. It is important to identify the factors affecting the price of products and influencing market demands. In this type of predictor models, historical data important and number of training samples v = 1, 2 . . . s, number of input unit can be visualized as Eq. 56.1, of the network and can be used to establish the desired output vector as shown in 56.2. (56.1) Iu = (i u1 , i u2 , . . . i us ) Ou = (Ou1 , Ou2 , . . . , Oun )
(56.2)
Fuzzy regression can be categorized into non-interval and interval types. Intention is to avoid the errors and increase the regression accuracy. In the paper [13] Yabuuchi, have compared these two types (non-interval and interval) and focused on least square method. Using these approaches, one can identify the different possibilities by processing the data. An interval-type regression model was proposed by author to eliminate the problems encountered in fuzzy regression model. Fuzzy regression (X i , Yi ), where i = 1, 2, 3, . . . n, X i = [X i1 , X i2 , . . . X i p ], can be calculated as Eq. 56.3. Yi = A0 + A1 · X i1 + . . . + A p · X i p ⊆ Yi
(56.3)
In above equation, X i is an independent variable, and Yi is dependent variable. Using these functions, one can predict the possibility of the regression output. Possibility may vary, and if one considers the Yi sample, then it may vary from k = YiC to Yi , and regression output illustrates the interval [YiC , Yi ]. Hence, possibility of the system P(Yi ) will be calculated by Eq. 56.4.
614
P. B. Landge et al.
Yi P(Yi ) =
yi u i dyi
(56.4)
k
In fuzzy regression model if outliers are mixed during processing of the data, model’s shape will get distorted, and defining the center of data distribution becomes difficult. Many authors have proposed ‘n’ regression models, like the paper [6] introduces pattern aided regression models to process the data. Prediction models usually cannot produce accurate predictions if the data is collected from diverse sources. Model deals with the diverse data, defining predictor rules, adjusting the relationships becomes difficult, and hence, these models cannot produce very accurate results. Hence, authors have proposed new approach in this paper which characterizes the logical data, and then, defining the model is important. Authors have used entropybased method to partition the variables into disjoint intervals to measure the purity of sets. A binning method is also discussed to split the intervals. In this paper [1], authors have discussed the regression modeling used in student’s confusion state in enrolling to the various courses offered by universities. In this paper, linear regression, SVM models, decision tress algorithm, etc., are discussed and compared. This approach helps the students in selecting right course offered by the universities. In this approach, algorithm considers various parameters like exam scores of the students in GRE, TOEFL, ranking of the institute, etc. Conclusion is to obtain the close chances of admit of the student [8, 12]. The paper [9] addresses many limitations of regression like automated regression, transformations and identifying interactions between various variables. Authors have proposed a framework to identify the qualitative analysis and quantification of relevance obtained between various variables by presenting the interactive workflow for feature subset selection.
56.3 Methodology 56.3.1 Regression Modeling We all are dealing with ML, i.e., machine learning which can play a major role in diverse fields. One can use the methods suggested in ML for decision making, recommending systems, etc. Those methods are like regression, classification, clustering, etc., and can be applied to the systems to work independently without explicitly programmed. The line of best fit is the line that best expresses the relationship between the data points. Let us see how to find the best fit line in linear regression. For example, one can plot the relationship between temperature and humidity and predict how temperature affects the humidity (Fig. 56.1). In this case study, COVID-19 dataset (ourworldindata.org/coronavirus-sourcedata) is used for analysis purpose. It has many attributes like location, date, population, total deaths, whether the patient is smoker or not, whether the patient is diabetic or not, total case found and so on. An attempt has been made to identify the relationship between independent (smoking habits, diabetes, etc.) and dependent
56 A Selection-Based Framework for Building and Validating Regression Mode …
615
Fig. 56.1 An example of regression modeling
Fig. 56.2 Total deaths due to COVID-19
variables (total deaths). Simple and multiple regressions are applied on the attributes of the given dataset to draw conclusions. We believe these regression methods will surely help to spread the awareness about COVID-19 and to take necessary actions (Fig. 56.2). Figure 56.3 shows country-wise analysis of this pandemic situation. Main concern is to protect the persons whose immunity is lower especially older than 65,70 and those who are diabetic and smokers.
616
P. B. Landge et al.
Fig. 56.3 Multi-attribute analysis
Fig. 56.4 Confirmed and deaths by location
Figure 56.4 clearly states that USA is badly affected due to this pandemic situation and then rest of the countries are shown in the figure. Emphasis is given to save the life of people by giving preliminary medical treatment immediately. As discussed earlier, the COVID-19 dataset is used for analysis purpose, and among the various ‘n’ independent variables, we have selected ‘diabetic’ and ‘smokers’ these variables. Our main concern is to identify the cause behind ‘total deaths’ and therefore focused on ‘diabetic’ and ‘smoking’ independent variables. Executing the same one can get the important values ‘multiple R’ and ‘R Square’.
56 A Selection-Based Framework for Building and Validating Regression Mode …
617
Table 56.1 Regression analysis (deaths vs. smokers) REGRESSION OUTPUT Regression Statistics Multiple R 0.510418453 R Square 0.260526997 Adjusted R Square 0.255450855 Standard Error 6686.532207 Observations 198 df Regression Residual
1 197
ANOVA SS MS 3103119632 3.1E+09 8807813452 44709713
F 69.40593949
Significance F 1.37524E-14
Fig. 56.5 Regression analysis (deaths vs. smokers) Fig. 56.6 Function of the model
R square obtained in the above table is 26.05% which is to be taken seriously because it shows as the number of ‘new cases’ will increase there are chances that R square will also increase. It means especially those are diabetic, and smokers need to take care in this pandemic situation (Fig. 56.5). Based on compiled result, our set of selected independent and dependent variables is justified (Fig. 56.6).
618
P. B. Landge et al.
Fig. 56.7 Worldwide COVID data as on 17 July
We have the Y p = f (, x), where is going to represent our different parameters, and then, Y p is going to be the values that we predict, given our model. Our fit parameters and we will see this fit term come into play as we learn to code our machine learning models, involves aspects of the model which we will estimate using the data. Each observation x is going to relate to some outcome variable y. The more of these values we have, the better the machine learning model can learn the parameters that define this relationship between the features x and the outcome variables y. Once we have trained on our past data, we get new observations. In the above Fig. 56.7, worldwide COVID-19 analysis is shown as the number of cases is increasing, and it has been observed that total deaths are also increasing. Hence, countries across the globe have started ‘Lockdown’ to stop this pandemic situation. As already discussed, independent variables ‘diabetic prevalence’ and ‘smoking habits’ impact the dependent variables ‘total deaths’. In the above Fig. 56.8, worldwide observation is represented with two variables, namely ‘diabetic prevalence’ and ‘total deaths’ (Fig. 56.9).
56.4 Results In Table 56.2 and Fig. 56.10, regression is evaluated on the diabetes and smokers; these are two independent variables. In Table 56.1, R square and multiple R obtained are 26.05% and 51% as dependent variable ‘total deaths’ is computed against the ‘diabetic’ and ‘smoking habits’. In Table 56.2, R square is 2.04%, and multiple R is 14.28% as regression applied on ‘diabetic prevalence’ and ‘smoking habits’.
56 A Selection-Based Framework for Building and Validating Regression Mode …
619
Fig. 56.8 Worldwide diabetes prevalence
Fig. 56.9 Worldwide diabetes prevalence Table 56.2 Regression statistics for diabetes and smoking. Regression Statistics Multiple R 0.142884411 R Square 0.020415955 Adjusted R Square 0.010852747 Standard Error 42412.90267 Observations 209
Regression Residual
df 2 207
ANOVA SS MS 7.76E+09 3880291348 3.72E+11 1798854313
F 2.15709
Significance F 0.118266667
620
P. B. Landge et al.
Fig. 56.10 COVID in diabetes and smoking
Fig. 56.11 Male and female smokers and diabetes prevalence
Now, we have divided the data into independent and dependent variables, e.g., ‘diabetic prevalence’ and ‘total deaths’. Then, we have split the into train sets and test sets. Now by executing the few more steps like, shaping the train and test sets, training the algorithm and retrieving the interceptor and coef_ are as follows (Fig. 56.11): Interceptor and Coefficient regressor.interceptor_ = 7. 34091782 regressor.coef_ = 1.93983881e-05
In the next phase, one can divide the data into ‘male_smokers’ and ‘total deaths’, and interceptor and coefficient values obtained are as follows: Interceptor and Coefficient (different variables) regressor.interceptor_ = 22.20282162 regressor.coef_ = 3.78252668e-05
Figure 56.12 presents a histogram for deaths during lockdowns (different lockdown phases) applied. In a histogram, each bar groups numbers into ranges, and binning method is applied. A histogram displays the shape and spread of continuous sample data as shown in figure.
56 A Selection-Based Framework for Building and Validating Regression Mode …
621
Fig. 56.12 Male and female smokers and diabetes prevalence
56.5 Conclusion We have analyzed the data with many attribute especially ‘diabetic concern’ and ‘smoking habits’ of the male and female. The whole world is suffering in this pandemic situation, and a small attempt is initiated with regression modeling to find the relevance between independent and dependent variables. We have found R-Square and multiple R values using regression method. R square is a statistical measure of how close the data is to the fitted regression line, i.e., our attempt to fit the selected data is succeeded as the values are positive in both the approaches using ‘diabetic concerns’ and ‘smoking habits’. In-depth analysis has been conducted by representing the various charts for ease of understanding.
References 1. Acharya, M.S., Armaan, A., Antony, A.S.: A comparison of regression models for prediction of graduate admissions. In: 2019 International Conference on Computational Intelligence in Data Science (ICCIDS), pp. 1–5 (2019) 2. Anoopkumar, M., Rahman, A.M.J.M.Z.: A review on data mining techniques and factors used in educational data mining to predict student amelioration. In: 2016 International Conference on Data Mining and Advanced Computing (SAPIENCE), pp. 122–133 (2016). https://doi.org/10.1109/SAPIENCE.2016.7684113 3. Basheer, S., Nagwanshi, K.K., Bhatia, S., Dubey, S., Sinha, G.R.: FESD: an approach for biometric human footprint matching using fuzzy ensemble learning. IEEE Access 9, 26641– 26663 (2021) 4. Challa, J.S., Goyal, P., Nikhil, S., Mangla, A., Balasubramaniam, S.S., Goyal, N.: DD-Rtree: a dynamic distributed data structure for efficient data distribution among cluster nodes for spatial data mining algorithms. In: 2016 IEEE International Conference on Big Data (Big Data), pp. 27–36 (2016)
622
P. B. Landge et al.
5. Cuzzocrea, A.: Privacy-preserving big data stream mining: Opportunities, challenges, directions. In: 2017 IEEE International Conference on Data Mining Workshops (ICDMW), Los Alamitos, CA, USA, IEEE Computer Society (Nov 2017), pp. 992–994 6. Dong, G., Taslimitehrani, V.: Pattern-aided regression modeling and prediction model analysis. IEEE Trans. Knowl. Data Eng. 27(9), 2452–2465 (2015) 7. Jalota, C., Agrawal, R.: Analysis of educational data mining using classification. In: 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon), pp. 243–247 (2019) 8. Mahmood, M.R., Patra, R.K., Raja, R., Sinha, G.R.: A novel approach for weather prediction using forecasting analysis and data mining techniques. In: Saini, H.S., Singh, R.K., Kumar, G., Rather, G., Santhi, K. (eds.) Innovations in Electronics and Communication Engineering, pp. 479–489. Springer, Singapore (2019) 9. Mühlbacher, T., Piringer, H.: A partition-based framework for building and validating regression models. IEEE Trans. Vis. Comput. Graph. 19(12), 1962–1971 (2013) 10. Ming, J., Zhang, L., Sun, J., Zhang, Y.: Analysis models of technical and economic data of mining enterprises based on big data analysis. In: 2018 IEEE 3rd International Conference on Cloud Computing and Big Data Analysis (ICCCBDA), pp. 224–227 (2018) 11. Nagwanshi, K.K., Dubey, S.: Estimation of centroid, ensembles, anomaly and association for the uniqueness of human footprint features. Int. J. Intell. Eng. Inf. 8(2), 117–137 (2020) 12. Sahu, A.K., Sharma, S., Tanveer, M., Raja, R.: Internet of things attack detection using hybrid deep learning model. Comput. Commun. 176, 146–154 (2021) 13. Yabuuchi, Y.: Evaluation of an Interval-Type Model on Fuzzy Regression. In: 2018 International Conference on Unconventional Modelling, Simulation and Optimization-Soft Computing and Meta Heuristics-UMSO, pp. 1–5. IEEE (2018)
Chapter 57
Fingerprint Liveliness Detection to Mitigate Spoofing Attacks Using Generative Networks in Biometric System Akanksha Gupta, Rajesh Mahule, Raj Kumar Patra, Krishan Gopal Saraswat, and Mozammil Akhtar Abstract Today fingerprint detection system is being used widely, from a corporate office to military camps. They are secure, have speed and accurate, but they are vulnerable to spoof attacks. And the primary aim of the fingerprint reader is to provide definitive and exact user authentication but also to be secure and ensure user confidence. The most prominent vulnerability in fingerprint spoof detection system was poor generalization of spoof classes that means whenever an unknown spoof the material was given to the detection system the error rate increases up to 3 folds. To improve the accuracy and performance of the fingerprint detection systems when fabricated to an unknown number of spoof materials thus decreasing the cross-performance error rate. Hence improving the poor generalizing problem of a fingerprint spoof detector using generative and other convolution networks. We are using one-class classification and minutiae extraction approaches using DCGANs and MobileNets, respectively, and using these networks gives a spoof score to given fingerprint and found out that our results had an accuracy of 5–10% more than the previous binary spoof classifiers.
57.1 Introduction Today, fingerprint biometrics are taking place of traditional IDs, used in forensics, border crossing security, mobile authentication, payment transactions, ATM machines, laptops and places where user authentication is required [1]. Bolts can be stolen, safes can be broken, and passwords can also be guessed sooner or later. So how do we save the things that we value? Here then, we use biometrics say fingerprint scan, A. Gupta · R. Mahule · K. G. Saraswat · M. Akhtar Department of IT, School of Studies in Engineering and Technology, Guru Ghasidas Vishwavidyalaya, Bilaspur, Chhattisgarh, India R. K. Patra (B) CMR Technical Campus, Hyderabad, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_57
623
624
A. Gupta et al.
retinal scan, iris scan, face scans as they cannot be forged. In specific fingerprints, the tiny friction ridges on the ends of human fingers and thumbs make it easier to hold the things [2]. That makes fingerprints such a brilliant way of telling a person apart is that they are virtually unique. These shapes are unique to each person and in this way help to recognize people from a whole populace. Fingerprints are inherent to individuals and can neither be stolen nor lost which makes it highly precise and reliable. Additionally, the accessibility of minimal effort fingerprint readers combined with simple mix abilities has prompted the across-the-board organization of fingerprint biometrics in an assortment of associations. To avoid spoof detection, automated fingerprint detectors were trained to distinguish between live and bonafide fingerprint [3, 4] from known spoof materials. But they were still vulnerable to spoofs made with materials not given in training. To solve this, many deep convolutional networks using the whole image and minutiae-based local patches are used [5].
57.1.1 Steps of Fingerprint Scanning Biometric usually take three characteristics [6, 7]: Fingerprint | Iris | Face: 1. 2. 3. 4. 5.
Fingerprint template formulation (minutiae extraction). Scanning into a monochrome grayscale of 8 bit then converting into 01 scalar. Image is processed then to thin lines sharp image, then passing through Gabor filter. Creates a matrix of pixel of size 4 times the ridges, valleys and bifurcations are marked. Fingerprint matching algorithm. 1: 1 matching | 1: n matching.
57.1.2 Different Spoof Attacks In general, a spoof attack is providing false data to gain illegitimate access to the system. Spoof artifacts are provided to the sensor to fool the system. These artificial objects imitate biological and behavioral characteristics. There are a number of unknown and known spoof materials and techniques for the forgery of data or other resources [8].
57.1.3 Example of Spoof Attacks In smart phones, unlocking and accessing with fingerprint has become very common, and hackers are gaining access by scanning and printing fingerprints by using accompany inks and printing on paper cut accessing mobile phones [9].
57 Fingerprint Liveliness Detection to Mitigate Spoofing Attacks …
625
In MSU, they have developed wearable finger that mimics human skin in optical, mechanical and electrical properties. Similarly, there are various other cloning materials like playdoh, dental molding and 3D fingerprinting. Example is shown in Fig. 57.1.
Fig. 57.1 Example spoof artifacts
626
A. Gupta et al.
57.2 Related Works This paper has been made after reading comprehending the following related papers: Learning to Discover Cross-Domain Relations with Generative Adversarial Networks [10]—This paper has covered the basic concepts of generative adversarial networks (GANs), by explaining the several layers required to develop one and elaborating how to construct the network in most image analysis task. DCGANs for image super-resolution, de-noising and de-blurring [11]—This paper has covered the concept of deep convolutional generative adversarial networks (DCGANs) by using DCGANs in several tasks like super-resolution, de-noising and deconvolution. DCGANs can learn from large datasets and automatically by itself add high frequency features to images while other traditional model cannot. So, this is a problem to mitigate fingerprint spoof attacks among all available (provided in dataset) spoof materials. In order to do that, we will need dataset first, so we favor dataset of KAGGLE [12]. The objective was to instruct a spoof indicator on the live fingerprints with the end goal that once the idea of “live” has been found, parodies of anything can be dismissed. We achieve this through tutoring more than one Generative Adversarial Networks (GANS) on live unique mark pictures got with open source, double camera, ppiRaspiReader unique finger reader.
57.3 Approaches So, there are two methodologies: One we use GANs with thinking about the spoof detection as a one-class classification [13] problem. Another methodology is to extract minutiae-based patches and given them spoof score utilizing MobileNets [14] architecture. Henceforth, our point will be to execute these methodologies and show signs of improvement results when contrasted with the given paired CNN classifiers and CNN models utilizing entire fingerprint images or arbitrarily chose patches in a fingerprint image.
57.3.1 Adversarial Liveness Detection: One-Class Classifier To avoid spoof detection, automated fingerprint detectors were trained to distinguish between live and bonafide fingerprint from known spoof materials. But they were still vulnerable to spoofs made with materials not given in training. To solve this, oneclass classification was proposed. The aim is to educate the detector most effective to discover live fingerprints, and then, spoof of every other fabric may be rejected. We accomplish this by using training over our dataset with the GANs network. We recall spoof detection as a one-class classification hassle [15, 16].
57 Fingerprint Liveliness Detection to Mitigate Spoofing Attacks …
627
57.3.2 Advantages of One-Class Classifier Over Binary Classifier? One-class classifier does not over fit the data while binary does, hence crossperformance decreases, and only live sample is required for training. For this reason putting off the assignment of fabricating an enormous number of spoof impressions from more than one material [17, 18]. It only learns what constitutes live fingerprint and does not use spoof material of any specific material during training. So, they have a strict decision boundary around one class say live samples and all other class samples are unknown (i.e., spoof) [19].
57.3.3 Deep Convolutional Generative Adversarial Networks Generative adversarial network is one where two networks contend with each other in a zero-sum game and converge at nash equilibrium. GANs have two components, namely generator and discriminator. A discriminator is fed both generators synthesized and live fingerprints. Generator learns to synthesize better fingerprints, and discriminator learns to differentiate between synthesized and real. Basically, generator and discriminator both are two CNN models coupled together.
57.4 Training and Architecture Used The spoof detection model depends on the deep convolutional generative adversarial network (DCGAN) architecture. Preparing utilizing DCGAN includes the accompanying advances: 1.
2.
3. 4. 5.
6.
Utilizing the Gaussian method to create clamor which will later on go through the generator and update the binary entropy loss with Adam streamlining agent and learning rate = 0.0002 utilizing slider over pooling. At that point utilizing the dataset having just genuine fingerprints and generated fingerprints as input to discriminator and a sigmoid yield giving an incentive somewhere in the range of 0 and 1. In testing phase removing the generator and using only discriminator to give the sigmoid output. Generator is using deconvolution and having an output of image of 64*64, and discriminator is just mirror network of generator. Initially in DCGAN training, we freeze the weight updating of discriminator, and only our generator gets trained, hence getting synthesized images and saving them. Then using genuine and generated images to train the discriminator again (Fig. 57.2).
628
A. Gupta et al.
Fig. 57.2 Flow diagram of the proposed one-class classifier fingerprint spoof detector
57.5 Implementation Details 57.5.1 Dataset We used dataset from Sokoto Coventry Fingerprint Dataset (SOCOFing) [10], and a biometric fingerprint database is made of up 6000 fingerprint images from 600 African subjects, having attributes like gender, hand, finger and have altered images for three different levels such as obliterations, z-cut and central rotations. We used it for training and validation. For testing, we used LIVDET dataset 2011, 2013, 2015 (Clarkson University—University of Cagliari, Joint Multi-modal Biometric Dataset). MNIST dataset was also used for digit generation using simple GAN model.
57.5.2 Libraries Used: Keras is used for implementing the one-class classification DCGAN spoof detection and minutiae-based extraction. Pytorch has been used for synthesizing fake images in GAN experimented with MNIST data.
57.5.3 Data Analysis Strategies In SOCOFing dataset, they have altered fingerprint images with a strange toolbox over 500dbi resolution and settings easy, medium, hard giving total images of 55,734
57 Fingerprint Liveliness Detection to Mitigate Spoofing Attacks …
629
Table 57.1 Comparision of all spoof materials as compared to any other CNN model Training set
Socofing dataset was used with 4800 98.3% accuracy real fingerprint images
Validation set
Socofing dataset 1200 real fingerprints and 600 spoof fingerprints used
96% accuracy
Testing set
Livdet2015, a dataset of size 1000 GreenBit images including real fingerprint and spoof material like Gelatin., Ecoflex and Latex
True detection rate: 51.3% and fake detection rate: 19%
altered images of size 1 × 96 × 103. Then, minutiae extraction algorithms are used to extract patches.
57.6 Experimental Results and Inference After implementing one-class classifier over the dataset, the sigmoid output of the discriminator gave the spoof score for the input image, and hence, over the complete testing dataset, the spoof score was measured, and the average was taken. Gans works well for materials which are anomalous such as playdoh and gold fingers. This approach had an average of true detection rate of at least 10% as compared to more for all 12 spoof materials as compared to any other CNN model. Training accuracy was 98.3%, and validation accuracy was 96%. And precision was 51.2% as compared to the previous work where it has been 49% only [20] (Table 57.1).
57.6.1 Generated Images Output Hence images showing starting from noise to the final output of fingerprint from image 1–4, respectively, Shown in Fig. 57.3.
57.7 Conclusion and Future Work We have improved the generalization problem for spoof detection the system through one-class classifier using DCGANs which requires large dataset but eliminates the poor generalization to unknown spoof materials the proposed spoof detection model nevertheless leaves room for future enhancement in transparent spoof materials. Indeed, obvious spoofs had been additionally reported as the maximum difficult substances due to the fact that a good deal of the live finger coloration transmits thru
630
A. Gupta et al.
Fig. 57.3 Results of the proposed system
the clear spoof substances GANs warfare to differentiate clear spoofs like ecoflex from live finger, due to the fact the live finger can be seen from behind the spoof.
References 1. Raja, R., Raja, H., Patra, R.K., Mehta, K., Gupta, A., Laxmi, K.R.: Assessment methods of Cognitive ability of human brains for inborn intelligence potential using pattern recognition. IntechOpen Biometric Syst (2020). ISBN 978-1-78984-188-6 2. Raja, R., Patra, R.K., Sinha, T.S.: Extraction of features from dummy face for improving biometrical authentication of human. Int. J. Luminescence Appl. 7(3–4), 507–512 (2017, October–Decemeber). ISSN 1 2277-6362 (Article 259) 3. Mahmood M.R., Patra R.K., Raja R., Sinha G.R.: A novel approach for weather prediction using forecasting analysis and data mining techniques. In: Saini H., Singh R., Kumar G., Rather G., Santhi K. (eds.) Innovations in Electronics and Communication Engineering. Lecture Notes in Networks and Systems, vol 65. Springer, Singapore (2019). https://doi.org/10.1007/978-98113-3765-9_50 4. Raja, R., Patra, R.K., Mahmood, M.R.: Image registration and rectification using background subtraction method for information security to justify cloning mechanism using high end
57 Fingerprint Liveliness Detection to Mitigate Spoofing Attacks …
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16. 17.
18. 19.
20.
631
computing techniques. In: 3rd International Conference on Computational Intelligence & Informatics. Springer (2018) Pandey, S., Miri, R., Sinha, G.R., Raja, R.: AFD filter and E2n2 classifier for improving visualization of crop image and crop classification in remote sensing image. Int. J. Remote Sens. 1(1), 1–14 (2022) Sinha, T.S., Chakraverty, D., Patra, R., Raja, R.: Modelling and simulation for the recognition of physiological and behavioural traits through human gait and face images. Discrte Wavelet Transforms Compendium New Approaches Recents Appl. Intech China 95–125 (2013, February). ISBN 978-953-51-0940-2. http://dx.doi.org/https://doi.org/10.5772/52565. (edited by AwadKh. Al. Asmari) Chandrakar, R., Raja, R., Miri, R., Patra, R.K., Sinha, U.: Computer succored vaticination of multi-object detection and histogram enhancement in low vision. Int. J. Biometrics Spec. Issue Investig. Robustness Image Enhancement Preprocess. Tech. Biometrics Comput. Vis. Appl. (1), 1–20 (2022) Ding, Y., Ross, A.: An ensemble of one-class SVMs for fingerprint spoof detection across different fabrication materials. In: IEEE International Workshop on Information Forensics and Security (WIFS) 1–6 (2016). https://doi.org/10.1109/WIFS.2016.7823572 Sahu, A.K., Sharma, S., Tanveer, M., Raja, R.: Internet of Things attack detection using hybrid deep learning model. Comput. Commun. 176, 146–154 (2021). ISSN 0140-3664. https://doi. org/10.1016/j.comcom.2021.05.024 Patra, R.K., Raja, R., Sinha, T.S.: Extraction of geometric and prosodic features from humangait-speech data for behavioral pattern detection: Part II. In: Bhattacharyya, S., Chaki, N., Konar, D., Chakraborty, U., Singh, C. (eds.) Advanced Computational and Communication Paradigms. Advances in Intelligent Systems and Computing, vol. 706. Springer, Singapore (2018). ISBN 978-981-10-8236-8 Kumar, S., Jain, A., Shukla, A.P., Singh, S., Raja, R., Rani, S.: A comparative analysis of machine learning algorithms for detection of organic and nonorganic cotton diseases. Math. Prob. Eng. 18 (2021), Article ID 1790171. https://doi.org/10.1155/2021/1790171. Tiwari, L,., Raja, R., Awasthi, V., Miri, R., Sinha, G.R., Alkinani, G.H., Polat, K.: Detection of lung nodule and cancer using novel Mask-3 FCM and TWEDLNN algorithms. Measurement 172, 108882 (2021). ISSN 0263-2241. https://doi.org/10.1016/j.measurement.2020.108882 Raja, R., Kumar, S., Rashid, M., Color object detection based image retrieval using ROI segmentation with multi-feature method. Wirel. Pers. Commun. 1–24 (2020). ISSN 0929-6212 (print). ISSN 1572-834 (online). https://doi.org/10.1007/s11277-019-07021-6 Tiwari L., Raja R., Sharma V., Miri R.: Fuzzy inference system for efficient lung cancer detection. In: Gupta, M., Konar, D., Bhattacharyya, S., Biswas, S. (eds.) Computer Vision and Machine Intelligence in Medical Image Analysis. Advances in Intelligent Systems and Computing, vol 992. Springer, Singapore (2020). https://doi.org/10.1007/978-981-13-879 8-2_4 Raja, R., Sinha, T.S., Dubey, R.P.: Soft computing and LGXP techniques for ear authentication using progressive switching pattern. Int. J. Eng. Future Technol. 2(2), 66–86 (2016). ISSN 2455-6432 Jangde, K., Raja, R.: Image compression based on discrete wavelet and lifting wavelet transform technique. Int. J. Sci. Eng. Technol. Res. (IJSETR) 3(3), 394–399 (2014). ISSN 2278-7798 Tiwari, L., Raja, R., Sharma, V., Miri, R.: Adaptive neuro fuzzy inference system based fusion of medical image. Int. J. Res. Electron. Comput. Eng. 7(2), 2086–2091. ISSN: 2393-9028 (print). |ISSN: 2348-2281 (online) Raja, R., Sinha, T.S., Dubey, R.P.: Orientation calculation of human face using symbolic techniques and ANFIS. Int. J. Eng. Future Technol. 7(7), 37–50 (2016). ISSN: 2455-6432 Raja, H., Gupta, A. Miri, R.: Recognition of automated hand-written digits on document images making use of machine learning techniques. Eur. J. Eng. Technol. Res. 6(4), 37–44 (2021). https://doi.org/10.24018/ejers.2021.6.4.2460 Diwan, S.D., Thakur, A.K., Raja, H.: Fixed point of expansion mapping fuzzy menger space with CLRs property. Int. J. Innov. Sci. Math. 4(4), 143–145 (2016)
Chapter 58
Pulmonary Nodule Detection Using Laplacian of Gaussian and Deep Convolutional Neural Network Nuthanakanti Bhaskar and T. S. Ganashree
Abstract The early disease of pulmonary nodules on CT images is vital for the optimum patient treatment. We present a CAD system for pulmonary nodule identification in this paper, which combines a standard nodule detection method with a deep learning architecture to identify genuine nodules. To find nodule candidates, we begin by using multi-scale Laplacian of Gaussian filters, as well as prior dimensions and structure constraints, before moving on to a Deep CNN. On the benchmark LUNA16 dataset, we achieved 93.2% validation accuracy, 89.3% precision, 71.2% recall, 98.2% specificity, and a 0.97 AUC.
58.1 Introduction For the year 2020, the expected frequency of cancer patients in India was 6,79,421 (94.1/100,000) for males and 7,12,758 (103.6/100,000) for females. Cancer affects 1/68 men (lung cancer), 1/29 women (breast cancer), and 1/9 Indians (0–74 years of age) [1]. Noncommunicable diseases (NCDs) represented 71% of all fatalities worldwide. NCDs are thought to account for 63% of all deaths in India, with cancer being one of the primary causes (9%). Cancer is the much more typical disease of death in the world. According to the World Health Organization, lung cancer was the leading cause of death in 2018. Early detection and treatment can significantly lower mortality rates. Deep learning is displacing traditional machine learning approaches in order to give efficient diagnostics, and it is increasingly being employed in CAD systems [2, 3].
N. Bhaskar (B) CSE Department, VTU-RRC, Visvesvaraya Technological University, Belagavi, India CSE Department, CMR Technical Campus, Hyderabad, Telangana, India T. S. Ganashree Department of Telecommunication Engineering, Dayanandasagar College of Engineering, Bangalore, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_58
633
634
N. Bhaskar and T. S. Ganashree
Manually detection, segmentation, recognition, and volumetric operations are hindered by irregular shapes, intricate anatomical placements, or sometimes poor intensity of nodules. Manually doing these tasks is tedious, time-consuming, and imprecise [4]. Considering the above issue, a complete automatic system is to be developed. Many systems developed and research is continuing lung cancer detection. However, some systems are unsatisfactory in the shape of cancer detection accuracy and some more systems improved to achieve the highest accuracy in the classification of nodules in CT scans. Image Processing and Deep Learning methods implemented Lung cancer nodule identification and classification in CT scans. We reviewed the latest systems for pulmonary lung nodule recognition and classification on CT scans to choose the best systems and analysis performed on them and the new system proposed.
58.2 Related Work Shuo Wang et al. [5] suggested a Central Focused CNN to distinguish lung nodules from diverse CT images. This strategy includes 2 main ideas: (1) The influence of surrounding voxels on the classification of an image voxel might vary based on their spatial positions; (2) The proposed model concurrently collects a wide region of nodule sensitive characteristics from both 3D and 2D CT images. This approach was tested using the public LIDC dataset, which included 893 nodules, as well as an independent dataset from Guangdong General Hospital, which had 74 nodules. In segmentation average dice scores of 82.15% for the 2 datasets and 80.02% for the other. Nasrullah et al. [6] created a new deep learning-based model using a variety of techniques. For lung nodule identification and classification, they used two deep 3D tailored mixed link network topologies. The quicker R-CNN using efficiently knowledgeable characteristics from CMixNet and U-Net, such as encoder–decoder architecture, was employed to identify nodules. This method was employed to classify the nodules based on the learnt characteristics from the specified 3D CMixNet structure. In early-stage lung cancer diagnosis, the deep learning architecture for nodule identification and classification, along with clinical considerations, aids to reduce abnormality and false-positive outcomes. On LIDC-IDRI datasets, this approach scored a 94 percent sensitivity and 91% specificity. Gu et al. [7] implemented a multi-scale prediction 3D Deep CNN. The LUNA16 database had 888 thin-slice scans containing 1186 nodules, which were used to test this approach. All of the results were acquired employing a tenfold cross-validation procedure. The proposed approach has a sensitivity of 87.94 percent and 92.93 percent at 1 & 4 false positives/scans, respectively. And the CPM is 0.7967. To accomplish lungs segmentation, Shaziya et al. [8] presented a U-Net convolutional network that was implemented on a lungs dataset. The lungs dataset contains
58 Pulmonary Nodule Detection Using Laplacian of Gaussian and Deep …
635
267 CT pictures of the lungs and the segmentation maps that go with them. 0.9678 and 0.0871, respectively, are the accuracy and loss attained. Zhang et al. [9] proposed a new 3D spatial pyramid dilated convolution network for lung nodule malignancy classification. To understand the exact characteristic information of the pulmonary nodules, researchers used 3D dilated convolution instead of pooling layers. Extensive testing revealed that the model outperformed the competition, with an accuracy of 88.6%. In a 2D CT slice, Karrar et al. [10] described a CAD system for detecting candidate nodules and diagnosing them as solitary or juxtapleural with equivalent dimensions ranging from 7.78 mm to 22.48 mm. A segmentation and improvement technique employs bilateral filtering, gray-level thresholding, bounding box, and highintensity extrapolation [11]. The lung boundary is cleared, erosion, dilation, and superimposing are used to remove border artefacts. Two classifiers are presented for classifying two types of nodules depending on their placements in the classification step: juxtapleural and solitary nodules. The CNN and KNN algorithms are the two classifiers. The CNN attained accuracy and sensitivity rates of 96% and 95%, respectively. Lin et al. [12] created a Taguchi-based CNN for identifying whether nodules are cancerous or benign. They created an orthogonal table and employed the Taguchi method to pick preliminary factors. The proposed approach had a 99.6% accuracy using the optimal parameter combination, according to the research observations. Lin et al. [13] used the Taguchi approach to choose optimal parameters with fewer trials. Based on the current results of the experiments, the accuracy of the used optimal parameter combination is 99.86% (Table 58.1).
58.3 Dataset The LUNA16 dataset [30], which contains 888 CT images and 1186 positive pulmonary nodules, was abstracted from the LIDC database. The reference conventional was established using manual comments from four radiologists who read each scan twice. Every radiologist classified lesions as non-nodule, nodule 3 mm, or nodule ≥ 3 mm during the first round of masked reading. In the second viewing session, all 4 radiologists’ annotations were evaluated in an unblended form, and every radiologist committed whether to approve or disapprove each annotation. CT images with slice size and shape more than 2.5 mm were removed from the dataset, leaving scans with an image pixel of 512 * 512. Every scan was pre-processed to an effective appraisal of 1.0 mm/voxel across all 3 axes that use the spline interpolation approach to reduce the annoyance produced by the variability in spatial resolution. All nodules ≥ 3 mm acknowledged by at least 3/4 radiologists serve as the LUNA16 challenge’s reference standard.
Year
2017
2018
2018
2018
2018
2018
2018
2019
2019
2019
2019
2019
2019
2019
2019
2020
2020
2020
2020
Author
Alakwaa et al. [14]
Zhang et al. [9]
Xie et al. [15]
Tang et al. [16]
Gu et al. [7]
Shaziya et al. [8]
Qin et al. [17]
Cao et al. [18]
Nasrullah et al. [6]
Carvalho et al. [19]
Fu et al. [20]
Huang et al. [21]
Xiao et al. [22]
Li et al. [23]
Jakimovski et al. [24]
Perez et al. [25]
Karrar et al. [10]
Baldwin et al. [26]
Lin et al. [12]
CNN
LCP-CNN
CNN
3D CNN
NA
CNN, MTAN, SDAE-ELM
3D CNN
R-CNN
3DCNN
DCNN
3D CMixNet
TSCNN
3D U-Net
U-Net convolutional network
3D deep CNN
3D Faster R-CNN architecture
2D CNN
3DCNN architecture
U-Net
Deep learning architecture
Consortium
LCP-CNN
NA
LIDC-IDRI
CDNN
LIDC-IDRI
LUNA16
LIDC
LIDC-IDRI
LIDC-IDRI
LIDC-IDRI
LUNA
LUNA16
Lungs dataset
LUNA16
LUNA16
LIDC-IDRI
LIDC
LIDC-IDRI
Dataset
98.8
99.5
95%
87
99.91
74.697
91.7
95.2
96.88
NI
94
NI
96.7
NI
87.94/92.93
NI
73.4/74.4
86.3%
78.9/71.2
Sensitivity
Table 58.1 The results of investigations looking into the identification of pulmonary nodules
99.5
28.03
NI
NI
98.66
NI
NI
NI
NI
NI
91
NI
NI
NI
NI
NI
NI
90.3
NI
Specificity
NI
NI
NI
NI
NI
NI
2
19.8
NI
NI
NI
NI
NI
NI
1/4
NI
1/8 and 1/4
NI
20/10
FPs/scan
NI
NI
NI
NI
NI
NI
0.874
0.880
0.882
NI
NI
0.925
0.834
NI
0.7967
0.723
0.790
NI
NI
CPM
NI
95
NI
0.913
NI
NI
NI
NI
NI
0.87
NI
NI
NI
NI
NI
NI
0.954
0.883
0.83
AUC
(continued)
99.6
NI
96%
99.6
99.62
68–99.6
NI
94.6
NI
NI
94.7
NI
NI
0.9678
NI
NI
NI
88.6%
86.6%
Accuracy
636 N. Bhaskar and T. S. Ganashree
2020
2020
2021
Silva et al. [28]
Lin et al. [29]
Lin et al. [13]
NI-not included
2020
Ying Su et al. [27]
*
Year
Author
Table 58.1 (continued)
ALEX NET
2D CNN
CNN
R-CNN
Deep learning architecture
GAN
SPIE-AAPM
LIDC-IDRI
LIDC-IDRI
Dataset
94.30
99.97
NI
NI
Sensitivity
99.30
99.93
NI
NI
Specificity
NI
0.06
NI
NI
FPs/scan
NI
NI
NI
NI
CPM
NI
NI
0.936
NI
AUC
99.86
99.97
NI
91.2
Accuracy
58 Pulmonary Nodule Detection Using Laplacian of Gaussian and Deep … 637
638
N. Bhaskar and T. S. Ganashree
58.4 Methodology The proposed method consists of three important phases: Lung Segmentation, Nodule Identification, and Classification. Figure 58.1 depicts a flowchart that summarizes the procedure. Each of this method’s three procedures was developed by us.
58.4.1 Segmentation of Lung To minimize the processing cost of following operations and boost the accuracy of nodule identification, we segment the lung for every lung CT image using the methods listed below: (1) (2)
Image Masks are created with help of data annotations. Removed outside of the Lung region value and expanded the nodule range with Normalized 0 to 1 and calculated the overlap between Lung Mask and Source
Fig. 58.1 Proposed method
58 Pulmonary Nodule Detection Using Laplacian of Gaussian and Deep …
(3) (4) (5)
639
Image with lower and upper value so that we analyzed the CT scans and its Slice thickness, Window width, and Position. Generated Lung nodule image masking by using the SITK Resampling Image Filter. Generated 3D Sub image Masks with a Patch Block Size of 96 × 96x16. Classified the Nodule and Non-Nodule data using Candidate’s file.
Augmentation. To supplement visual data gathered in short chunks, other features are used. To normalize pixel values, we used features like (a) ZCA Whitening, (b) Random Rotations, (c) Random Shifts, (d) Random Flips.
58.4.2 Nodule Detection Consider f (x, y) is a CT slice: A scale-normalized LoG filter at the scale of δi : L norm (x, y; δi ) = δi2 ∇ 2 G(x, y; δi )
(58.1)
where Gaussian Function G(x, y; δi ) with a standard deviation δi . We use scale-normalized LoG filters with 21 combinations on every CT Image to identify nodules at various levels, resulting in 21 response maps, as shown below: Vi (x, y) = L norm (x, y; δi ) ∗ f (x, y)
(58.2)
where I = 1, 2, …, 21 and deviation δi boundaries are 1 to 5 (Fig. 58.2). Then, for each nodule candidate area, compute the circularity as follows: Circularity =
4π ∗ L S2
(58.3)
where L is perimeter and S is area of the candidate region.
Fig. 58.2 a Lung Nodule CT image. b Nodule candidates identified by multi-scale LoG filters. c Nodule candidate discovered by removing the region
640
N. Bhaskar and T. S. Ganashree
Fig. 58.3 Deep convolutional network
The perimeter L and size S of every area can be approximated by comparing the amount of edge pixels and total pixels because these nodule candidates are binarilized zones formed by the multi-scale LoG filters. A region will be removed if its area is less than 9 or greater than 1000, or if its circularity is less than 0.1. As a result, regions that are too tiny, too large, or elongated are removed from the future study. After recognizing nodule candidate regions on each slice, we may calculate nodule candidate volumes by evaluating the 3D connectivity of these regions around every pair of surrounding slices. We integrated any 2 nodule candidates that were less than 3 mm apart to reduce false positives.
58.4.3 Nodule Classification We built a CNN with 3 2D convolutional layers and 4 3D pooling layers to categorize nodule candidate dimensions into nodules and non-nodule (Fig. 58.3). The 1st convolutional layer of 50 × 50x32 filters, 2nd convolutional layer of 64 5 × 5 filters, and the 3rd convolutional layer of 64 3 × 3 filters are used to build feature maps by CNN.
58.5 Results and Discussions 58.5.1 Metrics for Productivity Employing tenfold cross-validation, the proposed approach was tested on the LUNA16 dataset. The data was divided into ten subsets at random and in equal proportions. The experiment was performed ten times, with each subset being tested and others being trained (Figs. 58.4, 58.5, 58.6, 58.7 and 58.8). The model was trained using 80% of the data in the training set, with the remaining 20% being used for validation. For testing nodules on CT, nodule identification algorithms are frequently developed, allowing radiologists to emphasize primarily on positive instances.
58 Pulmonary Nodule Detection Using Laplacian of Gaussian and Deep …
641
Fig. 58.4 Image files extraction from Luna 16 dataset
Fig. 58.5 Generation of lung nodule CT images and it’s masks (2D) from image files
Fig. 58.6 Generation of patch (96 * 96 * 16) lung nodule images and it’s Masks (3D) from 2D images and masks
For this circumstance, radiologists can correct false-positive cases during manual discrimination, but false-negative cases cannot. As a result, a good algorithm should have a significant impact on the accuracy and sensitivity, which is the amount of lung nodules properly detected.
642
N. Bhaskar and T. S. Ganashree
Fig. 58.7 Classified non-nodule and nodule images (48 * 48 * 48) in 2 different folders (0: nonnodule and 1: nodule)
Fig. 58.8 Augmentation of the above nodule image data
Sensitivity =
TP TP + FN
(58.4)
On the above results, we first used the CNN Model, which consisted of 3 convolutional layers (Figs. 58.9, 58.10, 58.11, 58.12). The feature map was downsampled by 2 in the max-pooling layer that followed the initial layer. As a result, when the downsampled feature map is fed into the second convolutional layer, which consists of 64 5 × 5 filters.
Fig. 58.9 CNN architecture
58 Pulmonary Nodule Detection Using Laplacian of Gaussian and Deep …
643
Fig. 58.10 The above feature map is generated by the 1st convolutional layer (50 × 50 × 32)
Fig. 58.11 The above feature map is produced by the 2nd convolutional layer of 64 5 × 5 filters
Fig. 58.12 The above feature map is produced by the 3rd convolutional layer of 64 3 × 3 filters
58.5.2 Testing On 1622 images, we achieved 93.2% validation accuracy, 89.3% precision, 71.2% recall, and 98.2% specificity using the suggested CNN Model (Figs. 58.13, 58.14). False-Negative Rate: False Negative (FN) refers to a set of negative predictions that are actually positive, indicating that the model forecasted incorrectly.
644
N. Bhaskar and T. S. Ganashree
Fig. 58.13 Confusion matrix
FNR =
FN FN + TP
(58.5)
False-Positive Rate: False Positive (FP) refers to a set of positive predictions that are absolutely negative, indicating that the model was incorrectly forecasted. FPR =
FP FP + TN
(58.6)
True Negative Rate: The number of negative predictions that are genuinely negative and that the model accurately anticipated is known as True Negative (TN). TNR =
TN TN + FP
(58.7)
True Positive Rate: True Positive (TP) refers to the number of positive forecasts that are truly positive, implying that the actual values are positive and the anticipated values are also positive (Fig. 58.15). TPR =
TP TP + FN
(58.8)
58 Pulmonary Nodule Detection Using Laplacian of Gaussian and Deep …
a. FN
b. FP
c.TN
d. TP
645
Fig. 58.14 a False-negative predictions. b False-positive predictions. c True negative predictions. d True positive predictions
Fig. 58.15 Receiver operating characteristic (ROC) curve
646
N. Bhaskar and T. S. Ganashree
58.6 Conclusion In this work, we offer a CAD system for pulmonary nodule identification that combines a classical method for detecting nodule candidates (multi-scale LoG filters with limitations of region and dimensions) with a CNN model for identifying actual nodules from those candidates. On the LUNA16 dataset, this system obtained 93.2% validation accuracy, 89.3% precision, 71.2% recall, 98.2% specificity, and 0.97 AUC.
References 1. Mathur, P., Sathishkumar, K., Chaturvedi, M., Das, P., Sudarshan, K.L., Santhappan, S., Nallasamy, V., John, A., Narasimhan, S., Roselind, F.S.: ICMR-NCDIR-NCRP Investigator Group. Cancer Statistics, 2020: Report From National Cancer Registry Programme, India. JCO Glob Oncol. 6, 1063–1075. PMID: 32673076; PMCID: PMC7392737 (2020). https://doi. org/10.1200/GO.20.00122 2. Tiwari, L., Raja, R., Awasthi, V., Miri, R., Sinha, G.R., Alkinani, M.H., Polat, K.: Detection of lung nodule and cancer using novel Mask-3 FCM and TWEDLNN algorithms. Measurement 172, 108882, ISSN 0263-2241 (2021). https://doi.org/10.1016/j.measurement.2020.108882 3. Siegel, R., Ma, J., Zou, Z., Jemal, A.: Cancer statistics, 2014. CA Cancer J. Clin. 64(1), 9–29, 2014 Jan-Feb. https://doi.org/10.3322/caac.21208. Epub 2014 Jan 7. Erratum in: CA Cancer J. Clin. 64(5), 364. 2014 Sep-Oct; PMID: 24399786 (2014) 4. Raja, R., Kumar, S., Rani, S., Laxmi, K. (eds.): Artificial intelligence and machine learning in 2D/3D medical image processing. CRC Press, Boca Raton (2021). https://doi.org/10.1201/ 9780429354526 5. Wang, S., Zhou, M., Liu, Z., Liu, Z., Gu. D., Zang, Y., Dong, D., Gevaert, O., Tian. J.: Central focused convolutional neural networks: developing a data-driven model for lung nodule segmentation. Med. Image Anal. 40, 172–183. 2017 Aug. Epub 2017 Jun 30. PMID: 28688283; PMCID: PMC5661888 (2017). https://doi.org/10.1016/j.media.2017.06.014 6. Nasrullah, N., Sang, J., Alam, M.S., Mateen, M., Cai, B., Hu, H.: Automated lung nodule detection and classification using deep learning combined with multiple strategies. Sensors (Basel) 19(17), 3722. 2019 Aug 28. PMID: 31466261; PMCID: PMC6749467 (2019). https:// doi.org/10.3390/s19173722 7. Gu, Y., Lu, X., Yang, L., Zhang, B., Yu, D., Zhao, Y., Gao, L., Wu, L., Zhou, T.: Automatic lung nodule detection using a 3D deep convolutional neural network combined with a multi-scale prediction strategy in chest CTs. Comput. Biol. Med. 103, 220–231. ISSN 0010-4825 (2018). https://doi.org/10.1016/j.compbiomed.2018.10.011 8. Shaziya, H., Shyamala, K., Zaheer, R.: Automatic lung segmentation on thoracic CT scans using U-net convolutional network. 2018 International Conference on Communication and Signal Processing (ICCSP), pp. 0643-0647 (2018). https://doi.org/10.1109/ICCSP.2018.852 4484 9. Zhang, G., Liu, X., Zhu, D., He, P., Liang, L., Luo, Y., Lu, J.: 3D spatial pyramid dilated network for pulmonary nodule classification. Symmetry 10(9), 376 (2018). https://doi.org/10. 3390/sym10090376 10. Karrar, A., Mabrouk, M.S., Wahed, M.A.: DIAGNOSIS OF LUNG NODULES FROM 2D COMPUTER TOMOGRAPHY SCANS. Biomed. Eng. Appl. Basis Commun. 32(03), 2050017 (2020). https://doi.org/10.4015/S1016237220500179 11. Tiwari, L., Raja, R., Sharma, V., Miri, R.: Adaptive neuro fuzzy inference system based fusion of medical image. Int. J. Res. Elec. Comput. Eng. 7(2): 2086–2091, ISSN: 2393-9028 (PRINT) |ISSN: 2348-2281 (ONLINE)
58 Pulmonary Nodule Detection Using Laplacian of Gaussian and Deep …
647
12. Lin, C.-J., Li, Y.-C.: Lung nodule classification using Taguchi-based convolutional neural networks for computer tomography images. Electronics 9(7):1066 (2020). https://doi.org/10. 3390/electronics9071066 13. Lin, C.-H., Lin, C.-J., Li, Y.-C., Wang, S.-H.: Using generative adversarial networks and parameter optimization of convolutional neural networks for lung tumor classification. Appl. Sci. 11(2), 480 (2021). https://doi.org/10.3390/app11020480 14. Alakwaa, W., Nassef, M., Badr, A.: Lung cancer detection and classification with 3D convolutional neural network (3D-CNN). Int. J. Adv. Comput. Sci. Appl. (IJACSA) 8(8) (2017). http:// doi.org/10.14569/IJACSA.2017.080853 15. Xie, H., Yang, D., Sun, N., Chen, Z., Zhang, Y.: Automated pulmonary nodule detection in CT images using deep convolutional neural networks. Pattern Recogn. 85, 109–119, ISSN 0031-3203 (2019). https://doi.org/10.1016/j.patcog.2018.07.031 16. Tang, H., Kim, D.R., Xie, X.: Automated pulmonary nodule detection using 3D deep convolutional neural networks. 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), pp. 523–526 (2018). https://doi.org/10.1109/ISBI.2018.8363630 17. Qin, Y., Zheng, H., Zhu Y.-M., Yang, J.: Simultaneous accurate detection of pulmonary nodules and false positive reduction Using 3D CNNs. 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1005-1009 (2018). https://doi.org/10.1109/ ICASSP.2018.8462546 18. Cao, H., Liu, H., Song, E., Ma, G., Xu, X., Jin, R., Liu, T., Hung, C.C.: A two-stage convolutional neural networks for lung nodule detection. IEEE J. Biomed. Health Inform. 24(7), 2006–2015. Epub 2020 Jan 3. PMID: 31905154 (2020). https://doi.org/10.1109/JBHI.2019.2963720 19. Carvalho, J.B.S., Moreira, J.-M., Figueiredo, M.A.T., Papanikolaou, N.: Automatic detection and segmentation of lung lesions using deep residual CNNs. 2019 IEEE 19th International Conference on Bioinformatics and Bioengineering (BIBE), pp. 977–983 (2019). https://doi. org/10.1109/BIBE.2019.00182 20. Fu, L., Ma, J., Chen, Y. et al.: Automatic detection of lung nodules using 3D deep convolutional neural networks. J. Shanghai Jiaotong Univ. (Sci.) 24, 517–523 (2019). https://doi.org/10.1007/ s12204-019-2084-4 21. Huang, X., Sun, W., Tseng, T.-L. (Bill), Li, C., Qian, W.: Fast and fully-automated detection and segmentation of pulmonary nodules in thoracic CT scans using deep convolutional neural networks. Comput. Med. Imaging Graph. 74, 25–36, ISSN 0895-6111 (2019). https://doi.org/ 10.1016/j.compmedimag.2019.02.003 22. Xiao, Z., Du, N., Geng, L., Zhang, F., Wu, J., Liu, Y.: Multi-scale heterogeneous 3D CNN for false-positive reduction in pulmonary nodule detection, based on chest CT images. Appl. Sci. 9(16), 3261 (2019). https://doi.org/10.3390/app9163261 23. Li, D., Mikela Vilmun, B., Frederik Carlsen, J., Albrecht-Beste, E., Ammitzbøl Lauridsen, C., Bachmann Nielsen, M., Lindskov Hansen, K.: The performance of deep learning algorithms on automatic pulmonary nodule detection and classification tested on different datasets that are not derived from LIDC-IDRI: a systematic review. Diagnostics 9(4), 207 (2019). https:// doi.org/10.3390/diagnostics9040207 24. Jakimovski, G., Davcev, D.: Using double convolution neural network for lung cancer stage detection. Appl. Sci. 9(3), 427 (2019). https://doi.org/10.3390/app9030427 25. Perez, G., Arbelaez, P.: Automated lung cancer diagnosis using three-dimensional convolutional neural networks. Med. Biol. Eng. Comput. 58(8), 1803–1815. Epub 2020 Jun 5. PMID: 32504345 (2020). https://doi.org/10.1007/s11517-020-02197-7 26. Baldwin, D.R., Gustafson, J., Pickup, L., Arteta, C., Novotny, P., Declerck, J., Kadir, T., Figueiras, C., Sterba, A., Exell, A., Potesil, V., Holland, P., Spence, H., Clubley, A., O’Dowd, E., Clark, M., Ashford-Turner, V., Callister, M.E., Gleeson, F.V.: External validation of a convolutional neural network artificial intelligence tool to predict malignancy in pulmonary nodules. Thorax 75(4), 306–312. Epub 2020 Mar 5. PMID: 32139611; PMCID: PMC7231457 (2020). https://doi.org/10.1136/thoraxjnl-2019-214104 27. Su, Y., Li, D., Chen, X.: Lung nodule detection based on faster R-CNN framework. Comput. Methods Programs Biomed. 200, 105866. Epub 2020 Nov 22. PMID: 33309304 (2021). https:// doi.org/10.1016/j.cmpb.2020.105866
648
N. Bhaskar and T. S. Ganashree
28. Silva, F., Pereira, T., Frade, J., Mendes, J., Freitas, C., Hespanhol, V., Costa, J.L., Cunha, A., Oliveira, H.P.: Pre-training autoencoder for lung nodule malignancy assessment using CT images. Appl. Sci. 10(21), 7837 (2020). https://doi.org/10.3390/app10217837 29. Lin, C.-J., Jeng, S.-Y., Chen, M.-K.: Using 2D CNN with Taguchi parametric optimization for lung cancer recognition from CT images. Appl. Sci. 10(7), 2591 (2020). https://doi.org/10. 3390/app10072591 30. Setio, A.A.A., Traverso, A., de Bel, T., Berens, M.S.N., van den Bogaard, C., Cerello, P., Chen, H., Dou, Q., Fantacci, M.E., Geurts, B., van der Gugten, R., Heng, P.A., Jansen, B., de Kaste, M.M.J., Kotov, V., Lin, J.Y.-H., Manders, J.T.M.C., Sóñora-Mengana, A., GarcíaNaranjo, J.C., Papavasileiou, E., Prokop, M., Saletta, M., Schaefer-Prokop, C.M., Scholten, E.T., Scholten, L., Snoeren, M.M., Lopez Torres, E., Vandemeulebroucke, J., Walasek, N., Zuidhof, G.C.A., van Ginneken, B., Jacobs, C.: Validation, comparison, and combination of algorithms for automatic detection of pulmonary nodules in computed tomography images: the LUNA16 challenge. Med. Image Anal. 42, 1–13, ISSN 1361-8415 (2017). https://doi.org/10. 1016/j.media.2017.06.015
Correction to: A Peer-to-Peer Approach for Extending Wireless Network Base for Managing IoT Edge Devices Off-Gateway Range Ramadevi Yellasiri, Sujanavan Tiruvayipati, Sridevi Tumula, and Khooturu Koutilya Reddy
Correction to: Chapter 50 in: V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_50 In the original version of the chapter, the affiliation of the chapter author was inadvertently published incorrectly. The authors “R. Yellasiri, S.Tumula, and K. K. Reddy” affiliation has been now updated as below: Department of Computer Science and Engineering, Chaitanya Bharathi Institute of Technology, Hyderabad, India The chapter has been updated with the changes.
The updated version of this chapter can be found at https://doi.org/10.1007/978-981-16-9669-5_50
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5_59
C1
Author Index
A Abdul Muhaimin, 57 Abinash Sahoo, 307, 319 Adilakshmi, T., 247, 331, 505 Akanksha Gupta, 623 Akash, M. D. N., 239 Alluri Kranthi, 151 Alphonse, P. J. A., 1 Anitta Joseph, 269 Anjana Raut, 517 Anne, K. R., 197 Anuradha, P., 131, 141 Anusha, A., 589 Arnav Machavarapu, 75 Arun Kumar, P., 517 Ayush Noel, 331
B Bhargavi Mikkili, 353 Bhaskara Rao, G., 103 Bhavana, V., 441 Bhavitha Katari, 363 Bhise, Dhiraj V., 611
C Chhanda Saha, 493
D Deba Prakash Sathpathy, 319 Deshpande, Arati R., 471 Deshpande, Sujit N., 481 Dinesh Reddy, V., 599
Dillip Kumar Ghose, 307 Divakara Rao, D. V., 463 Durugkar, Santosh R., 611 E Emmanuel, M., 471 G Ganashree, T. S., 633 Gandhe Srivani, 17 Ganesh Bhandarkar, 187 Gnyanee, K., 431 Golla Bharadwaj Sai, 239 Guna Santhoshi, 33 H Harikrishnan, R., 377 Harish Kumar, K., 389 Harshitha, L., 599 Harsh Raj, J., 257 Hegde, G. P., 431 Hegde, Nagaratna P., 431 J Jabbar, M. A., 239 Jhansi, K., 441 Jogdand, Rashmi M., 481 Jyotirmaya Mishra, 531 K Kachapuram BasavaRaju, 401
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 V. Bhateja et al. (eds.), Smart Intelligent Computing and Applications, Volume 1, Smart Innovation, Systems and Technologies 282, https://doi.org/10.1007/978-981-16-9669-5
649
650 Kalpana, M., 413 Kapil Kumar Nagwanshi, 611 Khooturu Koutilya Reddy, 551 Kotha Manohar, 85 Krishan Gopal Saraswat, 623
L Lakshmi Harika Marrivada, 541 Landge, Pravinkumar B., 611 Lasya Sree, N., 599 Logashanmugam, E., 85
M Madhira Venkata Sai Yeshwanth Reddy, 239 Mallikarjuna Rao, Ch., 215 Mamata Das, 1 Manohar, P. M., 463 Mariyam Ashai, 377 Modh, Rutika M., 451 Mohamed Basheer, K. P., 57 Mohammad Ulfath, 177 Mohammed Bakhar, 569 Mohana Lakshmi, K., 579 Mohd Hasham Ali, 117 Mozammil Akhtar, 623 Mundharikar, Sanjana P., 377 Muneer, V. K., 57
N Naagamani Molakathaala, 493 Nagaratna, M., 541 Naik, M. T., 117 Navya, M., 421 Neelamadhab Padhy, 531 Nitish Reddy, T., 281 Nizar Banu, P. K., 269 Nuthanakanti Bhaskar, 599, 633
P Padmalaya Nayak, 215 Pallavi Biradar, 569 Pallavi Reddy, R., 177 Parthiban, A., 103 Patel, Jagruti N., 451 Patel, Meghna B., 451 Prahasit Reddy, P., 281 Pranathi Jalapally, 47 Prasanna Kumar, J., 67 Prasasd, A. V. Krishna, 589
Author Index Pratap Singh, R., 215 Priya, K., 161 Puneet Mittal, 95 Pusarla Samyuktha, 151
R Raga Madhuri, Ch., 441 Rajesh Mahule, 623 Rajeshwar Rao Arabelli, 131, 141 Rajkumar, K., 131, 141 Rajkumar, K. K., 161 Raj Kumar Patra, 611, 623 Ramadevi Yellasiri, 401, 551 Ramya Manaswi, V., 225 Ravi Teja, G., 331 Rhea Gautam Mukherjee, 377 Rittam Das, 493 Riyan Pahuja, 187 Rizwana, K. T., 57 Rohith Reddy Byreddy, 561
S Sahithi, P. G. L., 441 Saif Ali Athyaab, 257 Saketh Malladi, 561 Sameen Fatima, S., 421 Sandeep Samantaray, 319 Sandhya, B., 67 Sangeeta Gupta, 257 Sankarababu, B., 225 Sateesh Kumar, R., 421 Savadam Balaji, 339 Selvakumar, K., 1 Senguttuvan, K., 413 Shamal Bulbule, 205 Shankhanil Ghosh, 493 Shanmuga Sundari, M., 151 Shareena, R., 141 Shitharth, S., 281 Shraban Kumar Apat, 531 Shreekumar, T., 95 Shrey Agarwal, 187 Shridevi Soma, 205 ShushmaSri, K., 441 Shweta Patil, 569 Shyam Sunder Pabboju, 505 Sireesha, V., 431 Sirisha, B., 67 Sita Kumari Kotha, 363 Souvik Ghosh, 493 Sravya Madala, 47
Author Index Sridevi Tumula, 551 Sri Harsha, S., 197 Srikanth, B. V. S. S., 561 Srinivas, K., 389 Srinivasu Badugu, 17, 33, 293 Sriperambuduri Vinay Kumar, 541 Srujan Raju, K., 531, 589 Subba Raju, G. V., 293 Suhasini Sodagudi, 353 Sujanavan Tiruvayipati, 551 Sukhwinder Sharma, 95 Suma, K., 95 Suneetha Rikhari, 579 Sunitha, M., 331 Sunitha, N. V., 95 Suparna Das, 151 Supriya, M., 247 Suresh Chandra Satapathy, 187 Suvarna, Vani K., 47 Swati Samantaray, 517
651 Syed Nawazish Mehdi, 117 Sai Teja, M. N. V. M., 599
V Vaheed, Sk., 215 Vankayalapati, H. D., 197 Varaprasad Rao, M., 589 Varayogula Akhila, 293 Vasavi, J., 131 Venkata Bhargav, P., 599 Venkataramana Battula, 561 Vijay Kumar Gugulothu, 339 Vinayak Dev Kuanr, 377 Vishal, A., 281
Y Yashaswi Upmon, 187